To REST Or Not To REST

To REST Or Not To REST

REST has become the de-facto standard for building modern APIs, but it’s not without its drawbacks. Therefore, in today’s time with alternatives being available, it seems reasonable to question that standard.

In many teams I’ve worked with so far, the question of whether or not to use the Representational State Transfer (REST) architecture style to build the API of some new application or service wasn’t even raised – using REST was just obvious. To REST, then, seemed almost as natural as breathing, and seriously questioning that probably would have been considered something like heresy.

This is perhaps because REST has seen tremendous adoption since it was first described in 2000 by Roy Fielding in his PhD dissertation Architectural Styles and the Design of Network-based Software Architectures – so much so that it has, over time, become the de-facto standard for building a modern API, specifically combined with JSON as a much more lightweight alternative to XML for data exchange. And yet, REST is not without its drawbacks, and it’s two of those drawbacks – the most significant ones, in my estimation – that we’ll take a look at in this blog post.

Core Concepts…

The REST philosophy is based on the following two fundamental questions:

  • How should the server represent the objects of the domain model in terms of resources?
  • Which endpoints should the server expose to allow clients to query those resources?

So, the two most important concepts in the REST philosophy are resources and endpoints. This results in a very server-focused way of thinking about data and how it can be delivered, and I think it is this server-focused approach that sits at the very root of the issues clients may face when working with a REST API.

… And Their Drawbacks

Consider the following simple example: Suppose you have a server application that delivers information about actors and movies, and in the server’s data model, they have an n-to-m relationship to one another. So, which issues will clients probably face if the server exposes that information via a REST API, that is, in terms of resources and endpoints?

Little Flexiblity In Terms Of Data Representation

With REST, the server has a very specific idea not only how, for example, an actor or a movie object looks like, but also how such an object is presented to the client as a resource. If a client wants to query actors via an /actors endpoint, then the server will deliver those actors with all the pieces of information – properties – it thinks are important about actors, even if the client is only interested in, say, the actors’ first names. Depending on the client’s needs, this can lead to poor efficiency in terms of data delivered by the server vs. data actually used by the client, and we’ll see an example of how bad this can get a bit further down the line.

Representing References Requires Painful Compromises

In our simple example, each actor can appear in multiple movies, and each movie is played out by more than one actor. There are multiple options to expose those references to clients, and none is really satisfactory (let’s suppose the client wants to find all actors that starred in “The Matrix”):

  1. Don’t represent references at all – resources are “self-sufficient” and the client needs to retrieve IDs and implement some logic to traverse references on its own. This would require clients to make a call like /movies?title=The%20Matrix to retrieve the movie’s ID and then make another call to /movies/{id}/actors to retrieve all actors. Not very elegant – the client needs to make two calls to the API and maintain some logic to handle the information retrieved in each one.

  2. If you wanted to implement a REST API with a higher maturity level, you could go for HATEOAS and deliver references in the form of URLs. For example, each movie resource would then carry a list of URLs, one for each of its actors. The original call to the /movies endpoint for our simple demo use case would remain the same, but the client wouldn’t need to parse the ID (in fact, in HATEOAS, a resource’s URL is its ID), and it also wouldn’t need to know about the /movies/{id}/actors endpoint to retrieve the information it requires (one of the ideas behind HATEOAS is that the API essentially documents itself and a client can discover all of its endpoints if it is given only one “starting point”, but an explanation of HATEOAS is beyond the scope of this blog post) because the server provides it with the list of URLs, and the client can then simply invoke each one. This should make the client-side logic more trivial, but it entails making many (1+n) calls to the API – one for retrieving the movie resource and then n calls to retrieve its n actors. Not very nice, either.

  3. An extension of the idea above is that nested objects (actors, from each movie’s perspective) could be fully expanded and then embedded in the response, that is, the response would represent the referenced resource itself rather than just the reference. But this approach also seems unsatisfying since it’s only the first-level nested objects that could be expanded (you’d run into issues with circular references otherwise and responses would become unreasonably large) and because it emphasizes the issue of amount of data delivered vs. amount of data used – the greater the former, the greater the risk of most of it going to waste. But, on the plus side, in our simple use case, the client really would only need to make one call to the API (/movie?title=The%20Matrix) to retrieve all information it desires.

Example: Putting It To The Extreme

A while ago, I experienced a situation in the client company I currently work for (a large bank), and I want to use it as an example here because it beautifully illustrates the drawbacks outlined above.

My task was to build a REST API to deliver some infrastructure data (which applications run on which hosts, which technology stack is in place on each host, stuff like that) to internal clients, and during the planning phase of the project, we decided (a) to use HATEOAS to represent and deliver resources (REST was a requirement that had been set in stone long before I joined, so there was nothing to be changed about that) and (b) to represent references as URLs by default, but also give clients the option to ask the API for what we called an expanded view, which would expand all first-level nested objects (this is, of course, a combination of the aforementioned options two and three). In this manner, the clients would actually have a little bit of flexibility in terms of what perspective on data they would like to consume.

Fast forward: The application is in production and its API contains one endpoint whose resources link to three other kinds of resources. There are many resources (on the magnitude of thousands) of the former type within the business domain, and each carries many references – dozens, in total, on each one – to other resources. If clients queried that endpoint without any filter parameters asking for the expanded view and then iterated over the paged results, they’d end up with 160 to 170 MBs (!) of raw, unformatted JSON. With the non-expanded view – references represented only as URLs –, the client would have had to make thousands of “round trips” to the API to get all data it wanted, so that wasn’t a great option, either. The only chance to optimize this would have been for that client to make more fine-grained calls by moving away from its batch-processing-style workflows, but the team responsible for that client’s implementation was very quick to decide they weren’t really in the mood for that. Welp…

To add insult to injury, that client was only interested in the name attribute of the root resource plus the name attribute of the three linked resources. Therefore, the destiny of the vast majority of the humongous amounts of data queried from the API was to disappear into oblivion by eventually being garbage-collected away. Yuck!

Is REST Obsolete?

It’s obvious that REST’s neglect of the client’s perspective leads to problems on the client side, and it doesn’t even need a large, complex business domain to make those problems manifest (although they will manifest all the more painfully if the business domain is large and complex) – you’ve seen above that the endpoint-and-resources-focused approach of REST makes an API built in that style unelegant and clunky to handle even in our simple movie-and-actor example.

I’d still by no means consider REST to be obsolete, it’s just that the question of whether or not to use it is very worthy of discussion today (much more so than a couple of years ago). There is one particularly interesting – and powerful – alternative to REST that I want to talk about in the next blog post. Stay tuned!