In the year 2000 Roy Fielding’s acclaimed dissertation introduced the Representational State Transfer (REST) software design criteria, focusing on a series of constraints to simplify and standardize web services development. Although these guidelines were not immediately adopted as the norm, they paved the way for today’s petabyte-scale web architectures. Before exploring the principles behind REST APIs and high scalability, an overview of how the web works is required, with specific attention to its most popular protocol — HTTP.
I. HTTP- the winged messenger
If the World Wide Web were explained through Greek mythology (its precursor in complexity), the HyperText Transfer Protocol would be the winged messenger Hermes. As the only Olympian god with the power to travel between the realms of the living and the dead, HTTP is the only protocol used to communicate a seemingly “living” client (web browser to simplify) and a “stateless” web server. The decoupling of client and server, where one is the requester of information and the other is the sender, is the first of the REST constraints.
Browsers like Chrome, Safari or Firefox are helpful examples to illustrate this data exchange, but a client can be any software tool programmed to request and receive information. Whether you click on an image that directs you to a new web page, or a retail site connects to a courier service during checkout, the client-server computing model remains the same.
Hypertext, hypermedia and web resources
In simple terms hypertext is text which contains links to other texts. The same concept applies to hypermedia, only that linking to other content is done through images, graphics, video and sound. The target of that link lives on a web server and is considered a web resource. It’s returned to the client in form of representations, thus the name “Representational State Transfer”.
Think of a resource as anything on the internet that should be identifiable to enable storing, retrieving and modifying it — a user account, blog post, shopping cart item, flight destination and so on. Resources are identified using unique strings called URIs (Uniform Resource Identifier).
For web tech initiates, we should start by looking at the most recognizable type of URI: the URL (Uniform Resource Locator). Notice how URLs begin with “http://”. That’s our winged deity, indicating its status as the universally accepted protocol for locating a resource. The pantheon of concepts we’ve just sprinted through should become clearer once we look at what happens when you type a URL into a browser:
1. The client contacts the Domain Name System (DNS) to locate the IP address that is mapped to the requested URL.
*An IP address indicates where the resource that corresponds to that URL is located, i.e. the server that the webpage or web resource is hosted on.
2. Once the client knows which server to contact, it establishes a TCP connection with that server and sends an HTTP request.
3. The server processes this request and returns an HTTP response, which contains the HTML (HyperText Markup Language) page in the response body.
4. The response is rendered by the web browser and the solicited content, including text, image, video or sound, appears.
Voilá, mission (almost) accomplished. Under the strict specifications defined by HTTP and the REST architectural style, a client-server transaction can only be considered successful when it leaves no trace of any data related to the HTTP exchange on the server. This principle is referred to as “statelessness” and leads us to the next section for understanding REST APIs.
II. “A State for one man is no State at all.”
A stateful system is deemed so from the perspective of the backend server, which stores vital information related to the client session, such as user authentication, authorization and data validation. In REST however, all the information required to identify incoming requests is provided by the client. The stateless restriction stipulates that each client-issued request is handled as a single, isolated transaction. Client devices store and resend data, and the server cannot reutilize or rely on data from previous requests.
The impact this constraint has on a web services’ scalability is monumental, as the stateless protocol allows for load balancing. Incoming requests can be routed to any web server and the amount of web servers needed to match the expected workload can be scaled up or down.
In Fielding’s words: “REST ignores the details of component implementation and protocol syntax in order to focus on the roles of components, the constraints upon their interaction with other components, and their interpretation of significant data elements.”
In your average mortal’s words: statelessness is what enables scaling requests to multitudes of servers distributed across the globe.
III. Scalabilitas Opus Magnum: REST APIs
REST APIs function in a very similar way to common web transactions as they also use HTTP, only the data exchange occurs between two software definitions or products. No graphical user interface relays the result of the transaction since the client is usually a software program that requires limited human interaction.
Referred to as the “glue” that connects modern apps, a well-designed API or “Application Programming Interface” ensures competitive and relational distinction in today’s digital economy.
API methods are called when you create a playlist on Spotify, look up a profile on Instagram or make a purchase via PayPal. In fact, the wave of excitement surrounding APIs is largely owing to how they enable developers to build services that easily integrate with other, more powerful services. When you’re redirected to Facebook or Google to log in to a third-party application, API calls use these web giants’ authentication servers (OAuth 2.0) and access tokens to verify identity without revealing credentials to the external application, providing a safe method and more seamless user experience.
APIs’ building block potential extends far beyond personal verification data to finance, mapping, billing, mobility, sports, travel, farming and so on, resulting in the “softwarization” of an endless stream of services. In 2016, tractor manufacturers John Deere opened their API, allowing farm management and construction machinery companies to maximize profits by integrating crucial data into their applications. Thanks to API-generated data, the coffee mogul Starbucks has one of the most successful rewards-based loyalty programs, with over 16 million members. The driving force behind the growth of APIs for revenue creation is mass migration to cloud-based systems, with both digitally native and offline brands transforming their business models by leveraging REST and RESTful services (RESTful implies following most but not all the constraints).
Do all APIs adhere to the REST standard?
Negative. An API can be any interface layer that makes one application able to interact with another, but not necessarily over the internet. REST and RESTful APIs follow a standardized set of guidelines and always use HTTP. Basically, any process referred to as a “web service” can be considered an API but not all APIs are web services. The advantage of adhering to the REST design pattern is that the constraints themselves, summarized below, make for greater flexibility and reliability.
1. Separation of Client and Server
Based on a crucial principle in software engineering named Separation of Concerns (SoC), components are designed and developed to be independent, so changes to one will not affect how the others operate.
“No client context shall be stored on the server between requests”. When data related to the end-user (client context) is needed to carry out an authorized operation, it must be provided by the client in each request, making the server stateless.
Any response messages from server to client must be labelled as cacheable or non-cacheable. If the data is cacheable and hasn’t changed since the last response, it can be reutilized by the client. Caching increases an application’s responsiveness by improving client-side performance and reducing load and latency on the server.
4. Layered system
Also borrowed from the Separation of Concerns principle, layers of intermediary service can be implemented to help serve a client-issued request for a resource’s state. The constraint establishes that each layer can only communicate with the layer closest to it. If an authentication layer and a load-balancing layer are injected between the client and the end server, the client is agnostic to what these layers are or do, connecting only to the layer adjacent to it, improving system scalability and security.
5. Code on Demand
This (optional) constraint comprises the client’s ability to download and execute code which is returned from a server as an applet or script. Code on Demand temporarily enables extending a client’s functionality, but it’s not a mandatory feature for a web service to be considered RESTful.
6. Uniform Interface
The interface between client and server must be defined and designed to ensure that any machine trying to access data hosted on a server uses the same interface. To support achieving this constraint, the following sub-topics were included:
a. Resource-based — requests to the server define the solicited resource state by including URIs (Uniform Resource Identifier)
b. Manipulation of resources through representations — responses from the server contain the necessary representation information of a resource to allow the client to change the resource state.
c. Self-descriptive messages — each request message must contain the exact information required to enable serving it, and the returned message must contain all the data and respective metadata, needed to understand it.
d. Hypermedia as the Engine of Application State (HATEOAS) — each response from the server should include the requested URI along with hyperlinks that inform the client of the options for changing the current state of the application.
The last topic deserves further attention as it’s perhaps the most convoluted and debated of Mr. Fielding’s standards. Once understood though, it powers a well-designed REST web service like Zeus’ lightning bolt.
IV. Do Believe the Hype(rmedia)
HATEOAS — Hypermedia As The Engine Of Application State — key constraint
The World Wide Web was conceived as a virtual state machine where websites and applications continuously pass from one state to the next. The path that an application state follows is relative to the resource state, so distinguishing between the two is key. Depending on the HTTP method sent in a client-issued request, a resource can be created, retrieved, updated or deleted. Deemed CRUD operations, these actions correspond to the HTTP methods:
When a resource is modified on a server as the outcome of a CRUD operation, a different representation of that resource state is returned to the client, and the application state also transitions. Although the client context exists separately from the server-stored resource state, their respective transits are enmeshed. How hypermedia functions as the engine that determines as much is our next concern.
HATEOAS manifests that the resource representation returned by the server must include a series of follow-up links in hypermedia format, along with standardized link relations. As mentioned before, hypermedia refers to media that interactively allows hyperlinking to other data sources, and comprises text URIs, audio, video and images. To simplify, let’s condense hypermedia into “clickable items” that allow users to navigate from page to page. In a web-browser-as-client scenario, this isn’t too hard to grasp. Applied to a REST API where the client is a software tool, the abstraction gets pretty darned abstract.
On its HATEOAS page Wikipedia uses a banking application sample response for an HTTP “GET account” request. The server-issued code itself (copied below) helps elucidate how the application state is determined through the actions afforded by “clickable items”.
Example of HATEOAS (from Wikipedia) — Banking App
The “account” resource representation incorporates hypermedia links with the options to make a deposit, withdrawal, transfer funds or close the account. These options are traversed by the user in the form of buttons, icons, hypertext and so on. In the response, not only is information being shipped (such as the current balance) but instructions for the resource’s next state are offered. The ensuing client-side action determines what happens to the resource on the server-side, triggering a subsequent shift in application state. Thus, the hypermedia sent in the response drives the application state and not vice-versa.
In a REST API software-to-software transaction, the process mimics human interaction with a web app, but it’s the REST client that uses server-provided hypermedia URI links to access the resources it needs.
“Hypertext does not need to be HTML on a browser. Machines can follow links when they understand the data format and relationship type”. Roy Fielding
How will you benefit from knowing any of this?
The Geek Mythology Guide to REST APIs provides a basic intellectual framework for web APIs and the REST design pattern. With API-powered embedded financial services achieving skyrocketing valuation for new fintechs, and companies like Salesforce acquiring Mulesoft (an API management platform) for $6.5 billion, it’s no surprise that businesses everywhere are scrambling to implement an API strategy. Opening access to critical information enables customers to tailor their interactions with a product, while companies can also monitor API usage to better understand customer behavior.
At this rate, your not-too-far-into-the future-car is already using APIs to deliver automated updates on everything from insurance to mileage and repairs (while it drives itself). Anarchic scalability has forever transformed how we interact with the external world, so you can now pride yourself on understanding the technologies that made it possible.
Fun facts I purposefully left out:
- On top of authoring the REST design pattern, Roy Fielding co-authored the HTTP specification, co-founded the Apache HTTP Server Project and chaired the Apache Software Foundation, the largest open source project on the planet.
- The World Wide Web began in 1989 as a non-profit project at the European Organization for Nuclear Research (CERN). By August 1991 Sir Tim Berners-Lee and his CERN colleagues had invented HTML, HTTP, URIs, and the first web client and server. Within 5 years the internet expanded to 40 million users and its ability to scale became a matter of serious concern. In came Roy Fielding with the constraints that made Web history.
- HTTPS was developed to stop sensitive data being intercepted and compromised during web transactions. The added “S” stands for “secure”. HTTPS encryption technology and authorized security certificates are now used in over 50% of websites worldwide, avoiding potential dangers like phishing, extra advertisement/tracking ads being added to the sites you visit by your internet service providers and governments gaining confidential browsing activity information.
- HTTP and HTTPS are considered limited protocols for IoT (Internet of Things) applications and other application-layer protocols have been developed as an alternative.
- Hermes was also the god of trade, wealth, luck, language, thieves, and travel, all facets of the internet’s potential. It was believed that while still a baby, he stole 50 cows from his half-brother Apollo.
Written by Mercedes Arias-Duval