HTTP - in-depth explanation

While there are voluminous books on HTTP, I have done my best to condense the important parts that every developer must understand about HTTP in one article.

Let's look at the HTTP version used at some of the top websites in the world.

Open Chrome and press F12 for dev tools
Click on the Network tab
In the address bar type google.com (or any website) and press Enter

In the Protocol column h3 or h2 or http/1.1 denotes the HTTP version in use.

Notice how these websites are leveraging different HTTP versions to serve different objects. Chances are you might have never cared to observe.

How Abstractions Shape Developer Perspectives on HTTP

It's fascinating how abstractions have shaped our perspective as developers, often leading us to primarily focus on surface-level aspects.

If you were to inquire about HTTP with an average developer, their response would likely revolve around the basic HTTP methods such as GET, POST, PUT, and so on.

Perhaps a few might mention HTTP headers, content-type, and a handful of other specifics, but overall, most developers wouldn't have extensively explored the intricacies of HTTP.

What are the benefits you may ask?

By understanding and effectively utilizing the capabilities of HTTP, we can optimize and provide users with blazing-fast, interactive web experiences.

Here are a few benefits of a deeper understanding of HTTP...

Minimizing Round Trips
Efficient Resource Delivery
Caching for Performance
Prioritizing Resource Loading
Reducing Redirects

Let's begin with the basics of HTTP.

HTTP - Hypertext Transfer Protocol.

What began as a simple, one-line protocol for retrieving hypertext quickly evolved into a generic hypermedia transport, and now almost two decades later can be used to power just about any use case you can imagine.

Evolution

Let's look at the evolution of HTTP over the years.

📆 1989 - HTTP is developed by Tim Berners-Lee at CERN.

📆 1990 - The first version of HTTP, HTTP/0.9, is released. HTTP/0.9 was a very simple protocol that only allowed clients to request documents from servers.

📆 1991 - HTTP/1.0 is released. HTTP/1.0 added several new features, including support for headers, cookies, and caching.

📆 1997 - HTTP/1.1 is released. HTTP/1.1 added several performance improvements, including support for persistent connections and pipelining

📆 2015, HTTP/2 is released. HTTP/2 is a major revision of HTTP that provides significant performance improvements over HTTP/1.1

📆 2022, HTTP/3 becomes standard on June 6, 2022. It is designed to improve the performance and security of HTTP

Building blocks of HTTP

graphical user interface, text, application, chat or text message

Client/Server

A client and server make up the basic components of the Internet.
An HTTP client (Eg: Browser) makes a request to an HTTP Server. The server responds back with data.

Resources

Be it HTML files, static files, images, videos, etc. - a web server responds with a resource.

Because the Internet hosts many thousands of different data types, HTTP carefully tags each object being transported through the Web with a data format label called a MIME type.

MIME type

MIME - Multipurpose Internet Mail Extensions - originally designed for emails was adopted by HTTP to describe and label its own multimedia content.

How does the client request a resource?

It does this through the server resource name called a Uniform Resource Identifier, or URI.

URI

URIs are like postal addresses of the Internet, uniquely identifying and locating information resources around the world.

The URI for the png image above is as below.

https://cdn.hashnode.com/res/hashnode/image/upload/v1684977294735/57562ef5-369b-4127-90b9-bd8ec6d92ea9.png

The most common form of a resource identifier is the Uniform Resource Locator or URL.

URL

URLs describe the specific location of a resource on a particular server.

Below are some of the different types of URLs

Transactions

An HTTP transaction consists of a request (sent from client to server), and a response (sent from the server back to the client).

There are several request methods. Each method tells the server the action to perform.

Some common HTTP methods.

Method	Meaning
GET	Read data
POST	Insert data
PUT	Deposit data ( inverse of GET)
PATCH	Update data
DELETE	Delete data

It should be noted that the establishment of the connection between the client and server is a prerequisite for utilizing these HTTP methods.

Connections

Before an HTTP client can send a message to a server, it needs to establish a TCP/IP or QUIC (more about this below) connection between the client and server using Internet protocol (IP) addresses and port numbers.

To make a connection, we need the IP address of the server computer and the port number associated with the specific software program running on the server.

The following steps show how a browser uses HTTP to connect to a remote server and display simple HTML...

The browser extracts the server’s hostname from the URL
The browser converts the server’s hostname into the server’s IP address
The browser extracts the port number (if any) from the URL
The browser establishes a TCP connection with the web server
The browser sends an HTTP request message to the server
The server sends an HTTP response back to the browser
The connection is closed, and the browser displays the document

Types of Headers

There are several types of headers in HTTP, each serving a specific purpose. Here are some commonly used header types:

General Headers: These headers apply to both requests and responses and provide general information about the message. Examples include:
- Cache-Control: Specifies caching directives for both the client and server.
- Connection: Indicates whether the connection should be kept alive after the request/response.
- Date: Represents the date and time when the message originated.
Request Headers: These headers are sent by the client as part of an HTTP request to provide information about the request or the client itself. Examples include:
- Host: Specifies the target host and port number of the server.
- User-Agent: Identifies the client application, typically a web browser.
- Accept: Informs the server about the acceptable content types for the response.
Response Headers: These headers are sent by the server as part of an HTTP response and provide additional information about the response. Examples include:
- Content-Type: Specifies the media type of the response content.
- Server: Indicates the server software used to handle the request.
- Set-Cookie: Sets a cookie value to be stored on the client for future requests.
Entity Headers: These headers apply to the message body or entity in both requests and responses. Examples include:
- Content-Length: Specifies the length of the message body in bytes.
- Content-Encoding: Indicates the encoding applied to the entity, such as gzip or deflate.
- Content-Language: Specifies the natural language of the content.
Security Headers: These headers are used to enhance security in the context of web applications. Examples include:
- Strict-Transport-Security: Enforces the use of secure connections (HTTPS).
- X-Content-Type-Options: Prevents browsers from automatically detecting content types.
- X-XSS-Protection: Helps mitigate cross-site scripting (XSS) attacks.

Headers play a crucial role in communicating additional information between clients and servers, enabling enhanced functionality and security in web applications.

Issues with HTTP/1.1

Head-of-Line blocking

Head of Line blocking (HOL blocking) is a performance-limiting phenomenon that occurs in HTTP/1.1 when a client sends multiple requests to a server over a single TCP connection.

Each request is queued up behind the previous request, and the server can only process one request at a time.

This means that if the first request in the queue takes a long time to complete, all of the other requests will have to wait.

For example, imagine that you are trying to load a web page that contains images. Your browser will send multiple HTTP requests to the server, one for each image.

If the first request takes a long time to complete, because the image is large or the server is overloaded, then the other requests will have to wait. This can lead to a noticeable delay in loading the web page.

Congestion Window

The congestion window is a measure of how much data can be sent in a single TCP packet.

When a client makes a request, the congestion window is set to a small value. As the client receives data from the server, the congestion window is gradually increased.

The problem with this approach is that if too many clients are making requests at the same time, the congestion window can be too small to accommodate all of the traffic.

This can lead to congestion on the network, which can slow down or even prevent requests from being completed.

No Compressed Headers

HTTP/1.1 does not support header compression, which can lead to large request sizes and slow page load times. This is especially true on low-bandwidth or congested networks.

HTTP/2 key features and how they solved HTTP/1.1 problems

Multiplexing

One of the major improvements in HTTP/2 is multiplexing, which allows multiple requests and responses to be sent and received concurrently over a single TCP connection.

In HTTP/1.1, only one request/response could be processed at a time per connection, leading to head-of-line blocking where slower resources delayed others.

With multiplexing, HTTP/2 eliminates head-of-line blocking, enabling parallel and efficient resource fetching, resulting in improved performance and reduced latency.

Binary Protocol

HTTP/2 uses a binary protocol instead of the text-based protocol used by HTTP/1.1.

The binary framing enables more efficient parsing and serialization of data, reducing overhead and improving overall performance.

Header Compression

HTTP/2 introduces header compression using the HPACK algorithm.

With header compression, HTTP/2 reduces the size of headers, leading to reduced bandwidth consumption and faster communication between the client and server.

Server Push

HTTP/2 introduces server push, where the server can proactively push resources to the client without waiting for explicit requests.

This feature allows the server to anticipate the client's needs and send relevant resources, eliminating the need for additional round trips.

It helps reduce latency and speeds up page loading times by reducing the number of requests required.

Stream Prioritization

Optimal Loading Animation

HTTP/2 allows stream prioritization, where resources can be assigned different priorities.

This feature enables the client and server to communicate the importance of resources, ensuring that critical resources are given priority and delivered promptly, improving the overall user experience.

Limitation of HTTP/2

Case Study: How Lucidchart found the hard way why turning on HTTP/2 was a mistake

When Lucidchart enabled HTTP/2 on the load balancers for some of their services, they immediately noticed the load balancers had higher CPU load and lower response time.

They noticed the incoming traffic was the same as usual but there was a spike in the flow of requests.

The reason? HTTP/2 multiplexing

Multiplexing substantially increased the strain on their servers.

Firstly, they received requests in large batches instead of smaller, more spread-out batches. And secondly, because with HTTP/2, the requests were all sent together, their start times were closer together, which meant they were all likely to time out.

Implementing server push

The server push feature in HTTP/2 can be challenging for developers to implement and integrate into existing applications. It requires careful consideration and implementation to ensure optimal usage and avoid potential issues.

TCP-level blocking

While HTTP/2 addressed the head-of-line blocking problem present in HTTP/1.1, there can still be blocking at the TCP level.

Protocol Ossification

The progressive reduction in flexibility of the HTTP/2 protocol, known as ossification, can pose challenges for devices that are configured to accept only TCP or UDP. This can limit the adaptability and potential benefits of HTTP/2 in certain network environments.

HTTP/3 key features and how they solved HTTP/2 problems

Improved Performance

One of the key features of HTTP/3 is its improved performance, specifically in terms of reducing latency.

It achieves this through the use of the QUIC (Quick UDP Internet Connections) transport protocol, which is built on top of UDP (User Datagram Protocol) instead of TCP (Transmission Control Protocol) used by HTTP/2.

TCP suffers from a phenomenon called head-of-line blocking, where the delay or loss of a single packet can block the delivery of subsequent packets, resulting in increased latency.

With QUIC, HTTP/3 mitigates this issue by enabling independent packet transmission, reducing latency and improving overall performance.

Enhanced Security

HTTP/3 incorporates built-in encryption as a standard feature, providing enhanced security compared to its predecessor. Unlike HTTP/2, HTTP/3 mandates the use of encryption, ensuring that all communications are encrypted by default.

HTTP/1.1 vs HTTP/2 vs HTTP/3

	HTTP/1.1	HTTP/2	HTTP/3
Protocol	Text-based protocol	Binary protocol	Binary protocol
Multiplexing	Not supported	Supported (Multiplexing allows multiple requests/responses to be sent over a single connection simultaneously)	Supported (Improved multiplexing for better performance)
Header Compression	Not supported	Supported (Headers are compressed to reduce overhead)	Supported (Enhanced header compression for improved efficiency)
Flow Control	Not supported	Supported (Allows controlling the rate of data sent)	Supported (Improved flow control mechanisms)
Server Push	Not supported	Supported (Allows the server to push additional resources to the client without waiting for a request)	Supported (Improved server push mechanisms)
Header Compression Mechanism	Not specified, typically uses uncompressed headers	HPACK (Header Compression for HTTP/2)	QPACK (Header Compression for HTTP/3)
Request/Response Prioritization	Not supported	Supported (Allows assigning priority to requests, ensuring more important requests are processed first)	Supported (Enhanced prioritization mechanisms)
Stream Dependencies	Not supported	Supported (Allows establishing dependencies between streams)	Supported (Improved stream dependency handling)
Connection Reuse	Limited reuse of TCP connections	Multiplexing allows efficient connection reuse	Multiplexing with enhanced connection reuse
Server Push Cancellation	Not supported	Supported (Allows canceling pushed resources if they are no longer needed)	Supported (Improved server push cancellation mechanisms)
Security	No mandatory encryption (HTTPS optional)	No mandatory encryption (HTTPS optional)	Encrypted by default (HTTPS is mandatory)
Latency	Higher latency due to sequential request/response handling	Lower latency due to multiplexing and compressed headers	Further reduced latency due to improved multiplexing and enhanced mechanisms

HTTP/2+ - why it is still not fully adopted?

Despite the obvious benefits, HTTP/2+ has not found 100% adoption yet. Some reasons could be...

Compatibility and Support

HTTP/2 and HTTP/3 have gained significant support from major web browsers and servers. However, there are still legacy systems, outdated browsers, or network infrastructure that do not fully support these newer protocols. This lack of compatibility can hinder the widespread adoption of HTTP/2 and HTTP/3.

Migration Complexity

Transitioning from HTTP/1.1 to HTTP/2 or HTTP/3 can be complex and require infrastructure upgrades, configuration changes, and updates to server and client software. The complexity and effort involved in the migration process can slow down adoption, particularly for organizations with large and complex systems.

Network Infrastructure Limitations

Some network infrastructure, particularly in enterprise or legacy environments, may have limitations or restrictions that make it challenging to fully adopt HTTP/2 or HTTP/3. This can include firewalls, proxies, or network configurations that do not support the necessary protocols or encryption mechanisms required by these newer versions.

Performance Trade-offs

While HTTP/2 and HTTP/3 offer significant performance improvements over HTTP/1.1, there can be trade-offs in certain scenarios. For example, the increased resource consumption in HTTP/2 due to multiplexing may pose challenges for resource-constrained servers. Additionally, the encryption overhead in HTTP/2 and HTTP/3 can introduce additional processing requirements for both servers and clients.

Slow Industry Standardization

The process of standardization and widespread adoption of new protocols takes time. HTTP/2 and HTTP/3 have undergone rigorous standardization processes, but it still takes time for organizations, developers, and vendors to implement and support these newer protocols.

Conclusion

having a strong understanding of HTTP is crucial for developers. By grasping its building blocks and staying updated with the advancements in protocols such as HTTP/2 and HTTP/3, developers can optimize web applications for better performance, security, and user experience.

A deep knowledge of HTTP also enables effective troubleshooting, caching, compression, and encryption, leading to high-performing and reliable applications. With this knowledge, developers can unlock the full potential of the web and deliver seamless user experiences while driving innovation in the digital world.

Source

Command Palette