Next level customer experience with HTTP/3 traffic engineering

HTTP/3 addresses key challenges such as latency reduction, concurrent access, and low-latency content delivery.

photo of Dmitry Kolesnikov
Dmitry Kolesnikov

Senior Principal Engineer

Next level customer experience with HTTP/3 traffic engineering

TL;DR: HTTP/3 has gathered consensus by the industry as the best technical solution for improving Web protocol stack. Usage statistics indicate that 29.8% of websites worldwide have already embraced HTTP/3 to cater to their users, with Zalando being among them. The architecture of HTTP/3, coupled with the underlying QUIC transport, introduces concurrent access and low-latency capabilities to solutions, facilitated by user-space flow and congestion controls operating over the User Datagram Protocol (UDP). QUIC is used by 8.0% of all the websites. The result is an enhanced customer experience that fundamentally transforms content consumption, promising visually stunning displays on customers' mobile screens. This post will delve into the intricacies of HTTP/3 traffic engineering, Zalando experience with it and our vision for next steps.

The significance of HTTP/3 adoption

Nowadays, 85% of total Internet traffic is TCP traffic. HTTP traffic takes about 54.6% and 54.4% of it is the traffic to mobile devices. TCP was developed in the 70s of last century to build reliable client/server communication. The TCP-based family of Web protocols, specifically HTTP/1.0, HTTP/1.1 and HTTP/2, inherits the legacy TCP inefficiencies for building concurrent and low-latency Web applications on wireless networks. Looking in-depth on the protocol stack involved for end-to-end communication, there are issues in (1) network infrastructure utilisation and (2) protocol design:

Issues with Utilisation of IP Network

(1) Issues with Utilisation of IP Network: The Internet comprises a heterogeneous mix of packet-switched networks, including ISP Access Networks, ISP Core Networks, and numerous Tier 1/2/3 telecom carriers. For European customers connecting to load balancers deployed in the eu-central-1 availability zone, packets traversing about 15 hops. Each hop introduces a blend of processing, waiting times, and the inherent risks of packet loss or network congestion, particularly when nodes or links are strained beyond capacity. Additionally, the architecture of the access network, encompassing its physical medium and the transmission delays it incurs, further compounds these challenges. Furthermore, the saturated capacity of the radio spectrum utilised for communication within the access network adds another layer of complexity to contend with.

Issues with Protocol Design

(2) Issues with Protocol design: Recent development of Web-protocol stack has presented several notable improvements, foremost among them being the excessive signalling and handshakes required by the upper protocol to negotiate communication parameters prior to payload transfer. Despite this, each "cold" HTTP/2 request necessitates approximately 5 to 6 round-trips, including 1xDNS, 1xTCP, 3xTLS, and 1xHTTP handshakes, contributing to significant network signalling overhead. Moreover, TCP, functioning as a single ordered stream of bytes, lacks concurrent multiplexing capabilities for application traffic over the transport layer. Consequently, any networking failure, such as packet loss or congestion, results in the blocking of the entire byte stream, hindering performance and responsiveness. Existing Transport Congestion Control algorithms often fail to optimise network bandwidth utilisation, leading to suboptimal performance and efficiency. Additionally, poorly designed protocols contribute to fragmentation and reassembly, necessary for packets to traverse links with smaller Maximum Transmission Units (MTUs) than the original packet size. This fragmentation process increases the likelihood of excessive retransmissions in the event of packet loss, further impeding network efficiency and reliability.

It has been proven by the industry that customers love fast experiences: application and web sites. About 70% of mobile app users will stop using an app if it is taking too long to load. Slow “pages” have higher bounding rate; “speed” of the sites is considered as ranking signal for search. Having a fast site makes for a good user experience, which helps improve rankings and brings in visitors, which keeps them on your site and ultimately leads to more conversions.

Knowing these issues, we make an assumption that the first group of factors related to network infrastructure remain unchanged in the near future (3 to 5 years). The infrastructure improvements are driven by economics. It is only remediation of the second group factors related to protocol design that can bring about a significant improvement of the customer experience. We also assume mobile devices replacement is seasonal, with longer or shorter cycles depending on country & economic situation, but certain.

HTTP/3 has gathered consensus as the best technical solution to the second group of problems related to protocol design at this time.

What enhancements does HTTP/3 bring?

In the past, the industry has made multiple attempts on improving protocol design through Structured Streams Transport (SST), Stream Control Transport Protocol (SCTP), Multipath TCP (MP-TCP) and kernel-less TCP/IP implementations (e.g. uIP, and lwIP). None of these became widely adopted because they were focusing on the transport layer only, avoiding end-to-end Web perspective. In June 2022, IETF published HTTP/3 as a Proposed Standard, which is built over a new protocol called QUIC (standardised in May 2021).

QUIC is a transport layer network protocol. In contrast to TCP, it is user-space flow and congestion controls over the User Datagram Protocol (UDP). Its new architecture is built over protocols cooperation principles rather than a strict OSI layering. The protocol solves:

HTTP/3 Improvements

Multiplexing: TCP is a single stream that guarantees strict ordering of bytes. Any concurrency requires multiplexing over a single stream. Network conditions (e.g. packet losses, congestion) causes the TCP stream to be a bottleneck that blocks all senders / receivers on this stream. QUIC multiplexes streams over UDP datagrams, each stream independent and implements its own flow and congestion controls. QUIC also controls the fragmentation and packetisation of payload, producing optimal network datagrams.

Handshake: Each “cold” HTTP/2 request demands about 5 to 6 round-trips (1xDNS, 1xTCP, 3xTLS, 1xHTTP). HTTP/3 requires 3 round-trips (1xDNS, 1xQUIC, 1xHTTP). QUIC handshake combines negotiation of cryptographic and transport parameters. The handshake is structured to permit the exchange of application data as soon as possible, achieving actual waiting time to be a single round-trip. Peers establish a single QUIC connection that multiplexes a large number of parallel streams. The handshake is only required once, setup of the stream is an instant operation and does not require any additional handshake.

TLS: Traditional layered architecture has an isolated security and transport layer causing significant overhead to negotiate encryption keys and transmit encrypted data. Customers perceive bad experiences when the chain of TLS certificates exceeds 4KB and TLS records are fragmented to multiple packets. QUIC adopts TLS version 3 as default one and encapsulates the security protocol (encrypts each individual packet).

Congestion: QUIC provides the open architecture for congestion control, whereas TCP implements it on the kernel side of the operating system. QUIC does not aim to standardise the congestion control algorithms, it provides generic signals for congestion control, and the sender is free to implement its own congestion control mechanisms. As a benefit, sender can align payload to the actual size of the congestion window but also leads to performance inefficiencies as it involves copying extra packet data from kernel memory to user memory, so research on improving that efficiency is key.

Handover: QUIC connections are not strictly bound to a single network path. The protocol supports the connection transfer to a new network path, ensuring a low-latency experience when consumers switch from mobile to WiFi. In the case of HTTP, it always requires a “cold” start.

Outstanding HTTP/3 protocol challenges

QUIC has emerged as a serious alternative to TCP in the Web domain. Unfortunately, QUIC and HTTP/3 are not a “silver bullet” to solve concurrency and low latency. Open issues remains for engineers to be considered for the application development:

Multiplexing: Stream frames are multiplexed over single QUIC packets, which are coalesced into a single UDP datagram. The congestion or loss of datagrams causes a similar effect as on TCP. Application needs to implement its own traffic prioritisation schema(s) to mitigate effect if necessary.

Memory management: HTTP/3 and QUIC demands a greater commitment for memory resources than traditional Web protocol stack. HTTP/3 mitigates the protocol overhead with various compression techniques but stream-oriented ordering of bytes requires excessive buffering of any data that is received out of order. Additionally, a user-space implementation leads to performance inefficiencies as it involves copying extra packet data from kernel memory to user memory.

Traffic shaping and security: networking infrastructure was monopolised by TCP so long that it introduced indirect dependencies on networking. ISP enforces different traffic routing policies for TCP vs UDP traffic, there are various in-the-network optimisation techniques such as Quality of Service, Active Queue Management that impacts on UDP. The massive adoption of QUIC would require reconfiguration of networking gears. For example, Facebook reported: client side heuristic about TCP, heuristic for estimating the available download bandwidth, bottlenecks at Linux-kernel on UDP packet processing, new load balancing and firewall policies.

Congestion control: No ultimate solution on the problem domain. It inherits algorithms from TCP. Historically, congestion control was owned by “hardware” companies - those who developed networking equipment and operating systems. QUIC shifts the ownership, because of user-space implementation, towards “software” companies - those who own Web-browsers. Nowadays, NewReno (1999), CUBIC (2008) and Bottleneck Bandwidth and Round-trip (2016) are the heuristic congestion control algorithms. QUIC standard is confusing, it proposes NewReno as default algorithm, although CUBIC is the dominant algorithm for the broad internet traffic today. Also, BBR algorithm has increased its share in terms of the practical implementation and it can be expected to become the dominant algorithm in the future. A positive side effect of shifting congestion control to user-space is unblocking innovations (e.g. there are research activities of the adoption of Deep Reinforcement Learning to boost customer experience).

MTU: The QUIC protocol, as it is being standardised by the IETF, does not support network MTUs smaller than 1280 bytes. It makes the protocol compatible with IPv6 networks (1280 bytes is IPv6 MTU). However, this poses challenges for networks operating on "non-standard" IPv4 configurations, potentially leading to packet fragmentation, especially on radio channels. Presently, the industry predominantly adheres to Ethernet standards, assuming a physical link MTU of 1500. While larger datagrams are feasible, they necessitate the utilisation of the Path Maximum Transmission Unit Discovery protocol to ensure optimal performance and compatibility across diverse network environments.

Viewing HTTP/3 from the Radio Access Network (Physical Link) angle

The architecture of the HTTP/3 protocol assumes low latency and high reliability within access networks. While the QUIC protocol brings notable enhancements for "interactive" communication over 3G/4G/LTE wireless networks, it has not focused on specificity regarding the unique attributes of 5G networks. It's crucial to note that 5G networks are poised to solve latency issues effectively. Engineers need to be aware of the limitations within Radio Access Networks and carefully weigh the adoption of 5G technology, particularly in the European context. 5G stands out for its remarkable speed capabilities, boasting peak data rates of up to 20 Gigabits-per-second (Gbps) and average data rates exceeding 100 Megabits-per-second (Mbps). Unlike its predecessor, 4G, 5G exhibits significantly enhanced capacity, designed to accommodate a 100-fold surge in traffic capacity and network efficiency. Theoretical estimates suggest that 5G can support up to 1 million devices per square kilometer, showcasing its immense potential for accommodating the burgeoning demands of modern connectivity.

HTTP/3 Radio Access Network Perspective

Advertisements about 5G talk about millimeter-wave (mmWave) but the 5G technology is built over three frequency bands (a) low-bands (sub-1GHz) supports wide-area coverage, (b) mid-bands (1 - 6 GHz) offers a trade-off between coverage and capacity, most of the commercial 5G networks will use 3.3 GHz to 4.2 GHz range in the mid-band spectrum and (c) high-bands (24–52 GHz) are required to achieve ultra-high data rates and ultra-low latencies. High-bands (mmWave) are highly susceptible to blockages caused by various objects (e.g., buildings, vehicles, trees) and even the human body. Mass scale operating in mmWave spectrum, presents a demanding challenge in terms of its practical implementation and costs. The physical link in the Radio Access Network emerges as the primary bottleneck on low- and mid-bands, primarily due to the constrained capacity of the radio spectrum. Frequency bands below 6 GHz, traditionally utilised by pre-5G technologies, are progressively saturating, unable to meet escalating consumer demands. Our assumption is about the massive adoption of mid-bands across Europe, 5G mid-bands still outperforms 3G/4G/LTE in terms of latency and packet loss probability but requires less investment into network infrastructure. For example, serving multiple real-time video streams over 5G is not magic anymore. We are able to build customer experience with about 13 ms latency for 99.9% of downlink packets and 28 ms for 99.9% of uplink packets even with “bad” signal strength from -100 dBm to -113 dBm.

On the mid-bands, 5G still outperforms 3G/4G/LTE in terms of latency and packet loss probability. High-reliability plays against the congestion control algorithms used by QUIC. Conventional algorithms are not able to differentiate between the potential causes of packet loss or congestion on the radio channel due to noise, interference, blockage or handover. NewReno and CUBIC have resulted in very poor throughput and latency performance. Only BBR exhibited the lowest round trip time values among all possible physical failure scenarios and can satisfy the typical 5G requirements. Advancing the adoption of HTTP/3 for low-latency communication scenarios necessitates research and development into congestion control algorithms that are sensitive to bandwidth variations across different frequency bands.

Adoption of HTTP/3 by Zalando

Despite the discussed limitation, we have adopted the HTTP/3 protocol at Zalando for distributing all media content. We have successfully brought our vision to life: delivering a premium customer experience atop the foundation laid by industry enablers. Akamai Technologies has been supporting QUIC since July 2016. Amazon supports QUIC (UDP) at Network Load Balancer. Most importantly, HTTP/3 is available at CloudFront giving the ability to serve European customers through Edge Locations. Apple maintains proprietary closed source implementation of QUIC and HTTP/3 protocol since iOS 15. On Android, an open source Cronet library exists. Google Chrome has supported the protocol since 2012. Apple added official support in Safari 14. Support in Firefox arrived in May 2021.

Since HTTP/3 have been enabled into our production environment, we have observed that 36.6% of our users seamlessly migrated to content consumption using HTTP/3 protocol. The average latency for these customers has improved from double digit to single digit value giving about 94% improvements. The p99 latency has improved from 4th digit value to double digit giving 96% gain in comparison with HTTP/2. About 61.6% of our users continue utilisation of HTTP/2 protocol and remaining 1.8% of users fall back to HTTP/1. No incidents or severe anomalies caused by HTTP/3 have been observed by us.

Exploring further directions on traffic engineering opportunities with HTTP/3

Prior to concluding, the author anticipates delineating two significant pathways for further enhancing HTTP/3, aimed at crafting next-level customer experiences.

Congestion Control with Deep Reinforcement Learning

Conventional CC algorithms base their decisions on pre-defined criteria (heuristic) such as packet loss or delay and they lack the ability to learn and adapt their behaviour in complex dynamic environments such as 5G cellular networks. Some heuristic algorithms use statistics to accommodate previous experience into the decision making process, still they are not able to achieve the full potential of modern networks.

HTTP/3 Traffic Engineering Opportunities

Machine Learning techniques outperforms conventional CC algorithms by dynamically adapting the parameters. Deep Reinforcement Learning (DRL) is a prominent technique that has been assessed with QUIC. The Reinforcement Learning agent makes decisions about the size of the congestion window or sending rate while interacting with the environment. The reward metric is either throughput or network delay while penalising packet losses that are optimised for a particular application. In the lab, analysis of DRL algorithms has shown higher throughput and round-trip performance under various network settings to compare with competing solutions (e.g. BRR or Remy). It is worth mentioning Aurora, Eagle, Orca and PQB as known DRL algorithms. We expect this will become the main concept exploited in the research dedicated for protocol improvements in 5G networks.

Streaming of 4K Ultra High Definition videos

Streaming of 4K Ultra High Definition 3480x2160 video at 60 fps requires usage of H.265 (High Efficiency Video Coding) and demands 30 - 50 Mbps network bandwidth, 6 - 11 ms packet latency and 99.999% reliability for packet delivery. This is a tough requirement for 5G mid-bands and practically achievable in the urban areas only.

HTTP/3 introduces concurrent access and low-latency capabilities to video streaming solutions. Our initial investigations have revealed that only Video on Demand applications utilise Dynamic Adaptive Streaming over HTTP/3, with an assumption of 5.6 MB of HEVC-compressed video per second. The QUIC stream concurrency enables parallel fetching of video chunks, leading to an improved user experience compared to HTTP/2. The real-time video streaming with QUIC over less than ideal network conditions faces an issue due to the reliable nature of the protocol. Retransmissions of lost packets in a video stream, inadvertently lead to stalls in the video stream. It also performs poorly when it encounters packet losses that are not due to congestion. This is another improvement opportunity for QUIC to offer a selectively reliable transport wherein not all video frames are delivered reliably, we can optimise video streaming and improve end-user experiences. We believe this improvement impacts content consumption by supporting up to 4096 × 2160 at 60fps (True 4K).

Conclusion

Usage statistics indicate that 29.8% of websites worldwide have already embraced HTTP/3 to cater to their users, with Zalando being among them. Through its adoption, significant strides have been made towards improving the efficiency and responsiveness of web communications, ultimately enhancing the end-user experience.

We've explored how HTTP/3 addresses key challenges such as latency reduction, concurrent access, and low-latency content delivery. We’ve also emphasised remaining issue engineers should be aware specifically in the content of radio access networks and discussed remaining exciting opportunities for further advancements in traffic engineering and network optimization, especially as technologies like Deep Reinforcement Learning continue to mature.

Overall, the insights shared in this post underscore the pivotal role of HTTP/3 in shaping the future of web communication, paving the way for richer, more immersive online experiences. Our observations tell us that 36.6% of our users seamlessly migrated to content consumption using HTTP/3 protocol. The average latency for these customers has improved from double digit to single digit value giving about 94% improvements.


We're hiring! Do you like working in an ever evolving organization such as Zalando? Consider joining our teams as a Frontend Engineer!



Related posts