HLS Latency Sucks, But Here’s How to Fix It (Update)

January 19, 2022 by
A woman in front of her laptop with an expression of frustration.

Apple’s HTTP Live Streaming (HLS) protocol is the preferred format for video delivery today. The streaming protocol boasts unmatched compatibility across devices, high-quality video, and scalability to countless viewers. But while HLS surges in popularity, we continue to hear complaints from customers about reducing latency. The general feedback is: “HLS latency sucks.”

Just peek at the following two charts. More than 70% of respondents in our 2021 Video Streaming Latency Report indicated that they use HLS — with more than 23% experiencing latency in the 10-45 second range.


Which streaming formats are you currently using?

A graph comparing the use of different streaming protocols for last-mile delivery and playback, with HLS leading the way, followed by MPEG-DASH and WebRTC.

How much latency are you currently experiencing?

A graph showing how much latency companies are currently experiencing when delivering video online.

Lag time aside, HLS has a lot of things going for it. And it can’t be ignored as a viable option for your streaming workflow. Keep reading to find out what causes HLS delay, routes for increasing delivery speed, as well as alternative technologies for real-time streaming.


Table of Contents

  • Risks of Tuning HLS for Latency
  • Alternative Formats for Low-Latency Streaming
  • Conclusion

    Benefits of the HLS Protocol

    Remember when streaming video online meant constant buffering? HLS solved that problem by using chunks to ensure that your stream can be played back seamlessly, in high quality, without causing the spinning ball of death.

    Beyond that, HLS reduced the cost of delivering content. Using affordable HTTP infrastructures, content owners could easily justify delivering their streams to online audiences and expand potential viewership. HLS is also ubiquitous — ensuring playback on more devices and players than any other technology — which means that it’s a convenient technology to use that doesn’t require specialized workflows.

    Regardless of whether viewers are using an iOS, Android, HTML5 player, or even a set-top box (Roku, Apple TV, etc.), HLS streaming is available. It also scales well. Like the MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard, HLS uses a packetized content distribution model that cuts and then reassembles video chunks based on the manifest (HLS uses .m3u8 playlist) file. It provides content delivery networks (CDNs) and streaming service providers with a relatively common platform to standardize across their infrastructure, allowing for edge-based adaptive bitrate (ABR) transcoding.

    Get the low latency streaming guide

    Understand the critical capabilities required to provide interactive live streaming experiences.

    Download Free

    Causes of Latency With HLS

    For many of the same reasons that HLS is great, it drags its heals when it comes to delivery speed. The sources of this latency include the encodingtranscoding, distribution, and the default playback buffer requirements for HLS.

    When streaming with HLS, Apple recommends a six-second chunk size (also called segment duration) and a certain number of packets to create a meaningful playback buffer. This results in about 20 seconds of glass-to-glass delay. Plus, when you introduce CDNs for greater scalability, you inject another 15-30 seconds of latency so the servers can cache the content in-flight — not to mention any last-mile congestion that might slow down a stream.

    For these reasons, standard HLS isn’t a viable option when interactivity or broadcast-like speed matters. Nobody wants to see spoilers in their phone’s Twitter feed before the ‘live’ broadcast plays a game-winning touchdown. Likewise, large delays interrupt the two-way nature of game streaming and user-generated content (UGC) applications like Facebook live or Twitch. Consumers today expect their content to arrive as fast as possible, regardless of the realistic nature of the streaming app.


    Tuning HLS for Low Latency

    One option for streaming lower-latency Apple HLS content is to tune your workflow. Using the Wowza Video service or the Wowza Streaming Engine software, users can manipulate the segment size for reduced-latency HLS streams. Below we provide a tutorial for the latter.


    Steps for Reducing Latency in Wowza Streaming Engine

    When delivering lower-latency HLS streams in Wowza Streaming Engine, there are four settings you’ll want to modify. Justin walks through each in this video, and we go into more detail in the list below.

    1. Reduce your HLS chunk size. Currently, in the HLS Cupertino default settings, Apple recommends a minimum of six seconds for the length of each segment duration. We have seen success manipulating the size to half a second. To reduce this, modify the chunkDurationTarget to your desired length (in milliseconds). HLS chunks will only be created on keyframe boundaries, so if you reduce the minimum chunk size, you need to ensure it is a multiple of the keyframe interval or adjust the keyframe interval to suit.
    2. Increase the number of chunks. Wowza Streaming Engine stores chunks to build a significant buffer should there be drop in connectivity. The default value is ten, but for reduced-latency streaming, we recommend storing 50 seconds of chunks. For one-second chunks, set the MaxChunkCount to 50; if you’re using half-second chunks, the value should be 100.
    3. Modify playlist chunk counts. The number of items in an HLS playlist defaults to three, but for lower latency scenarios, we recommend 12 seconds of data to be sent to the player. This prevents the loss of chunks between playlist requests. For one-second chunks set the PlaylistChunkCount value to 12; if you’re using half-second chunks, the value should be doubled (24).
    4. Set the minimum number of chunks. The last thing you want to adjust is how many chunks must be received before playback begins. We recommend a minimum of 6 seconds of chunks to be delivered. To configure this in Wowza Streaming Engine, use the custom CupertinoMinPlaylistChunkCount property. For one-second chunks, set it to 6, or 12 for half-second chunks.

    Risks of Tuning HLS for Latency

    Tuned HLS comes with inherent risks. First, smaller chunk sizes may cause playback errors if you fail to increase the number of seconds built into the playlist. If a stream is interrupted, and the player requests the next playlist, the stream may stall when the playlist doesn’t arrive.

    Additionally, by increasing the number of segments that are needed to create and deliver low-latency streams, you also increase the server cache requirements. To alleviate this concern, ensure that your server has a large enough cache, or built-in elasticity. You will also need to account for greater CPU and GPU utilization resulting from the increased number of keyframes. This requires careful planning for load-balancing, with the understanding that increased computing and caching overhead incurs a higher cost of operation.

    Lastly, as HLS chunk sizes are smaller, the overall quality of the video playback can be impacted. This may result in either not being able to deliver 4K video reaching the player, or small playback glitches with an increased risk of packet loss. Essentially, as you increase the number of bits (markers on the chunks), you require more processor power to have a smooth playback. Otherwise, you get packet loss and interruptions.


    Alternative Formats for Low-Latency Streaming

    Luckily, there are two lower-risk options for speeding things up: Apple Low-Latency HLS and Web Real-Time Communications (WebRTC).


    Apple Low-Latency HLS

    Apple’s Low-Latency HLS protocol was designed to achieve sub-two-second latencies at scale — while also offering backwards compatibility to existing clients.  


    Backwards compatibility is a major benefit of Low-Latency HLS. Players that don’t understand the protocol extension will play the same streams back, with a higher (regular) latency. This allows content distributors to leverage a single HLS solution for optimized and non-optimized players.  

    Low-Latency HLS promises to put streaming on par with traditional live broadcasting in terms of content delivery time. Consequently, we expect it to quickly become a preferred technology for OTT, live sports, esports, interactive streaming, and more.


    The Low-Latency HLS spec is still evolving, and support for it is being added across the streaming ecosystem. As a result, large-scale implementations remain few and far between. It also lags behind WebRTC in terms of delivery speed. For the lowest-latency possible, you’ll be better off swapping out HLS altogether.

    A graphic showing the continuum of interactivity and different latency values for streaming protocols like WebRTC, Tuned DASH, HLS, and more.


    WebRTC delivers near-instantaneous streaming to and from any major browser. The technology was designed for video conferencing and thus supports sub-500-millisecond latency — without requiring third-party software or plug-ins.


    Everything from low-latency delivery to interoperability makes WebRTC an attractive, cutting-edge technology. All major browsers and devices support WebRTC, making it simple to integrate into a wide range of apps without dedicated infrastructure. It’s also the quickest method for transporting video across the internet, as mentioned above.


    While it’s the speediest protocol out there, scaling to more than 50 concurrent peer connections requires additional resources. Luckily, we designed Real-Time Streaming at Scale for Wowza Video to overcome this limitation. The new feature deploys WebRTC across a custom CDN to support interactive streaming to a million viewers.



    No matter what your latency requirements are, Wowza makes it happen. Our full-service platform can power any workflow — with reliability to boot. We offer protocol flexibility on the ingest side as well as delivery, meaning you’re able to design the best streaming solution for your use case rather than sticking with one prescriptive workflow.


    Video Streaming Latency Report

    Download Free