HLS Latency Sucks, But Here’s How to Fix It (Update)December 3, 2019
Ever since Android added support for Apple’s HTTP Live Streaming (HLS) protocol in 2011, HLS has taken over the world of streaming video. But while HLS surges in popularity, we continue to hear complaints from customers about reducing latency. The general feedback is: “HLS latency sucks.”
Let’s take a look at the numbers. In our 2019 Video Streaming Latency Report, more than 45% of respondents were using HLS for video playback. What’s more, almost 40% reported that they were experiencing latency in the 10-45 second range.
Which streaming formats are you currently using?
It’s not all doom and gloom, though. HLS has a lot of things going right. And it can’t be ignored or dismissed as a viable option for your streaming decisions.
Benefits of the HLS Protocol
HLS provides excellent video quality. Remember when streaming sucked because it was always buffering? HLS solved that problem by using chunks to ensure that your stream can be played back seamlessly, in high quality, without causing the “spinning ball of death.”
Secondly, HLS reduced the cost of delivering content. Using affordable HTTP infrastructures, content owners could easily justify delivering their content to online audiences, and expand potential viewership. HLS is also ubiquitous — delivering to more devices and players — which means that it’s cheaper for the consumer to watch without needing a specialized device.
Regardless if you’re using an iOS, Android, HTML5 player, or even a set-top box (Roku, Apple TV, etc.), HLS streaming is available. It also scales well. Like the MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard, HLS uses a packetized content distribution model that cuts and then reassembles video chunks based on the manifest (HLS uses .m3u8 playlist) file. It provides CDNs and encoding/transcoding software providers with a relatively common platform to standardize across their infrastructure, allowing for edge-based adaptive bitrate (ABR) transcoding.
Causes of Latency With HLS
For many of the same reasons that HLS is great, it also has faults when it comes to latency. The most likely sources of latency include the encoding, transcoding, distribution, and the default playback buffer requirements for HLS.
When changing an adaptive stream in HLS, it demands a new buffer to be built. Apple recommends a six-second segment duration and a certain number of packets to create a meaningful playback buffer. This results in about 20 seconds of glass-to-glass delay seconds from capture to final packet assembly. But, when you introduce CDNs for greater scalability, you inject another 15-30 seconds of latency so the servers can cache the content in-flight — not to mention any last-mile congestion that might slow down a stream.
Standard HLS isn’t a viable option when interactivity or broadcast-like speed matters. Nobody wants to see spoilers in their Twitter feed while watching a game on their phone. Likewise, you don’t want to have large delays in interactivity with game streaming or UGC broadcasters, like in a Facebook live or Twitch stream. That’s because consumers today expect their content to arrive as fast as satellite or cable feeds, regardless of the realistic nature of the streaming app.
Tuning HLS for Low Latency
One option for streaming lower-latency Apple HLS content is to tune your workflow. Using the Wowza Streaming Engine™ software or the Wowza Streaming Cloud™ service, users can manipulate the segment size for reduced-latency HLS streams. Let’s take a look at how you’d go about delivering low-latency Apple HLS streams when delivering video with Wowza Streaming Engine.
Steps for Reducing Latency
There are four simple steps to tune your workflows to deliver lower-latency HLS streams:
- Reduce your chunk size. Currently, in the HLS Cupertino default settings, Apple recommends a minimum of six seconds for the length of each segment duration. We have seen success manipulating the size to half a second. To reduce this, modify the chunkDurationTarget to your desired length (in milliseconds). HLS chunks will only be created on keyframe boundaries, so if you reduce the minimum chunk size, you need to ensure it is a multiple of the keyframe interval or adjust the keyframe interval to suit.
- Increase the number of chunks. Wowza Streaming Engine stores chunks to build a significant buffer should there be drop in connectivity. The default value is ten, but for reduced-latency streaming, we recommend storing 50 seconds of chunks. For one-second chunks, set the MaxChunkCount to 50; if you’re using half-second chunks, the value should be 100.
- Modify playlist chunk counts. The number of items in an HLS playlist defaults to three, but for lower latency scenarios, we recommend 12 seconds of data to be sent to the player. This prevents the loss of chunks between playlist requests. For one-second chunks set the PlaylistChunkCount value to 12; if you’re using half-second chunks, the value should be doubled (24).
- Set the minimum number of chunks. The last thing you want to adjust is how many chunks must be received before playback begins. We recommend a minimum of 6 seconds of chunks to be delivered. To configure this in Wowza Streaming Engine, use the custom CupertinoMinPlaylistChunkCount property. For one-second chunks, set it to 6, or 12 for half-second chunks.
Risks of Tuning HLS for Latency
Tuned HLS doesn’t come without inherent risks. First, smaller chunk sizes may cause playback errors if you fail to increase the number of seconds built into the playlist. If a stream is interrupted, and the player requests the next playlist, the stream may be interrupted when the playlist doesn’t arrive.
Additionally, by increasing the number of segments that are needed to create and deliver low-latency streams, you also increase the server cache requirements. To alleviate this concern, ensure that your server has a large enough cache, or built-in elasticity. You will also need to account for greater CPU and GPU utilization resulting from the increased number of keyframes. This requires careful planning for load-balancing, with the understanding that increased computing and caching overhead incurs a higher cost of operation.
Lastly, as chunk sizes are smaller, the overall quality of the video playback can be impacted. This may result in either not being able to deliver full 4K video reaching the player, or small playback glitches with an increased risk of packet loss. Essentially, as you increase the number of bits (markers on the chunks), you require more processor power to have a smooth playback. Otherwise, you get packet loss and interruptions.
Luckily, there’s now a lower-risk way to speed things up: Apple Low-Latency HLS.
Apple Low-Latency HLS
We recently announced support for Apple’s Low-Latency HLS protocol in the Wowza Streaming Engine software. As an ideal spec for live and interactive streaming, Low-Latency HLS enables sub-two-second video delivery at scale — while also offering backward compatibility to existing clients. Players that don’t understand the protocol extension will play the same streams back, with a higher (regular) latency. This allows content distributors to leverage a single HLS solution for optimized and non-optimized players.
Low-Latency HLS promises to put streaming on par with traditional live broadcasting in terms of content delivery time. Consequently, we expect it to quickly become a preferred technology for OTT, live sports, esports, interactive streaming, and more. The spec is still evolving, though, and support for it is being added across the streaming ecosystem.
- Configure Apple HLS Packetization in Wowza Streaming Engine
- Deliver a Reduced-Latency HLS Stream in Wowza Streaming Cloud
- Apple Low-Latency HLS: What It Is and How It Relates to CMAF
- Delivering Low-Latency HLS Live Streams Using Wowza Streaming Engine