How Does Latency Affect Streaming Video?
Anyone who is managing or delivering video feeds knows that latency is a major roadblock to responsiveness and interactivity. What’s more, latency is cumulative across the entire workflow. It starts at ingest and continues into encoding or transcoding steps. Then, even more latency is introduced when video is passed to a CDN for delivery, and ultimately when the viewer watches it in the player.
This blog explores this lag between when a video is captured and when it’s displayed on a viewer’s device.
The Levels of Latency
Latency dictates the success of a stream. It enables a stream to be less of a passive broadcast and more of an interactive, real-time experience. These tiers of latency can be visualized on a spectrum that, at the high end, can be 30 seconds to a minute. At the low end of the spectrum is a nearly instantaneous experience, in the range of milliseconds.

High Latency Protocols
At the highest end of the latency spectrum are common HTTP latencies, or High Latency, which spans from 30 to 45 seconds or more. This tier is only acceptable for traditional linear, non-interactive TV-like content. This is often the default setting for standard HTTP Live Streaming (HLS) and Dynamic Adaptive Streaming over HTTP (MPEG-DASH) protocols which prioritize reach and quality over speed.
Reduced Latency Protocols
Next, we find Reduced Latencies which range from 6 to 30 seconds of delay. These are most common in major live news and sports broadcasts. While functional, this range means significant events (like a goal scored) are still noticeably delayed. Here, we see the low-latency end for protocols like Tuned HLS and DASH.
Low Latency Protocols
A typical latency for HD and cable TV is considered to be around 5 seconds. The desirable range for modern applications is considered as Low Latency, which usually operates on a span of 1 to 6 seconds. This range of delay is good for most social media interactions and less time-critical live events, making a stream feel far more current. Protocols like Real-Time Streaming Protocol (RTSP), Real-Time Protocol (RTP), Secure Reliable Transport (SRT) for ingest or contribution feeds, and Common Media Application Format (CMAF) for DASH shine here. While RTMP is a legacy protocol often used for ingest, it is natively low-latency, unlike HTTP-based protocols.
Near Real-Time Protocols
Finally, the closest to being in-person is Ultra-Low Latency (ULL) or Near Real-Time. This operates on a sub-second basis, the tier required for true interactivity. This near-instantaneous immersion powers live auctions or remote collaboration where even a fraction of a second of latency can make a noticeable impact. Web Real-Time Communication (WebRTC) currently is the most widely adopted protocol here.
What Causes Latency in Streaming Video?
The journey a video stream takes from the camera to the viewer’s device is a complex series of steps, each one a potential source of added delay, compounding into the final latency number. This pipeline can be broken down into three major stages:
- Ingest and Processing
- Distribution
- Playback
Managing buffering tolerances across all three of these pipeline stages is crucial. Every millisecond of latency introduced, especially at earlier stages, can have a sizeable and noticeable impact on the end viewer experience.
To optimize for smooth playback, prioritize reducing latency wherever possible.
Streaming Delay Causes During Ingest & Processing
During Ingest and Processing, the stream’s first hurdle is the initial encoding/transcoding process. Once footage is captured in-camera, video is compressed and converted into multiple bitrates across an ABR ladder. These procedures introduce delay. The length of video buffer segments, whether it be 2-second segments or 6-second segments, has a direct impact on latency. However, latency is rarely just about segment size. A common engineering rule of thumb is that latency equals 3x the segment duration. So, a standard 6-second segment results in roughly 18 seconds of glass-to-glass latency.
The very act of preparing the video for distribution is a time-consuming process. Raw video from the camera must be converted into multiple compressed formats and different bitrates (e.g., 1080p, 720p, 480p). This process, known as encoding and transcoding, requires the encoder to process a chunk of the video before it can output the first playable segment. This processing takes time, adding a constant overhead to the overall delay. Faster encoding presets reduce latency at the expense of lower video quality or higher file size. Higher-quality compression and a greater number of output bitrates generally require more processing time, further contributing to the inherent latency of the stream.
Where Latency Is Introduced During Distribution
The next stage, Distribution, involves delivering the video segments to viewing devices. Often times, this involves a Content Delivery Network (CDN) which helps deliver content to end viewers. CDNs are essential for providing reliability and distributing global streams. They introduce latency (sometimes 15-30 seconds in older configurations) so they can effectively cache enough content across their distributed network. This allows the CDN to absorb traffic spikes and guarantee that segments are available quickly and reliably to millions of simultaneous viewers, but it adds significant time to the end-to-end delay.
Network congestion across global data centers, even in optimal conditions, can still add milliseconds of latency. But, more significantly, the Origin server needs to wait for segments to be fully created before making them available. So, any latency introduced at ingest creates downstream bottlenecks for distribution.
How Playback Adds Latency to Optimize Viewer Experiences
Finally, at the viewer’s end, Playback introduces the final opportunity for delay. Players are engineered to be robust against momentary network hiccups. Before starting playback, the player must load a minimum number of segments (often 2 to 3) to create a safe cushion of video data. This pre-loading creates a noticeable starting delay and adds to the overall operational latency. Buffers exist to smooth out packet jitter.
The Business Case for Sub-Second, Ultra-Low Latency
The push toward sub-second or real-time streaming with ultra-low latency is a powerful business imperative driven by scenarios where every fraction of a second translates directly into profit, safety, or engagement. Where up to 30 seconds of latency is sometimes passable, sub-second latency is critical for certain use cases.
Key Scenarios Where Delay is Costly
The core business benefit of sub-second streaming is eliminating the costs of delay: lost revenue, decreased productivity, or compromised safety.
Live Auctions & Real-Time Bidding
For platforms hosting live auctions (whether for fine art, collectibles, or digital assets), sub-second streaming is absolutely critical. It ensures fair and timely betting/bidding. This levels the playing field for all remote participants. Any delay can mean a bidder misses the closing moment, resulting in lost revenue for the auction house. Additionally, the immediate response builds a sense of urgency and excitement with participants that is crucial to stimulating purchases.
Corporate & Educational Events
In professional settings, productive communication is paramount. During Corporate Town Halls, Training Sessions, or Educational Webinars, a significant delay hinders any real-time interaction. When a participant asks a question via chat or video, a speaker’s delayed response creates an awkward gap that breaks the flow of discussion. Sub-second latency facilitates productive Q&A and collaboration. This allows for a genuine, natural, back-and-forth dialogue that maximizes knowledge retention and engagement. The virtual event, in turn, feels as responsive as an in-person meeting.
Remote Facility Surveillance & Operations
For monitoring critical infrastructure, manufacturing lines, or remote construction sites, latency directly impacts response time and safety. Lower latency ensures a prompt response to any events, anomalies, or security breaches that occur. This is especially crucial if the facility is in a remote location. In operational technology (OT) settings, where streams might be used to control machinery, near-instant feedback is vital for preventing errors and maintaining precision.
Transportation & Traffic Cameras
Utilizing live video feeds for traffic management and public safety relies heavily on real-time data. Constant, low-latency monitoring of congestion and traffic patterns enables authorities to make immediate, critical decisions. This could include adjusting signal timing, deploying emergency services, or diverting traffic. Monitoring and optimizing these areas saves time, fuel, and potentially lives. A 10-second delay in identifying a sudden accident could mean a 10-second delay in dispatching help.
How To Reduce Video Streaming Latency
To effectively reduce latency, engineers must either tune existing HTTP-based protocols or adopt entirely new protocols built from the ground up for real-time performance. In some cases, this could mean moving from standard HLS/DASH to solutions like WebRTC or specialized low-latency standards like LL-HLS. This pursuit of speed involves trade-offs between compatibility, scale, and complexity.
HTTP-Based Streaming Innovations
Standard HTTP Live Streaming (HLS) and DASH are inherently high-latency due to their large segment sizes. However, industry innovators have developed extensions to make these protocols viable for low-latency delivery while retaining their massive scalability advantages.
Low-Latency HLS (LL-HLS) and Low-Latency DASH (LL-DASH) is a modern solution, championed by Apple and others. It directly addresses the Segment Size problem by introducing partial segments or “chunks.” Instead of waiting for a full, large segment (e.g., 6 seconds) to be generated, the segments are divided into smaller, immediately transferable chunks (e.g., 200 milliseconds). This dramatically reduces the waiting time and brings latency down to approximately 2–5 seconds. The key advantage of LL-HLS and LL-DASH is that they offer high compatibility with existing CDNs and players while providing substantial latency reduction.
Tuning Standard HLS is being used in environments where the newer LL-HLS/DASH specifications aren’t supported. This option manually reduces the chunk size of standard HLS (e.g., to 0.5 or 1 second). While this provides a quick latency win, it introduces significant risks. The CDN has many more files to manage, leading to increased server costs and a higher risk of playback errors or stalling on the client side if network conditions are poor.
Ultra-Low Latency Standards
For mission-critical interactive applications where sub-second performance is mandatory, completely different protocols are required that bypass the limitations of segment-based HTTP streaming.
WebRTC (Web Real-Time Communications) is the premier sub-second protocol, offering true real-time capabilities. WebRTC was originally designed for two-way peer-to-peer (P2P) video conferencing. As such, it focuses on instant data transfer rather than large-scale distribution. It can achieve latency as low as 100–500 ms, making it ideal for live video chats, interactive classrooms, and real-time surveillance or monitoring. The primary drawback has been complexity to scale to massive one-to-many audiences, though specialized WebRTC-CDN solutions are available. Unlike HTTP-based protocols which are stateless and easily cacheable, WebRTC relies on stateful UDP connections. This makes it incredibly fast but difficult to scale cost-effectively to millions of concurrent viewers.
SRT (Secure Reliable Transport) is a protocol designed specifically for reliably transporting high-quality video over unreliable networks (like the public internet). SRT is often used for the ingest or contribution phase. It delivers the stream from the source to the origin server. It achieves low latency (typically 150 ms – 3 seconds) by using advanced retransmission techniques to quickly recover from packet loss without adding excessive buffer.
Mastering the Latency Trade-Off
The delivery protocol an architect chooses to use is one of the most critical decisions they must make. The balancing act is one of latency against cost, scale, and complexity. From ingest through encoding, distribution, and final playback, every technical decision, is a choice between speed (low latency) and scalability and reliability (high compatibility and stable playback).
While protocols like standard HLS ensure massive global reach and highly stable viewing, they achieve this by intentionally introducing significant delay. On the other hand, cutting-edge protocols like WebRTC deliver the immediate, sub-second performance for near real-time interaction, but may not scale well on their own.
Therefore, the ultimate strategy for a successful live video infrastructure is supporting a wide array of protocols. The choice of protocol (such as WebRTC or LL-HLS) must align directly with the use case. If the goal is a non-interactive linear broadcast (like a TV channel), a low-cost, high-latency solution is sufficient. However, if interactivity is key, sub-second latency is non-negotiable. Taking advantage of both, leveraging WebRTC for ingest and HLS for delivery to public audiences, can yield significant benefits.
By meticulously identifying and eliminating bottlenecks at every stage of the pipeline and selecting the appropriate technology, you can successfully navigate the tiers of lag and deliver an immersive, real-time experience. To explore how you can deploy an optimized, low-latency streaming workflow tailored precisely to your application’s needs, contact the Wowza Streaming Engine experts today for a demo of our flexible low-latency media technology solutions.