Scaling and Load Balancing

When streaming live and on-demand (VOD) content from your Wowza™ media server, there's a maximum number of concurrent streams that the media server can handle comfortably before it starts to slow down or get overloaded. The number varies for different media servers based on a number of factors including hardware specification, network configuration, stream type, stream bitrate, and connection types. The only sure way to determine the limits for a particular configuration is to perform load tests that show when performance is likely to degrade. When the limit is likely to be reached, it's a good idea to increase the scalability of your Wowza media server configuration to increase capacity. In addition to increasing the overall scale of your media server configuration, you can incorporate a load-balancing system to offload player connections from more heavily loaded server to less-loaded servers.


What is scaling?
When to scale?
What options are available for scaling?
Load balancing
Scaling and load balancing scenarios

What is scaling?

In simple terms, scaling is where you have two or more media servers that are used to make your streams available to a larger audience. Servers arranged in this way are usually referred to as a "server pool" or "server cluster." Depending on hardware and infrastructure, servers can be added to or removed from the cluster manually based on estimated peak loads or automatically based on real-time metrics. Different methods can be used to scale, depending on stream type and target audience and it's possible to use a combination of methods to suit your needs. Some form of load balancing is also generally required to route player connections to the correct servers in order to balance the cluster.

When to scale?

There isn't an easy answer as to when to scale a video system. When designing or specifying a new video system, it's best to do so with scaling in mind. To help determine when and how best to scale, consider the following areas:
  • The number of video streams that would be available concurrently and their bitrates.
  • The estimated audience size and geographical location. Can you control the number of viewers and their location? What would happen if one or more of your videos went viral?
  • The short-term and long-term project budget. For your deployment, is it more cost-effective to keep systems online 24/7 to handle peak loads or to use a cloud service that allows you to bring systems online as demand increases and take them offline when demand is low?

What options are available for scaling?

Different options can be used for scaling and sometimes it may be a case where multiple options are used simultaneously to handle different demands. The following articles can help you to decide when and how to scale your Wowza media server configuration:

Wowza live stream repeater

Applies to: Live streaming

A live stream repeater (origin/edge) configuration is a system where multiple Wowza media servers are configured as either origin or edge servers. Live encoders that produce the content connect to the origin servers and players connect to the edge servers. The edge servers connect to the origin servers in order to repeat the live stream. A live stream repeater configuration can be used for all types of live video streaming. To increase capacity ("scale up"), you can just add more edge servers to the cluster and point them to the streams originating from the origin servers. A separate load balancer is typically used to direct the players to the correct edge server in the cluster.

Article: Scaling with Wowza live stream repeater

Wowza Media Cache

Applies to: Video On Demand streaming

The Media Cache feature in Wowza Streaming Engine software can be used on multiple Wowza media servers to provide access to video on demand (VOD) content from a central storage location. Each media server that uses Media Cache can access the same content without having to store the content locally. As with live streaming, scaling up is just a matter of adding new media servers to the cluster and configuring them to access the central media storage. A load balancer can be used in the same way that it would be for live content.

Article: Scaling with Wowza Media Cache

Wowza HTTP Origin mode

Applies to: Live and Video On Demand streaming

Normally, when you enable an HTTP-based playback type (Adobe HDS, Apple HLS, Microsoft Smooth Streaming, or MPEG-DASH), Wowza Streaming Engine associates every separate HTTP connection with a different internal session so that each one can be tracked and different options can be used for the session. After the initial stream request, subsequent requests have the unique session ID attached.

Wowza HTTP Origin mode is a special session-less mode that enables regular HTTP caching proxies to be used in front of the Wowza media server to cache the live or VOD content, thus reducing the load on the Wowza media server. As the name indicates, HTTP Origin Mode is only available for HTTP player connections.

Article: Scaling with Wowza HTTP Origin applications

UDP multicast

Applies to: Live and Video On Demand streaming

If the intended audience is on a controlled private network, you can use UDP multicast to deliver live streams to a wider audience. Multicast differs from regular scaling in that routers in the network provide the capacity instead of a media server cluster. Multicast generally only works in private networks where the routers can be configured to allow multicast traffic to traverse the network. A single stream is sent out from the Wowza media server to the multicast address and the players all connect to the multicast address to receive the stream. Wowza Streaming Engine supports both RTP and MPEG-TS streaming over UDP multicast; however, RTP is recommended due to MPEG-TS streaming licensing requirements.

Load balancing

Load balancing is used to distribute the player connections between servers in a cluster. Simple load balancing can be performed using certain types of DNS services. Software or hardware solutions specifically designed for load balancing are also available. A properly designed load balancing solution should allow easy addition and removal of servers from the cluster and also allow for failure detection. Each type of load balancing has it's own strengths and weaknesses and must be chosen to match the server cluster.

Article: Load balancing overview

Scaling and load balancing scenarios

Occasional live events with a mixture of players

In this scenario, a single server is deployed for day-to-day streaming. Every now and then, a larger event, that requires more than one server, will be streamed. The players will be a mixture of RTMP, RTSP, and HTTP players .

The best way to handle this type of scenario would be to have Wowza Load Balancer configured and running on the original server. When there is only one server in the cluster (the original server) the load balancer will redirect all connections to the streaming application on itself.

When required, extra servers can be configured as live stream repeater edges and pointed to the original server application and added to the cluster. The load balancer automatically detects these and starts redirecting connections to them. After the event is finished, the extra servers can be taken offline until they are next required.

If the event is short-lived, it's best to use a cloud instance of Wowza Streaming Engine for the extra servers as they can be brought online and taken down quickly and you only pay for the time they are running and not requiring hardware purchase for the additional servers.

Occasional live events with mainly HTTP players

This scenario is similar to above except that most of the players are HTTP-based. The best way to handle this type of scenario is to configure the Wowza application as a HTTP Origin application and use Amazon CloudFront to handle all HTTP connections for the event. If the event is regional, then CloudFront should be configured to use the closest region. All RTMP and RTSP connections would go directly to the Wowza media server while all HTTP connections would be routed to CloudFront for playback.

Large VOD library with constant new releases

In this scenario, there is a large video library with new releases that are very popular. There are a couple of ways of handling this, depending on how many of each type of player there is.

As a minimum, Media Cache should be used to connect at least one Wowza media server to the video library using HTTP. This allows robust HTTP-based storage (such as multiple web servers) or a cloud solution (like Amazon S3 or Rackspace Cloud Files) to be used.

Multiple Wowza media servers can be connected to the video library as required using a similar solution to the first example above--or if the players are mainly HTTP-based, then CloudFront can be used to connect to the Wowza media server. When CloudFront is used with VOD streams, the regions should be limited on the CloudFront distribution to closely match the intended audience. This ensures that the content is cached effectively on the CloudFront distribution.

Small to medium server cluster

In this scenario, a cluster of Wowza media servers is configured to handle regular loads of up to 50,000 concurrent connections. There is control over the players and webpages that launch them so compatibility with Wowza Load Balancer can be assured.

One or two servers in the cluster are configured as live stream repeater origins and the rest are configured as live stream repeater edges. The origins are also configured as Wowza Load Balancers.

Round Robin DNS would be used to resolve the origin and load balancer domain names.

Encoders would connect to the origins and the players would connect to the edges via the load balancer.

Media Cache would be used on the edge servers to retrieve VOD content from a central video library.

HTTP connections could be served using a cluster of HTTP caching proxies or CloudFront instead of many Wowza media servers. Regular proxies would require a different load balancer and CloudFront could work out to be too expensive over a long term of high use.

New clusters could be started in a different region where Geographical DNS can be used to direct connections to the closest region.
Note: Custom code may be required to allow the edges to locate the origin streams.

Large server cluster

Similar to above apart from the cluster is sized to handle in excess of 50000 connections. There may not be control over the players or webpages that launch them so they may not be compatible with Wowza Load Balancer.

The main difference between this and the above solution is how the load balancing is done. Instead of Wowza Load Balancer, one or two 3rd-party load balancers are used to handle a high number of connections. There may be a number of player that aren't compatible with Wowza Load Balancer but will work with the 3rd-party load balancer.

Again, HTTP caching proxies or CloudFront could be used for the HTTP connections.