WebRTC Server: What It Is and Why You Need One

January 4, 2023 by

The Web Real-Time Communications (WebRTC) protocol has been making waves with its promise of ultra-low latency streaming as the demand for interactive video continues to grow. WebRTC is also popularly known for not requiring a server to stream in real time between peers. However, the relationship between WebRTC and servers is more complex than it first seems, especially if you’re hoping to stream to wider audiences. 

In this article, we’ll touch on the different types of WebRTC servers and when you might need them. In particular, we’ll discuss the myriad benefits of media servers for a variety of WebRTC workflows and what you can do to take advantage of these benefits for your streaming solution. 

 

Do I Need a WebRTC Server? 

That entirely depends on what you are trying to accomplish. Let’s take a moment to break down how WebRTC works and claims that no servers are necessary. WebRTC utilizes three JavaScript APIs to capture, encode, and transmit data, eliminating the need for intermediary servers that might otherwise fulfill these functions. 

  • GetUserMedia API — Allows users to capture raw video data via their own webcam or microphone. 
  • RTCPeerConnection API — Takes this raw data and encodes it for transmission. It also establishes the initial peer-to-peer connection over which the encoded data will be transmitted and is responsible for transmitting media data from one peer to another.
  • RTCDataChannel API — Transmits other types of data, including text and arbitrary application data, between peers. 

When it comes to a basic peer-to-peer connection, these APIs get the job done. However, they are woefully insufficient in most cases. This is particularly true if you want to transmit to a wider audience or traverse a NAT device. Even a standard browser-based peer-to-peer connection technically utilizes an application server, the same application servers on which browsers rely. 

Really, there’s no way to truly use WebRTC without any server. Even if you were transmitting from peer-to-peer over a local area network (LAN) connection and with access to both computer’s IP and port information, you’d need some way to host the application. So now that we’ve disabused you of the idea that WebRTC is a server-free technology in any practical sense of the term, let’s explore what the different WebRTC servers are and when you might need each one. 

Keep Up With All the Latest Trends

Get video reports and articles delivered to your inbox.

Subscribe Now
 

Types of WebRTC Servers

There are four main types of servers you might encounter when using WebRTC. In this section, we’ve provided a brief overview of each, what they do, and when they are necessary. 

comprehensive webrtc workflow with servers

What Is a WebRTC Application Server?

We’ve touched on this a bit above. An application server quite simply hosts applications. For WebRTC, the application server is typically the website hosting the service. Sure, these aren’t technically part of your WebRTC service, but as a browser-based technology, it’s not going to run without it. 

Is a WebRTC application server necessary? Yes. Even if you decide to take your WebRTC solution to a LAN setting, you still need some way to host the service. 

 

What Is a WebRTC Signaling Server? 

Signaling in WebRTC is the process by which client devices establish a connection. Basically, these devices need to agree to talk to one another before they can send and receive data. And to come to an agreement, they need to know how to “find” each other. 

A device sends a session description protocol (SDP) containing certain identifying information (otherwise known as internet connectivity establishment or ICE candidates), such as port and IP information, to a signaling server. This server sends the SDP along to the other device. It also relays SDP acceptance signals between the peers. 

WebRTC NAT and Signaling workflow

Is a WebRTC signaling server necessary? Let’s put it this way: what you NEED is to relay SDP information between devices to establish a connection. If you have your IP address and port information readily available, you can establish a connection any way that makes sense, be it paper, phone, or carrier pigeon. At the end of the day, it’s just a piece of text. However, this isn’t practical for most people, making a signaling server effectively essential for your WebRTC workflow.

 

What Is a WebRTC NAT Traversal Server?

It sounds like it should be simple — connecting two or more peers remotely. However, the process is more complicated than it first seems thanks to Network Address Translation (NAT) devices. These devices block client devices from locating their own internet protocol (IP) addresses. Before sending an SDP request, a computer must know its IP address. That’s where NAT traversal comes in. 

WebRTC STUN Server

The first method of NAT traversal is known as Session Traversal Utilities for NAT (STUN). Put simply, a client device pings a STUN server, asking for a connection. This server is located on the public internet and requires an IP address for any device that tries to communicate with it. Therefore, when a device pings it, it responds with that device’s IP address. The information received from the STUN server can be used in the SDP sent over the signaling server. 

webrtc STUN server workflow

WebRTC TURN Server

If your NAT device is particularly strict, then STUN may not work for you. That’s where Traversal Using Relays around NAT (TURN) comes in. In this case, you forgo ICE candidates and SDP protocol connections and just go around the NAT firewall. TURN servers have public IP addresses, making them easy to connect to. When two clients connect, they can send media to one another using the TURN server as an intermediary.  

WebRTC TURN server workflow

Are WebRTC NAT traversal servers necessary? You need to be able to establish a connection with another device in order to send it media. If you know your IP address, then you don’t need to worry about these fancy workarounds. Unfortunately for many, that’s a big “if.” 

 

What Is a WebRTC Media Server?

By definition, a media server stores digital media and makes it available over a network. In the case of a peer-to-peer WebRTC connection, this server sits between the peers and acts as a multimedia middleman, taking in media from one end and sending it along to the other. In doing so, it makes things like transcoding and one-to-many streams possible. 

Is a WebRTC media server necessary? Technically, no, especially if you’re just using WebRTC for a one-to-one connection. However, media servers come with a myriad of benefits and make It possible to take advantage of numerous workflows. Let’s take a closer look at what a WebRTC media server can do for you.

 

Get the ultimate WebRTC guide

Everything you need to start your own ultra-low latency live streams is one click away.

Download Free
 

Highlight on WebRTC Media Servers

First off, media servers can be a lot of different things. Literally any device or service that takes media, stores it, and makes it available to other devices is technically a media server. When it comes to WebRTC, media servers typically help to shoulder the load of high-volume data streams, making it possible to stream to larger audiences. This opens the door to a variety of alternative WebRTC workflows, including simulcasting and scalable video coding (SVC). 

 

Types of Media Servers

Your WebRTC media server will likely fall into one of two categories: selective forwarding unit (SFU) or multi-conferencing unit (MCU). Each of these media server types comes with different strengths. 

Multi-Conferencing Unit

The primary purpose of an MCU is to take media provided from peer devices and redistribute it as a single stream. Basically, it’s your quick fix for streaming to a larger group. Because it emits a standard signal, it can also be easily decoded and integrated into existing systems. However, it lacks the flexibility and scalability of an SFU since transcoding into a single stream takes a lot of CPU.

MCU server illustrated
Source: Stream

Selective Forwarding Unit

An SFU is, well, selective. It’s a bit more complex than an MCU as it receives media and then decides which media to send to other parties. It primarily differs from an MCU in that it’s not turning all media into a single stream. Instead, it chooses from multiple options according to certain criteria. A good example of this is in WebRTC simulcasting, where multiple versions of a stream are sent to an SFU for distribution to end user devices according to their available bandwidth. In a more standard set up, the SFU takes in individual streams and sends them to all other users as individual streams. 

SFU server illustrated
Source: Stream
 

Workflows Enabled by Media Servers

The number one thing a media server allows you to take advantage of is one-to-many streaming. Technically, this is possible without the use of a media server. However, sending and receiving multiple streams can present a strain on an individual computer. Media servers act like a WebRTC peer on the server-side and carry the load of collecting and sending this data to relieve said strain. SFU servers, in particular, also facilitate a handful of workflows aimed at improving stream quality and accessibility.

WebRTC Simulcasting

Not to be confused with typical simulcasting, where one streams to multiple platforms at once, WebRTC simulcasting is a method by which media is encoded at a few different bitrates and selectively distributed to various end-user devices. In this case, the SFU’s job is to select the best bitrate for a given peer based on available bandwidth. This makes it easier to stream to a variety of devices on a range of bandwidths without sacrificing the integrity of the stream.

WebRTC simulcasting workflow with SFU server

WebRTC Scalable Video Coding

Similar to WebRTC simulcasting, scalable video coding makes multiple bitrates available for streaming. However, instead of receiving three distinct streams at three different bitrates, the SFU receives a single stream with multiple bitrate layers. The SFU peels away layers of the stream as needed to accommodate the needs of different end-user devices. 

WebRTC scalable video coding workflow with SFU server
 

Summary of Media Server Benefits

  1. Relieves pressure on media publishers / peer devices 
  2. Conserves resources
  3. Enables transcoding of data 
  4. Enables adaptive workflows like simulcasting and SVC
  5. Can sometimes add other complex features, such as server-side machine learning
 

WebRTC Media Servers and Wowza 

Getting started with a WebRTC media server doesn’t have to be complicated. Video solution providers like Wowza make it easy to build a WebRTC-based workflow that fits your needs. You can integrate our Wowza Streaming Engine into your existing infrastructure or opt for our cloud-based Wowza Video platform. 

 

Wowza Streaming Engine and WebRTC

Wowza Streaming Engine can ingest WebRTC streams for delivery to playback devices. It can also ingest non-WebRTC streaming protocols and transcode them into WebRTC streams for output. Our streaming engine also provides SSL/TLS encryption for your WebRTC stream and a range of configuration options.  

 

Wowza Video and WebRTC 

With for Wowza Video, our cloud-based platform prepares data for delivery through a custom content delivery network (CDN), which acts as an SFU. In doing so, it makes sub-second latency streaming to a million users worldwide a reality. Real-Time Streaming at Scale recently added live to VOD via a content management system (CMS) to further enhance WebRTC streaming capabilities. 

wowza real time streaming at scale workflow

What WebRTC has in potential it lacks in inherent scalability (and thereby usability). Media servers and workflow solutions like those provided by Wowza give you the tools needed to make WebRTC work for you.

 

Interested in Real-Time Streaming at Scale?

Learn More

Can’t wait to start streaming with WebRTC? Check out our FREE TRIAL.

About Sydney Whalen

Sydney works for Wowza as a resident content writer and social media marketer, leveraging roughly a decade of experience in copywriting, technical writing, and content development. When observed in the wild, she can be found gaming, reading, hiking, parenting, overspending… View more