Real-Time Subtitles and Language Translation with Wowza Streaming Engine
Generate real-time, AI-generated subtitles and translations with Wowza Streaming Engine (WSE), and display them in a custom React web app using Wowza Flowplayer.
The code samples provided here are for demonstration purposes only, and are provided via the Wowza Public License.
How It Works
This demo uses two GitHub repos – one for Wowza Streaming Engine and performs the speech recognition and translation. The other repo provides a frontend React web app to show how to embed the live stream and gather the AI-generated text tracks to create custom solutions to listen for specific keywords or provide summaries or transcripts of video streams.
Real-time speech-to-text:
![]()
Real-time language translation:
![]()
Speech recognition and machine translation
This demo utilizes two Github repos – one for Wowza Streaming Engine configured to work with
OpenAI's whisper, a general purpose
speech recognition model, and LibreTranslate,
a machine translation API. Both are available as open-source projects.
WSE embeds the WebVTT timed text tracks in the stream containing English, and additional tracks containing translations in Spanish, French, German, and Japanese.
The wse-plugin-caption-handlers repo provides sample code on how to implement this with whisper and LibreTranslate, useful if use cases include on-prem, secure, edge or offline scenarios.
Alternatively, the sample can be configured to use the Azure speech-to-text service, and serves as an example of how to connect Streaming Engine with other cloud services that may provide domain-specific speech recognition models for industries such as legal or healthcare.
Displaying subtitles and text tracks
The second GitHub repo, dev-guides, provides a
frontend React web app that embeds Wowza Flowplayer to display the live stream within a web page.
By default, Flowplayer overlays the embedded, synchronized text tracks on top of the video feed.
The text track can also be read by the web app to customize or store the output, or listen for specific keywords in industry-specific scenarios.
Step 1: Sign up for a trial and set up your environment
To get a trial license key and get this demo working on your local machine, follow the steps in the first developer guide, Creating a Live Stream.
This first developer guide is a good place to start if you are new to Wowza Streaming Engine and will help you deploy your first live stream. It also provides more details and context, such as how the demos utilize environment variables.
Specifically, you'll need to:
- Sign up for a Wowza Streaming Engine free trial
- Obtain a Wowza Flowplayer token
- Clone the sample frontend GitHub repo: https://github.com/WowzaMediaSystems/dev-guides.
- (Optional) Install the free and open-source OBS Studio to stream from your webcam
Step 2: Clone the Caption Handlers Repo
You'll need to clone a second repo which configures Wowza Streaming Engine to provide WebVTT subtitles.
https://github.com/WowzaMediaSystems/wse-plugin-caption-handlers
Step 3: Deploy Wowza Streaming Engine using Docker
Option 1: English-only subtitles
- Download and install Docker on your computer.
- Open a terminal window and go to the
wse-plugin-caption-handlersdirectory. - Create a
.envfile:- Copy the
.env.examplefile to.env: - Open
.envand add your Wowza Streaming Engine license key:WSE_LICENSE_KEY=YOUR_LICENSE_KEY_HERE
- Copy the
- The Docker Compose setup will automatically use the value from your
.envfile. You can also set your admin username and password indocker-compose.yamlif you want to change them from the default.
- Start the Docker images. In the terminal, run the following command:
docker compose upOption 2: Translated subtitles
Performing live translation utilizes more computer resources, so it is turned off by default. To enable this capability: - Make a backup of the existing
docker-compose.yamlfile. - Rename the
docker-compose-translate.yamlfile todocker-compose.yaml. - Uncomment lines
75-78inconf/whisper/Application.xml
- Start the Docker images. In the terminal:
docker compose up
Step 4: Start live streaming
Option 1: Stream live with OBS Studio
OBS Studio is free and open source software for video recording and live streaming. We'll use this to create a live stream from your local computer using your webcam and microphone.
- Open OBS Studio
- Add a new Video Capture Device

Select the webcam on your computer

You should see your webcam feed now

Click on Settings (lower right) and click on
Stream
In the Destination section, set the Server and Stream Key
Server: rtmp://localhost/whisper Stream Key: myStream- Click OK, then click
Start Streaming.
Option 2: Stream an existing video with ffmpeg
Alternatively, you can stream an existing video file utilizing ffmpeg. In the test-videos subfolder of the dev-guides repo there is
a sample video called barry_360p.mp4.
- Download and install
ffmpegat https://www.ffmpeg.org/ - In a second terminal window, navigate to the
dev-guides/test-videosfolder - Run this command to start streaming the video to Wowza Streaming Engine
ffmpeg -re -y -stream_loop -1 -i ./barry_360p.mp4 \
-vcodec libx264 -f flv rtmp://localhost/whisper/myStream
Your output should look similar to this:
![]()
Subtitles and language translation output
In the terminal where Docker is running, you should see the speech-to-text output in the terminal window. In the next step we'll display this in a web app.
![]()
Step 5: Embed the live stream Using Wowza Flowplayer
Open the /frontend subfolder in the dev-guides repo to see
how to embed Wowza Streaming Engine streams using Wowza Flowplayer, the HTML5 video player for HLS
and MPEG-DASH playback for browsers and devices.
https://github.com/WowzaMediaSystems/dev-guides/tree/main/frontend
Make sure
node.jsis installed on your machine. If not, follow these installation instructions.Copy the
.env.examplefile in thedev-guides/frontendfolder to.env:cp .env.example .env- Open
.envand add your Flowplayer trial token:FLOWPLAYER_TOKEN=YOUR_FLOWPLAYER_TOKEN_HERE - The web app will automatically use the value from your
.envfile for secure configuration. Start the web app by typing
npm run dev. The web app is listening on localhost:8080.
Open http://localhost:8080/ai-transcription and you'll see the live stream embedded in the web page with captions displayed overlayed on the video and on the web page.
