Real-Time Subtitles and Translation in Wowza Streaming Engine

The demand for accessible and globally-reaching live streams has never been higher. For developers, meeting this demand means going beyond simply delivering video.  Our latest developer guide shows you exactly how to integrate real-time subtitles and language translation into your live streams using Wowza Streaming Engine and Wowza Flowplayer. If you haven’t yet, follow the steps in our First Developer Guide: Setting Up Your Live Stream to get your Wowza Streaming Engine license key and Flowplayer token.

How To Setup Live AI Subtitles & Translations in Wowza Streaming Engine

This developer guide explains how to quickly set up a live stream pipeline that performs Automatic Speech Recognition (ASR) and machine translation. The resulting captions are injected directly into the stream for the client-side player. The architecture utilizes the flexibility of Wowza Streaming Engine and its plugin framework. It can power custom caption handling in secure, on-prem, or offline scenarios.

This walkthrough demonstrates how to implement multilingual audio and captions in a live stream using Wowza Streaming Engine. You can read the developer guide for detailed guidance on how to:

  • Generate real-time, synchronized subtitles from your live audio feed
    • WebVTT text tracks are displayed directly over the video
    • Extract and display subtitles within a web page
  • Leverage powerful open-source models and flexible AI backends
    • OpenAI’s Whisper for speech-to-textLibreTranslate for machine translation
    • Integrate with cloud services like Azure Speech Detect for specific models (e.g., healthcare or legal)
  • Generate subtitles in multiple languages
    • Including Spanish, French, German, and Japanese
    • Offer a truly global reach for your content

The demo utilizes Wowza Streaming Engine, Wowza Flowplayer, and the open-source OBS Studio live-streaming software switcher and encoder to capture and stream live video. We then use Docker to easily deploy the Streaming Engine instance alongside additional images running Whisper and LibreTranslate. This allows developers to get the complex AI pipeline running locally with minimal setup.

On the front end, Wowza Flowplayer not only overlays the captions but also allows the React web application to read the available text tracks. This enables highly customized styling, transcription displays, or listener functionality for specific keywords or industry scenarios.

Start Streaming Live Video with Real-Time AI Captions

Whether you are seeking compliance, global reach, or enhanced user engagement, adding captions is a necessary step. Artificial intelligence and machine learning tools, especially open-source AI speech-to-text and translation models, have made this simpler than ever to achieve.

Watch the full developer walkthrough video to learn how to configure your environment files, manage the compute overhead of real-time translation, and build the custom front-end application. For more developer guides, visit https://www.wowza.com/developer.

About Don Kianian

Don Kianian is a seasoned marketing professional and content strategist with deep expertise in video production technology and media workflows. He has spent more than 10 years building content, fostering awareness, and driving demand for complex technology and media solutions. He holds a Master of Science in Marketing from Santa Clara University and a Professional Certificate in Data Analytics from Google. Prior to Wowza, Don led Marketing efforts for Sherpa Digital Media, which was later acquired by Telestream. As a thought leader in the media production and video streaming space, Don hosted and produced "The Wirecast Show" in 2022-2023, joined as a featured guest in interviews to secure prominent industry analyst coverage, and has helped secure numerous awards at NAB, IBC, and Streaming Media events.
View More

FREE TRIAL

Live stream and Video On Demand for the web, apps, and onto any device. Get started in minutes.

START STREAMING!
  • Stream with WebRTC, HLS and MPEG-DASH
  • Fully customizable with REST and Java APIs
  • Integrate and embed into your apps

Search Wowza Resources


Subscribe


Follow Us



Back to All Posts