Production Tools and Workflows, Part 2: Preparation

April 11, 2018 by

Media production tools and workflow preparation

 

This blog post comes to you from Las Vegas, Nevada, which this week is transformed into the wonderland of broadcasting—from QAM to IP, over-the-air to cable and even a bit of OTT (over-the-top delivery). Approximately 120,000 attendees of the National Association of Broadcasters (NAB) Show will flood the Las Vegas Convention Center and other venues throughout the Strip area this year.

In our first blog post on the topic of production tools and workflows, we discussed the challenges of acquisition. In this post, we’ll talk about preparing the acquired content for on-demand delivery, with a nod to some live-streaming delivery workflow steps.

 

Edit Video Content and Add Graphics for a Polished End Product

Unless you’ve been able to acquire all the content in a single take, the first step in this preparation process is non-linear editing (NLE). The NLE process can be as simple as trimming clips and abutting them to each other—often referred to as “cuts-only editing”—or as complex as adding motion graphics (such as titles, crawls or composited animations with transparency) to a NLE timeline full of multiple camera angles, dissolves and other transitions.

Simple content trimming and adding lower-thirds titles (e.g., the on-camera talent’s name, company and title) can often be accomplished on a mobile phone. However, complex NLE still requires a desktop or laptop using specialized software.

Video Editing

 

Include Closed Captioning for Engagement and Compliance

One aspect of preparation that can be time-consuming—especially for those blog readers in the United States—is the need to comply with Americans with Disabilities Act (ADA) requirements by creating time-based closed captions for enterprise and broadcast content.

We’ve all probably seen, perhaps in an airport where HD monitors are delivering a live news show, just how inconsistent live-television closed captioning can be, and how much it lags behind the actual talking-head pundit. A transcriptionist, or closed captioning specialist, listens to the live audio and then types—as quickly as they can, with an acceptable number of spelling or phonetic errors—the words that on-air talent utter. Since this occurs in real time, the typed words often lag behind the spoken words by three to five seconds.

Live closed captioning is inserted into the broadcast signal, and it’s often referred to by broadcast engineers as “Line 21 insertion,” based on the original NTSC signal that allowed several scan lines of the broadcast signal to hold metadata such as closed captions.

Once a live event has been broadcast, the transcriptionists often go back and clean up the captions, aligning them to the actual point at which words are spoken, as well as cleaning up spelling and phonetic issues. These final closed captions are then used for rebroadcast or on-demand delivery.

There are also audio recognition systems that can automatically generate captions from an audio feed. For live-streaming delivery on OTT channels, most captions are timecode-based, wherein the player associates the timecode corresponding to a segment of video with the proper time in the video metadata.

In this way, closed captioning is beneficial in several other use cases—not just for those viewers who have a disability, but for every viewer.

On another Wowza blog post, author Holly Regan talks about one benefit of closed-captioning: video views while scrolling through social media feeds. Regan notes that some content creators with the Wowza ClearCaster™ appliance are using close-captioned live streams to capture the attention of Facebook users who browse their News Feed with the audio turned off.

Both Facebook and YouTube have features that launch video playback with audio in a muted state, requiring the viewer to click or tap the screen to hear it. Facebook estimates that upwards of 80 percent of all video views on the platform are initially seen without sound.

According to Regan, using closed captioning for these Facebook videos “helps grab people and clue them into what’s happening, so they engage more deeply.” There are a number of software and hardware tools that can be used to add closed-captioning to live streams for Facebook Live and other destinations, such as ClearCaster, which offers simulcasting to the Wowza Streaming Cloud™ service.

Facebook Live News Broadcast Closed Captioning

Other examples of closed-captioning use cases include educational scenarios for live lectures,  debates or question-and-answer sessions. Closed captions are also used in religious settings, such as Christmas tree lightings, special ceremonies or lyrics for songs.

The use of lyrics overlaid on recorded videos for YouTube has been around for quite some time in worship and educational content, and the trend for slideshow-type music videos, complete with lyrics, seems to be growing across all genres. In fact, there’s an expanding market for “official lyric videos” for even major-label music artists who don’t want to spend significant production capital on a more traditional music video.

 

Transcode or Transrate to Reach End Viewers

Beyond editing, adding graphics and preparing closed captioning, the final piece of preparation for OTT delivery centers on transforming what may be very high-bitrate content into a lower-bitrate, streamable format.

Without going into too much detail here, there are three main elements of this transformation from “master quality” to a streamable version of prepared content: transcoding, transrating and segmentation or packaging.

Transcoding allows an acquisition “master quality” format (meaning full resolution, full quality and full file size) to be transformed into a format that NLE can process—and then processed again to a format that can be streamed. Along the way, the need to use the appropriate compression-decompression software (also known as a codec) for each step of the workflow may require transferring the content from one codec to another.

Some acquisition formats, such as Motion JPEG, rely on intraframe compression, meaning they only consider video compression within the space of a single frame. This approach allows every frame to be captured accurately, but often does so at the expense of saving overall bandwidth during recording: sometimes up to 100 megabits per second (Mbps).

This leads to a need to apply interframe compression—or compression that considers multiple frames, in what is called a Group of Pictures (GoP)—to significantly reduce the overall bandwidth of the streamed version to approximately 3 Mpbs.

If the “master quality” output from an NLE is in an interframe compression format that uses the same codec for streaming, but the bandwidth is still too high for streaming delivery, the approach differs from transcoding.

Take the example of an I-frame only format such as IntraAVC: It uses the AVC-H.264 codec, but its acquisition capture rates can be as high as 100 Mpbs, which is approximately 30 times the average bandwidth a viewer might have coming into their home.

Since YouTube and other primary video-on-demand platforms also use H.264 for delivery, there’s no need to transcode to a different codec—but there will be a need to apply rate-reduction algorithms to bring the streamable version of this H.264-based content down to the approximate 3 mbps range (a practice known as “transrating”).

As the final stage in the media production workflow, there’s segmentation and packaging. We’ll talk about that in the final blog post in this series, when we cover common OTT delivery issues.