Is it possible to accomplish speech recognition in live streaming using speech-to-text services from Google or Azure?

Spencer_Lin2 · August 4, 2023, 7:52am

Speech-to-text services from Google or Azure appears to support only from microphone and the file format as input stream.

So,I’m curious about is there a way to acheive that?

Kay_Werner · September 27, 2023, 8:27am

Did you find any solution connect wowza SDK with azure cognitive-services-speech SDK?
First step would be to extract the audio stream from the live stream. What features does wowza provide to handle / redirect the audio feed in parallel to the default transcoding process?

Dorota_Szafer-Kwasik · September 27, 2023, 6:54pm

I can see that the stage of creating by home means is beginning…
I know it is possible to create a wowza module that will automate the conversion of audio to text /closed captions/.
I don’t understand why the wowza team concentrated their efforts on Wowza Video.
I believe that such an audio-to-text conversion module would attract new WSE users. It would retain current WSE users.
Speech-to-text services from Google works great with open captioning.
Third-party programmers can handle English but other languages are much worse for them.

Scott_Kellicker2 · October 5, 2023, 2:28pm

Hello. I’m former Wowza now working independently.
I’ve been working on a couple speech to text implementations for WSE, although not yet Google. If you are interested in such a module, I am to build it on a contract basis.

Reach out to scott@blankcanvas.video

Dorota_Szafer-Kwasik · October 5, 2023, 6:14pm

Hi,
… and yet interest is emerging. and well. You know the point.
Scott, you’ll get it done faster than Wowza will be interested in such a solution.
Greetings to you

Axel_Gomez1 · December 14, 2023, 8:19pm

Hi @Scott_Kellicker2

I came across your post about developing speech to text modules for WSE. I’m interested in a module that also integrates with JW Player. Could you provide some insights on feasibility, development time, and cost?

Scott_Kellicker2 · December 14, 2023, 9:31pm

Hi.

Yes, I could develop such a module.

Let’s connect via email. I’m at scott@blankcanvas.video

(I’m traveling this week but will reach out to your email Monday)

Scott Kellicker

Scott_Kellicker2 · January 9, 2024, 12:56pm

Hi Axel. Do you still have interest in such a module? I’ve been working on something very close to this.

Let’s chat at scott@blankcanvas.video .

ScottK

Karel_Boek · January 9, 2024, 8:28pm

At Raskenlund we’ve worked with a variety of STT services, incl. IBM, Azure, Google, AWS and a few more.

We have a working module for integration with AWS Transcribe (audio is extracted, sent to AWS Transcribe, then text is added to the stream as subtitles, or you can export it as VTT)

Wowza Community

Is it possible to accomplish speech recognition in live streaming using speech-to-text services from Google or Azure?

Popular Video Topics

Video Resources

Partners

Company

Stay Connected

Stay Up to Date with the Blog