DIY automatic transcription online events

Despite this article is in this category, it can be applied to any type of media event having audio, which plays in your computer e.g. a browser playing a webcast, an application playing some video online or on-demand, Zoom desktop client, etc.

On terms of a videoconference the two options we are exploring below are not very suitable if you need to have an active role in a meeting as input/output audio channels are modified to allow better recognition which sometimes doesnt allow to get the audio from your micro.

Specifically for Zoom meetings, there is an automated live transcription feature that you can activate as host: Activating automated live transcription for meetings you are hosting

We are focus in the type of passive transcription where we are mainly interesting to read what others are saying e.g. a webinar, webcast, vod, etc.

Most of the content of this topic, has been collected from some articles:

Using google docs dictation feature

On MacOS (tested on High Sierra 10.13.6) you will need to install the soundflower driver for your OS. Then go to Sound on your mac settings and select Soundflower for both Input and Output. Please be sure you are not muting any of the channels.
You can then go to google docs (you would need a google account), open a Word document, go to Tools and select Voice Typing.

If you would like to also hear what is going on, you need to create a composite output channel, you could use the Audio MIDI setup app as in the picture:

On Windows

Tested on Windows 10. In order to work in the same way as indicated for Mac, you will need to have Stereo Mix driver enabled on Sound -> Recording. If after enabling disable devices, you still dont get it you will need to install the driver. You can get it from Realtek downloads. Please remember that both Realtek Speakers on Playback and Stereo Mix at Recording should be enabled. See picture:



Note: Transcripts with google docs are not perfect (sometimes far to be), but it may be of help. It can be setup to be used with different languages. One annoying detail is that the transcription will stop if you remove the focus from google docs.

OBS studio plugin

Following this article: Closed Captioning via Google Speech Recognition you can set OBS to do Cloud closed caption as in the picture. In a way it’s bit more comfortable than google docs but still transcripts are not too exact in some situation.

The functionality it’s based on OBS-captions-plugin.