Automated SRT and VTT Video Captions (April 2020 Update)

Automated SRT and VTT Video Captions (April 2020 Update)

This month we added new acoustic models for UK customers, automated video captioning (SRT/VTT), and automatic transcript summaries. Now, companies in industries like video hosting, media monitoring, e-discovery, or video interviewing will be able to improve video playback and search...

Building an end-to-end Speech Recognition model in PyTorch

Building an end-to-end Speech Recognition model in PyTorch

Deep Learning has changed the game in speech recognition with the introduction of end-to-end models. These models take in audio, and directly output transcriptions. Two of the most popular end-to-end models today are Deep Speech by Baidu, and Listen Attend Spell (LAS) by Google. Both Deep Speech ...

