Speech-to-Text features

Build, scale, and operate a world-class engine.

Top rated Accuracy - Final Copy.png

Top-Rated Accuracy

Our API is built using the latest advances in Deep Learning, using cutting edge research from our in-house team of AI researchers.

All Audio and Video Formats Accepted

Don't worry about file formats or sampling rates, our API supports virtually all audio/video files accurately without any transcoding required.

Supported file formats
All Files Accepted - Final big.png
Keyword - Final big.png

Keyword Boosts

Boost accuracy for key terms and phrases like person or product names, places, and other vocabulary unique to your application.

Boosting accuracy for keywords

Export as Captions (SRT/VTT)

Easily export your transcription in SRT or VTT format, to be plugged into a video player for subtitles and closed captions.

Exporting in SRT or VTT format
Video Captions - Final big.png
Punctuation - Final big.png

Automatic Punctuation and Casing

Casing and punctuation of proper nouns are automatically added to the transcription text, to make transcripts produced by the API more readable.

Automatic punctuation and casing

Automatic Transcript Highlights

The API can automatically detect key phrases and words in your transcription text - very useful for tagging content, and providing a summary of the transcription text.

Auto-detecting key phrases and words
Auto Transcripts - Final big.png
Accents - Final big.png

Multiple Models for Accents

Select a customized model for Australian, UK, South African, and more dialects to boost accuracy on your data.

Automatic punctuation and casing

Dual-Channel Support

Phone calls recorded in stereo/dual channel are transcribed separately, and you'll get a transcript for each channel.

Transcribing dual-channel recordings
Dual-Channel - Final big.png
Security - Final big Copy.png

Advanced Security & Privacy

We always follow best-practice guidelines like encryption in transit and at rest, and are not in the business of monetizing your data. Files sent to the API for transcription are never stored, and you can request the deletion of transcription text permanently from our database.

Giving temporary access to private files

99.9%+ Uptime

With a historical uptime and transcription completion rate of 99.9%+, you can feel confident integrating our API into your product.

Uptime - Final big Copy.png

Additional features

Speed Boost

Request for a transcript to be generated 25-50% faster than normal.

24x7 Support

We're here to work with you as much, or as little, as you'd like. Talk to us over live chat, Slack, email, or phone.

Transcribe Remote Files

Have audio/video files stored in S3, GCP, or on your server? The API can download and transcribe any file accessible via a URL.

Custom Models

Our team can work directly with you to fine-tune our models for your use case and data, in order to boost accuracy.

Upload Files for Transcription

Upload your audio/video files directly to the API for transcription.


Receive webhooks from our API as soon as your transcriptions are finished processing.

PII Redaction

Replace PII like social security and credit card numbers with "#" for audio data that has sensitive information.

Word Timings

Word-by-word timestamps across the entire transcript text.

Confidence Scores

Get a confidence score i.e. .96 for each word in the transcript.

Ready to start building?

Begin testing in under 2 minutes