For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Getting started
    • Transcribe a pre-recorded audio file
    • Model selection
    • View model benchmarks
    • Evaluate model accuracy
    • Cloud endpoints & data residency
    • Manage concurrent requests
    • Webhooks
  • Models
    • Medical Mode
  • Features
    • Boost specific terms
    • Label speakers
    • Transcribe multiple audio channels
    • Transcribe audio with mixed languages
    • Correct spelling of terms
    • Include filler words
    • Search for words in transcript
    • Set the start and end of the transcript
      • Check transcript status
      • Export transcripts as SRT, VTT, or text
      • Delete transcripts
  • Guides
LogoLogo
PlaygroundChangelogSign In
On this page
  • Export SRT or VTT caption files
  • Export paragraphs
  • Export sentences
  • Word-level timestamps
  • API Reference
  • Additional resources
FeaturesTranscription operations

Transcript export options

Was this page helpful?
Previous

Delete Transcripts

Next
Built with

This page explains the different ways you can export and format your transcript data, including SRT/VTT caption files, paragraphs and sentences, and word-level timestamps.

Export SRT or VTT caption files

You can export completed transcripts in SRT or VTT format, which can be used for subtitles and closed captions in videos.

You can also customize the maximum number of characters per caption by specifying the chars_per_caption parameter.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(
9 speech_models=["universal-3-pro", "universal-2"],
10 language_detection=True
11)
12
13transcript = aai.Transcriber(config=config).transcribe(audio_file)
14
15if transcript.status == "error":
16 raise RuntimeError(f"Transcription failed: {transcript.error}")
17
18srt = transcript.export_subtitles_srt(
19 # Optional: Customize the maximum number of characters per caption
20 chars_per_caption=32
21 )
22
23with open(f"transcript_{transcript.id}.srt", "w") as srt_file:
24 srt_file.write(srt)
25
26# vtt = transcript.export_subtitles_vtt()
27
28# with open(f"transcript_{transcript_id}.vtt", "w") as vtt_file:
29# vtt_file.write(vtt)

Export paragraphs

You can retrieve transcripts that are automatically segmented into paragraphs. The text of the transcript is broken down by paragraphs, along with additional metadata.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(
9 speech_models=["universal-3-pro", "universal-2"],
10 language_detection=True
11)
12
13transcript = aai.Transcriber(config=config).transcribe(audio_file)
14
15if transcript.status == "error":
16 raise RuntimeError(f"Transcription failed: {transcript.error}")
17
18paragraphs = transcript.get_paragraphs()
19for paragraph in paragraphs:
20 print(paragraph.text)
21 print()

Export sentences

You can retrieve transcripts that are automatically segmented into sentences, for a more reader-friendly experience. The text of the transcript is broken down by sentences, along with additional metadata.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(
9 speech_models=["universal-3-pro", "universal-2"],
10 language_detection=True
11)
12
13transcript = aai.Transcriber(config=config).transcribe(audio_file)
14
15if transcript.status == "error":
16 raise RuntimeError(f"Transcription failed: {transcript.error}")
17
18sentences = transcript.get_sentences()
19for sentence in sentences:
20 print(sentence.text)
21 print()

The response is an array of objects, each representing a sentence or a paragraph in the transcript. See the API reference for more info.

Word-level timestamps

The response also includes an array with information about each word:

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(
9 speech_models=["universal-3-pro", "universal-2"],
10 language_detection=True
11)
12
13transcript = aai.Transcriber().transcribe(audio_file, config)
14
15for word in transcript.words:
16 print(f"Word: {word.text}, Start: {word.start}, End: {word.end}, Confidence: {word.confidence}")

API Reference

Additional resources

Is there a way to generate SRT or VTT captions with speaker labels?

Learn how to create caption files that include speaker identification.