VTT files are widely used subtitle file formats for videos. In this guide, we'll show you how to create VTT(.vtt) files for videos in Python.
What is a VTT file?
VTT files are text files saved in Video Text Tracks format, also known as WebVTT. A VTT file is generally saved with the .vtt extension and contains supplementary information about the video such as subtitles or captions. VTT is one of the most common file formats used for video subtitles.
The syntax is similar to SRT files but has some differences:
- The file should begin with the header WEBVTT.
- The start and end times are given in the format
hours:minutes:seconds.millisecondsand are separated by
- No blank lines are needed between entries.
- No index numbers are required.
- This format is supported by many modern browsers and can be used with the HTML5
<track>element to add subtitles to a
VTT files make it possible to add subtitles to video content after it is produced. For example, they can be uploaded to YouTube videos to add missing subtitles or replace existing ones with higher-quality ones.
Example of a VTT file
This is what the first lines of the SRT file for this YouTube video look like:
00:00.170 --> 00:04.234
AssemblyAI is building AI systems to help you build AI applications
00:04.282 --> 00:08.106
with spoken data. We create superhuman AI models for speech
Prerequisites to get VTT files in Python
We will use the following dependencies to complete this tutorial:
- The AssemblyAI Python SDK
- A free AssemblyAI API key, which can be copied from your AssemblyAI dashboard
If you want a working code example that transcribes and generates subtitles for YouTube videos, you can check out this Google Colab.
Project Setup for VTT generation
Make sure that you have Python 3.8 or newer already installed on your system and create a new folder for your project. Then, navigate to your project directory in your terminal and create a new virtual environment:
python3 -m venv venv
python -m venv venv
Install the AssemblyAI Python package.
pip install assemblyai
Set your AssemblyAI API key as an environment variable named
ASSEMBLYAI_API_KEY. You can get a free API key here.
Create the VTT files for videos in Python
AssemblyAI can produce subtitles as both SRT and VTT files. We also have a guide on how to generate SRT files for videos.
First, you'll need the video file for which you want to create the VTT file. You can either use a path to a local file or a URL to a publicly accessible file. The AssemblyAI API supports most common audio and video file formats, so you can submit both audio or video files to generate VTT files. You’ll find all supported file formats in our API documentation.
Create a new file named
main.py and insert the following code:
import assemblyai as aai
# If the API key is not set as an environment variable named
# ASSEMBLYAI_API_KEY, you can also set it like this:
# aai.settings.api_key = "YOUR_API_KEY"
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://storage.googleapis.com/aai-web-samples/aai-overview.mp4")
vtt = transcript.export_subtitles_vtt()
# Save it to a file
with open("subtitle_example.vtt", "w") as f:
The above code first imports the
assemblyai Python package. Next, it instantiates the
Transcriber object, which is used to call AssemblyAI's transcription service.
transcriber.transcribe() starts the transcription process on the specified video file. Here, we used a remote URL, but you can replace it with the path to your own file.
When the transcription is finished, it gets saved in the
transcript object. Calling
transcript.export_subtitles_vtt() then generates the subtitles in VTT format.
Lastly, the script dumps the VTT string into a .vtt file. You can modify the output filename to your liking.
Running the subtitle generation code
Ensure that the
main.py file is saved, that your virtual environment is still activated, and that your API key is set. Navigate to the project directory in a terminal and run the `main.py` file with the following command:
Once the script has finished executing, you should see a new .vtt file with the generated subtitles in your folder.
Specify the number of characters per caption in the VTT file
You can also customize the maximum number of characters per caption by specifying the
chars_per_caption parameter. For example:
vtt = transcript.export_subtitles_vtt(chars_per_caption=32)
The captions are then limited to 32 characters:
00:00.170 --> 00:01.754
AssemblyAI is building AI
00:01.802 --> 00:03.514
systems to help you build AI
In this tutorial, you’ve learned how to generate VTT files for videos with AssemblyAI using Python.
Here are a few other helpful resources to learn more about what you can do with transcripts and AssemblyAI’s Speech AI models:
- API documentation for subtitle generation
- How to generate SRT files for videos
- Automatically determine video sections with AI using Python
- Key phrase detection in audio files using Python