SRT files are widely used subtitle file formats for videos. In this guide, we'll show you how to create SRT(.srt) files for videos in Python.
What is an SRT file?
An SRT file or SubRip file is one of the most common types of subtitle file formats for videos, generally saved with the .srt extension. The format contains human-readable plain text that provides the timing information for each subtitle along with the subtitle text itself.
Here's a breakdown of how the format works:
- Each subtitle entry consists of an index number, start time, end time, and text.
- The index number is a sequential number starting from 1.
- The start and end times are given in the format
hours:minutes:seconds,millisecondsand are separated by
- The text that follows the timing information is the subtitle text itself, and it may span multiple lines.
- Entries are separated by a blank line.
SRT files make it possible to add subtitles to video content after it is produced. For example, they can be uploaded to YouTube videos to add missing subtitles or replace existing ones with higher-quality ones.
Example of an SRT file
This is what the first lines of the SRT file for this YouTube video look like:
00:00:00,170 --> 00:00:04,234
AssemblyAI is building AI systems to help you build AI applications
00:00:04,282 --> 00:00:08,106
with spoken data. We create superhuman AI models for speech
Prerequisites to get SRT files in Python
We will use the following dependencies to complete this tutorial:
- The AssemblyAI Python SDK
- A free AssemblyAI API key, which can be copied from your AssemblyAI dashboard
If you want a working code example that transcribes and generates subtitles for YouTube videos, you can check out this Google Colab.
Project Setup for SRT generation
Make sure that you have Python 3.8 or newer already installed on your system and create a new folder for your project. Then, navigate to your project directory in your terminal and create a new virtual environment:
python3 -m venv venv
python -m venv venv
Install the AssemblyAI Python package
pip install assemblyai
Set your AssemblyAI API key as an environment variable named
ASSEMBLYAI_API_KEY. You can get a free API key here.
Create the SRT files for videos in Python
AssemblyAI can produce subtitles as both SRT and VTT files.
First, you'll need the video file for which you want to create the SRT file. You can either use a path to a local file or a URL to a publicly accessible file. The AssemblyAI API supports most common audio and video file formats, so you can submit both audio or video files to generate SRT files. You’ll find all supported file formats in our API documentation.
Create a new file named
main.py and insert the following code:
import assemblyai as aai
# If the API key is not set as an environment variable named
# ASSEMBLYAI_API_KEY, you can also set it like this:
# aai.settings.api_key = "YOUR_API_KEY"
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://storage.googleapis.com/aai-web-samples/aai-overview.mp4")
srt = transcript.export_subtitles_srt()
# Save it to a file
with open("subtitle_example.srt", "w") as f:
The above code first imports the
assemblyai Python package. Next, it instantiates the
Transcriber object, which is used to call AssemblyAI's transcription service.
transcriber.transcribe() starts the transcription process on the specified video file. Here, we used a remote URL, but you can replace it with the path to your own file.
When the transcription is finished, it gets saved in the
transcript object. Calling
transcript.export_subtitles_srt() then generates the subtitles in SRT format.
Lastly, the script dumps the SRT string into a .srt file. You can modify the output filename to your liking.
Run the subtitle generation code
Ensure that the
main.py file is saved, that your virtual environment is still activated, and that your API key is set. Navigate to the project directory in a terminal and run the
main.py file with the following command:
Once the script has finished executing, you should see a new .srt file with the generated subtitles in your folder.
Specify the number of characters per caption in the SRT file
You can also customize the maximum number of characters per caption by specifying the
chars_per_caption parameter. For example:
srt = transcript.export_subtitles_srt(chars_per_caption=32)
The captions are then limited to 32 characters:
00:00:00,170 --> 00:00:01,754
AssemblyAI is building AI
00:00:01,802 --> 00:00:03,514
systems to help you build AI
In this tutorial, you’ve learned how to generate SRT files for videos with AssemblyAI using Python.
Here are a few other helpful resources to learn more about what you can do with transcripts and AssemblyAI’s Speech AI models:
- API documentation for subtitle generation
- Automatically determine video sections with AI using Python
- Key phrase detection in audio files using Python