Blog

How to build a YouTube downloader in Python

Tutorials
How to build a YouTube Downloader in Python
Share on social icon.Share on social icon.Share on social icon.Share on social icon.

Have you ever wanted to download the audio from a YouTube video? Maybe to a song or a speech? Well, after this tutorial, you will be able to make a Python command line interface to download any YouTube video or audio you want. You can find the source code here


What libraries/softwares will we need to download/configure?

  • youtube-dl (the python library)
  • FFmpeg
  • Click (the python library)


First thing’s first, we have to download and configure our libraries. Since youtube-dl and click are both Python libraries, we can install both at the same time with pip:

pip install youtube-dl click

After these two libraries are done installing, we’re going to want to install FFmpeg, FFprobe, and FFplay. ffmpeg. FFmpeg is an open source and free software for handling, video, audio, and other multimedia files. We’ll be using this in conjunction with youtube-dl to convert the video we download into an audio file. This part is different for Windows and OSX users. First, we’re going to download the binaries from https://ffbinaries.com/downloads 

If you’re a Windows user what you’ll want to do is download the binaries and unzip the files. You’ll see an executable file for each of the three ffbinaries we need, ffmpeg, ffprobe, and ffplay. Copy each executable file to a folder and make sure you know where that folder is. For the purposes of this tutorial, I copied it to the same folder that I am running the python program from. Later we’ll add an option to the request we send youtube_dl that will tell it where to find the program.

If you’re an OSX user, you’ll want to go to the site and download the binaries, and then add the location where you’ve downloaded them to to your PATH variable. Like so:

Run 

sudo cp ./ffmpeg ./ffplay ./ffprobe /usr/local/bin

Open up ~/.zshrc with whatever text editor you’d like, I just run 

vim ~/.zshrc

Add the line 

PATH=”/usr/local/bin:$PATH

We’ll create two commands, one to download audio only, and one to download videos. Before we get into creating our actual commands, we have to initialize our CLI. An important note, all of your commands should go in-between the definition of apis() and the definition of main().

import click
 
@click.group()
def apis():
   """A CLI for getting transcriptions of YouTube videos"""
def main():
   apis(prog_name='apis')
 
if __name__ == '__main__':
   main()

The first thing we’ll need to do is make a download video function which will download the YouTube video from a link we pass to it as a .mp4 file. When we’re done, it should look like this:


Our download video function is going to specify the format that it wants the video in (mp4) and an output template that tells youtube-dl the way it wants the file saved. We’re going to set the name of the file to be equal to the YouTube id of the video, this is totally optional, I did it because I find that the title of the file can get long and cumbersome to work with in some settings, especially if there are spaces in it. Then we’ll call youtube-dl to save the file and have the function return the filename back to us.

import youtube_dl

@click.argument('link')
@apis.command()
def download_video(link):
   ydl_opts = {
       'format': 'mp4',
       'outtmpl': "./%(id)s.%(ext)s",
   }
   _id = link.strip()
   meta = youtube_dl.YoutubeDL(ydl_opts).extract_info(_id)
   save_location = meta['id'] + ".mp4"
   print(save_location)
   return save_location


The other function we need to make is a download audio function which will download the audio from a YouTube link that we pass it as a .mp3 file. When we’re done, it should look like this:

Or if you decide you want to keep the video:

Like our download video function, our download audio function will also specify some options to youtube-dl. The extra options we’ll need to specify this time is to tell youtube-dl to use FFmpeg to convert the video file after processing and whether or not to keep the video.

@click.argument('link')
@click.option('-k', '--keep-video', is_flag=True, help="Pass this to keep the video")
@apis.command()
def download_audio(link, keep_video):
   ydl_opts = {
       'format': 'bestaudio/best',
       'postprocessors': [{
           'key': 'FFmpegExtractAudio',
           'preferredcodec': 'mp3',
           'preferredquality': '192',
       }],
       'ffmpeg-location': './',
       'outtmpl': "./%(id)s.%(ext)s",
       'keepvideo': 'True' if keep_video else 'False'
   }
   _id = link.strip()
   meta = youtube_dl.YoutubeDL(ydl_opts).extract_info(_id)
   save_location = meta['id'] + ".mp3"
   print(save_location)
   return save_location


That’s it. We’re done! It’s that easy. No more using sketchy sites with tons of ads to download your youtube videos or audios! See how to extend this into a CLI that will give you the transcript of the YouTube video. You can follow AssemblyAI for updates on Twitter @assemblyai, and you can follow me @yujian_tang

Subscribe to our blog!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

You may also like

Checkout some of our recent research and product updates

Python Speech Recognition in Under 25 Lines of Code
Tutorials
Python Speech Recognition in Under 25 Lines of Code

How to get the transcript of a YouTube video
Tutorials
How to get the transcript of a YouTube video

In this blog post, I'm going to show you how to build a command line tool that will download a video from a YouTube link and extract the transcription for you via AssemblyAI in Python 3!

Fine-Tuning Transformers for NLP
Deep Learning
Tutorials
Fine-Tuning Transformers for NLP

Since being first developed and released in the Attention Is All You Need paper Transformers have completely redefined the field of Natural Language Processing. In this blog, we show you how to quickly fine-tune Transformers for numerous downstream tasks, that often perform really well out of the box.

ADVANCED TRANSCRIPTON FEATURES

Unlock your media with our advanced features like PII Redaction,
Keyword Boosts, Automatic Transcript Highlights, and more