Do you get a lot of spam calls? With the rise of automated calling, more and more spam calls are being made. Let’s fight back, in this tutorial I’ll show you how to make a burner phone to record incoming calls with Twilio and transcribe them with AssemblyAI. Twilio is a communications API that provides support around communicating via phone capabilities. AssemblyAI is a fast, automatic speech to text API, ranked by Nordic API as the top public API of 2020. You can find the source code here.
Today we're going to use Python to build a burner phone. What's a burner phone? It's a phone that you can use in place of your real one and then get rid of. You can use the free number that Twilio let’s you provision to follow along. In this tutorial, we’ll provision a number from Twilio, create a function using Python Flask that will pick up phone calls and record them, programmatically update our number's webhook endpoint, make some example phone calls, download our .mp3 files, and transcribe those .mp3 files with AssemblyAI. At the end we'll also look at a comparison between the accuracy of the transcripts using Twilio's built in transcription and the transcription from AssemblyAI's Speech-to-Text API.
- Sign up for a Twilio Account
- Copy your credentials
- Provision a phone number from Twilio
- Create a flask endpoint to record voice
- Get ngrok and run it to expose your Flask endpoint to the web
- Update Twilio provisioned phone number with new webhook endpoint
- Make some calls
- Check for and download recordings
- Get an AssemblyAI API key
- Transcribe them via AssemblyAI
- Comparison between Twilio’s and AssemblyAI’s transcription
Sign up for a Twilio Account
Sign up for a Twilio account at twilio.com and get your account sid and auth token. I’ve circled where they should be on your console (and blocked out my account sid).
Copy your credentials
Copy your credentials and either save them in your environment variables or save them in a configure.py file in the folder you’re working in.
Provision a phone number from Twilio
Now let’s programmatically provision a phone number from Twilio via their REST API using Python. We’ll list 20 local numbers and then pick one of them.
When we’re done, our terminal output should look like:
Create a flask endpoint to record voice
If you don’t already have Python Flask downloaded, you can install it with
It is very important that you read this block and the next block carefully. We will create an endpoint using Python Flask. Our endpoint will record an incoming phone call and transcribe it via Twilio (later on we’ll look at a comparison).
Now we run this in our terminal and we should see an output like this:
Get ngrok and run it to expose your flask endpoint to the web
After our flask app is up and running in our terminal, we also need to download and run ngrok. You can download ngrok here. After downloading ngrok and copying it into our working folder, we’ll open a second terminal and run
Note that the “5000” can be replaced with whatever port you’re running your Python Flask application on. As you can see above, ours is running on port 5000. Ngrok should expose and return an endpoint we can hit. We’ll need to keep track of the https forwarding link, that is going to be our updated webhook URL for Twilio to hit when we call.
Update Twilio provisioned phone number with new webhook endpoint
Alright, now that we have an app that will record phone calls, let’s update our webhook on Twilio via their Python REST API.
When we run this, we should get this output in the terminal. Notice that I added a “/voice” to the end of the URL provided by ngrok, that is because I defined “/voice” as the endpoint in our Flask application above.
Make some calls
There’s no code to write here, but make some calls to your new burner phone number. If everything is set up correctly, you should hear a female voice say “Please leave a message”. For the purpose of this tutorial, I made 3 phone calls and left 3 messages.
The message I left are:
“This is a third and final recording that I’m going to use for testing transcription services. So yeah, I should have been a cowboy”
“This is a test recording for transcriptions. Sally sells seashells down by the sea shore”
“A B C D E F G This is a test call for recording transcriptions with Twilio and AssemblyAI”
Check for and download recordings
Now let’s check out our recordings on Twilio and download them.
When we’re done, the output in the terminal should look like:
I ran this three times to pull down all three recordings. You can alternatively execute the code below to download all of them at once.
Get an AssemblyAI API key
Go to AssemblyAI to get an API key. You will see your API key where I’ve circled and blocked out in red:
Add this line to your configure.py file
Transcribe them via AssemblyAI
Now we'll use AssemblyAI's API to transcribe the .mp3 files we have. We'll transcribe our files with AssemblyAI's Topic Detection feature enabled. Topic Detection is used to detect topics within transcription text. Topic Detection is useful for automatically performing some actions if we detect that the text contains some topic or topics we're looking for. What we'll do is upload our .mp3 file to AssemblyAI's upload endpoint, and then transcribe it with Topic Detection enabled via the AssemblyAI transcription endpoint. We'll download our transcription as a .json file.
When we are done, a request to transcribe one file should look like:
I realize that we clearly weren’t able to get topic detection with Twilio’s built in transcription as we could with AssemblyAI’s API, but we can still compare their transcription accuracy. We’ll build two new scripts to do this. First, one to retrieve all our transcripts from Twilio. Second, one to print out our transcripts from AssemblyAI.
Our Twilio script will invoke the client, fetch all our transcripts on our account (we should only have 3 right now) and print them out:
Our AssemblyAI script will just take our downloaded JSON’s and print out the text. I manually loaded the JSON file names.
In a side by side comparison:
We can see that AssemblyAI’s transcription is more accurate than Twilio’s built in transcription service even at messages as short as these. We can also compare the pricing for AssemblyAI and the pricing for Twilio’s built in transcription and see that AssemblyAI costs less than ⅓ as much ($0.015 vs $0.05).
Finally, after you’re done giving out your burner phone number to those pesky services that keep asking for a phone number, you can delete it.
It should run like this:
In this tutorial, we found out how to use Python to programmatically get a number from Twilio, set up a Flask application to respond to and record a phone call, transcribe our phone call with AssemblyAI, and delete our number after we’re done with it.