Over the course of this post, we'll be walking you through the process of uploading an audio file from your local machine to the AssemblyAI Speech-to-Text API, and submitting that audio file for transcription using C#.
At a glance
To accomplish the task at hand, we'll be interacting with 3 separate AssemblyAI API endpoints.
This endpoint will be used to upload an audio file directly to the AssemblyAI API. It will return the URL of the uploaded file, that we'll use in the subsequent request to actually start the transcription.
Note: all uploads are immediately deleted after transcription, AssemblyAI does not store uploads indefinitely!
Submit for transcription
This endpoint will take the url of the uploaded file and will submit the file for transcription. The response will contain a unique identifier that will be used to query the transcription status and return the result once complete.
This endpoint will take the ID of the requested transcription and return the status of the transcription as well as the result, once complete.
Below is a list of tools you'll need to follow along with this walk-through.
- .NET 5 SDK (earlier versions of .NET core and .NET framework 4.5+ should work as well)
- Visual Studio 2019 Community Edition or above.
- Optionally, you could use Visual Studio Code, but I won't be covering the steps for doing so.
- AssemblyAI API key to allow usage of the API. See below on how to acquire an API key.
How to get an API key from AssemblyAI
Before you can get started, you'll need to sign-up for a free account. With a free account, you'll be allowed to transcribe up to 5 hours of audio over the course of a month. If you exceed your 5 hours of transcription, you can easily upgrade to a Pro Plan.
To sign-up for your account and get your API key, navigate to the sign-up page, add your information, and click submit. You'll need to verify your email address, so check your inbox for the verification email.
Once you click the verification link in your email, you'll be taken to your account dashboard. You'll notice you can grab your API token from your account dashboard.
Create the project
- Open Visual Studio 2019 then locate and select the Console Application template.
- Name the project what you'd like and select the location on disk for where you'd like the project files stored. We'll name the project AssemblyAITranscriber.
- Finally, make sure you select .NET 5 as your target framework version.
Creating the API model classes
Now we need to create our strongly typed API models that will be used for serialization/deserialization of requests and responses when interacting with the API. At the root of your project, create a new folder named Models and add each of these classes to that folder.
Creating the API Client class
We'll need to create the API Client class that will be used to interact with the API. This class will handle making HTTP requests to the API and perform deserialization of API responses to strongly typed objects.
Add a new class file to the root of your project named AssemblyAIApiClient.cs.
To begin, we'll add a few properties to this class to hold our API settings as well as a constructor to set those properties.
Add a simple helper method that will support sending an HTTP request to the API and deserializing it's response.
Add a new method to the class that will be used to upload a file from disk to the API.
Next, add a new method that will initiate the transcription process by submitting an audio file via its URL.
The final method to add will retrieve a transcription by its unique identifier. The object returned from this method will indicate the status of the transcription as well as the result of the transcription once complete.
Orchestrating the symphony
Now that our API client supports interacting with the 3 API endpoints we need, it's just a matter of sequencing the operations.
The remainder of the code will be updating Program.cs at the root of the project. Let's start by adding a few constants within Program.cs.
You'll want to make sure you add your API token in-place of <YOUR_API_KEY> and specify the path to the audio file you want to transcribe.
Note: The audio file path can be a relative or absolute. When using a relative path, it may be easier to add the audio file to the VS project and set it's Build Action to "Content -> Copy Always"
The final step will be to add the orchestration logic to the body of the Main method. The orchestration code will:
- Upload the audio file
- Submit the file for transcription
- Poll the status of the transcription, until the transcription is complete.
- Perform post-proessing (output some metadata about the transcription).
Running the application
If you'd like, you can download the sample file that was used in this walk-through. You'll just need to download the file in your browser, or use a tool such as cURL or wget to download the audio file from the following address:
Assuming your project builds and runs, you should get output similar to the following:
And that is it! You've successfully uploaded and transcribed a file using the AssemblyAI API.
You can read more about the endpoints used here as well as some of the more advanced feature in the AssemblyAI API Docs!