Tutorials

Transcribe audio to text on Cloudflare Workers with AssemblyAI and TypeScript

In this tutorial, you'll learn how to create an application that transcribes the audio files (and video files) to text. You'll create a TypeScript backend on top of Cloudflare Workers and use the AssemblyAI APIs to transcribe the audio.

Cloudflare logo

Prerequisites

You can find the source code for this project on GitHub and use the "Deploy with Workers" button inside the README file to quickly deploy the code to the Cloudflare Workers runtime.

Create the Cloudflare Worker

Create your worker project using the C3 (create-cloudflare-cli) tool:

npm create cloudflare@latest

Answer the prompts as follows:

  • In which directory do you want to create your application? ./audio-transcriber.
  • You can use lowercase letters and dashes, but no spaces or uppercase letters.
  • What type of application do you want to create? "Hello World" Worker
  • Do you want to use TypeScript? yes
  • Do you want to use git for version control? no or yes, up to you.
  • Do you want to deploy your application? no or yes, up to you.
using create-cloudflare version 2.0.14

╭ Create an application with Cloudflare Step 1 of 3
│
├ In which directory do you want to create your application?
│ dir ./audio-transcriber
│
├ What type of application do you want to create?
│ type "Hello World" Worker
│
├ Do you want to use TypeScript?
│ yes typescript
│
├ Copying files from "hello-world" template
│
├ Do you want to use TypeScript?
│ yes typescript
│
├ Retrieving current workerd compatibility date
│ compatibility date 2023-07-17
│
├ Do you want to use git for version control?
│ no git
│
╰ Application created

╭ Installing dependencies Step 2 of 3
│
├ Installing dependencies
│ installed via `npm install`
│
╰ Dependencies Installed

╭ Deploy with Cloudflare Step 3 of 3
│
├ Do you want to deploy your application?
│ yes deploy via `npm run deploy`
│
├ Logging into Cloudflare checking authentication status
│ logged in
│
├ Selecting Cloudflare account retrieving accounts
│ account ***@***.com's Account
│
┘ Deploying your application .
├ Deploying your application
│ deployed via `npm run deploy`
│
├  SUCCESS  View your deployed application at https://audio-transcriber.***.workers.dev
│
│ Navigate to the new directory cd audio-transcriber
│ Run the development server npm run start
│ Deploy your application npm run deploy
│ Read the documentation https://developers.cloudflare.com/workers
│ Stuck? Join us at https://discord.gg/cloudflaredev
│
├ Waiting for DNS to propagate
│ DNS propagation complete.
│
├ Waiting for deployment to become available
│ deployment is ready at: https://audio-transcriber.***.workers.dev
│
├ Opening browser
│
╰ See you again soon!

Change directories into your new project:

cd audio-transcriber

You just scaffolded a Cloudflare Workers project which has a lot of files, mostly configuration for TypeScript and Cloudflare. You can find the code for your Worker under src/worker.ts.

/**
 * Welcome to Cloudflare Workers! This is your first worker.
 *
 * - Run `npm run dev` in your terminal to start a development server
 * - Open a browser tab at http://localhost:8787/ to see your worker in action
 * - Run `npm run deploy` to publish your worker
 *
 * Learn more at https://developers.cloudflare.com/workers/
 */

export interface Env {
  // Example binding to KV. Learn more at https://developers.cloudflare.com/workers/runtime-apis/kv/
  // MY_KV_NAMESPACE: KVNamespace;
  //
  // Example binding to Durable Object. Learn more at https://developers.cloudflare.com/workers/runtime-apis/durable-objects/
  // MY_DURABLE_OBJECT: DurableObjectNamespace;
  //
  // Example binding to R2. Learn more at https://developers.cloudflare.com/workers/runtime-apis/r2/
  // MY_BUCKET: R2Bucket;
  //
  // Example binding to a Service. Learn more at https://developers.cloudflare.com/workers/runtime-apis/service-bindings/
  // MY_SERVICE: Fetcher;
  //
  // Example binding to a Queue. Learn more at https://developers.cloudflare.com/queues/javascript-apis/
  // MY_QUEUE: Queue;
}

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    return new Response('Hello World!');
  },
};

The code is very short, the fetch function receives an HTTP request and returns an HTTP response saying “Hello World!”.

Run the following command to start the worker:

npm run start

Look for the worker URL in the terminal output and use a browser to open it. You should see "Hello World!".

Info

If you look into the package.json file, you can find the different npm scripts that are available under the scripts property. There is the start script to run your worker locally, and the deploy script to deploy the worker to the Cloudflare Worker runtime. Both of these commands use the wrangler CLI which is a CLI for developers using Cloudflare.

The fetch function will handle all requests. In this project, you'll use the itty-router library to make it easier to execute different code depending on the path of the HTTP request. This library was originally made for Cloudflare Workers, but has expanded to different platforms too.

In a separate terminal, install the itty-router library using NPM.

npm install --save itty-router

Replace the code in worker.ts with the following:

import { IRequest, Router, error, html, json, text } from 'itty-router';

const router = Router();

export interface Env {
}

router.get('/hello/:name', ({ params: { name } }: IRequest) => html(`<!DOCTYPE html>
<body>
  Hello ${name}!
</body>`));

export default {
  fetch: (req: IRequest, ...args: any) => router
    .handle(req, ...args)
    .then(json)
    .catch(error)
};

This code sets up the router to handle all requests. If the return value of the selected route is not an HTTP response, it will convert the value to a JSON HTTP response using the json function. Errors will be caught by the error function.

Currently, there's only one route, /hello/:name, in which :name is a route parameter. This route will respond with HTML. The name parameter is passed into the HTML so it'll greet whichever name is set in the path.

Back in the browser, change the path to /hello/[YOURNAME] and replace [YOUR_NAME] with your name. You should be greeted with your name.

Create a file upload form

AssemblyAI can transcribe any audio file that's publicly available via URL, but if your audio file is on your local disk, you'll need to upload it somewhere to make it accessible to AssemblyAI.

Change the existing route and function so it returns HTML for a file upload form, and add another route that will accept the form submission using HTTP POST at path /upload-file:

router.get('/', () => html(`<!DOCTYPE html>
<body>
  <form action="/upload-file" method="post" enctype="multipart/form-data">
    <label for="file">Upload an audio or video file:</label> <br>
    <input type="file" name="file" id="file" /><br>
    <button type="submit">Submit</button>
  </form>
</body>`))
  .post('/upload-file', async (request, env: Env) => {
    const formData = await request.formData();
    const file = formData.get('file') as unknown as File;
    return text(`You uploaded ${file.name} (${file.size} bytes)`);
  });

The /upload-file route will respond with the file name and the size, just so you can verify that it's working.

Now that you can receive a file, you could upload the audio file to any file hosting service, including Cloudflare R2. To keep things as simple as possible, you will upload your audio file to AssemblyAI's CDN.

However, to interact with the AssemblyAI API, you'll first need to configure your API key.
Create a new file in the project root called .dev.vars and add your Assembly API key like this:

ASSEMBLYAI_API_KEY=[your_assemblyai_api_key]

Replace [your_assemblyai_api_key] with your AssemblyAI API key.

Warning

This API key will only be used during development. Make sure to keep this file and the API key a secret, and never check this file into source control.

Then, update the Env interface in the worker.ts file to include the API key:

export interface Env {
  ASSEMBLYAI_API_KEY: string;
}

Next, install the AssemblyAI JavaScript SDK using npm.

npm install assemblyai --save

For all the previous changes to take effect, you need to restart your worker. Press ctrl + c to stop the worker, and use npm run start to start it again.

Update src/worker.ts to import the AssemblyAI client from the assemblyai package.

import { IRequest, Router, error, html, json, text } from 'itty-router';
import { AssemblyAI } from 'assemblyai';

Then inside the /upload-file route, create an AssemblyAI instance passing in the API key to the constructor, then use the client.files.upload function to upload the file to AssemblyAI's CDN:

  .post('/upload-file', async (request, env: Env) => {
    const formData = await request.formData();
    const file = formData.get('file') as unknown as File;

    const client = new AssemblyAI({ apiKey: env.ASSEMBLYAI_API_KEY });
    const uploadUrl = await client.files.upload(file.stream());
    return text(`Uploaded file to ${uploadUrl}`);
  });

Note

Files uploaded to AssemblyAI are exclusively accessible to AssemblyAI's services. These files are encrypted in transit and at rest, and will be automatically deleted after a period of time.

Go back to the browser at the root path, and upload a new file. You should receive a response with the URL of the file uploaded to AssemblyAI's CDN.

Create a transcript

Now that you have a file that's accessible (only) to AssemblyAI’s services, you can create a transcript using the AssemblyAI API.

Update the /upload-file route function so it creates a transcript from the uploaded file, then create a redirect response to another route, which we will define next.

  .post('/upload-file', async (request, env: Env) => {
    const formData = await request.formData();
    const file = formData.get('file') as unknown as File;

    const client = new AssemblyAI({ apiKey: env.ASSEMBLYAI_API_KEY });
    const uploadUrl = await client.files.upload(file.stream());
    let transcript = await client.transcripts.submit({ audio: uploadUrl });

    const newUrl = new URL(`/transcript/${transcript.id}`, request.url);
    return Response.redirect(newUrl.toString(), 303);
  })

When you create a transcript, you will receive a transcript object back, with many empty properties such as the transcript text. That's because AssemblyAI queues the transcript and will then process the audio file. However, for now, you only need the id to redirect the user.

Shortcut

You can upload and create a transcript using a single function like this: await client.transcripts.submit({ audio: file.stream() }).

Get the transcript

While you can query the transcript object at any time, the transcript text, and many other properties will be empty until AssemblyAI is finished processing the audio file.

Create an HTTP GET route with the /transcripts/:id route pattern. This route will retrieve the transcript by ID and return the text if the status is completed. Otherwise, the status itself will be returned which can either be queued, processing, or error, along with a Refresh: 1 header which will automatically refresh the page after 1 seconds.

  .get('/transcript/:id', async (request: IRequest, env: Env) => {
    const id = request.params.id;
    const client = new AssemblyAI({ apiKey: env.ASSEMBLYAI_API_KEY });
    const transcript = await client.transcripts.get(id);
    if (transcript.status === 'completed') {
      return text(transcript.text);
    } else {
      return text(transcript.status, {
        headers: {
          'Refresh': '1' // refreshes the browser every 1 seconds
        }
      });
    }
  });

Now, head back to the browser, upload a file, and you should be redirected to the transcript route, which then returns the processing status, and eventually the text of the transcript.

Info

Instead of polling the transcript endpoint, you can use the webhook_url property to be notified when the transcript is completed, as documented in the AssemblyAI webhook guide.

Deploy Worker to Cloudflare

You can now deploy the worker to the Cloudflare Worker runtime. Use the npm deploy script to deploy the worker:

npm run deploy

The output will give you the URL where the worker is deployed. Earlier, you configured the AssemblyAI API key as an environment variable, but only locally. You also need to configure the key for your deployed worker. You can use the following wrangler command for this:

wrangler secret put ASSEMBLYAI_API_KEY

After running this command, you will be prompted to enter the value of the secret.

With the worker deployed and the secret configured, you and everyone else can now use the application just like you did locally.

Conclusion

You just learned how to create a Cloudflare Worker project for local development and how to deploy it to the runtime environment. You also learned how to use a router to execute different code depending on the URL of the incoming HTTP requests, how to handle a file upload. Finally, you used the AssemblyAI API to upload audio files and to transcribe those files.

Cloudflare has a lot more developer products that you can integrate into your application like queues, storage, databases, and more.

AssemblyAI can also do more than transcribe audio. AssemblyAI can summarize your transcript, create chapters with summaries, detect hate speech, identify speakers, ask LeMUR (AssemblyAI's LLM framework) any questions about long audio content, and more.