Transcribe streaming audio from a microphone in TypeScript
Learn how to transcribe streaming audio using Real-Time Transcription in TypeScript.
Overview
By the end of this tutorial, you'll be able to transcribe audio from your microphone in TypeScript.
Before you begin
To complete this tutorial, you need:
- Node.js installed.
- TypeScript installed.
- An with credit card set up.
Here's the full sample code for what you'll build in this tutorial:
import { AssemblyAI, RealtimeTranscript } from 'assemblyai'
import recorder from 'node-record-lpcm16'
const run = async () => {
const client = new AssemblyAI({
apiKey: 'YOUR_API_KEY'
})
const rt = client.realtime.createService({
sampleRate: 16_000
})
rt.on('open', ({ sessionId }) => {
console.log(`Session opened with ID: ${sessionId}`)
})
rt.on('error', (error: Error) => {
console.error('Error:', error)
})
rt.on('close', (code: number, reason: string) =>
console.log('Session closed:', code, reason)
)
rt.on('transcript', (transcript: RealtimeTranscript) => {
if (!transcript.text) {
return
}
if (transcript.message_type === 'PartialTranscript') {
console.log('Partial:', transcript.text)
} else {
console.log('Final:', transcript.text)
}
})
try {
console.log('Connecting to real-time transcript service')
await rt.connect()
console.log('Starting recording')
const recording = recorder.record({
channels: 1,
sampleRate: 16_000,
audioType: 'wav' // Linear PCM
})
recording.stream().pipe(rt.stream())
// Stop recording and close connection using Ctrl-C.
process.on('SIGINT', async function () {
console.log()
console.log('Stopping recording')
recording.stop()
console.log('Closing real-time transcript connection')
await rt.close()
process.exit()
})
} catch (error) {
console.error(error)
}
}
run()
Step 1: Install the SDK
Install the package via NPM:
npm install assemblyai
Step 2: Configure the API key
In this step, you 'll create an SDK client and configure it to use your API key.
- 1
Browse to , and then click the text under Your API key to copy it.
- 2
Configure the SDK to use your API key. Replace
YOUR_API_KEY
with your copied API keyimport { AssemblyAI } from 'assemblyai'
const client = new AssemblyAI({
apiKey: 'YOUR_API_KEY'
})
Step 3: Create a real-time service
- 1
Create a new real-time service from the AssemblyAI client. If you don't set a sample rate, it defaults to 16 kHz.
const rt = client.realtime.createService({
sampleRate: 16_000
})Sample rateThe
sample_rate
is the number of audio samples per second, measured in hertz (Hz). Higher sample rates result in higher quality audio, which may lead to better transcripts, but also more data being sent over the network.We recommend the following sample rates:
- Minimum quality:
8_000
(8 kHz) - Medium quality:
16_000
(16 kHz) - Maximum quality:
48_000
(48 kHz)
- Minimum quality:
- 2
Create functions to handle events from the real-time service.
rt.on('open', ({ sessionId }) => {
console.log(`Session opened with ID: ${sessionId}`)
})
rt.on('error', (error: Error) => {
console.error('Error:', error)
})
rt.on('close', (code: number, reason: string) => {
console.log('Session closed:', code, reason)
}) - 3
Create another function to handle transcripts. The real-time transcriber returns two types of transcripts: partial and final.
- Partial transcripts are returned as the audio is being streamed to AssemblyAI.
- Final transcripts are returned when the service detects a pause in speech.
rt.on('transcript', (transcript: RealtimeTranscript) => {
if (!transcript.text) {
return
}
if (transcript.message_type === 'PartialTranscript') {
console.log('Partial:', transcript.text)
} else {
console.log('Final:', transcript.text)
}
})tipYou can also use the
on("transcript.partial")
, andon("transcript.final")
callbacks to handle partial and final transcripts separately.
Step 4: Connect the real-time service
Real-Time Transcription uses WebSockets to stream audio to AssemblyAI. This requires first establishing a connection to the API.
await rt.connect()
Step 5: Record audio from microphone
In this step, you'll use the node-record-lpcm16
package to record audio from your microphone.
- 1
Install
node-record-lpcm16
.npm install node-record-lpcm16
- 2
node-record-lpcm16
depends on SoX, a cross-platform audio library. - 3
In the
on("open")
callback, create a new microphone stream. ThesampleRate
needs to be the same value as the real-time service settings.const recording = recorder.record({
channels: 1,
sampleRate: 16_000,
audioType: 'wav'
})Audio data formatThe
node-record-lpcm16
formats the audio data for you. If you want to stream data from elsewhere, make sure that your audio data is in the following format:- Single channel
- 16 bit
- Linear PCM
- 4
Pipe the recording stream to the real-time stream to send the audio for transcription.
recording.stream().pipe(rt.stream())
Send audio buffersIf you don't use streams, you can also send buffers of audio data using
rt.sendAudio(buffer)
.
Step 6: Disconnect the real-time service
When you are done, disconnect the transcriber to close the connection.
process.on('SIGINT', async function () {
console.log()
console.log('Stopping recording')
recording.stop()
console.log('Closing real-time transcript connection')
await rt.close()
process.exit()
})
Need some help?
If you get stuck, or have any other questions, we'd love to help you out. Ask our support team in our Discord server.