Transcribe Your Zoom Meetings

This guide creates a Node.js service that captures audio from Zoom Real-Time Media Streams (RTMS) and provides both real-time and asynchronous transcription using AssemblyAI.

Zoom RTMS Documentation

For complete Zoom RTMS documentation, visit https://developers.zoom.us/docs/rtms/

Features

  • Real-time Transcription: Live transcription during meetings using AssemblyAI’s streaming API
  • Asynchronous Transcription: Complete post-meeting transcription with advanced features
  • Flexible Audio Modes:
    • Mixed stream (all participants combined)
    • Individual participant streams transcribed
  • Multichannel Audio Support: Separate channels for different participants
  • Configurable Processing: Enable/disable real-time or async transcription independently

Setup

Prerequisites

  • Node.js 16+
  • FFmpeg installed on your system
  • Zoom RTMS Developer Preview access
  • AssemblyAI API key
  • ngrok (for local development and testing)

Installation

  1. Clone the example repository and install dependencies:
$git clone https://github.com/zkleb-aai/assemblyai-zoom-rtms.git
>cd assemblyai-zoom-rtms
>npm install
  1. Configure environment variables:
$cp .env.example .env

Fill in your .env file:

1# Zoom Configuration
2ZM_CLIENT_ID=your_zoom_client_id
3ZM_CLIENT_SECRET=your_zoom_client_secret
4ZOOM_SECRET_TOKEN=your_webhook_secret_token
5
6# AssemblyAI Configuration
7ASSEMBLYAI_API_KEY=your_assemblyai_api_key
8
9# Service Configuration
10PORT=8080
11REALTIME_ENABLED=true
12REALTIME_MODE=mixed
13ASYNC_ENABLED=true
14AUDIO_CHANNELS=mono
15AUDIO_SAMPLE_RATE=16000
16TARGET_CHUNK_DURATION_MS=100

Local development with ngrok

For testing and development, you can use ngrok to expose your local server to the internet:

  1. Install ngrok: Download from ngrok.com or install via package manager:

    $# macOS
    >brew install ngrok
    >
    ># Windows (chocolatey)
    >choco install ngrok
    >
    ># Or download directly from ngrok.com
  2. Start your local server:

    $npm start
  3. In a separate terminal, start ngrok:

    $ngrok http 8080
  4. Copy the ngrok URL: ngrok will display a forwarding URL like:

    Forwarding https://example-abc123.ngrok-free.app -> http://localhost:8080
  5. Use the ngrok URL in your Zoom app webhook configuration:

    https://example-abc123.ngrok-free.app/webhook

Configuration options

Real-time transcription

  • REALTIME_ENABLED: Enable/disable live transcription (default: true)
  • REALTIME_MODE:
    • mixed: Single stream with all participants combined
    • individual: Separate streams per participant

Audio settings

  • AUDIO_CHANNELS: mono or multichannel
  • AUDIO_SAMPLE_RATE: Audio sample rate in Hz (default: 16000)
  • TARGET_CHUNK_DURATION_MS: Audio chunk duration for streaming (default: 100)

Async transcription

  • ASYNC_ENABLED: Enable/disable post-meeting transcription (default: true)

Usage

Start the service

$npm start

The service will start on the configured port (default: 8080) and display:

šŸŽ§ Zoom RTMS to AssemblyAI Transcription Service
šŸ“‹ Configuration:
Real-time: āœ… (mixed)
Audio: mono @ 16000Hz
Async: āœ…
šŸš€ Server running on port 8080
šŸ“” Webhook endpoint: http://localhost:8080/webhook

Configure Zoom webhook

  1. In your Zoom App configuration, set the webhook endpoint to:

    # For production
    https://your-domain.com/webhook
    # For local development with ngrok
    https://example-abc123.ngrok-free.app/webhook
  2. Subscribe to these events:

    • meeting.rtms_started
    • meeting.rtms_stopped

Testing with ngrok

When using ngrok for testing:

  1. Keep ngrok running: The ngrok tunnel must remain active during testing
  2. Update webhook URL: If you restart ngrok, you’ll get a new URL that needs to be updated in your Zoom app configuration
  3. Monitor ngrok logs: ngrok shows incoming webhook requests in its terminal output
  4. Free tier limitations: The free ngrok tier has some limitations; consider upgrading for heavy testing

Real-time output

During meetings, you’ll see live transcription:

šŸš€ AssemblyAI session started: [abc12345]
šŸŽ™ļø [abc12345] Hello everyone, welcome to the meeting
šŸ“ [abc12345] FINAL: Hello everyone, welcome to the meeting.

Post-meeting files

After each meeting, the service generates:

  • transcript_[meeting_uuid].json - Full AssemblyAI response with metadata
  • transcript_[meeting_uuid].txt - Plain text transcript

Advanced configuration

AssemblyAI features

Modify the ASYNC_CONFIG object in the code to enable additional features:

1const ASYNC_CONFIG = {
2 speaker_labels: true, // Speaker identification
3 auto_chapters: true, // Automatic chapter detection
4 sentiment_analysis: true, // Sentiment analysis
5 entity_detection: true, // Named entity recognition
6 redact_pii: true, // PII redaction
7 summarization: true, // Auto-summarization
8 auto_highlights: true, // Key highlights
9};

See AssemblyAI’s API documentation for all available options.

Audio processing modes

Mixed mode (default)

  • Single audio stream combining all participants
  • Most efficient for general transcription
  • Best for meetings with clear speakers

Individual mode

  • Separate transcription stream per participant
  • Better speaker attribution
  • Higher resource usage

Multichannel audio

  • Separate audio channels for different participants
  • Enables advanced speaker separation
  • Requires AUDIO_CHANNELS=multichannel

API endpoints

POST /webhook

Handles Zoom RTMS webhook events:

  • URL validation
  • Meeting start/stop events
  • Automatic RTMS connection setup

Error handling

The service includes comprehensive error handling:

  • Automatic reconnection for dropped connections
  • Graceful cleanup on meeting end
  • Audio buffer flushing to prevent data loss
  • Temporary file cleanup

Monitoring

Real-time logs

  • Connection status updates
  • Audio processing statistics
  • Transcription progress
  • Error notifications

Example log output

šŸ“” Connecting to Zoom signaling for meeting abc123
āœ… Zoom signaling connected for meeting abc123
šŸŽµ Connecting to Zoom media for meeting abc123
āœ… Zoom media connected for meeting abc123
šŸš€ Started audio streaming for meeting abc123
šŸŽµ [abc12345] 100 chunks, 32768 bytes, 10.2s
šŸ“ [abc12345] FINAL: This is the final transcription.

Development workflow

  1. Start your local server: npm start
  2. Start ngrok in another terminal: ngrok http 8080
  3. Update your Zoom app webhook URL with the ngrok URL
  4. Test with Zoom meetings
  5. Monitor logs in both your app and ngrok terminals