Speech Understanding

Gain maximum value from voice data with audio intelligence models, and leverage LLM capabilities with LeMUR to extract insights, generate summaries, and more.

Summary

The customer, John, called Acme Corporation's customer service department to report a malfunction with his widget. Sarah, the representative, attempted troubleshooting but concluded that the widget needed repair under warranty.

Topic Detection

Customer Service, Product Support, Warranty, Troubleshooting, Repair

Key Phrases

Acme Corporation, Sarah, John, Acme Widget, malfunction, serial number, batteries, troubleshooting, repair, warranty, prepaid shipping label

Extract valuable insights from voice data

Audio Intelligence

AI models to summarize speech, redact personal information, detect hateful content, identify spoken topics, and more.

An illustration of the AssemblyAI playground showcasing the Audio Intelligence feature. Theres a heading "Summary" and a paragraph to show the output of the model. There's also a heading "Topic Detection" followed by an output.
LeMUR

With a single API call, summarize meetings, generate call insights, recap action items, and more on over 100 hours of audio data.

An illustration of the AssemblyAI realtime playground. On top, there's a button with the Text "Start talking". Below, there's a timestamp and output with text "Hello today is"

LeMUR

Leverage LLM capabilities and take action on your audio data

Ask questions

Get instant answers to questions about your audio.

Create summaries

Summarize your audio data with key takeaways.

Extract data

Extract data such as topic tags from your audio to categorize and organize your audio data.

Generate content

Generate long-form or short-form written content using your audio data.

Ask questions

See how in docs

Input

What is a runners' knee?

Output

Based on the transcript, runner's knee is a condition characterized by pain behind or around the kneecap. It is caused by overuse, muscle imbalance and inadequate stretching. Symptoms include pain under or around the kneecap and pain when walking.

Explore how people are using LeMUR

Rapidly ship high-quality Generative AI features with voice data

Unifies your AI stack for audio
LeMUR is a single API that connects all of the spoken data in your application to an LLM to build generative features in your product. No need to chain multiple technologies together to go from audio file to LLM output.
Powered by AssemblyAI’s speech recognition models
High-quality LLM outputs on audio data start with quality transcriptions. LeMUR operates over AssemblyAI’s state-of-the-art speech recognition models, to ensure LLM outputs are top-notch.
Continuously updated with the latest in research
We are constantly experimenting with the latest research in LLMs and updating LeMUR with new techniques in retrieval, compression, prompt engineering, LLM performance, and more.
Launch quickly & effortlessly scale
Find product-market fit faster and ship new AI features at scale. The LeMUR API enables you to process over 200 hours of audio in a single API call, and handles over 1M tokens as input. The API is priced to scale as your audio data grows.

Speech-to-Text

Build on top of the most accurate Speech-to-Text model on the market with >92.5% accuracy.

Streaming Speech-to-Text

Transcribe audio streams synchronously with high accuracy and low latency.

START BUILDING WITH AI

Get started in seconds

1
2
3
4
5
6
import assemblyai as aai

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)

print(transcript)
{
  "id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
  "language_code": "en_us",
  "status": "completed",
  "text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
  "confidence": 0.98122,
  "audio_duration": 3200,
  "words": [
    { "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
    { "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
  ]
}