Introducing LeMUR, the easiest way to build LLM apps on spoken data. Search, summarize, ask questions, and generate new text, with knowledge of all your application’s spoken data. LeMUR performs intelligent retrieval to offer high-quality LLM responses with a single API call.

Try in Playground Contact Sales
video thumbnail

See LeMUR in action

Watch Patrick use LeMUR to summarize, answer questions, and generate action items in 2 minutes.

Spoken data – from meetings, phone calls, videos, podcasts, and more – is a critical input into Generative AI workflows and applications. In response to the increased desire to build these kinds of AI apps on audio data, we’re seeing the emergence of an “AI stack” to string together components including automatic transcription, prompt augmentation, compression strategies, retrieval techniques, language models, and structured outputs.

LeMUR offers this stack in a single API, enabling developers to reason over their spoken data with a few lines of code. We launched LeMUR Early Access in April, and starting today, LeMUR is available for everyone to use, with new endpoints, higher accuracy outputs, and higher input and output limits.

These are the kinds of apps our users have been building with LeMUR:


See our Prompt Guide for tips on how to obtain accurate and relevant outputs from LeMUR for your use case.

“LeMUR unlocks some amazing new possibilities that I never would have thought were possible just a few years ago. The ability to effortlessly extract valuable insights, such as identifying optimal actions, empowering agent scorecard and coaching, and discerning call outcomes like sales, appointments, or call purposes, feels truly magical.”

#Optimized for high accuracy on specific tasks

LeMUR is designed to be highly accurate on a core set of tasks that developers most commonly want to build with.

Custom Summary

Automatically summarize virtual meetings, phone calls, and more in a flexible way with LeMUR’s Summarization endpoint. You can add additional context to provide information that is not explicitly referenced in the audio data being analyzed, like specific topics LeMUR should pay particular attention to when summarizing.

# requires the assemblyai Python SDK: pip install assemblyai>=0.15
import assemblyai as aai

URL = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
context = "A GitLab meeting to discuss logistics"

transcript = aai.Transcriber().transcribe(URL)
result = transcript.lemur.summarize(context)

Question & Answer

Get answers to questions about your spoken data with LeMUR, ranging from asking about a customer’s history in a call center to asking for an explanation of a concept mentioned in a podcast. LeMUR can provide answers to questions that include citation and reasoning.

# requires the assemblyai Python SDK: pip install assemblyai>=0.15
import assemblyai as aai

URL = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
questions = [
        question="What are the top level KPIs for engineering?",
        answer_format="short sentence"),
        question="How many days since the data team received updated metrics?",
        answer_options=[1, 2, 3, 4, 5, 6, 7, "more than 7"]),

transcript = aai.Transcriber().transcribe(URL)
result = transcript.lemur.question(questions)
for q in result.response:
    print(f"Question: {q.question}")
    print(f"Answer: {q.answer}")

Action Items

Automatically generate a list of action items from virtual meetings with LeMUR. You can provide a specific format to follow and add context on the speakers to assign action items to specific meeting attendees.

# requires the assemblyai Python SDK: pip install assemblyai>=0.15
import assemblyai as aai

URL = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
transcript = aai.Transcriber().transcribe(URL)

result = transcript.lemur.action_items(
    context="A GitLab meeting to discuss logistics",
    answer_format="**<topic header>**\n<relevant action items>\n",


You can try LeMUR out yourself in a no-code way through our Playground, or see how to use LeMUR under-the-hood in this Colab notebook.

#Extensible to any use case

LeMUR is built flexibly to allow you to define your own tasks and prompts with a customizable endpoint.

# requires the assemblyai Python SDK: pip install assemblyai>=0.15
import assemblyai as aai

URL = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
transcript = aai.Transcriber().transcribe(URL)

result = transcript.lemur.task(
  "You are a helpful coach. Provide an analysis of the transcript "
  "and offer areas to improve with exact quotes. Include no preamble. "
  "Start with an overall summary then get into the examples with feedback.",
  "Under each example, place the corresponding feedback in a bulleted list."


You can try out using the custom task endpoint yourself in this Colab notebook.

“LeMUR works incredibly well out-of-the-box. It allowed us to focus on product instead of infrastructure. As a result, we were able to bring a transformative new product to market in half the time.”

#Try LeMUR today

LeMUR is accessible through our API today. The easiest way to try LeMUR is through our Playground, where you can upload a file or enter a YouTube link to experiment with LeMUR endpoints in just a few clicks.

You can also try out our API directly for free. Simply sign up to get a free API token, and head over to our Docs or Welcome Colab to be up and running in just a few minutes.

If you’re thinking about integrating LeMUR into your product, you can reach out to our Sales team with any questions you might have.


What is your pricing?

You can find our model pricing here.

What languages does LeMUR support?

To start, LeMUR is focused on supporting English, but anecdotally we’ve seen good results with LeMUR generating responses in other common languages.

How can I improve LeMUR outputs?

See our LeMUR Best Practices guide on which endpoints to use, how to improve prompts, generate the output you want, etc.

How many tokens can LeMUR handle in a single API call?

LeMUR can ingest over 1,000,000 tokens (~100 hours of audio) in a single API call.