- Overview
- Quickstart
- Method 1: Transcribe and format in one request
- Method 2: Transcribe and format in separate requests
- Output format
- Understanding the custom_formatting parameter
- Available formatting options
- Common formatting patterns
- Date formats
- Phone number formats
- Email formats
- Best practices
- API reference
- Request
- Method 1: Transcribe and format in one request
- Method 2: Add formatting to existing transcripts
- Response
- Understanding formatted_utterances
- Key differences from standard transcription
Custom Formatting
Supported languages
enen_auen_uken_usesfrdeitptnlhijazhfikoplrutrukviafsqamarhyasazbaeubebnbsbrbgcahrcsdaetfoglkaelguhthahawhehuisidjwknkklolalvlnltlbmkmgmsmlmtmimrmnnenonnocpapsfarosasrsnsdsiskslsosuswsvtltgtatttetkuruzcyyiyoSupported models
slam-1universalSupported regions
US only
Overview
The Custom Formatting feature automatically standardizes and formats specific types of information in your transcripts, ensuring consistency across dates, phone numbers, emails, and other data types. This eliminates the need for post-processing and provides clean, formatted output ready for your application.
Key capabilities:
- Format dates in your preferred style (US, European, ISO, etc.)
- Standardize phone number formats with custom patterns
- Control currency and decimal precision
- Convert spelled-out text into formatted patterns
- Format URLs as hyperlinks
- Apply multiple formatting rules simultaneously
Common use cases:
- Standardizing contact information in customer service transcripts
- Formatting financial data in earnings calls
- Preparing transcripts for CRM systems with specific format requirements
- Creating consistent documentation from meetings
- Processing legal or medical transcripts with strict formatting standards
Quickstart
There are two ways to use Custom Formatting:
- Transcribe and format in one request - Best when you’re starting a new transcription and want to automatically format the transcript text as part of that process
- Transcribe and format in separate requests - Best when you already have text that you would like to format or for more complicated workflows where you want to separate the transcription and formatting tasks
Method 1: Transcribe and format in one request
This method is ideal when you’re starting fresh and want both transcription and formatting in a single workflow.
Python
JavaScript
Method 2: Transcribe and format in separate requests
This method is useful when you already have text that you would like to format or for more complicated workflows where you want to separate the transcription and formatting tasks.
Python
JavaScript
Expected output:
Output format
Data from the Custom Formatting API will be returned in the custom_formatted object, which is contained in the speech_understanding object. The formatted_text key will included a formatted version of the transcript text.
If Speaker Diarization is used in the request a formatted_utterances key will be returned containing formatted utterances with preserved timestamps.
Example response structure:
Key features of the output:
- Formatted Text: Formatted text can be found in the
formatted_textkey - Formatted utterances: When
format_utterancesis enabled, speaker-separated segments in theformatted_utteranceskey include formatted text - Preserved timestamps: All word-level timestamps in
formatted_utterancesremain intact after formatting, allowing you to maintain temporal alignment with the audio - Mapping object: Shows exactly what transformations were applied (original → formatted)
Understanding the custom_formatting parameter
The custom_formatting parameter accepts an object with specific formatting rules for different data types in your transcript. Each property in the object defines how a particular type of information should be formatted.
Available formatting options
Example configuration:
When you include this configuration in your transcription request, the API will automatically detect and format dates, phone numbers, and emails in your transcript according to the specified patterns. With format_utterances enabled, the formatting is applied to both the main transcript text and individual speaker utterances while preserving all timing information.
Common formatting patterns
Date formats
Phone number formats
Email formats
Best practices
-
Choose appropriate formats: Select formatting patterns that match your application’s requirements and regional standards.
-
Combine formatting rules: You can apply multiple formatting rules simultaneously for comprehensive text standardization.
-
Test with sample data: Verify your formatting patterns work correctly with representative audio samples before processing large batches.
-
Review the mapping: Check the
mappingobject in the response to see exactly what was changed and verify the results. -
Consider regional differences: Be mindful of date and phone number format differences when processing international content.
API reference
Request
Method 1: Transcribe and format in one request
When creating a new transcription, include the speech_understanding parameter directly in your transcription request:
Method 2: Add formatting to existing transcripts
For existing transcripts, retrieve the completed transcript and send it to the Speech Understanding API:
Response
The Custom Formatting API returns your original transcript response with formatting applied to the text field and additional formatting details in the speech_understanding object. When format_utterances is enabled, formatted utterances with preserved timestamps are also included.
Understanding formatted_utterances
When format_utterances is enabled, each object in the formatted_utterances array contains:
Important: All timestamps in both utterances and words are preserved exactly as they appear in the original transcription, ensuring perfect temporal alignment with the audio even after formatting is applied.
Key differences from standard transcription
All other fields from the original transcript (words, utterances, confidence, etc.) remain unchanged. The formatted_utterances field provides an additional view of the data with formatting applied while maintaining complete timestamp fidelity.