Supported languages
Supported languages
Supported models
Supported models
Supported regions
Supported regions
US & EU
Overview
The Custom Formatting feature automatically standardizes and formats specific types of information in your transcripts, ensuring consistency across dates, phone numbers, emails, and other data types. This eliminates the need for post-processing and provides clean, formatted output ready for your application. Key capabilities:- Format dates in your preferred style (US, European, ISO, etc.)
- Standardize phone number formats with custom patterns
- Control currency and decimal precision
- Convert spelled-out text into formatted patterns
- Format URLs as hyperlinks
- Apply multiple formatting rules simultaneously
- Standardizing contact information in customer service transcripts
- Formatting financial data in earnings calls
- Preparing transcripts for CRM systems with specific format requirements
- Creating consistent documentation from meetings
- Processing legal or medical transcripts with strict formatting standards
Quickstart
There are two ways to use Custom Formatting:- Transcribe and format in one request - Best when you’re starting a new transcription and want to automatically format the transcript text as part of that process
- Transcribe and format in separate requests - Best when you already have text that you would like to format or for more complicated workflows where you want to separate the transcription and formatting tasks
Method 1: Transcribe and format in one request
This method is ideal when you’re starting fresh and want both transcription and formatting in a single workflow.- Python
- JavaScript
- Python SDK
- JavaScript SDK
Method 2: Transcribe and format in separate requests
This method is useful when you already have text that you would like to format or for more complicated workflows where you want to separate the transcription and formatting tasks.- Python
- JavaScript
Output format
Data from the Custom Formatting API will be returned in thecustom_formatted object, which is contained in the speech_understanding object. The formatted_text key will included a formatted version of the transcript text.
If Speaker Diarization is used in the request a formatted_utterances key will be returned containing formatted utterances with preserved timestamps.
Example response structure:
- Formatted Text: Formatted text can be found in the
formatted_textkey - Formatted utterances: When
format_utterancesis enabled, speaker-separated segments in theformatted_utteranceskey include formatted text - Preserved timestamps: All word-level timestamps in
formatted_utterancesremain intact after formatting, allowing you to maintain temporal alignment with the audio - Mapping object: Shows exactly what transformations were applied (original → formatted)
Understanding the custom_formatting parameter
The custom_formatting parameter accepts an object with specific formatting rules for different data types in your transcript. Each property in the object defines how a particular type of information should be formatted.
Available formatting options
| Parameter | Type | Description | Example Values |
|---|---|---|---|
date | string | Specifies the format pattern for dates in the transcript | "mm/dd/yyyy", "dd/mm/yyyy", "yyyy-mm-dd" |
phone_number | string | Specifies the format pattern for phone numbers | "(xxx)xxx-xxxx", "xxx-xxx-xxxx", "xxx.xxx.xxxx" |
email | string | Specifies the format pattern for email addresses | "username@domain.com", "firstname.lastname@domain.com" |
format_utterances | boolean | When true, applies formatting to utterances in addition to the main text field. Preserves all word-level timestamps. | true, false (default: false) |
format_utterances enabled, the formatting is applied to both the main transcript text and individual speaker utterances while preserving all timing information.
Common formatting patterns
Date formats
| Pattern | Example Output | Description |
|---|---|---|
mm/dd/yyyy | 09/19/1991 | US format (month/day/year) |
dd/mm/yyyy | 19/09/1991 | European format (day/month/year) |
yyyy-mm-dd | 1991-09-19 | ISO 8601 format |
mm-dd-yyyy | 09-19-1991 | US format with dashes |
dd.mm.yyyy | 19.09.1991 | European format with dots |
Phone number formats
| Pattern | Example Output | Description |
|---|---|---|
(xxx)xxx-xxxx | (555)679-3466 | Parentheses and dash |
xxx-xxx-xxxx | 555-679-3466 | Dashes only |
xxx.xxx.xxxx | 555.679.3466 | Dots separator |
+x(xxx)xxx-xxxx | +1(555)679-3466 | International format |
Email formats
| Pattern | Example Output |
|---|---|
username@domain.com | john.doe@example.com |
firstname.lastname@domain.com | john.doe@company.com |
Best practices
- Choose appropriate formats: Select formatting patterns that match your application’s requirements and regional standards.
- Combine formatting rules: You can apply multiple formatting rules simultaneously for comprehensive text standardization.
- Test with sample data: Verify your formatting patterns work correctly with representative audio samples before processing large batches.
-
Review the mapping: Check the
mappingobject in the response to see exactly what was changed and verify the results. - Consider regional differences: Be mindful of date and phone number format differences when processing international content.
API reference
Request
Method 1: Transcribe and format in one request
When creating a new transcription, include thespeech_understanding parameter directly in your transcription request:
Method 2: Add formatting to existing transcripts
For existing transcripts, retrieve the completed transcript and send it to the Speech Understanding API:| Key | Type | Required? | Description |
|---|---|---|---|
speech_understanding | object | Yes | Container for speech understanding requests. |
speech_understanding.request | object | Yes | The understanding request configuration. |
speech_understanding.request.custom_formatting | object | Yes | Custom formatting configuration. |
custom_formatting.date | string | No | Date format pattern. Common patterns: mm/dd/yyyy (US), dd/mm/yyyy (European), yyyy-mm-dd (ISO). |
custom_formatting.phone_number | string | No | Phone number format pattern. Examples: (xxx)xxx-xxxx, xxx-xxx-xxxx, xxx.xxx.xxxx. |
custom_formatting.email | string | No | Email format pattern. Example: username@domain.com. |
custom_formatting.format_utterances | boolean | No | When true, applies formatting to speaker utterances in addition to the main text. Preserves word-level timestamps. Default: false. |
Response
The Custom Formatting API returns your original transcript response with formatting applied to thetext field and additional formatting details in the speech_understanding object. When format_utterances is enabled, formatted utterances with preserved timestamps are also included.
| Key | Type | Description |
|---|---|---|
text | string | The transcript text with custom formatting applied. |
speech_understanding | object | Container for speech understanding request and response information. |
speech_understanding.request | object | The original custom formatting request configuration that was submitted. |
speech_understanding.request.custom_formatting | object | The formatting parameters that were used. |
speech_understanding.response | object | The response information from the formatting process. |
speech_understanding.response.custom_formatting | object | Details about the formatting operation. |
speech_understanding.response.custom_formatting.formatted_text | string | The complete transcript with custom formatting applied. Identical to the text field. |
speech_understanding.response.custom_formatting.formatted_utterances | array | Array of speaker utterances with formatting applied. Only present when format_utterances is true. Each utterance includes speaker label, timestamps, confidence scores, formatted text, and word-level details with preserved timestamps. |
speech_understanding.response.custom_formatting.mapping | object | An object showing the original text segments and their formatted versions. Keys are original text, values are formatted text. |
speech_understanding.response.custom_formatting.status | string | The status of the formatting operation. Will be "success" when formatting completes successfully. |
Understanding formatted_utterances
Whenformat_utterances is enabled, each object in the formatted_utterances array contains:
| Field | Type | Description |
|---|---|---|
speaker | string | Speaker identifier (e.g., “A”, “B”) |
start | integer | Start time of the utterance in milliseconds |
end | integer | End time of the utterance in milliseconds |
text | string | The utterance text with custom formatting applied |
confidence | number | Confidence score for the utterance (0-1) |
words | array | Array of word objects with individual timestamps, text (formatted), confidence scores, and speaker labels |
Key differences from standard transcription
| Field | Standard Transcription | With Custom Formatting | With format_utterances=true |
|---|---|---|---|
text | Transcribed text with default formatting | Transcribed text with your custom formatting rules applied | Same as custom formatting |
speech_understanding | Not present | Object containing formatting request, response, and mapping | Same, plus formatted_text and formatted_utterances |
utterances | Speaker-separated segments with original text | Unchanged | Unchanged (original utterances remain) |
| Word timestamps | Original timestamps | Preserved exactly | Preserved exactly in formatted_utterances |
words, utterances, confidence, etc.) remain unchanged. The formatted_utterances field provides an additional view of the data with formatting applied while maintaining complete timestamp fidelity.