Does your API return timestamps for individual words?
Yes! The response for a completed request includes start
and end
keys. These keys are timestamp values for when a given word, phrase, or sentence starts and ends. These values are in milliseconds and are accurate to within about 400 milliseconds.
To convert these timestamps from milliseconds to seconds, divide the timestamp value by 1000.
See this section of our API reference for an example of the JSON response for a completed transcript, which includes these start
and end
keys.