It’s important to know that your transcript quality can vary based on a handful of variables.
Not rendering correctly? View this email as a web page here.
It’s important to know that your transcript quality can vary based on a handful of variables. Factors such as echo, background noise, accents, diction and audio compression all affect accuracy, and voice recording quality has a huge impact on word recognition rate.
Questions To Ask Yourself Before Testing
Are your files compressed? If you must use compressed files we recommend 64kbps or higher for the best quality. 8-32kbps files show a significant decrease in transcript accuracy. Compression is measured in kbps (kilo bits per second).
What sample rate are you sending?Higher is better. Telephony commonly used 8kHz sample rate. Modern HD telephony uses a 16kHz sample rate. While VoiceBase’s speech recognition performs well on 8kHz, 16kHz will result in a more accurate transcript. Sample rate is measured in thousands of samples per second (kHz).
Is your recording in mono or stereo? Stereo recordings are always preferred to get the best accuracy (it prevents cross-talk issues) and the best analytics to understand who said what (the agent vs. caller)
wav vs. MP3: wav files are typically uncompressed while MP3's tend to be more compressed
Recording codec: Best option is g.711 (if your recording provider doesn't allow that, then choose the codec with the highest sample rate and highest bits/sample available)
Not sure where to start? Just reach out to email@example.com and we'll walk you through best practices.
VoiceBase API Support
Have Questions? Just reply to this email for quick answers and friendly API support!