We’re on the verge of getting some great improvements to our Speech Recognition API in production. We’ll be deploying new languages, update the user interface, improve accessibility, add better usage reporting and enhance processing. Apart from that our new Zoom Media API 2.0 will be better scalable and more stable. So what can you expect exactly? Read on, find out and check the teaser screenshot.
Improved interface and better accessibility
The new user interface will give the user an immediate clear monthly usage overview and processing status. Moreover, users can more easily navigate their transcripts add real-time or batch processing tokens and update profile settings. We’ll also add the option to buy fixed bundles of hours to process content right away. No hassle with agreeing on prices and contracts with those busy sales reps anymore, just pay as you go. Not the most technical person out there? Not to worry, the new manual upload option will be much easier to handle, thus enabling anyone to use our services.
In order to improve the reliability of the platform, all language processing is separated in microservices, which enables us to scale engines and processors up (or down) when needed. Also, in the new GUI, the state of your transcription session becomes immediately visible. This enables you to track the state of your transcript sooner and faster. Internally we track more metrics than before, this gives us insight in platform performance during the day and per language. By having these metrics we are able to identify possible low-level issues without models and/or engines. All in all, this should give us more insight on where we can make further improvements and identify potential issues.
The Zoom Media Speech Recognition API only supports high accuracy language models. We deploy a model when we are confident the results work for the user, meaning an accuracy of at least 90% with clear speech. We currently support: Danish, Dutch, English (US), Finnish, Flemish, Norwegian and Swedish. All these languages are available in 16 kHz models. Our new API will also support a Dutch 8khz model, suitable for call center applications. Moreover, we’re working on a Filipino model which will be deployed by the end of February. We already have some other languages on our roadmap for 2019 as well. Get in touch so we can discuss if a language you have a specific use case for can be added to the list.