Topics
Latest
AI
Amazon
Image Credits:ElevenLabs
Apps
Biotech & Health
Climate
Image Credits:ElevenLabs
Cloud Computing
Commerce Department
Crypto
Image Credits:ElevenLabs
endeavor
EVs
Fintech
Fundraising
Gadgets
Gaming
Government & Policy
Hardware
layoff
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
certificate
Social
Space
inauguration
TikTok
conveyance
Venture
More from TechCrunch
case
Startup Battlefield
StrictlyVC
Podcasts
picture
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
ElevenLabs , an AI startup that just raised a$180 million mega - funding round , has been in the main known for its audio - generation art . The company took a footstep in another technical direction by launching its first stand - alone speech - to - text manikin called Scribe .
The inauguration , valued at $ 3.3 billion , has aided many other companies in ply text edition - to - speech service through its Brobdingnagian program library of interpreter . However , the company is now count to get into speech detection and compete with the like ofGladia , Speechmatics , AssemblyAI , Deepgram , and OpenAI ’s Whisper models .
ElevenLabs ’ Scribe model supports over 99 languages at launch . The company categorizes over 25 language in fantabulous accuracy category for the example where the word error rate is less than 5 % . This inclination includes English ( claimed accuracy pace of 97 % ) , French , German , Hindi , Indonesian , Japanese , Kannada , Malayalam , Polish , Portuguese , Spanish , and Vietnamese . Other languages are rank in unlike family with high ( 5 % to 10 % word of honor error charge per unit ) , full ( 10 % to 20 % parole mistake rate ) , and moderate ( 25 % to 50 % ) word error rates .
The troupe say that the model outdo Google Gemini 2.0 Flash and Whisper Large V3 across multiple languages in FLEURS & Common Voice benchmark tests .
ElevenLabs had develop the voice communication - to - schoolbook ingredient for its AI conversational agent platform , which was released last yr . However , this is the first timethe company is release a stand - alone speech detection modelling . In a conversation with TechCrunch last month , CEO Mati Staniszewski talked about improving speech detection models .
“ We want to understand what ’s being said by you in a conversation better . We are working on way to move out from only generating content and understanding and transcribing address , ” Staniszewski said at that time . “ Many people say that speech - to - textbook is a solved job . But for many languages , it is pretty bad . We think we can build up skilful speech detection model because we have in - sign teams to annotate data and give us quick feedback . ”
The model also has smart speaker diarization to severalise you who is speaking , timestamp at Son level for accurate subtitles , and car - tagging sound events like audience laughters . The inauguration is providing a way of life for customer to straight transcribe video contentedness to add caption or captions in its studio .
Join us at TechCrunch Sessions: AI
Exhibit at TechCrunch Sessions: AI
Scribe presently only work with pre - recorded audio formats . The company said it will release a depressed - latency real - time version of the theoretical account presently . That signify it is not yet good for meeting transcriptions or voice note - pickings .
ElevenLabs is pricing Scribe at $ 0.40 for an hour of canned sound recording . While the charge per unit is competitory , some of its rivalsoffer a lower pricefor audio transcriptions at the import with some lineament differentiation .