Speech Classification Models

OpenAI Introduces New Speech Models for Transcription and Voice Generation

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Birgitta Böckeler, Distinguished Engineer at ...

Forbes

How Large Scale Speech Models Will Impact Voice AI

Gautam Jha is the Co-Founder & CTO of Kalpa Labs, an SF-based YC backed startup building large scale Foundational speech models. Voice is quickly becoming a primary interface for enterprise software, ...

VentureBeat

Meta Introduces Spirit LM open source model that combines text and speech inputs/outputs

Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.

Geeky Gadgets

Top Text-to-Speech Models of 2026: Proprietary vs Open Source Compared

Text-to-speech (TTS) technology in 2026 has reached a level where synthesized voices can closely mimic human speech in both accuracy and expressiveness. Trelis Research examines this progress by ...

CU Boulder News & Events

Fine-tuning a Strong Language model to Enable Classroom Speech Recognition

Postdoctorate Viet Anh Trinh led a project within Strand 1 to develop a novel neural network architecture that can both recognize and generate speech. He has since moved on from iSAT to a role at ...

1don MSN

Top text-to-speech platforms

Did you know that in 2024, the global text-to-speech market was valued at just $4 billion, but is expected to reach $37.55 ...

ZDNet

I tested 3 text-to-speech AI models to see which is best - hear my results

There are several AI tools available that can generate humanlike speech. Some AI voices can whisper, laugh, and perform other expressive feats. TTS tools vary in terms of level of realism and their ...

Geeky Gadgets

OpenAI Launches New Speech-to-Text AI Audio Models API for Developers

OpenAI has today introduced a suite of advanced audio models and tools through its API, designed to empower developers in creating sophisticated, voice-driven applications. These updates include ...

9to5Mac

New Apple-backed AI model can generate sound and speech from silent videos

The new model, called VSSFlow, leverages a creative architecture to generate sounds and speech with a single unified system, with state-of-the-art results. Watch (and hear) some demos below. Currently ...

TechCrunch

ElevenLabs is launching its own speech-to-text model

ElevenLabs, an AI startup that just raised a $180 million mega-funding round, has been primarily known for its audio-generation prowess. The company took a step in another technological direction by ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results