Recipe ·
VibeVoice
VibeVoice is a speech-to-text AI model released by Microsoft under the permissive MIT license, combining transcription with speaker diarization for multi-speaker audio processing.
1
Techniques inside
3y
Median research → prod
3y
Fastest adoption
3y
Slowest adoption
Ingredient list
Invented by OpenAI · 2022-12 · Velocity 3y
“builds on the encoder-decoder architecture popularized by Whisper”
multimodalhigh
This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.