Skip to content
gentic.news — AI News Intelligence Platform

Recipe ·

VibeVoice

VibeVoice is a speech-to-text AI model released by Microsoft under the permissive MIT license, combining transcription with speaker diarization for multi-speaker audio processing.

1
Techniques inside
3y
Median research → prod
3y
Fastest adoption
3y
Slowest adoption

Ingredient list

  1. Invented by OpenAI · 2022-12 · Velocity 3y

    builds on the encoder-decoder architecture popularized by Whisper

    multimodalhigh

This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.

VibeVoice Recipe — The Research Behind the Model | gentic.news