voice cloning

26 articles about voice cloning in AI news

OpenVoice v2: Complete Voice Cloning Directory Launches on GitHub

A developer has compiled and released a comprehensive directory of open-source voice cloning tools and resources on GitHub. This centralizes access to models, datasets, and training code, lowering the barrier to entry for AI audio development.

Apr 16, 202685% relevant

ElevenLabs Voice Cloning API Priced from $5 to $1,320/Month

ElevenLabs' AI voice cloning service has published pricing tiers from $5 to $1,320 per month. This formalizes the cost structure for developers and businesses integrating synthetic speech.

Apr 10, 202687% relevant

Waves Audio Launches Lightning V3.1: 10-Second Voice Cloning with 44.1kHz Studio Quality

Waves Audio released Lightning V3.1, a voice cloning model that creates studio-quality voice replicas from just 10 seconds of audio with under 100ms latency. The update supports over 50 languages and targets real-time applications.

Mar 25, 202687% relevant

LuxTTS Democratizes Voice Cloning: High-Quality Synthesis Now Runs on Consumer Hardware

LuxTTS, a new open-source text-to-speech model, enables realistic voice cloning from just 3 seconds of audio using only 1GB of VRAM. The system operates 150x faster than real-time and produces 48kHz audio, challenging proprietary solutions like ElevenLabs.

Mar 11, 202695% relevant

Developer Builds AI Baby Monitor with Voice Cloning in Under 24 Hours Using DevKit

A developer created a working MVP of a smart baby monitor that clones a mother's voice to soothe a crying infant, completing the project in less than 24 hours after unboxing a new devkit.

Mar 20, 202685% relevant

VoxCPM2 Open-Source Voice AI Outperforms ElevenLabs on Key Benchmarks

Researchers from OpenBMB and Tsinghua University released VoxCPM2, a 2B-parameter open-source voice AI that clones voices from short clips and creates voices from text descriptions. It outperforms ElevenLabs on the Minimax-MLS benchmark and runs locally with no API costs.

Apr 10, 202695% relevant

GOLF.AI Launches 24/7 AI Concierge Agent for Golf Pro Shops, Voiced by Nick Faldo

GOLF.AI has introduced the GOLF.AI CONCIERGE Agent, an AI-powered voice assistant designed to serve as the primary contact for golf pro shops. It manages tee time bookings and answers customer queries around the clock, utilizing a licensed voice model of six-time major champion Sir Nick Faldo.

Apr 6, 202688% relevant

Neuralink & ElevenLabs Demo AI Voice Restoration for Brain Implant User

Neuralink and voice AI firm ElevenLabs demonstrated a system that generates speech for a Neuralink patient who lost their voice. The demo shows a brain-computer interface decoding intended speech into synthetic voice in real-time.

Apr 6, 202685% relevant

Mistral AI Releases Voxtral TTS: 4B-Parameter Open-Weight Model Clones Voices from 3-Second Audio in 9 Languages

Mistral AI has launched Voxtral TTS, its first open-weight text-to-speech model. The 4B-parameter model clones voices from three seconds of reference audio across nine languages, with a latency of 70ms, and scored higher on naturalness than ElevenLabs Flash v2.5 in human tests.

Mar 26, 202695% relevant

OpenAI's Audio Revolution: New Voice Models Signal Major AI Advancements

OpenAI appears poised to release new audio models that could significantly enhance voice interaction capabilities. This development follows recent trademark filings and suggests major improvements to voice mode technology.

Feb 23, 202685% relevant

OpenBMB's VoxCPM 2: 2B-Param Open-Source TTS for Multilingual Voice

OpenBMB launched VoxCPM 2, a 2-billion-parameter open-source text-to-speech model. It generates multilingual, emotionally expressive speech from text descriptions and runs on consumer-grade hardware.

Apr 13, 202697% relevant

Google Launches Gemini 3.1 Flash TTS with Prompt-Controlled Speech

Google has launched Gemini 3.1 Flash TTS, a text-to-speech model featuring prompt-based voice control and support for over 70 languages. This release expands Google's multimodal AI offerings directly to developers.

Apr 15, 202693% relevant

HeyGen Launches CLI Tool for AI Video Generation from Terminal

AI video platform HeyGen has launched a CLI tool, allowing users to generate videos with avatars, voice, and script via terminal commands. This moves video synthesis from a web dashboard into developer workflows.

Apr 13, 202685% relevant

Mistral AI Launches Voxtral TTS: 3B-Parameter Open-Source Model Claims 63% Win Rate Over ElevenLabs Flash v2.5

Mistral AI released Voxtral TTS, a 3-billion-parameter open-weights text-to-speech model. It reportedly outperforms ElevenLabs Flash v2.5 in human preference tests, runs on 3 GB RAM, and clones voices from 5 seconds of audio.

Mar 26, 202695% relevant

Tongyi Lab Releases World's First Open-Source Multi-Speaker AI Dubbing Model

Alibaba's Tongyi Lab has released the first open-source AI model capable of dubbing multi-speaker conversations, addressing one of the hardest problems in AI video generation. The model synchronizes voice with lip movements across multiple speakers in a single pass.

Mar 17, 202685% relevant

Miso One: 8B Open-Source TTS Hits 110ms Latency, Real Emotion

Miso One, an 8B open-source TTS model, achieves 110ms latency with emotional range. Weights are fully open-source for self-hosting, but no benchmark data is provided.

Jun 3, 202685% relevant

mlx-audio v0.4.3 Ships 6 New TTS Models, Slimmer Deps

mlx-audio v0.4.3 adds 6 TTS models, server concurrency, and slims dependencies, targeting Apple Silicon developers.

May 9, 202685% relevant

WeClone: Open-Source Tool Fine-Tunes AI Clones from Chat Logs

WeClone is an open-source tool that processes exported chat logs to fine-tune an LLM, creating a personalized AI clone. It has gained 16.4K stars on GitHub, offering a free, self-hosted alternative to commercial digital twin services.

Apr 13, 202685% relevant

Pika Labs Launches 'AI Self' Chatbot for Newsletter Creator Kimmonismus

Kimmonismus, who runs an AI newsletter with 225K+ readers, has launched a custom chatbot trained on his industry knowledge and opinions using Pika Labs' technology. The 'AI Self' is designed to handle reader inquiries at scale.

Apr 10, 202685% relevant

HeyGen Launches Avatar Engine, Open-Source Renderer & 175-Language Dubbing

HeyGen's major 2026 update includes a new avatar engine, an open-source video renderer, and 175-language dubbing capabilities, expanding its AI video generation platform for enterprise and creator use.

Apr 8, 202697% relevant

OpenBMB Launches VoxCPM 2, an Open-Source TTS Model Rivaling Qwen3-TTS

OpenBMB has launched VoxCPM 2, an open-source text-to-speech AI model from China. The release is positioned as a direct competitor to Alibaba's Qwen3-TTS, expanding the open-source TTS landscape.

Apr 7, 202691% relevant

Text-to-Speech Cost Plummets from $0.15/Word to Free Local Models Using 3GB RAM

High-quality text-to-speech has shifted from a $0.15 per word cloud service to free, local models requiring only 3GB of RAM in 12 months, signaling a broader price collapse in AI inference.

Mar 30, 202685% relevant

Fanvue Emerges as Primary Platform for AI-Generated Influencers, Explicitly Allowing Synthetic Creator Accounts

Fanvue, a subscription content platform, has positioned itself as the primary destination for AI-generated influencer accounts, explicitly permitting creators to monetize synthetic personas. This formalizes a niche market for AI-driven adult and influencer content.

Mar 29, 202685% relevant

POP.STORE Launches ECHO-ME: An Agentic AI Commerce Platform for Creators

POP.STORE announced ECHO-ME, an agentic AI platform designed to autonomously run a creator's business operations. It monitors social channels, detects brand deals, and converts fan interactions into revenue, launching with 15,000 creators. This represents a shift from task automation to full business operation for the solo creator economy.

Mar 18, 202682% relevant

Fish Audio S2 Enables Word-Level Speech Control with Positional Tags, Beats GPT-4o in Human Preference Tests

Fish Audio S2 introduces a 100% open-source TTS model that uses inline positional tags for word-level vocal control, achieving 8/10 wins against GPT-4o and Gemini in human preference tests while generating audio nearly 5x faster than real-time.

Mar 17, 202695% relevant

The Uncanny Valley of Truth: How AI Avatars Are Blurring Reality's Edge

AI avatars now replicate human speech patterns, facial expressions, and gestures with unsettling accuracy, creating synthetic personas indistinguishable from real people. This technological leap raises urgent questions about authenticity, trust, and the future of digital communication.

Feb 27, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety