video
30 articles about video in AI news
Google Launches $0.034 Image Model, Video API for Gemini
Google launched Nano Banana 2 Lite ($0.034/image, 4-second generation) and Gemini Omni Flash ($0.10/second video API), targeting high-throughput developer pipelines.
Bluezoo Launches AI Agent for In-Store Video Advertising
Bluezoo launched an AI agent for in-store video advertising that uses computer vision to analyze shopper engagement and optimize ad content in real time, promising improved ad effectiveness for retailers.
AI editor matches pro on 84% of video cuts in blind test
AI editor matched pro on 84% of video cuts in blind test of 4-hour project. Suggests editorial judgment is partially automatable.
Mirage: Microsoft's 10.57x faster video gen skips RGB render loop
Microsoft's Mirage stores 3D scenes as latent tokens, achieving 10.57x faster video generation and 55x less memory, with SOTA WorldScore consistency.
LTX Studio Turns AI Video Clips Into Editable Scenes
LTX Studio + LTX-2.3 lets users edit AI video scenes, not just generate clips. This shifts AI video from demo to production tool.
Kling AI Video Enters Hollywood Production with 'House of David'
Kling AI video used in 'House of David', first Hollywood production at industrial scale. Show reached 44M+ viewers, #1 on Prime Video U.S.
HAVEN Benchmark Exposes MLLM Gap Between Fluency and Video Understanding
HAVEN benchmark tests MLLMs on hierarchical video understanding across frame, shot, and video levels. Results show top models lack grounded multimodal reasoning despite fluent text generation.
POV Shopping Videos Threaten Luxury Brand Control, BoF Warns
BoF warns POV shopping videos risk luxury brand exclusivity by prioritizing authenticity over controlled imagery, with no disclosed revenue impact.
Tavus Debuts AI Avatars Without Source Video Footage
Tavus announced AI avatars no longer need source video, enabling generation from images or text. The shift lowers barriers for enterprise video production.
Pollo AI Underprices Seedance 2.0 at $0.11/Video
Pollo AI offers Seedance 2.0 at $0.11/video, 5-10x below Seedance's API rates, signaling a pricing war in AI video generation.
Luma Labs Opens Uni-1.1 API for Production — Image, Not Video, and #1 ELO Comes With a Caveat
Luma Labs has shipped the Uni-1.1 API for production — an image-generation model (not video) with two REST endpoints, Python and JavaScript SDKs, and support for up to nine reference images per call. The widely-cited '#1 Human Preference ELO' is from Luma's own internal pairwise evaluation; on pure text-to-image Luma reports #2 behind Google Nano Banana. Pricing: ~$0.09 per 2K image, 10–30% below Nano Banana 2 / Pro.
UniVidX Generates Video From 1,000 Samples, SIGGRAPH 2026
UniVidX generates omni-directional video from <1,000 training samples, using diffusion priors with stochastic masking, accepted at SIGGRAPH 2026.
Google DeepMind Launches Real-Time Video AI Co-Clinician
Google DeepMind launched AI Co-Clinician, a real-time video analysis system for triadic care, claiming 30% fewer diagnostic errors in early tests.
NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text
NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.
Microsoft World-R1: RL Aligns Text-to-Video with 3D Physics
Microsoft's World-R1 framework applies reinforcement learning with feedback from pre-trained 3D foundation models to align text-to-video outputs with physical 3D constraints, improving structural coherence without modifying the underlying video diffusion architecture.
Mirage's Cappy Edits Video via Text Message with No App
Mirage launched Cappy, a text-based video editing service that delivers fully edited videos via SMS. This first-of-its-kind approach eliminates traditional editing interfaces entirely.
OpenAI Teases 'Not a Screenshot' AI Video Model
OpenAI posted a cryptic tweet stating 'This is not a screenshot' with a video link, strongly hinting at a new AI video generation model. This marks a direct move into a space currently led by rivals like Runway and Pika.
Uni-ViGU Unifies Video Generation & Understanding in Single Diffusion Model
A new paper introduces Uni-ViGU, a unified model that performs video generation and understanding within a single diffusion process via flow matching. This inverts the standard approach of separate models for each task.
ByteDance's OmniShow Unifies Text, Image, Audio, Pose for Video Gen
ByteDance introduced OmniShow, a unified multimodal framework for video generation that accepts text, reference images, audio, and pose inputs simultaneously. It claims state-of-the-art performance across diverse conditioning settings.
HeyGen Launches CLI Tool for AI Video Generation from Terminal
AI video platform HeyGen has launched a CLI tool, allowing users to generate videos with avatars, voice, and script via terminal commands. This moves video synthesis from a web dashboard into developer workflows.
Seedance 2.0 Generates Complex 'Mech Battle' Video from Text Prompt
Academic Ethan Mollick highlighted Seedance 2.0's ability to generate a coherent video for the complex prompt 'a mech battle between Neanderthal and Homo Sapiens'. This demonstrates the model's progress in multi-concept scene composition and temporal consistency.
LPM 1.0: 17B-Parameter Diffusion Model Generates 60K-Second AI Avatar Videos
Researchers introduced LPM 1.0, a 17B-parameter real-time diffusion model that generates infinite-length conversational videos with stable identity, achieving over 60,000 seconds of consistent character performance.
OpenMontage: Open-Source Agentic Video Production System Costs $0.69 Per Ad
OpenMontage, an open-source agentic video production system, has been released. It orchestrates 11 pipelines and 49 tools across multiple AI providers to autonomously script, generate assets, edit, and render videos from a plain language prompt.
Seedance 2 Video AI Launches on Lovart AI Platform
The Seedance 2 video generation model has launched on the Lovart AI platform. Early users report it can create complex cinematic sequences, like a spy transformation, from a single text prompt.
PixVerse V6 Launches: 15-Second 1080P Video with Full Audio
AI video startup PixVerse launched its V6 model, capable of generating 15-second, 1080p videos with full audio from text prompts. This marks a significant upgrade in output length and quality for the platform.
Massive Video Reasoning Dataset Released, Reportedly 1000x Larger Than Predecessors
An unverified report claims the release of a video reasoning dataset roughly 1000x larger than existing benchmarks. If true, it would be a significant resource for training next-generation video understanding models.
NemoVideo AI Automates Video Editing Based on Text Prompts
A video creator states NemoVideo AI now automates complex editing tasks like cuts and transitions from simple text descriptions, reducing a 5-hour manual process to a prompt-driven workflow.
OpenAI's GPT-Image-2 Model Reportedly Achieves Photorealistic Video Generation, Surpassing Prior Map-Generation Flaws
A social media user claims OpenAI's GPT-Image-2 model now produces video indistinguishable from reality, a significant leap from its predecessor's documented failure to generate coherent world maps.
Stanford's EgoNav Trains Robot Navigation on 5 Hours of Human Video, Enables Zero-Shot Control of Unitree G1
Stanford's EgoNav system uses a 5-hour egocentric video walk of campus to train a diffusion model that enables zero-shot navigation for a Unitree G1 humanoid robot, eliminating the need for robot-specific training data.
Elon Musk Predicts 'Vast Majority' of AI Compute Will Be for Real-Time Video
Elon Musk states that real-time video consumption and generation will consume most AI compute, highlighting a shift from text to video as the primary medium for AI processing.