video

30 articles about video in AI news

Google Launches $0.034 Image Model, Video API for Gemini

Google launched Nano Banana 2 Lite ($0.034/image, 4-second generation) and Gemini Omni Flash ($0.10/second video API), targeting high-throughput developer pipelines.

Jun 30, 202675% relevant

Bluezoo Launches AI Agent for In-Store Video Advertising

Bluezoo launched an AI agent for in-store video advertising that uses computer vision to analyze shopper engagement and optimize ad content in real time, promising improved ad effectiveness for retailers.

Jun 22, 202678% relevant

AI editor matches pro on 84% of video cuts in blind test

AI editor matched pro on 84% of video cuts in blind test of 4-hour project. Suggests editorial judgment is partially automatable.

Jun 15, 202665% relevant

Mirage: Microsoft's 10.57x faster video gen skips RGB render loop

Microsoft's Mirage stores 3D scenes as latent tokens, achieving 10.57x faster video generation and 55x less memory, with SOTA WorldScore consistency.

Jun 9, 202692% relevant

LTX Studio Turns AI Video Clips Into Editable Scenes

LTX Studio + LTX-2.3 lets users edit AI video scenes, not just generate clips. This shifts AI video from demo to production tool.

Jun 5, 202675% relevant

Kling AI Video Enters Hollywood Production with 'House of David'

Kling AI video used in 'House of David', first Hollywood production at industrial scale. Show reached 44M+ viewers, #1 on Prime Video U.S.

May 24, 202685% relevant

HAVEN Benchmark Exposes MLLM Gap Between Fluency and Video Understanding

HAVEN benchmark tests MLLMs on hierarchical video understanding across frame, shot, and video levels. Results show top models lack grounded multimodal reasoning despite fluent text generation.

May 21, 202685% relevant

POV Shopping Videos Threaten Luxury Brand Control, BoF Warns

BoF warns POV shopping videos risk luxury brand exclusivity by prioritizing authenticity over controlled imagery, with no disclosed revenue impact.

May 18, 202698% relevant

Tavus Debuts AI Avatars Without Source Video Footage

Tavus announced AI avatars no longer need source video, enabling generation from images or text. The shift lowers barriers for enterprise video production.

May 15, 202685% relevant

Pollo AI Underprices Seedance 2.0 at $0.11/Video

Pollo AI offers Seedance 2.0 at $0.11/video, 5-10x below Seedance's API rates, signaling a pricing war in AI video generation.

May 9, 202675% relevant

Luma Labs Opens Uni-1.1 API for Production — Image, Not Video, and #1 ELO Comes With a Caveat

Luma Labs has shipped the Uni-1.1 API for production — an image-generation model (not video) with two REST endpoints, Python and JavaScript SDKs, and support for up to nine reference images per call. The widely-cited '#1 Human Preference ELO' is from Luma's own internal pairwise evaluation; on pure text-to-image Luma reports #2 behind Google Nano Banana. Pricing: ~$0.09 per 2K image, 10–30% below Nano Banana 2 / Pro.

May 6, 202691% relevant

UniVidX Generates Video From 1,000 Samples, SIGGRAPH 2026

UniVidX generates omni-directional video from <1,000 training samples, using diffusion priors with stochastic masking, accepted at SIGGRAPH 2026.

May 4, 202685% relevant

Google DeepMind Launches Real-Time Video AI Co-Clinician

Google DeepMind launched AI Co-Clinician, a real-time video analysis system for triadic care, claiming 30% fewer diagnostic errors in early tests.

May 1, 202685% relevant

NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text

NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.

Apr 28, 202693% relevant

Microsoft World-R1: RL Aligns Text-to-Video with 3D Physics

Microsoft's World-R1 framework applies reinforcement learning with feedback from pre-trained 3D foundation models to align text-to-video outputs with physical 3D constraints, improving structural coherence without modifying the underlying video diffusion architecture.

Apr 28, 202685% relevant

Mirage's Cappy Edits Video via Text Message with No App

Mirage launched Cappy, a text-based video editing service that delivers fully edited videos via SMS. This first-of-its-kind approach eliminates traditional editing interfaces entirely.

Apr 23, 202675% relevant

OpenAI Teases 'Not a Screenshot' AI Video Model

OpenAI posted a cryptic tweet stating 'This is not a screenshot' with a video link, strongly hinting at a new AI video generation model. This marks a direct move into a space currently led by rivals like Runway and Pika.

Apr 21, 202685% relevant

Uni-ViGU Unifies Video Generation & Understanding in Single Diffusion Model

A new paper introduces Uni-ViGU, a unified model that performs video generation and understanding within a single diffusion process via flow matching. This inverts the standard approach of separate models for each task.

Apr 15, 202685% relevant

ByteDance's OmniShow Unifies Text, Image, Audio, Pose for Video Gen

ByteDance introduced OmniShow, a unified multimodal framework for video generation that accepts text, reference images, audio, and pose inputs simultaneously. It claims state-of-the-art performance across diverse conditioning settings.

Apr 14, 202685% relevant

HeyGen Launches CLI Tool for AI Video Generation from Terminal

AI video platform HeyGen has launched a CLI tool, allowing users to generate videos with avatars, voice, and script via terminal commands. This moves video synthesis from a web dashboard into developer workflows.

Apr 13, 202685% relevant

Seedance 2.0 Generates Complex 'Mech Battle' Video from Text Prompt

Academic Ethan Mollick highlighted Seedance 2.0's ability to generate a coherent video for the complex prompt 'a mech battle between Neanderthal and Homo Sapiens'. This demonstrates the model's progress in multi-concept scene composition and temporal consistency.

Apr 13, 202685% relevant

LPM 1.0: 17B-Parameter Diffusion Model Generates 60K-Second AI Avatar Videos

Researchers introduced LPM 1.0, a 17B-parameter real-time diffusion model that generates infinite-length conversational videos with stable identity, achieving over 60,000 seconds of consistent character performance.

Apr 12, 202695% relevant

OpenMontage: Open-Source Agentic Video Production System Costs $0.69 Per Ad

OpenMontage, an open-source agentic video production system, has been released. It orchestrates 11 pipelines and 49 tools across multiple AI providers to autonomously script, generate assets, edit, and render videos from a plain language prompt.

Apr 11, 202699% relevant

Seedance 2 Video AI Launches on Lovart AI Platform

The Seedance 2 video generation model has launched on the Lovart AI platform. Early users report it can create complex cinematic sequences, like a spy transformation, from a single text prompt.

Apr 10, 202685% relevant

PixVerse V6 Launches: 15-Second 1080P Video with Full Audio

AI video startup PixVerse launched its V6 model, capable of generating 15-second, 1080p videos with full audio from text prompts. This marks a significant upgrade in output length and quality for the platform.

Apr 9, 202689% relevant

Massive Video Reasoning Dataset Released, Reportedly 1000x Larger Than Predecessors

An unverified report claims the release of a video reasoning dataset roughly 1000x larger than existing benchmarks. If true, it would be a significant resource for training next-generation video understanding models.

Apr 8, 202699% relevant

NemoVideo AI Automates Video Editing Based on Text Prompts

A video creator states NemoVideo AI now automates complex editing tasks like cuts and transitions from simple text descriptions, reducing a 5-hour manual process to a prompt-driven workflow.

Apr 5, 202685% relevant

OpenAI's GPT-Image-2 Model Reportedly Achieves Photorealistic Video Generation, Surpassing Prior Map-Generation Flaws

A social media user claims OpenAI's GPT-Image-2 model now produces video indistinguishable from reality, a significant leap from its predecessor's documented failure to generate coherent world maps.

Apr 4, 202685% relevant

Stanford's EgoNav Trains Robot Navigation on 5 Hours of Human Video, Enables Zero-Shot Control of Unitree G1

Stanford's EgoNav system uses a 5-hour egocentric video walk of campus to train a diffusion model that enables zero-shot navigation for a Unitree G1 humanoid robot, eliminating the need for robot-specific training data.

Apr 3, 202699% relevant

Elon Musk Predicts 'Vast Majority' of AI Compute Will Be for Real-Time Video

Elon Musk states that real-time video consumption and generation will consume most AI compute, highlighting a shift from text to video as the primary medium for AI processing.

Mar 29, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety