motion capture
30 articles about motion capture in AI news
Bones Studio Demos Motion-Capture-to-Robot Pipeline for Home Tasks
Bones Studio released a demo showing its 'Captured → Labeled → Transferred' pipeline. It uses optical motion capture to record human tasks, then transfers the data for a humanoid robot to replicate the actions in simulation.
Unitree G1 humanoid robots mirror dancer in real time via motion cap
Unitree G1 humanoid robots mirrored a dancer in real time via motion capture at a Shanghai event, part of a 100-person tracking challenge.
NVIDIA Open-Sources Motion Diffusion Model for Humanoid Robots
NVIDIA open-sourced Kimono, a motion diffusion model for humanoid robots, trained on 700 hours of motion capture data. It generates 3D human and robot motions from text prompts, supports keyframe and end-effector control, and runs on Unitree G1.
Miso One: 8B Open-Source TTS Hits 110ms Latency, Real Emotion
Miso One, an 8B open-source TTS model, achieves 110ms latency with emotional range. Weights are fully open-source for self-hosting, but no benchmark data is provided.
India's Human Motion Farms Train Humanoid Robots with First-Person Hand Data
Labs in India are capturing detailed human motion data—focusing on grip, force, and error recovery—to train AI models for humanoid robots. This addresses the critical bottleneck of acquiring physical intelligence data for robotics.
Maker 'Sword Man' Builds 5,000 kg Real-Time Motion-Tracking Robotic Hand
A Chinese maker known as Sword Man has constructed a massive 5,000 kg robotic hand from scratch. It uses a motion-tracking glove to perfectly mimic the operator's hand movements in real-time.
UiPath Launches AI Agents for Retail Pricing, Promotions, and Stock Management
UiPath has announced new AI agents designed to autonomously handle core retail operations: dynamic pricing, promotional planning, and inventory gap resolution. This represents a significant move by a major automation player into agentic AI for retail.
New Research Improves Text-to-3D Motion Retrieval with Interpretable Fine-Grained Alignment
Researchers propose a novel method for retrieving 3D human motion sequences from text descriptions using joint-angle motion images and token-patch interaction. It outperforms state-of-the-art methods on standard benchmarks while offering interpretable correspondences.
Beyond Logic: How EMO-R3 Teaches AI to Reason About Human Emotions
Researchers have developed EMO-R3, a novel framework that enhances emotional reasoning in multimodal AI systems. Using reflective reinforcement learning, it enables AI to better understand and interpret human emotions in visual contexts, addressing a critical gap in current models.
The Dawn of Emotional AI Avatars: How Synthetic Humans Are Redefining Digital Interaction
New AI avatar technology creates emotionally responsive digital humans with realistic facial expressions, enabling natural conversations that could transform customer service, education, and social interaction.
GenRobot Launches 6-Camera Wearable for Embodied AI Data Capture
GenRobot launched DAS Ego, a wearable with six 2MP cameras for capturing zero-distortion, 270° FOV data. They also open-sourced the 'Gen Ego Data' dataset covering 200+ skills to train models on perception-action causality.
Tsinghua & Peking University Researchers Train Humanoid Robot to Play Tennis Using Scattered, Imperfect Human Motion Clips
A team from Tsinghua, Peking University, and other labs taught a humanoid robot to play tennis using short, imperfect human swing clips instead of perfect match data. The system uses a physics simulator to correct errors, lowering the barrier for teaching robots complex physical tasks.
Instacart's Semantic IDs: Product Understanding at Scale
Instacart's engineering team details a semantic ID system for product understanding at scale, using embeddings to create meaningful identifiers that enhance search and recommendations. This approach captures nuanced product relationships, improving relevance for grocery e-commerce.
OpenBMB's VoxCPM 2: 2B-Param Open-Source TTS for Multilingual Voice
OpenBMB launched VoxCPM 2, a 2-billion-parameter open-source text-to-speech model. It generates multilingual, emotionally expressive speech from text descriptions and runs on consumer-grade hardware.
Indian Factory Workers Wear Head Cams to Gather Embodied AI Training Data
To overcome the high cost of robot fleet data collection, companies are deploying head cameras on human factory workers. This first-person video captures the sequencing, posture, and micro-adjustments of real work, serving as a proxy for expensive robotic action data.
AI Model Decodes Silent Speech from Phone Sensors, No Microphone Needed
A new AI model can reconstruct speech by analyzing imperceptible facial movements captured by smartphone sensors, effectively enabling silent speech recognition without a microphone. This represents a significant leap in sensor fusion and on-device AI.
TikTok Shop's Real ROI: Why Brands Must Measure Cross-Platform Demand, Not Just In-App Sales
A case study of sun-care brand Carroten argues TikTok Shop's primary value is as a demand engine for Amazon and retail, not a standalone sales channel. The strategy reframes ROI measurement to capture the halo effect across the entire digital shelf.
Mood-Assisted Recommendation Systems Show Statistically Significant Improvement in Music Context
New research demonstrates that incorporating user mood input via the energy-valence spectrum leads to statistically significant improvements in music recommendation quality compared to baseline systems. This highlights the value of emotional context in personalization.
Kling AI Video Platform Goes Global: How 3.0 Release Redefines Accessible Cinematic AI
Kling AI has launched its 3.0 platform worldwide, offering 1080p cinematic video generation and advanced motion control. This marks a significant step toward professional-grade AI video tools becoming accessible to global creators.
NeuroSkill: MIT's Breakthrough AI Agent Reads Your Mind Before You Ask
MIT researchers have developed NeuroSkill, a revolutionary AI system that integrates brain-computer interfaces with foundation models to create proactive agents that respond to implicit human cognitive and emotional states, running fully offline on edge devices.
Robotics' Scaling Breakthrough: How SONIC's 42M-Parameter Model Achieves Perfect Real-World Transfer
Researchers have demonstrated that robotics can scale like language models, with SONIC training a 42M-parameter model on 100M human motion frames. The system achieved 100% success transferring to real robots without fine-tuning, marking a paradigm shift in robotic learning.
Pollo AI Underprices Seedance 2.0 at $0.11/Video
Pollo AI offers Seedance 2.0 at $0.11/video, 5-10x below Seedance's API rates, signaling a pricing war in AI video generation.
MIT Hackathon Team Builds Wearable AI for Physical Movement Guidance
MIT hackathon team builds wearable AI for real-time physical movement guidance via sensors and on-device inference, demoed by @kimmonismus.
Guerlain Launches First Paid Influencer Campaign After Viral TikTok
Guerlain reports the Vanille Planifolia extrait became its #1 best-selling product for five months after organic TikTok videos, leading to the brand’s first paid influencer campaign. Sales tripled despite the $660 price, and the fragrance sold out multiple times.
VLAF Framework Reveals Widespread Alignment Faking in Language Models
Researchers introduce VLAF, a diagnostic framework that reveals alignment faking is far more common than previously known, affecting models as small as 7B parameters. They also show a single contrastive steering vector can mitigate the behavior with minimal computational overhead.
OpenAI Teases 'Not a Screenshot' AI Video Model
OpenAI posted a cryptic tweet stating 'This is not a screenshot' with a video link, strongly hinting at a new AI video generation model. This marks a direct move into a space currently led by rivals like Runway and Pika.
John Ternus Takes Over Apple AI Leadership as Era Ends
Apple's AI leadership transitions to John Ternus, marking a new era following Steve Jobs' vision and Tim Cook's operational success. This comes as Apple accelerates its generative AI push with Apple Intelligence.
Webcam Head-Tracking Wallpaper Uses AI for Parallax Effect
A developer built a dynamic wallpaper that tracks a user's head via webcam to shift the background perspective in real-time. It demonstrates a novel, accessible application of computer vision for interactive desktop environments.
Four Seasons Kuala Lumpur Deploys AI to Personalize Luxury Event Experiences
The Four Seasons Kuala Lumpur is introducing AI to create personalized event experiences, from tailored menus to dynamic ambiance. This is part of a broader trend where luxury hotels are testing AI as a tool for deeper guest engagement and service differentiation.
Ethan Mollick on AI's Impact: 'Everything Is Someone's Life Work' No Longer True
AI researcher Ethan Mollick notes the foundational assumption that 'everything around me is somebody's life work' is being invalidated by generative AI, signaling a profound shift in how we value human output.