
Nvidia Unveils Physical AI Agent Skills, 32B VLA Model at CVPR
Nvidia launched physical AI agent skills and a 32B VLA model at CVPR to automate AV and robotics workflows, addressing the fragmented tooling bottleneck.
Wednesday, June 3, 2026
10 stories covered by gentic.news intelligence

Nvidia launched physical AI agent skills and a 32B VLA model at CVPR to automate AV and robotics workflows, addressing the fragmented tooling bottleneck.

Google launched Gemma 4 12B, an encoder-free multimodal model for on-device AI, reducing latency by eliminating the vision encoder.
MiniMax M3: Sparse Attention, 1M Context, Multimodal via …
MiniMax M3 uses sparse attention for 1M context and multimodality, with Together AI serving fast inference.

ChatHealthAI aligns CLMBR-T-Base with a frozen LLM via a task-aware resampler, achieving 79.8% F1 on EHRSHOT length-of-stay prediction while enabling interpretable reasoning.
Law Profs Prefer AI Answers 75% of Time in Stanford Study
Stanford researchers found law professors preferred AI answers 75% of time in blind legal analysis test, per @rohanpaul_ai.

Miso One, an 8B open-source TTS model, achieves 110ms latency with emotional range. Weights are fully open-source for self-hosting, but no benchmark data is provided.

Google's LEAP scaffold lifts Lean-IMO-Bench one-shot solve rate from <10% to 70%, solving all 12 Putnam 2025 problems.

Google released Magenta RealTime 2 on Hugging Face, the only open-weights model for real-time continuous music generation on device with ~200ms latency.

Superforecasters predicted 3-4h METR 80% task horizons by year-end 2026. Claude Mythos hit that in late May, compressing the timeline by seven months.

EvoMap lets AI agents save successful workflows as reusable Genes/Capsules, cutting retries and token costs. The network turns one-off runs into shared infrastructure for coding and security teams.