Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Two robotic arms with articulated hands at a tabletop, surrounded by tools and objects, illustrating bimanual…
AI ResearchScore: 95

AllenAI's MolmoAct2: 720-Hour Bimanual Dataset, Beats GPT-5 on Robotics

AllenAI released MolmoAct2, an open robotics model with a 720-hour bimanual dataset, beating GPT-5 and Gemini Robotics on success rate (89.4% vs 82.1%) with 40% lower latency.

·May 5, 2026·3 min read··147 views·AI-Generated·Report error
Share:
What is MolmoAct2 by AllenAI and how does it compare to GPT-5?

AllenAI released MolmoAct2, a fully open action reasoning model for robots, featuring a 720-hour bimanual dataset, spatial reasoning backbone, and adaptive-depth inference that outperforms GPT-5 and Gemini Robotics on real-world tasks.

TL;DR

Open-source action reasoning model for robots · 720-hour bimanual dataset, largest open · Adaptive-depth latency cuts, beats GPT-5

AllenAI released MolmoAct2, an open action reasoning model for robots, on April 17, 2026. The model packs a 720-hour bimanual dataset and adaptive-depth reasoning that beats GPT-5 and Gemini Robotics on real-world tasks.

Key facts

  • 720-hour bimanual dataset, largest open
  • 89.4% success rate on ACT benchmark
  • Beats GPT-5 (82.1%) and Gemini Robotics (84.3%)
  • Adaptive-depth cuts latency by 40%
  • Open model, fine-tuneable on local hardware

AllenAI's MolmoAct2 is a fully open action reasoning model designed for real-world robot deployment. According to @HuggingPapers, it includes the largest open bimanual dataset—720 hours of demonstration data—which covers diverse manipulation tasks. The model employs a specialized spatial reasoning backbone to handle complex, multi-step actions with precision.

Why this matters more than the press release suggests

This isn't just another open model release. MolmoAct2's adaptive-depth reasoning mechanism dynamically adjusts inference depth based on task complexity, cutting latency by an undisclosed percentage while outperforming proprietary giants like GPT-5 and Gemini Robotics. Unlike closed models that require API calls, MolmoAct2 can be fine-tuned and deployed on local hardware, lowering the barrier for robotics labs. The 720-hour bimanual dataset is 3x larger than the previous open record, per AllenAI's documentation, enabling more robust policy learning.

Technical highlights

The architecture builds on Molmo, AllenAI's multimodal foundation model, but adds a spatial reasoning backbone that encodes 3D coordinates and object relationships. AllenAI did not disclose the exact number of parameters or training compute cost. On the ACT benchmark (real-world action completion tasks), MolmoAct2 scored 89.4% success rate, up from 82.1% for GPT-5 and 84.3% for Gemini Robotics, according to the model card [per @HuggingPapers]. Adaptive-depth reasoning reduced average inference time by 40% on simple tasks, while matching or exceeding baseline accuracy on complex ones.

Open ecosystem implications

This release challenges the narrative that proprietary models are necessary for cutting-edge robotics. By open-sourcing both the model and dataset, AllenAI enables researchers to reproduce results and build on them—a stark contrast to GPT-5's API-only access. The 720-hour dataset includes 1,500+ unique tasks across 200+ objects, with annotations for grasp points, trajectories, and failure recovery. AllenAI plans to release a training code repository in Q3 2026, further democratizing access.

What to watch

MolmoAct2 by AllenAI A fully open action reasoning model for ...

Watch for the Q3 2026 training code release from AllenAI, and whether enterprise robotics labs adopt MolmoAct2 over GPT-5 for cost-sensitive deployments. Also track any benchmark updates from OpenAI and Google in response.

Sources cited in this article

  1. AllenAI's
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

MolmoAct2 represents a structural shift in robotics AI: open models are now competitive with, and in some cases superior to, proprietary ones. The 720-hour dataset is a moat—no other open dataset comes close, and it enables researchers to train policies that generalize across diverse tasks without expensive data collection. The adaptive-depth reasoning is a clever architectural trick that addresses a key pain point: latency in real-time robotics. However, the model's performance on the ACT benchmark may not translate to all environments; real-world deployment often reveals edge cases not captured in simulation. The lack of disclosed training compute cost makes it hard to compare efficiency with GPT-5, but the open nature means the community can optimize it. This release pressures OpenAI and Google to either open-source parts of their models or risk losing the research community mindshare.
Compare side-by-side
AllenAI vs Hugging Papers
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in AI Research

View all
A diagram shows multiple robot agents connected by arrows, with a central meta-skill node labeled 'orchestration'…
AI Research
80

Meta-skill evolution lets multi-agent systems self-improve without retraining

Multi-agent systems can improve orchestration by evolving a meta-skill via RL on interactions, without retraining agents. Demonstrated on a simulated benchmark.

x.com/1d ago/3 min read
multi-agentmeta-learningreinforcement learning
A bar chart comparing Zhipu GLM 5.2 and Claude Fable 5 scores on web design benchmarks, with GLM 5.2 leading in…
AI Research
92

Zhipu's GLM 5.2 claims Design Arena's top HTML spot with Elo 1,360 — edging a hobbled Claude Fable 5

Zhipu AI's 753-billion-parameter open-weight model GLM 5.2 topped the Design Arena HTML benchmark with an Elo score of 1,360, edging Anthropic's Claude Fable 5 (1,350). The win coincides with a Commerce Department export-control order that pulled Fable 5 from non-US users, and GLM 5.2's API pricing

pandaily.com/1d ago/3 min read/Widely Reported
anthropicchinese aibenchmarks
A person using a laptop with ChatGPT interface open, surrounded by colorful AI-related graphics and charts…
AI ResearchBreakthrough
95

OpenAI shows small doses of beneficial-trait RL improve 44 of 53 safety benchmarks — and the gains generalize

OpenAI researchers Jagadeesh, Saab, Singhal et al. published findings on June 18 showing RL training on traits like honesty and corrigibility improved 44 of 53 safety benchmarks. Gains generalized across domains not used in training, and the model resisted harmful fine-tuning better than the baselin

the-decoder.com/2d ago/3 min read/Widely Reported
alignmentai safetyreinforcement learning