AllenAI released MolmoAct2, an open action reasoning model for robots, on April 17, 2026. The model packs a 720-hour bimanual dataset and adaptive-depth reasoning that beats GPT-5 and Gemini Robotics on real-world tasks.
Key facts
- 720-hour bimanual dataset, largest open
- 89.4% success rate on ACT benchmark
- Beats GPT-5 (82.1%) and Gemini Robotics (84.3%)
- Adaptive-depth cuts latency by 40%
- Open model, fine-tuneable on local hardware
AllenAI's MolmoAct2 is a fully open action reasoning model designed for real-world robot deployment. According to @HuggingPapers, it includes the largest open bimanual dataset—720 hours of demonstration data—which covers diverse manipulation tasks. The model employs a specialized spatial reasoning backbone to handle complex, multi-step actions with precision.
Why this matters more than the press release suggests
This isn't just another open model release. MolmoAct2's adaptive-depth reasoning mechanism dynamically adjusts inference depth based on task complexity, cutting latency by an undisclosed percentage while outperforming proprietary giants like GPT-5 and Gemini Robotics. Unlike closed models that require API calls, MolmoAct2 can be fine-tuned and deployed on local hardware, lowering the barrier for robotics labs. The 720-hour bimanual dataset is 3x larger than the previous open record, per AllenAI's documentation, enabling more robust policy learning.
Technical highlights
The architecture builds on Molmo, AllenAI's multimodal foundation model, but adds a spatial reasoning backbone that encodes 3D coordinates and object relationships. AllenAI did not disclose the exact number of parameters or training compute cost. On the ACT benchmark (real-world action completion tasks), MolmoAct2 scored 89.4% success rate, up from 82.1% for GPT-5 and 84.3% for Gemini Robotics, according to the model card [per @HuggingPapers]. Adaptive-depth reasoning reduced average inference time by 40% on simple tasks, while matching or exceeding baseline accuracy on complex ones.
Open ecosystem implications
This release challenges the narrative that proprietary models are necessary for cutting-edge robotics. By open-sourcing both the model and dataset, AllenAI enables researchers to reproduce results and build on them—a stark contrast to GPT-5's API-only access. The 720-hour dataset includes 1,500+ unique tasks across 200+ objects, with annotations for grasp points, trajectories, and failure recovery. AllenAI plans to release a training code repository in Q3 2026, further democratizing access.
What to watch
Watch for the Q3 2026 training code release from AllenAI, and whether enterprise robotics labs adopt MolmoAct2 over GPT-5 for cost-sensitive deployments. Also track any benchmark updates from OpenAI and Google in response.









