In a candid statement that cuts to the core of the robotics field's biggest challenge, Figure CEO Brett Adcock declared that the primary obstacle to creating general-purpose robots is no longer a matter of algorithmic breakthroughs or hardware design, but one of pure data acquisition.
"If we could get a pile of data in the helix stack, we would solve general robotics right now," Adcock stated, as highlighted by AI commentator Rohan Pandey. The comment underscores a growing consensus in AI and robotics: the models and architectures capable of generalist robotic behavior likely exist, but they are starved for the massive, diverse, and high-quality datasets needed for training.
The Prohibitively Expensive Data Bottleneck
The foundational problem, as noted in the source, is that robotic data is "insanely expensive and brutal to collect." Unlike text or image data which can be scraped from the internet, data for physical robots must be gathered in the real world or highly realistic simulations. This involves:
- Physical Hardware Costs: Operating fleets of advanced robotic platforms like Figure's humanoid robot or similar systems from competitors.
- Time and Labor Intensity: Each demonstration or teleoperated task requires human time and expertise.
- Environmental Variety: To achieve generalization, data must be collected across a vast array of environments, objects, and lighting conditions—a logistical nightmare.
This data scarcity creates a chicken-and-egg problem: you need vast data to train a capable generalist robot, but collecting that data at scale requires a level of robotic capability you don't yet have.
The "Helix Stack" and the Path Forward
Adcock's reference to the "helix stack" points to Figure's internal AI infrastructure, likely a pipeline for processing and training on multimodal data (vision, language, proprioception, force/torque). The implication is that this technical stack is ready and waiting; the missing ingredient is the "pile of data" to feed it.
This framing shifts the competitive landscape. The race for general robotics is increasingly a race to acquire or generate proprietary data at scale. Strategies to overcome this bottleneck include:
- Massive Simulation: Using physics engines like NVIDIA Isaac Sim or others to generate synthetic data, though the 'sim-to-real' transfer gap remains significant.
- Fleet Learning: Deploying large numbers of robots (as companies like Covariant and Sanctuary AI are attempting) to collect parallel real-world experience.
- Cross-Modal Transfer: Leveraging data from other domains (e.g., video understanding of human activities, large language model knowledge) to bootstrap robotic understanding, an area of active research.
gentic.news Analysis
Adcock's statement is less a revelation and more a stark confirmation of the industry's pivotal constraint. It directly aligns with the trajectory we've been tracking: the center of gravity in AI has shifted from model architecture to data acquisition and curation. This was evident in the OpenAI o1 reasoning model launch, where sophisticated data synthesis and curation were highlighted as critical differentiators, and in the intense competition for video data preceding models like Sora.
For Figure, this public focus on data scarcity serves a dual purpose. It honestly frames the technical challenge while also positioning the company's massive real-world data collection efforts—through partnerships like the one with BMW Manufacturing—as a core, defensible moat. If data is the ultimate currency, then Figure's strategy of embedding its humanoids in complex industrial environments is a direct play to mint it.
However, Adcock's claim that data is the "only thing" holding back general robotics is provocative. It arguably downplays remaining challenges in energy efficiency, mechanical reliability, cost-effective actuation, and real-time safety guarantees—hardware and control problems that are non-trivial. The statement likely reflects an AI-centric worldview, where the embodied intelligence problem is reduced to a large-scale learning problem once sufficient data exists. The next 12-18 months will test this hypothesis, as companies like Figure, 1X Technologies, and Sanctuary AI scale their data collection and begin to demonstrate whether their "helix stacks" can indeed deliver on the promise of generalization.
Frequently Asked Questions
What is the "helix stack" mentioned by Figure's CEO?
While not publicly detailed, the "helix stack" is almost certainly Figure's proprietary end-to-end AI software pipeline. It likely ingests multimodal data (camera feeds, joint positions, force readings, language commands), processes it through neural network models (likely vision-language-action models), and outputs low-level control commands for the robot's actuators. His comment suggests this pipeline is architecturally sound but data-hungry.
Why is robotic data so much more expensive than data for AI chatbots?
Data for chatbots (text) is abundant, existing digitally across the entire web. Robotic data is inherently physical and episodic. It requires a robot to physically interact with the world, which is slow, risks damage to the robot and environment, and needs human supervision or demonstration. Each successful or unsuccessful grasp, navigation path, or manipulation task is a costly data point.
Are simulations a viable solution to the robotic data problem?
Simulations are a crucial and widely used tool for generating large-scale, labeled data cheaply and safely. However, they are not a complete solution due to the "reality gap"—the differences between simulated physics and the messy real world. The most advanced teams use simulation for pre-training and broad skill acquisition, but still rely on real-world data for fine-tuning and closing the performance gap.
What companies are leading in the collection of real-world robotic data?
Several companies are betting heavily on large-scale real-world data collection. This includes 1X Technologies with its Eve and Neo robots, Sanctuary AI with its Phoenix humanoid, and Covariant with its warehouse manipulation systems. Figure's partnership with BMW is a prime example of an strategy to gain access to a high-value, structured environment for data collection at scale.








