Ethan Mollick Uses GPT-4o Pro to Research Roman Aqueduct Labor Displacement, Finds Exponential Displacement Followed by S-Curve

Wharton professor Ethan Mollick had GPT-4o Pro research historical labor displacement from Roman aqueducts, finding an exponential doubling time followed by an S-curve saturation. The experiment demonstrates AI's emerging capability to conduct historical economic analysis with human verification.

AAAla AYADI & AI Research Desk·Mar 16, 2026·3 min read··108 views·AI-Generated·Report error

Source: x.comvia @emollickSingle Source

What Happened

Wharton professor and AI researcher Ethan Mollick conducted an experiment using OpenAI's GPT-4o Pro to research the historical impact of Roman aqueducts on labor displacement. Mollick prompted the AI to analyze this specific historical case study and produce a "METR-style graph" — referring to the visual style used by the AI research organization METR (formerly ARC Evals) in their AI capability forecasting reports.

The AI generated a graph showing labor displacement over time, with Mollick noting two key findings from the analysis:

The displacement followed an exponential pattern with a specific "doubling time" (the exact timeframe wasn't specified in the tweet)
The exponential growth eventually transitioned into an S-curve pattern as displacement saturated

Mollick added two interpretive lessons from this historical case:

"Displacing terrible work is good" — referring to the grueling manual water-carrying labor that aqueducts eliminated
"All exponentials become s-curves in the end" — noting that even rapid technological displacement eventually reaches saturation points

Context & Verification

Mollick explicitly noted that he performed "spot checks" on the AI's research and found it "seemed accurate." This verification step is significant — while the AI conducted the initial research and analysis, human expertise was still required to validate the findings.

The experiment builds on Mollick's ongoing work exploring how advanced AI models can augment research capabilities. As a professor at Wharton who frequently writes about AI's impact on work and education, Mollick has been testing frontier models' abilities to assist with complex analytical tasks that traditionally require specialized historical and economic expertise.

The reference to "METR-style" graphs connects this historical analysis to contemporary AI forecasting. METR (formerly ARC Evals) produces influential reports tracking AI capabilities, often using exponential growth curves to model progress. By applying this analytical framework to historical technology adoption, Mollick creates a bridge between past technological transitions and current AI development trajectories.

Technical Note on the Model

Mollick specified using "GPT-5.4 Pro" in his tweet, which appears to be a typographical error for GPT-4o Pro — OpenAI's current flagship multimodal model. The "o" in GPT-4o stands for "omni," referring to its ability to process and generate text, audio, and visual content. The Pro version offers higher rate limits and priority access to new features.

This model choice is significant because GPT-4o represents one of the most capable publicly available AI systems for complex reasoning tasks. Its ability to research historical economic patterns suggests growing competency in synthesizing information across domains — in this case, combining historical data, economic theory, and data visualization.

Source: gentic.news · Mar 16, 2026 · author=Ala AYADI · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala AYADI.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Mollick's experiment demonstrates several important developments in AI capabilities. First, GPT-4o Pro can now conduct meaningful historical economic analysis that produces interpretable insights — not just retrieve facts, but identify patterns (exponential growth transitioning to S-curves) and generate appropriate visualizations. This represents a step beyond basic research assistance toward genuine analytical partnership. The human verification component ('spot checks seemed accurate') remains crucial. While AI can now structure research questions, analyze data, and present findings, domain expertise is still needed to validate conclusions. This suggests a near-future workflow where AI handles the heavy lifting of data gathering and initial analysis, while human experts focus on verification, interpretation, and applying nuanced judgment. Practitioners should note the specific prompt engineering implied here: asking for a 'METR-style graph' immediately frames the analysis within a particular analytical tradition. This demonstrates how effective AI research assistance requires both domain knowledge (to ask the right questions) and familiarity with AI capabilities (to structure prompts effectively). The most powerful applications will come from experts who can bridge these domains.

#gpt-4 #economic history #productivity tools #ai research

Mentioned in this article

Ethan Mollick GPT-4o Pro OpenAI

Enjoyed this article?