Runway's new Agent mode builds complex stories from short text descriptions, as Ethan Mollick demonstrated. The one-shot attempt produced impressive narratives, though not error-free.
Key facts
- Runway's Agent mode generates stories from short text
- Demonstrated by Ethan Mollick on X
- One-shot attempt produced complex narrative
- Competes with Sora, Pika, and Stable Video Diffusion
- No technical details or benchmarks released
Runway has introduced an Agent mode that generates story sequences from brief text prompts, according to a demonstration by Ethan Mollick. The feature creates multi-scene narratives, moving beyond simple text-to-video generation into automated storytelling. Mollick noted the output was 'quite impressive' for a single attempt, though it contained errors.
How It Works
Agent mode interprets a short description—such as a few sentences about a scene or plot—and produces a coherent video story. This contrasts with Runway's existing Gen-2 and Gen-3 models that generate single clips from text or image inputs. The agent likely chains multiple generations, applying consistent characters and settings across scenes. The company has not disclosed the underlying model architecture or context window, but the results suggest planning capabilities beyond frame-by-frame generation.
Implications for AI Video
Runway's Agent mode enters a competitive landscape dominated by OpenAI's Sora, Pika Labs, and Stability AI's Stable Video Diffusion. While Sora focuses on high-fidelity single clips, Runway's agent approach targets narrative coherence—a different axis of capability. Mollick's test used a single prompt without iterative refinement, indicating the model can handle complex story arcs autonomously. However, the error rate remains unspecified, and Runway has not released benchmarks or pricing for the feature.
What's Missing
Runway has not published technical details, including training data, model size, or inference cost. The demonstration is a single data point from a third party, not a formal evaluation. The company's blog post or documentation does not yet describe Agent mode, suggesting it may be in early beta or limited rollout. Users should expect variability in output quality until more testing is done.
What to watch
Watch for Runway's official documentation or blog post detailing Agent mode's architecture and availability. A public beta or API release would signal broader adoption. Also track comparisons from independent evaluators on narrative coherence vs. Sora and Pika.









