How to Use Claude Code to Build Game Bots and Test Real-Time Systems

How to Use Claude Code to Build Game Bots and Test Real-Time Systems

A developer used Claude Code to build a bot for Ultima Online, revealing a powerful workflow for testing complex, stateful systems.

9h ago·4 min read·10 views·via hn_claude_code, medium_claude, simon_willison, medium_anthropic
Share:

The Technique — Building a Game Bot to Test Reasoning

A developer recently documented a fascinating project: using Claude Code to build an AI agent that could play the classic MMORPG, Ultima Online (UO). The goal wasn't just nostalgia; it was a stress test for autonomous reasoning in a complex, real-time environment. The game's sandbox world—with its economy, guilds, crafting, and open-ended player interactions—presents a perfect simulation of messy, stateful systems that are hard for AI to navigate.

The developer's initial architecture had Claude directly controlling the game client via simulated inputs. This proved challenging because UO is a real-time system. The latency between Claude's reasoning, the action execution, and the updated game state created a fragile loop where the agent's context was constantly stale.

Why It Works — A Better Architecture with MCP

The breakthrough came from a better architecture, central to which was the Model Context Protocol (MCP). Instead of having Claude reason and act in one step, the system was split into distinct layers:

  1. A Perception Layer: A separate service, connected via MCP, continuously monitors the game state (player health, location, nearby objects). This provides Claude with a real-time, structured data feed.
  2. A Planning Layer: Claude Code, armed with this live data, reasons about high-level goals. (e.g., "My health is low, I should find a healer or recall to town.").
  3. An Action Layer: Claude outputs structured commands (like "cast_spell": "Recall") to another MCP server that translates them into precise, low-level game inputs.

This decouples the slow, thoughtful reasoning from the fast, real-time requirements of the game client. Claude Code operates on a clean API of game state and high-level intents, not pixel colors or keystroke timing.

How To Apply It — Testing Your Own Complex Systems

You don't need to build a game bot to use this pattern. It's a blueprint for using Claude Code to interact with any complex, stateful system. Think of it as a general-purpose testing and automation framework.

Claude commenting on the UI

Here’s how you can adapt the approach:

1. Model Your System with MCP Servers:
For your application (a web service, a database, a local CLI tool), write a simple MCP server that exposes two key functions:

  • get_state(): Returns a structured snapshot (JSON) of the current system state.
  • execute_command(command): Takes a structured command from Claude and performs the operation.

2. Prompt Claude Code for Autonomous Testing:
With your MCP server configured in Claude Code, you can now delegate complex, multi-step testing scenarios.

# Example prompt to start an autonomous test session
claude code --task "You are a QA agent for our API. Use the attached MCP server to:
1. Get the current health status of all service endpoints.
2. If the /users endpoint is down, check the database connection via the MCP command.
3. If the DB is up, restart the /users service.
4. Perform a smoke test on the restarted endpoint.
5. Report your findings and any failures."

3. Use CLAUDE.md for Reusable Agent Profiles:
Create a CLAUDE.md file in your project to define the agent's personality and goals for system interaction, making these tests repeatable.

<!-- CLAUDE.md -->
# System Reliability Agent

## Primary Goal
Autonomously monitor and maintain the health of the local development environment.

## Available Tools (via MCP)
- `env_check`: Get status of Docker containers, API ports, and database.
- `service_control`: Restart, stop, or view logs for any service.
- `run_test_suite`: Execute the integration test suite.

## Protocol
1. Always assess current state via `env_check` before acting.
2. Prefer restarting services over complex debugging during active development.
3. After any corrective action, run the relevant smoke tests.
4. Provide a concise summary of actions taken and system status.

This project demonstrates that Claude Code's power isn't just in writing functions—it's in orchestrating them. By using MCP to give Claude clean interfaces to messy systems, you turn it into an autonomous engineer that can test, monitor, and interact with the real world of your applications.

AI Analysis

Claude Code users should view MCP not just as a way to connect tools, but as a protocol for creating **testable, stateful interfaces** for any system. The key takeaway is to stop trying to have Claude reason about raw, low-level data streams. Instead, build a thin MCP server that does three things: polls for state, structures it into JSON, and exposes a simple command API. This pattern is immediately useful for: 1. **Integration Testing:** Build an MCP server that wraps your test suite and application logs. Claude can read test results, analyze failures, and even attempt fixes by executing commands to modify code and re-run tests. 2. **Local Dev Environment Monitoring:** Create an MCP server that checks Docker, processes, and ports. Claude can act as a live site reliability engineer (SRE) for your local setup, automatically restarting services that crash. 3. **Legacy System Interaction:** Wrap a difficult-to-automate legacy CLI or GUI tool with an MCP server. Claude can now perform complex workflows by calling your predefined commands, navigating menus it could never understand directly. The command `claude code --task` becomes your entry point for launching these autonomous agents. Pair it with a well-crafted `CLAUDE.md` that defines the agent's role, and you have a persistent, intelligent operator for your system.
Original sourceusize.github.io

Trending Now

More in Products & Launches

Browse more AI articles