How Alex Finn Built a Complete Game in 1 Hour Using Codex's /goal Command

Alex Finn Ran /goal for One Hour and Got a Complete Extraction Shooter

Alex Finn typed a single /goal command, walked away, and came back to a fully playable extraction shooter — complete with auto-generated assets, top-down sprites, and working game logic. The whole run took under an hour. No manual asset creation. No back-and-forth prompting. One command, one agentic loop, one finished game.

That demo, shared publicly and circulating among Codex users, is the clearest illustration yet of what the /goal command actually does: it activates what Codex internally calls “Ralph loops” — multi-hour (or multi-day) autonomous agentic runs where the model keeps working without waiting for your approval at each step. The game Alex built isn’t a tech demo with placeholder squares. The assets look deliberately stylized — static sprites, but coherent and intentional. The gameplay is legible. It’s the kind of thing that would have taken a solo developer a weekend.

This post walks through how that works mechanically, what you need to replicate it, and where the real failure modes are.

What the /goal Command Actually Does

Most Codex sessions are conversational: you prompt, it responds, you approve, it executes. The /goal command breaks that pattern entirely.

Plans first. Then code.

PROJECTYOUR APP

SCREENS12

DB TABLES6

BUILT BYREMY

1280 px · TYP.

yourapp.msagent.ai

A · UI · FRONT END

Remy writes the spec, manages the build, and ships the app.

When you type /goal followed by a description, Codex enters an extended agentic loop. It plans, executes, evaluates its own output, and iterates — without surfacing approval requests unless it genuinely hits a blocker it can’t resolve. The loop can run for hours. In the Alex Finn demo, it ran for roughly 60 minutes and produced a complete game.

The mechanism behind this is what the Codex team calls Ralph loops. The name is internal, but the behavior is observable: the agent sets sub-goals, tracks progress against them, and keeps going. It’s closer to a background job than a chat session.

One important detail: /goal is not enabled by default. You have to edit your TOML configuration file to turn it on. It’s an early-access feature, which is probably why it’s gated — an agent running autonomously for hours with full file system access is a significant capability to hand someone without a confirmation step.

Before You Start

You need a few things in place before /goal will do anything useful for a game build.

A ChatGPT subscription. Codex is part of the ChatGPT ecosystem. The free tier gives you limited Codex access; the $20/month plan is the practical minimum. If you’re hitting rate limits quickly, the Pro tier is worth considering — the rate limits panel in Codex (visible under Settings) shows a 5-hour session reset and a weekly reset, with percentage remaining for each.

The Codex desktop app. You can use Codex via VS Code extension or terminal, but the desktop app gives you the in-app browser, the visual mouse cursor for QA, and the pet UI that shows what the agent is working on while you’re in another window. For a long autonomous run, being able to glance at the pet and see current status is genuinely useful.

TOML file edit to enable /goal. This is the step most people miss. The /goal command exists in the slash command list, but it won’t activate Ralph loops until you’ve edited the TOML config to enable it. The exact path depends on your OS, but it’s in your Codex configuration directory. Look for a [features] or [experimental] section and enable the goal/Ralph loop flag. If you’re not sure where the file lives, ask Codex directly — it knows its own config structure.

The image generation skill enabled. This is what made Alex Finn’s demo work so well. Before running /goal, enable the image generation skill (accessible via /skill-creator or by installing it from the global skills directory at ~/.codex/skills/). With that skill active, Codex will generate game assets — sprites, backgrounds, UI elements — using GPT Image 2 as part of the autonomous build. Without it, you get placeholder geometry. With it, you get actual art.

Full access permissions. By default, Codex pauses to ask permission before taking certain actions. For a multi-hour autonomous run, that’s a problem — it’ll stall waiting for your approval. Go to Settings > General > Auto Review and toggle to Full Access. The UI flags this in orange because it’s a meaningful permission grant. For a game build in an isolated project folder, it’s fine. For anything touching production systems or sensitive files, think carefully.

Running the Build

Step 1: Create a project and set up agents.md

One coffee. One working app.

You bring the idea. Remy manages the project.

WHILE YOU WERE AWAY

✓Designed the data model

✓Picked an auth scheme — sessions + RBAC

✓Wired up Stripe checkout

✓Deployed to production

Live at yourapp.msagent.ai

Open a new project in Codex and point it at a fresh directory. Before you do anything else, create an agents.md file. This is Codex’s equivalent of Claude Code’s CLAUDE.md — it gets read at the start of every new chat session and gives the agent context about what it’s building and why.

For a game build, your agents.md should include: the genre and core mechanic, the visual style you want, any constraints (tile size, color palette, top-down vs. side-scrolling), and the target output (playable in browser, specific framework, etc.). The more specific this file is, the more coherent the autonomous output will be.

Now you have a project with persistent context that survives across sessions.

Step 2: Enable the image generation skill

Type /skill-creator in the chat and follow the prompts to create or install the image generation skill, or navigate to ~/.codex/skills/ and confirm the skill file is present. Skills in Codex are just markdown files — a recipe that tells the agent how to approach a specific task. The image generation skill tells Codex to use GPT Image 2 for asset creation during builds.

This is also worth knowing: because skills are plain markdown files in a local directory, the same skill works across Claude Code, Codex, OpenClaw, and Cursor. If you’ve already built skills for other tools, they’re portable. The Claude Code skills system for content repurposing follows the same pattern — markdown recipes that any agent harness can read.

Now you have image generation available as a tool the agent can call during the build.

Step 3: Use Plan mode first

Before running /goal, toggle Plan mode on. This prevents Codex from executing anything — it brainstorms and asks clarifying questions only. Use this to nail down the game design before you hand it off to an autonomous loop.

Describe your game in detail: extraction shooter, top-down perspective, player collects loot and extracts before enemies close in, three enemy types, procedurally placed obstacles. The more specific your plan-mode conversation, the better the /goal run will go. Ask Codex to surface any ambiguities. Resolve them. Then turn Plan mode off.

Now you have a clear spec that the agent will execute against.

Step 4: Run /goal

Type /goal followed by your game description. Something like:

/goal Build a complete top-down extraction shooter. Player spawns in a map, collects loot items scattered around the level, and must reach the extraction point before a timer expires. Three enemy types with basic patrol AI. Generate all visual assets using the image generation skill. Output should be playable in the browser via a local server.

Then wait. The agent will start planning, generating assets, writing game logic, testing, and iterating. The pet in the bottom of your screen will show you what it’s working on. The context window bar at the bottom of the chat will fill as the session progresses — watch it, but don’t panic until it’s near capacity.

For a game of the complexity Alex Finn built, expect 45–90 minutes. Don’t interrupt unless the pet has been idle for more than 10 minutes or you see an error the agent is clearly stuck on.

Now you have a running autonomous build session.

Step 5: Review and iterate

When the loop completes, Codex will surface what it built and where it’s running (typically a local server). Open the in-app browser and test it. If you want a more systematic QA pass, use the browser use feature — type something like “use your browser use skill to click through every game state and tell me what breaks.” You’ll see the visual mouse cursor move through the game, and Codex will document issues it finds.

For anything you want to fix, open a side chat (the “open side chat” button in the top bar) so you can ask questions or give instructions without interrupting the main session. This is the same as Claude Code’s / command for parallel conversations.

Now you have a tested, iterable game build.

Where This Goes Wrong

The TOML edit is easy to miss

If /goal doesn’t seem to activate an extended loop — if it just runs a normal session and stops — you probably haven’t edited the TOML file. This is the most common failure mode. The command exists in the slash menu regardless of whether the feature is enabled, which is confusing.

Automations default to GPT-5.2

This one burned at least one user for 40 minutes. If you set up an automation (via the Automations tab) to run your game build or any follow-up task, check the model setting manually. Codex automations silently default to GPT-5.2 instead of GPT-5.5. GPT-5.5 is significantly more capable and more token-efficient. A task that takes 20 minutes on 5.5 can stall or fail on 5.2. Always set the model explicitly in the automation settings — don’t trust the default.

Context window fills on complex builds

The context window bar at the bottom of the chat shows fill percentage. For a multi-hour /goal run on a complex game, you may see this approach capacity. Codex auto-compacts (similar to Claude Code’s behavior with Opus), but if you’re building something very large, consider breaking the goal into phases: first the core game loop, then assets, then enemy AI, then polish. Each phase as a separate /goal run with a fresh context.

Full access + wrong directory

With Full Access enabled, Codex can touch any file on your system. Make sure your project directory is isolated before running a long autonomous session. Don’t run /goal from a directory that contains anything you’d be upset to have modified or deleted. The agent is generally careful, but “generally” isn’t “always.”

The image generation skill isn’t active

If your game comes back with placeholder rectangles instead of actual sprites, the image generation skill wasn’t loaded. Check ~/.codex/skills/ and confirm the skill file is there. You can also type /model to see what’s currently active and /skill-creator to add missing skills mid-session.

What to Build Next

The extraction shooter demo is a proof of concept for a broader pattern: give /goal a complete spec, enable the right skills, and let it run. The same approach works for other game genres, for web apps, for data pipelines.

The GitHub + Vercel deployment pipeline that Codex supports natively makes this more useful than it sounds. Once your game is built locally, you can push it to a GitHub repo and have Vercel auto-deploy it — Codex handles the git commands. The workflow is: build locally, test with browser use, push to GitHub, Vercel picks it up automatically. If you want to understand how that kind of autonomous pipeline connects to longer-running agent infrastructure, the post on building a 24/7 AI trading agent with Claude Code covers similar territory — scheduled runs, autonomous execution, and the failure modes that emerge when agents run without supervision.

For the game assets specifically, GPT Image 2 is doing real work here. It’s not just generating decorative images — it’s producing sprites that fit a consistent visual style because the image generation skill gives it context about the game’s aesthetic. If you want to understand the full range of what GPT Image 2 can do in a build context, the GPT Image 2 use cases breakdown is worth reading alongside this.

The /goal pattern also connects to what Andrej Karpathy has described as AutoResearch loops — agents that run experiments, measure results, and iterate without human checkpoints. The AutoResearch loop post applies that pattern to business optimization, but the underlying mechanism is the same as what Ralph loops do for software builds.

If you’re thinking about where this kind of autonomous build capability fits in a broader agent stack, MindStudio’s approach is relevant: 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows — so the game-building agent can hand off to a deployment agent, a QA agent, or a notification agent without you writing the orchestration code.

One more thing worth knowing: the skills system that makes the image generation work in Codex is the same system that makes it portable. Those markdown recipe files in ~/.codex/skills/ are readable by Claude Code, Cursor, and OpenClaw. If you build a good game-development skill — one that encodes your preferred asset style, your code structure conventions, your testing approach — that skill travels with you across tools. The spec becomes the source of truth. Tools like Remy take that idea further: you write an annotated markdown spec and it compiles a complete full-stack application from it — TypeScript backend, database, auth, deployment — treating the spec as the artifact and the code as derived output.

The Alex Finn demo is a single data point, but it’s a specific one. One hour, one command, one complete game. The infrastructure to replicate it is available today, and the failure modes are known and fixable. That’s a reasonable place to start.