Stuart Jeff — Writing

Introducing wood-fired-tasks

Wed, 27 May 2026 00:00:00 GMT

This week I open-sourced wood-fired-tasks. It is now public at github.com/Wood-Fired-Games/wood-fired-tasks and on npm. MIT-licensed. The release tag is v1.12. It is a self-hostable task-tracking system with three peer interfaces — a REST API, a CLI, and an MCP server — built specifically so AI coding agents can read and write the same backlog as the humans supervising them, without anyone stepping on anyone else.

I have been using it daily for three months and I want to tell you both why it exists and what it can do for you.

Where it came from

For a good bit of my twenty-five years in games I have been a creative director and a game designer, and in both of those roles the actual unit of work was often more about managing Jira tickets than writing code. I have created thousands of them. I have assigned thousands of them. I have reviewed the results of thousands of them. Every shipped game I have ever worked on existed as a column of tickets long before it existed as a build. The ticket is not paperwork around the work; for a creative lead, the ticket is often the shape the work takes.

When AI agents started getting genuinely useful in late 2025, the workflow I wanted to build was obvious to me. Take a ticket out of the tracker. Hand it to an agent. Let the agent do the work. Review the result the same way I would review a teammate's PR. Close the ticket. The unit of work would not change at all; the thing doing the work would. I spent the back half of 2025 trying to make that exact workflow run against the off-the-shelf trackers I had spent two decades inside — Jira, Trello, Notion, GitHub Issues — and the agents kept stumbling. They would hit the wrong API. They would lose track of which board they were operating on. They would fail to authenticate in ways the official integrations were not built to recover from. They would just decline to take an action without explaining why. They would ignore formatting requirements. The integrations existed. None of them were reliable enough to be load-bearing.

In early 2026 I gave up on bridging to the existing tools and decided to build a tracker that was agent-first from the foundation up. It needed to be CLI-friendly, because by then I was spending more time in terminals than I had since the early 2000s. And because my own day splits between a Windows machine where the front-end game development happens and a Linux box where the backend services and most of the AI experimentation live, the tracker needed to coordinate across machines too — across operating systems, across repos, across the agents themselves.

The first version of that coordination was undignified. I copy-pasted agent output between Slack threads. A Claude session on the Windows side would summarize what it had done; I would paste the summary into a thread on the Linux side and let a second Claude pick it up. The second version was a network file share where both machines could read the same JSON files. The third version was a recognition: a real task tracker, designed for this, was a better place to do these handoffs than any general-purpose messaging or filesystem layer. Tasks already have status. Tasks already have dependencies. Tasks already have the right shape for cross-machine, cross-repo, cross-agent work. They just needed an MCP server and a real claim protocol and the right CLI on top.

That is when the project started to compound. Around the same time I was learning from the GSD orchestration framework I run as a substrate — github.com/gsd-build/get-shit-down — that agentic work can be made systemic, verifiable, and auditable. So I pushed the tracker hard in that direction. Structured verdicts on every closed task. Read-only graders that physically cannot edit the code they are grading. Dependency-aware execution that refuses to dispatch a parallel loop against a cyclic graph. Generator/critic separation enforced by the agent definitions themselves. The result is what I open-sourced today.

The original vision — taking work directly out of an existing Jira or Trello board and dispatching it into agents — is still something I'm interested in exploring but is largely unrealized. What exists today is the substrate that vision will eventually run on top of, and it has been load-bearing enough to ship several thousand commits across the Wood Fired ecosystem in the past several months. That is the wedge the rest of this post unpacks.

What it does

Wood Fired Tasks is a single Node service that exposes three peer surfaces over one SQLite service layer:

A REST API with 47 route handlers (40 live in a default deploy). A CLI with 31 commands — tasks create, tasks list, tasks claim, the usual shape. An MCP server with 22 tools, so Claude Code, Cursor, Gemini, and Codex can all read and write tasks directly from inside an agent session. All three surfaces hit the same data. Anything an agent does over MCP is immediately visible to the CLI; anything you do from the CLI is immediately visible to the agents. There is no "agent view" and no "human view." There is one project, one truth, and three ways to talk to it.

The coordination primitives are first-class.

Atomic task claiming with optimistic locking. When a task is unclaimed, any number of agents can race to claim it. Exactly one wins. The other nineteen get a 409-equivalent error and move on to a different task. This is verified end-to-end in CI: twenty concurrent agents race the same task, one success, nineteen conflicts, zero errors. Stale claims auto-release after thirty minutes so a crashed agent does not lock the work forever.

Workflow automation. When a task closes, the system automatically unblocks every task whose blocked_by edge pointed at it. When the last subtask of a parent task closes, the parent auto-completes. You build the dependency graph once; the executor figures out the order.

Real-time SSE events. Every state change emits an event on a Server-Sent Events stream. Dashboards, second agents, and downstream automation all subscribe to one stream and react in real time. Closed tasks, status transitions, new dependencies, claim conflicts — all of it.

The most elegant of the 22 MCP tools, and the one that captures the design philosophy in a sentence, is claim_task:

Atomically claim an unassigned task, setting assignee and transitioning status to in_progress. Returns 409-equivalent error if already claimed.

That is one tool definition. That is the entire story of how N agents work the same backlog without colliding.

Capturing what the agent notices

There is a value pattern in this design that surprised me, and it might be the single best argument for putting a tracker behind an MCP server rather than behind any other interface.

When I give an agent a research task or a debugging task, that agent has a primary objective. It is trying to answer a specific question or fix a specific bug. The context window it is working in fills up fast with whatever serves that objective. Anything tangential — a smell in a neighboring file, an undocumented behavior, a brittle test, a missing migration, a TODO comment that has aged badly, a friction point in the build — would normally fall out of the agent's context the moment its focus shifts. Context churn eats observations like that for breakfast. The agent is doing exactly what you asked it to do; everything it noticed on the way is gone the next session.

But because the tracker is an MCP server, the agent can pause for one tool call, file the observation as a new task in the right project, tag it appropriately, link it as a follow-up to whatever it was actually working on, and keep going. No interruption to the primary work. No working-memory cost beyond a single round-trip. The observation lands in the database with a timestamp, the file reference, and the agent's reasoning preserved as the task body. I review the new tasks later, decide which ones to act on, and dispatch a worker against them when it makes sense.

That single pattern has become one of the most valuable things wood-fired-tasks does for me. Technical debt that would have been lost to context churn now accumulates as a tracked backlog. Bugs the agent noticed while doing something else become real bug reports instead of vanished neurons. The result is a queue I can plan against, contributed to by every agent session, without ever asking an agent to step outside its assigned scope. The tracker is doing classic backlog-grooming work, except the contributors filing the tickets are the agents themselves while they are nominally doing other things. That alone has made this a valuable tool.

The orchestration layer

A task tracker with twenty-two MCP tools and a REST API would already be useful. The thing that turns it from "tracker the agents can talk to" into "autonomous backlog drain" is the set of /tasks:* Claude Code skills shipped alongside the service.

/tasks:loop is the sequential autonomous executor. You point it at a project and it drains the backlog. For each task: pick the highest-priority open task, claim it, plan the validation depth, dispatch a fresh subagent to implement the fix, independently re-run build and test to verify the subagent's claim, commit and push only the named files, dispatch a separate tasks-verifier subagent to grade the closed task against its acceptance criteria, close the task only on PASS, and emit a kill-safe LOOP-RUN.md audit artifact. The mental model the skill teaches:

Think of yourself as the foreman, not the carpenter. Each task: hand a self-contained brief to a fresh subagent (the carpenter), then independently re-check the work before signing it off. Your context only holds the plan, summaries, and verification results — never raw build logs, file scans, or trial-and-error.

/tasks:loop-dag is the wave-by-wave parallel version. Same primitives, but it computes the dependency frontier — the set of open tasks whose blockers are all satisfied — and dispatches a worker subagent per frontier task in parallel under a configurable concurrency cap. When the wave finishes, it runs the verifier per worker, runs an integration-auditor per file overlap, then recomputes the frontier and dispatches the next wave. The mental model:

Think of yourself as a foreman scheduling a build crew across independent foundations on the same site. Each foundation (wave) is a set of tasks that have no remaining dependencies. While the wave's workers are pouring concrete in parallel, you (the orchestrator) plan the next wave. You never let a worker start before its supporting foundation has cured — that's what blocked_by enforces.

Before either loop dispatches anything, it asks the topology_check MCP tool to classify the project. The tool walks the task_dependencies graph and returns one of three labels: FLAT (no dependency edges; parallel-safe; use /tasks:loop), DAG (acyclic with edges; use /tasks:loop-dag and let Kahn's algorithm order the waves), or DAG_CYCLIC (refuse to run; cycles must be broken first; no override flag overrules this). The classifier is its own verification surface. You cannot accidentally run a parallel loop against a graph with a cycle.

The frontier correctness is fixture-tested. Edges {334→337, 335→337, 337→338, 337→339} must produce waves {334, 335}, then {337}, then {338, 339}. If that ordering ever breaks, CI rejects the change before the build goes green.

The graders that grade the graders

The orchestration discipline that makes the loops trustworthy is generator/critic separation. The agent that wrote the code never grades the code. A separate, read-only agent grades it.

tasks-verifier is dispatched after every closed task. Its tool list is deliberately restricted:

tools: Read, Grep, Glob, Bash,
  mcp__wood-fired-tasks__get_task,
  mcp__wood-fired-tasks__get_comments,
  mcp__wood-fired-tasks__get_dependencies,
  mcp__wood-fired-tasks__list_tasks,
  mcp__wood-fired-tasks__list_projects

No Edit, no Write, no mutating MCP tool. The verifier physically cannot alter the code it is grading or change the task it is grading against. Bash is allowlisted to read-only commands — git log, git diff, git show, the project's test and build commands, cat, head, tail, sqlite3 SELECT-only — and explicitly denies everything that could mutate state. It runs with hard bounds: thirty tool calls and five minutes per task. The verdict is a structured JSON object — PASS, FAIL, PARTIAL, or NOT_VERIFIED — with cited evidence per acceptance criterion. The forbidden-evidence rule, paraphrased from the skill file:

FORBIDDEN evidence: "looks good", "appears to satisfy", "the worker said so", any paraphrase that does not cite a file, command, or commit.

If the verifier emits anything ungrounded, a static gate rejects the verdict before the close sticks.

integration-auditor is the second grader, and it catches a failure mode the per-task verifier physically cannot see. When two worker subagents in the same loop run touch the same file, the auditor is dispatched once per overlap to grade that one file × two-hunk seam:

You are the falsifiable gate that surfaces composition bugs the per-task verifier cannot see — because per-task verifier sees only one task's diff against HEAD~, never the union of two workers' edits to the same symbol. Without this gate, ten green tasks can compose into a broken system and the loop never notices.

The auditor's verdict surface is SAFE / RISKY / BROKEN, with tighter bounds than the verifier (fifteen tool calls, three minutes, because the scope is one file). It is allowed to mark something BROKEN only if it can cite a concrete file:line referent; otherwise it falls back to RISKY. The auditor exists because of an incident on 2026-05-23 where the verifier emitted an invalid per-check status enum and the orchestrator silently upgraded the run from PARTIAL to PASS based on its own observation. The commit that hardened the no-upgrade rule, 6b26fc5, is in the public history:

"the orchestrator silently upgraded both runs from PARTIAL to PASS based on its own observation, violating the Generator/critic separation rule."

The fix made the upgrade impossible to repeat. A static gate now refuses any verdict that promotes a per-check status outside the allowed enum. That is the kind of bug that does not exist if you do not separate the generator from the critic. It is also the kind of bug that, when you do separate them, you find at design time instead of at run time.

A note on attribution. The generator/critic-separation pattern and the frontier-wave execution model are not my invention. They come from GSD, the third-party orchestration framework I credited earlier. What is mine is the implementation of those patterns inside a tasks-loop executor — the verifier and auditor agent definitions, the loop skills, the topology classifier, the database schema that persists verifier verdicts on the task row, the MCP server that exposes it all. The shipped skills are vendor-neutral by design; the commit that introduced /tasks:loop-dag explicitly removed cross-references to other agent-focused tooling from the skill text. The lineage is real and credited; the implementation is independent.

How I'm using it

I orchestrate everything I ship through wood-fired-tasks now. Three concrete examples from the last week, all part of the public commit history of this very repo.

When I want to grade a TypeScript codebase against current community standards, I open a separate Codex session and tell it the truth: I have not read this codebase, I did not write a single line of it, please produce a structured improvement plan and enter it as tasks in the production database. Codex emits a multi-phase roadmap. A follow-on Codex session turns the roadmap into a tracked project with created_by: codex stamped on every task. Then I run /tasks:loop project N and Claude Opus 4.7 implements the plan while Codex sleeps. Two frontier models from two different vendors grade each other through the task system without ever meeting inside the same CLI session.

When I want to harden a service before a public release, I dispatch a Codex audit and an Opus audit independently, paste one's findings into the other, ask Opus to merge both views into a plan, run /tasks:decompose to turn the plan into a structured project, then /tasks:loop-dag project N to drain it in parallel waves with the verifier and integration-auditor running between waves. That is how the final pre-launch sweep on this repo happened the night of May 25. Every task closed before morning.

When I want to clear a backlog overnight, I leave /tasks:loop-dag running with --max-waves 3 --concurrency 4 and check the LOOP-RUN.md artifact when I wake up. Twenty tasks, dispatched in three waves, every one independently verified by an agent that cannot edit the code it is grading. Every claim conflict, every verifier verdict, every integration audit landed in the database as searchable history.

The thing I want every reader to take away from those three patterns is that the orchestrator is not me. The orchestrator is a skill file. I am the project planner and the appellate court. The execution happens because the primitives are first-class.

Extending it

I have extended wood-fired-tasks for my own use with a Grafana dashboard suite that visualizes loop runs and per-workflow cost in real time, a set of session summarizers that auto-generate commit messages and release notes from the telemetry stream, and an attribution hook that tags every agent transaction back to the task it was working on. That extension layer is personal scaffolding I happen to find useful; it is out of scope for this post and I will write it up separately. What matters for a reader installing the tracker today is that the SSE event stream and the structured task rows in the SQLite schema are designed to be exactly the integration points for that kind of observability. Whether you wire it into Grafana, Datadog, a homemade dashboard, or nothing at all, the primitives are there. The point of open-sourcing the tracker is that the parts you build on top of it are yours.

Try it

git clone https://github.com/Wood-Fired-Games/wood-fired-tasks.git
cd wood-fired-tasks && npm install && npm run build
export API_KEYS="your-api-key-here"
export DATABASE_PATH="./data/tasks.db"
npm run migrate && npm start
tasks create --title "My first task" --project 1 --created-by "me"

That is the entire start. The included install.sh (and install.ps1 on Windows) registers the MCP server in ~/.claude.json, copies the /tasks:* skill files to ~/.claude/commands/tasks/, and copies the verifier and auditor agent definitions to ~/.claude/agents/. Restart Claude Code and the autonomous loops are wired up.

The README at the repo root is the full reference. The smallest interesting thing you can do once it is installed is open Claude Code, type /tasks:loop 1, and watch it run. The smallest interesting thing you can do without an agent at all is tasks list --project 1 and start filing real work into a tracker that an agent can pick up later.

Why this is open source

The honest answer is that wood-fired-tasks is load-bearing inside my own development practice. The verifier and auditor lanes that grade every closed task before the close sticks are most of the reason I can ship at the volume I ship. Hiding the executor while publishing posts about how I use it would have been dishonest. Open-sourcing it is the cost of being credible about the practice.

It is MIT-licensed. Use it for your own backlogs. Fork it. Replace the verifier with your own grader. Wire your own dashboards into the SSE stream. If you do, let me know — the part of this I most want to learn from is what other people's orchestration patterns look like when they have a real task layer underneath them.

A companion post going up next week describes how I came to trust the validation infrastructure enough to ship this release without reading the code. The two posts are meant to be read together.

Repository: github.com/Wood-Fired-Games/wood-fired-tasks npm: npm install wood-fired-tasks

From workstation buildout to AI in the loop

Wed, 20 May 2026 00:00:00 GMT

The second post in this series ended on the last days of 2024. The first publishing contract was over. The platform was owned outright by Wood Fired Games. I had ten months of daily AI-assisted work already on my hands, an expanded toolchain I had stood up in the last quarter, and a hypothesis about what all of it might enable. The hypothesis was that AI, used correctly, might be exactly the thing an independent studio needed to bring the larger vision into a form one person could actually ship.

The hypothesis was right. The studio I am operating today is not the studio I was operating eighteen months ago. The throughput of the studio I am operating today is not the throughput of one person.

This post is the practice arc that produced that change, told in three acts. AI as advisor, the long quiet period from early 2024 through the summer of 2025 when AI was a daily expert reference inside my IDE and the code was still entirely mine. AI as collaborator, the autumn-of-2025 transition when Claude Code arrived with the ability to actually edit my files and I had to learn how to delegate without losing control. AI as workforce, the 2026 era of agent orchestration, validation discipline, and observability that I am still inside today. The measurable outputs change by an order of magnitude across those three acts. The practice that produced them changed even more.

Act I — AI as advisor

The first paid AI subscription I bought was JetBrains AI Pro in early March 2024. It lived inside Rider, the IDE I had been working in for years. It did not have access to my filesystem. It could not edit my code. It could read whatever I pasted in and answer whatever I asked. That was the entire surface. And it was enough to change my daily practice in ways I did not appreciate at the time.

I had been a senior engineer for a long time before this. What changed was not what I could do. What changed was how cheap it became to ask questions about things I was already capable of figuring out, but couldn't justify the time investment required. I would have spent half a day reading Docker documentation to set up a service the way I wanted it. With the assistant inside Rider, I would have an answer specific to my project in minutes, plus a reference link, plus an explanation of why that answer was right. The same compression happened with SQL schema design, with CI configurations, with a hundred other surfaces I had been competent at but slow inside. The expertise was already in my head. The activation energy to apply it had dropped sharply.

This was also where I learned what to ask AI and what not to. The dominant patterns of my prompts across this whole period were explain this concept, debug this error, review this code. Generating new code from scratch was a small minority of what I asked for. I was the one writing the code; the AI was the one I went to when I needed an expert sitting next to me. And I corrected the model whenever it hallucinated, which was constantly. No, that isn't a property of that class. No, that method doesn't exist. The discipline of verifying the advisor — of treating the AI as a knowledgeable colleague who is sometimes confidently wrong — became reflex.

The contrast with ChatGPT was instructive. I added a ChatGPT Plus subscription in late summer of 2024, and within a few weeks my honest assessment of it was that it still felt like a toy. I was paying for it. I was not really leaning on it. The reason was scope. JetBrains AI Pro was deliberately scoped to programming — it would refuse to answer questions that fell outside that lane, which was annoying at the time but turned out to be evidence of a product opinion I now respect. A scoped, file-attachable, code-only assistant was producing real daily value. A general-purpose chat assistant on the same dataset was producing curiosities.

The deeper lesson, the one that took me months to recognize as a lesson, was that AI's job in this era was to compress my learning curve, not to take work off my plate. The work itself was still bottlenecked by my typing speed and my attention. What had changed was that I was reaching competence in new infrastructure tooling at maybe an order of magnitude faster than before, and I was using that compression to pick up things I had been circling for years — Docker, modern CI, Blazor, OAuth flows, persistent backend patterns — without having to break the publisher project I was actively shipping.

The asymmetry that defined my expectations heading into the second half of 2025 was about where AI's reach actually extended. The closer the work was to server, DevOps, and SaaS — Docker configurations, CI pipelines, REST APIs, OAuth integrations, SQL schemas, Blazor — the better the AI was at it. The closer the work got to game development proper — Unity editor tools, simulation logic, ECS scaffolding, the gameplay code that actually defines the player's experience — the more uneven the AI's performance became. I did not yet have a framing for why. I have one now, and it is entirely a training-data story. The open-source corpus the models learned from is heavily skewed toward web and backend work. There is an enormous body of public source for the kinds of services I was now finally able to stand up confidently. There is comparatively very little public source for game development proper, because the tooling is often graphical, the engines are mostly proprietary, and the patterns the industry actually uses rarely show up in public repositories. The AI was not failing at game development because it could not reason about games. It was failing because almost no one had ever taught it how the work is actually done.

That shaped what I reached for AI to do across the rest of the year. The DevOps and platform work I had been circling for a decade — Docker, CI, OAuth flows, persistence patterns — I could now pick up at something approaching senior-engineer-with-a-mentor speed. The game-side work stayed mine. I would lean on the assistant for explanations and reviews on the game code, but I would not yet trust it to write it for me. AI could clearly do some kinds of work autonomously. The work I cared about most was not yet on that list.

In November of 2024 the practice expanded sharply in scope. I stood up a full local-AI workstation — Ollama with a 70B model, CUDA, a code-specific 32B model, two agent frameworks, Microsoft's TinyTroupe LLM simulation library — all in one week. I installed Cursor, the filesystem-aware code editor, and pointed it at my repository. I added a Claude.ai Pro subscription. And, on the same machine in the same week, something quieter shifted inside Rider: my prompts to JetBrains AI Pro started carrying explicit file attachments. "How would I convert this file to use ASP.NET?" "Add all of the source code in this namespace to your corpus." The discipline of deciding what context the AI sees, before the AI could see anything on its own, started to become deliberate. That practice would later acquire a name — context engineering — but in November of 2024 I was just doing it.

By the time the publisher project ended at the close of 2024, I had a year of daily practice with AI as an advisor that the surface git history does not begin to capture. The measurable output of the studio in that period looked roughly like one fast senior developer's output. Inside, the practice was already changing in ways the next year would reveal.

Act II — AI as collaborator

The August 5, 2025 commit message in my project-viking repository reads, in full:

"just making a checkpoint before I go all vibe coder."

I had picked up Claude Code's research preview earlier that summer and held off using it seriously while I wrapped a publisher pivot. By early August I was ready to take it seriously. The vibe-coder line was a self-aware joke about what I was about to attempt: pointing an agentic tool at my codebase and letting it write something.

Roughly twenty-eight hours later, the first Co-Authored-By: Claude trailer in any of my projects landed. Two trailers, actually, thirty-nine seconds apart, in project-viking and wood-fired-platform. The work in question was a complete CLI command system for MECSEditor — interfaces, command registry, the full component and message and entity and query command set, JSON converters, a round-trip test project — all of it produced in one focused session by an AI that could read my codebase, plan against it, and edit my files directly. That single session shipped a deliverable I would never have been able to justify even attempting a year ago.

The thing that had changed in August 2025 was not filesystem access. Cursor had given me filesystem access since November of the previous year. The thing that had changed was autonomous multi-step execution. Claude Code could be given a goal, read the codebase to plan against it, edit a dozen files coherently in one pass, run the tests, and present me a coherent end state for review. JetBrains AI had facilitated a faster way for me to write code. Claude Code was the first tool I had used that could produce code I had not myself written, at a scope where the code was structural rather than illustrative.

The first 48 hours of that capability did not go smoothly. The new MECSEditor CLI had a systematic bug in how it generated components — most of the components I asked Claude to create came out with their data fields stripped, leaving only the required identity bytes. We worked through it across one long day. I would describe the symptom; Claude would propose a fix; I would try it; we would find a different failure mode; I would write down what I had learned and feed it back into the next prompt. By the end of the day the system was producing correct components and I had a small library of context documents — what the engine's component contract required, how the database serialization worked, what the conventions were — that the next session could use as bounded context.

That recovery was the moment the practice I had been doing in primitive form for nine months became deliberate, daily, and intensive. The pattern was simple in description. Treat the AI like a senior engineer who has just joined the team. Give it documentation of the conventions, the constraints, the patterns, the antipatterns. Iterate on the docs when the AI gets something wrong, the way you would iterate on onboarding docs after a confusing first week of a new hire. Every mistake the AI made in those early sessions became a paragraph in a context document that prevented the same mistake from recurring. The docs were the leverage point. The AI was the thing that exposed the gaps in them. And — this is the part the industry took another six months to start saying out loud — the docs were also genuinely useful documentation for human collaborators, because the discipline of writing them honestly forced the conventions into a form anyone could read.

The first major feature I let Claude author end-to-end and reviewed at the diff level was the persistent-entity system for the platform — a hybrid persistence layer that gave the simulation a durable identity model that survived server restarts. I designed the architectural shape. I wrote the context that described the contract. I asked Claude to implement against it. I reviewed every diff. That experiment crystallized the pattern I would lean on through the rest of 2025 and into early 2026. Write the context. Let the AI propose the implementation. Review every diff.

What followed in the last two months of 2025 was the dense first output of a fully matured collaboration. The platform got its persistence layer, an OAuth-backed identity surface across Steam and Google and Apple, a Docker deployment that containerized every service with health checks, and a real CI pipeline catching regressions on every push. Four bodies of work I had been circling for a decade, all shipped in five weeks, all under the context-engineering-and-manual-review discipline that the August disaster had taught me to take seriously.

The measurable shift across this transition is the cleanest evidence I have of what changed. The pre-AI baseline — every month I worked on the wood-fired stack before the first co-author trailer landed in August 2025 — averaged about twenty-one thousand source lines of code changed per month. That is what twenty-five years of being a fast senior engineer looks like as data, and it is not a small number. The months following the Claude Code adoption began running consistently north of one hundred thousand source lines per month. Roughly five times faster, sustained, with my own review still in the loop on everything.

What had changed was not me. I was still the engineer. I had not gotten faster at typing or thinking. What had changed was the bottleneck: I was no longer rate-limited by what I could write. I was rate-limited by what I could review. That distinction is what shaped everything that came after.

Act III — AI as workforce

If your bottleneck is review, the design question becomes: how do you make the work easier to review, and how do you scale review when the rate of output keeps climbing? The 2026 answer I converged on has three layers. Redesign the engine so the language itself catches the mistakes that would otherwise show up in review. Delegate work to specialized agents with declared scopes and discipline layers, so what hits human review is already pre-validated. And instrument the entire stack — every AI call across every vendor across every tool — so what gets merged is backed by evidence rather than vibes.

The first layer was born out of frustration with how much friction AI encountered trying to follow my existing workflow. My pre-AI authoring pipeline was built for a human. Start with a gameplay concept, mentally decompose it into the messages, components, systems, and interpreters it would require, open MECSEditor's GUI to declare each of those types into the asset database, let the tool generate the C# scaffolding the runtime would compile, then jump to the IDE to hand-write the function bodies. It worked because the human driving it carried the gameplay concept across each tool boundary and applied judgement at every stage. When I pointed Claude at the same pipeline, the work fell apart. Claude could not coherently shuttle between the GUI tool and the IDE; the handoffs that were invisible to me — concept to database row, database row to generated code, generated code to function body — were friction points the AI lost information across.

So, as any tools engineer would, I rethought the tool from the perspective of its new primary user. The answer was a fundamental inversion of the authoring direction. In the original pipeline, the asset database was the source of truth and code was generated from it. In the new pipeline, the code became the source of truth and the asset database was generated from it. Components, systems, queries, and interpreters became attribute-annotated C# types in the codebase. A Roslyn source generator read those annotations and emitted the runtime scaffolding at compile time. The asset database now sat downstream of the code rather than upstream of it, which meant Claude — working entirely in C# files — could now drive the whole pipeline from a single surface. Over a hundred hand-coded boilerplate files retired in one pass.

The compile-time diagnostics that shipped alongside the generator — sixteen of them, covering ECS constraint violations the framework can now catch before the assembly is built — were a benefit that came along for the ride. That is the part I would later describe as making the engine AI-legible. The benefit is not that AI sees prettier code. The benefit is that AI cannot quietly produce a malformed system, because the compiler refuses. Review converges on the things that require human judgement instead of catching mechanical mistakes the language could catch on its own.

The second layer was orchestration on top of the third-party GSD framework — github.com/gsd-build/get-shit-down, the work of an author other than me. I credit the substrate every time I describe what I have built on top of it. What I added to GSD was a discipline layer: specialized agents with one scope and one job each, a hook layer that enforces validation patterns at agent-boundary handoff, and an orchestration agent that dispatches work to them and presents me a coherent end state rather than a stream of half-finished intermediate outputs. The pattern is the same pattern I had built into MECS itself. Declare what each component owns. Declare what it depends on. Run them in topological order. Let the framework verify nobody stepped outside their lane.

Then March came, and the wheels nearly came off. The orchestration was producing far more code than I could review. Hallucinations slipped through. Subtle regressions accumulated. Architectural drift crept in at the edges. The structural symptom was that I kept having to revisit work the orchestrator should not have shipped in the first place. I had raised the ceiling on what I could attempt. I had not yet built the matching floor on what I let through.

That gap forced me to look honestly at what the senior practitioner's job had become. For twenty-five years my job had been to read code. I read what other engineers wrote. I read what I had written. I read what I was about to land. Code review was the surface where the human caught what the AI — or the junior engineer, or the late-night version of myself — would otherwise have shipped wrong. With the orchestration running at the volume it was running at, that surface had stopped scaling. The code was no longer the artifact a human could realistically validate.

So I moved my attention up a layer. The artifact I now validate is the validation infrastructure itself — the test harness that proves the code works, the CI configuration that runs the harness on every push, the static-analysis rules and the cross-vendor audits that grade each other, the telemetry stack that captures every AI call and lets me query what every model did against my repository. Comprehensive automated testing went in everywhere. Every CI workflow I have today was written by an agent following a specification I wrote in plain English; I do not actually know how to configure a modern CI pipeline by hand anymore, and it would be dishonest to imply I do. The cross-vendor AI observability stack — a telemetry daemon, an outbound proxy, dashboards — captures every AI call I make across every vendor and every tool, so the question of which AI runs to trust becomes a question of evidence rather than instinct.

There is a recursion in this practice that took me a while to notice. The validation infrastructure that grades the AI-written code is itself AI-written code I never typed by hand. If trust were the right frame, that would be two leaps of faith stacked on each other. The thing that makes it tractable is that the layers grade each other. Tests describe behavior. Behavior is independent fact. The failure of a test is true regardless of who wrote either side. The CI runs the tests. A cross-vendor audit grades the CI. I read the verdicts, not the diffs. Trust is a feeling. Validation engineering is a system. The first scales with attention; the second scales with infrastructure. Over the course of this spring I migrated almost entirely from the first to the second.

The measurable output across this period is the part of the story that still surprises me when I look at it. Across the seventy-five months of source-control history I have, I have shipped roughly 1.96 million source lines of code. Of that, 44.8% carries an AI co-author trailer — and every line of that 44.8% was written in the last ten months. The last ten months shipped more source code than the entire five and a half years before them combined. The peak month, March of 2026, was 309,000 lines, and that month was a net deletion in raw bytes because the work was a source-generator migration replacing hand-coded boilerplate with generated equivalents. The volume of work the studio is doing is no longer one person's volume. It has not been one person's volume for ten months.

I want to put that number in its place before I say anything else about it. Lines of code is a vanity metric, and every senior engineer in this industry knows it. The number above is the easiest measurement to make, not the most meaningful one. Watched as change-over-time it is not entirely useless — it tells you whether the studio is moving — but it tells you very little about whether the movement is in the right direction. What actually matters is the shipped value: tickets closed and verified, features that land and stick, bugs that stay fixed, audits that come back clean. The reason I built the cross-vendor observability stack alongside the task-orchestration system I am preparing to release is to measure exactly that. The tasks system tracks the work I want done. The telemetry tracks the AI doing it. Together they let me ask the question LOC cannot answer: of the work the AI produced this week, how much survived validation and shipped, and how much got reverted, rewritten, or quietly broke something downstream? That is the question that matters, and the infrastructure to answer it at scale is what I am still building.

Even taken at face value, the number above does not mean I have become five times the engineer I was. I am the same engineer. What has changed is where my attention lives in the loop. The code is now produced and validated downstream of me. I write specifications. I review test coverage, CI configurations, audit findings, telemetry trends. I look at what the validators are catching, what they are missing, and whether the gap is widening or closing. The orchestration is doing the typing. The validation infrastructure is doing the checking. I am doing the architectural decisions, the design judgement, and the review of the validators themselves — the parts of the job that, in retrospect, were always what actually mattered.

The artifacts of that practice are starting to leave the studio. The next step is the open-source release of wood-fired-tasks — the externalized version of the orchestration discipline I run on my own work, with the task graph, the verifier and integration-auditor agents, and the validation hooks at agent-boundary handoff all packaged up for general use. The observability stack is the externalized version of the telemetry I depend on internally. Underneath both is the Wood Fired Agent Operations Platform, the productized governance layer this whole practice has been pointing at. The work I do for my own studio is the same work other organizations need. A discipline that an independent studio of one person can run is a discipline a team can adopt. The journey is the resume.

Where this leaves me

Two and a half years ago I bought my first AI subscription because I found value in having an expert sitting inside my IDE. Ten months ago an agentic tool first produced code in my repository that I had not myself written. The studio I am operating today, with AI in the loop on every line of every project, is producing output that would have taken a team of dozens to produce twenty years ago. The bottleneck has moved from writing, to reviewing, to governing — to building the engineering substrate that lets the AI work and the human catch what matters.

The vision the studio has been quietly aiming at since the 2020 thesis — an independent operation shipping the kind of multiplayer technology that historically required a team of dozens — is no longer aspirational. It is the practice I am running today. Both engines had to be running. Anthropic kept shipping more capable models. I kept finding harder problems to point them at, and harder questions to ask of how I was directing them. The compound result is something I can now hand to someone else.

This is the third post in a series. The first covered the six years and twenty-three years of thinking that produced MECS as an engine. The second covered the platform that grew up around the engine and made it operable as a studio. This third post covered the practice that turned the studio into something I can operate at a throughput one person should not be able to sustain. The series ends here for now. The work continues every day.

From engine to platform

Wed, 13 May 2026 00:00:00 GMT

The first post in this series ended at the close of 2023: MECS had survived roughly ten Unity prototypes, the architectural decisions that came out of two decades of multiplayer-game work had been re-expressed in modern C#, the late-2022 unmanaged rewrite had landed, and the framework was a battle-tested engine looking for a contract that would force the parts of the architecture that had been designed against a network layer to actually meet one. None of the prototypes had shipped. None had carried real UDP datagrams over a real wire yet — though, as I will get to below, the multiplayer architecture itself had been validated for years through a model that ran inside a single process.

That changed in October 2023. Wood Fired Games signed its first contract as an operational studio: a three-month prototyping engagement in which I had to prove that I — and by extension Wood Fired Games — could actually deliver. The studio had been a dormant LLC since 2018, waking back up in early 2021 when I went independent and started running the engine work seriously. By late 2023 the studio was operational again, and the engine was about to cross from an in-process model of an authoritative-server sim to the real thing operating over the open Internet.

This post is what happened next. The in-process two-thread model that had been validating MECS's multiplayer architecture for years before any networking code was written. The November 2023 work that replaced the model's in-process queues with real UDP datagrams. The 2024 backend stand-up that grew up around the engine. The five-layer architecture that crystallized in April of that year. The moment MECS stopped being shared game code and became a platform.

Before the contract: an in-process model of the wire

A point I want to make clearly before the contract enters the story. The fact that MECS had not actually moved bytes across a network by late 2023 is not the same as the fact that MECS had not been designed for one. Those are different statements, and conflating them gets the engine arc wrong.

Iteration speed is the dominant variable in game development. The number of times a designer can change a mechanic and see what it feels like is the single biggest determinant of whether a game is fun. Splitting your binary into a separate client and a separate server is one of the most expensive things you can do to that variable — every change touches two executables, every debugging session has to attach to two processes, and the surface area for stale-build mismatches roughly doubles. So the rule I had learned over twenty years of multiplayer work was clear: do not split the binary until the last possible moment, and only when not splitting poses more risk to shipping than splitting does.

But you can buy most of the architectural integrity of a split binary without paying the iteration cost, and that is what I had been doing inside MECS since the framework was first capable of running a simulation. The construction is simple in description. Run the client-side MECS service in Unity's normal frame update, exactly as it will run in the eventually-shipped client. Run a second MECS service — the would-be server — on a separate thread inside the same process, with the thread itself controlling the simulation update interval rather than Unity. Connect the two services with a pair of ConcurrentQueue<T> instances and treat each queue exactly as if it were a UDP socket. The client-side service writes messages to the outbound queue using the same Packet<T> framing it would later use to write to a real socket; the server-side thread reads from that queue, processes those messages, and writes replies to the second queue, which the client-side service drains on its next frame. Forbid any other communication between the two threads. No shared state, no static singletons, no direct method calls — anything that needs to cross has to go through the queues.

The result is a fairly accurate model of a split-binary client-server architecture without the iteration burden of actually splitting. Single-process debugging. One build to compile. Unity's hot-reload still working. And the exact message types, the exact serialization paths, the exact authoritative-server semantics that the real network layer would inherit, because the queues are the message bus the real transport would later replace one-for-one. Anything that worked in the in-process model would work over UDP. Anything that did not work in the in-process model — race conditions, ordering assumptions that crossed the queue boundary, mutable references held across the simulation/view firewall — would not have worked over UDP either, and would have shown up later at a much higher cost.

To stay honest about what real networking would feel like, I added artificial latency on both queues with a parameterized noise floor — messages sat in their queue for a configurable delay plus jitter before the other side could drain them. That turned the queues into a believable approximation of an actual socket. Delivery had a distribution. Frame-rate-sensitive code on the Unity side could not assume server responses were instant. Any timing or ordering assumption that would have broken over a real network broke over the queues first and got fixed years before any UDP code was ever written.

That model is also where MECS's hard 512-byte component size limit comes from. The limit looks at first like an arbitrary constraint and is sometimes read as one in the framework's docs, but it is not. It is the size below which a component is guaranteed to fit inside a single UDP datagram on every commodity network path the engine targets — and therefore the size below which a component can never be fragmented in transit. Fragment reassembly is one of the harder things to get right on top of UDP, especially when paired with reliable in-order delivery; the easiest way to make fragment reassembly bug-free is to make it impossible for the case to arise. The component size cap was set in 2022 with that property in mind, two years before any real UDP code was written. The whole framework had been designed against a wire it would eventually meet.

October 2023: the prototyping contract

I am not going to name the project, the publisher, or the target platform. The shape of the engagement is what matters for the engine arc. It was a three-month prototyping contract — not a shipping-game contract — through which I had to prove that I (and by extension Wood Fired Games as an operational studio) could actually deliver. The specific bar I was being asked to clear was a cloud-hosted authoritative server with players able to join ad-hoc from anywhere in the United States into a single server instance, end-to-end, in front of the customer. That bar was cleared.

A prototyping contract sounds smaller than a publishing contract and in some ways it is. The thing it is not smaller in is technical risk to the engine, because a three-month window to demonstrate a working multiplayer prototype is exactly the window in which an engine that has not yet faced real network traffic gets to find out whether the architectural choices that worked in-process work over a wire. From my perspective the contract was the box-check moment for MECS's multiplayer architecture. I had been running the in-process two-thread model for years. I now had three months to make it real on a socket, in front of a customer, all while building a game that was actually fun to play.

The first practical change was that MECS started being consumed as a compiled MECS.dll rather than as embedded Unity source. The earlier prototypes had all carried MECS in-source under WoodFired.MECS.asmdef — every game project had its own copy of the engine, edited in-place when the engine needed changes. That works for cross-project hardening, where the engine is allowed to change underneath you. It does not work for a project that is going to be evaluated against external deliverables. The contracted game needed a stable engine binary it could depend on. So MECS got built out into a compiled dependency for the first time, with versioned releases, and the game consumed the binary instead of the source.

That seems like a small organizational change. It was not. It forced the engine's surface area to become a real public API. Internal types had to be marked internal. Public types had to be designed against the use cases of consumers who could no longer modify the engine to suit their game. Breaking changes had to be considered rather than just made. The engine had to start behaving like a library, with the discipline that implies.

The prototype itself was where MECS first proved authoritative-server, drop-in/drop-out multiplayer over real sockets rather than across in-process queues. The simulation ran on a dedicated server process hosted in the cloud. The clients ran the same MECS framework but in a viewer role — receiving component-delta broadcasts from the server, applying them to a local replica of the simulation state, rendering against the replica. The sim/view separation principle from the Rise of Nations tri-class (covered in the previous post) was finally doing the work it had been designed for. The server was the simulation; the clients were the views; the framework that powered both was the same framework on both sides; the queues that had been carrying the messages had been replaced by a UDP socket.

November 2023: the UDP transport gets built

By the time the prototyping contract was signed, the multiplayer architecture had been validated for years in-process, but the actual transport — real UDP framing, real fragment reassembly, real reliable-resend, real heartbeating — had not been written. November 2023 is when it was.

The window was as short as it was because there was nothing left to design. UDP plus an application-layer reliability layer for low-latency multiplayer was a conclusion I had reached in the early 2000s — reinforced rather than challenged by the WebSocket-and-TCP educational platform I had just come off of — and the wire-protocol shape was familiar; I had written variants since college and had authored the wire protocols for both Rise of Nations and Rise of Legends at Big Huge Games. The server was always going to be pure .NET rather than Unity, which ruled out every Unity-side networking library before any of them could be evaluated. The question in November was not what to build but how quickly the existing message and component model could be re-expressed over a real socket.

The first transport commit lands on November 11, 2023:

"...getting closer to making it so players can share their world state."

That commit introduces NetworkSocket.cs — a raw UDP socket wrapped in a state-broadcast model. There is no TCP layer underneath. There is no networking library being pulled in. There is a Socket opened in Dgram mode, a send loop, a receive loop, and the framing code that turns a list of component deltas into a sequence of UDP packets the other side can reassemble. The same managed-versus-unmanaged discipline that runs through MECS runs through the transport. The wire format is the in-memory format. The component bytes that go onto the wire are the component bytes that live in the unmanaged pool — and they fit in a single datagram because that was decided two years earlier.

Five days later, on November 16:

"I finally got the dedicated server working!"

That commit renames NetworkSocket to NetworkInterface and adds the things that turn a raw UDP socket into a viable game transport: compression of large state broadcasts, fragment reassembly for any messages that did exceed a single MTU (the BitArray2048 family that landed during the unmanaged rewrite a year earlier finally earns its keep as the backing for fragment-tracking bitmasks), heartbeating to detect dead connections, and a reliable resend queue with exponential backoff for the messages that need to arrive in order. The first dedicated server run was a few hours after that commit.

Two weeks of focused work, two load-bearing commits, one functioning multiplayer game-server transport. The honest framing of that timeline is that it was expedient rather than heroic. The architecture had been waiting for the transport. The transport just had to be written.

NetworkInterface is still the transport MECS uses today. The class has been rewritten in pieces — the reliable resend queue got hardened in 2024, the fragment reassembly got more careful about ordering in 2025, the heartbeating got smarter about distinguishing dead connections from briefly silent ones — but the basic shape has not changed since November 2023.

Q1 2024: the legal lag and a C dogfood

The prototype concluded successfully in late 2023. The next phase — a longer-form agreement covering the full game — required everyone involved to sign, which meant the first quarter of 2024 disappeared into legal. I had time on my hands that I had not had since the contract started. Two pieces of pre-platform work fall in that window.

The first was that a new requirement landed partway through Q1. The client would not be an installed binary; it would run inside the cloud service itself, with the rendered output streamed to the player's device. The game we were building would be the first multiplayer title to ship on that service, and at the time the only multiplayer title in development on it. That requirement is what forced the platform stand-up I describe in the next section. It is worth flagging here because it changes the shape of the April work from "build the services this game happens to need" to "build the services any multiplayer game on this platform will need," and the second framing is what I actually started executing against.

The second was that I used the available cycles to confront a question I had been carrying since 2020. Was C# really the right language for MECS? I had defended the choice every time it had come up — the case for it is laid out in the previous post — but I had never actually built MECS in another language and lived inside it, so the case was theoretical. In late February 2024 I started a separate repository, project-cfirelight, and rebuilt the engine from scratch in pure C. No managed runtime. No garbage collector. No reflection. No Roslyn. The exercise was a deliberate dogfood: I wanted to know what my best effort at a C MECS would actually feel like to live inside, day to day, and whether after a few weeks in that environment I would still prefer C#.

The branch ran from February 26 to March 29, 2024 — roughly five weeks. By the end of it cfirelight had a working entity manager, component pools implemented as hand-rolled arena allocators where the caller passes in a block of memory and the framework computes layouts inside it, a query system that walked component-type predicates to match entities, message headers, and a system-execution layer with per-system controls and depends_on arrays. The mapping from the C# concepts to the C concepts was nearly one-to-one. EntityID became eid. ComponentHandle — the ComponentType plus ComponentID pair that is the delta currency in the C# engine — became cref { ctype; cid }. UniqueID became a 16-byte uuid with a spec-conformant generator. The Roslyn incremental generator became a small C codegen/main.c tool serving the same role. The change-driven system skip — the one optimization I claim as my own architectural contribution — ported directly: mecs_system_update builds a types_changed set from the frame's deltas and uses if (!execute_system) continue; exactly the way the 2021 BitArray version does in the C# engine.

The honest verdict was closer than I had expected. C felt better than I had been carrying in my head. The architectural shape of MECS survived the language change cleanly enough that the C version felt like the same engine, which I had not been certain it would. But after five weeks of working in it, I still wanted to be back in the C# codebase. The reasons were small things that added up — boilerplate I had been letting Roslyn generate now had to be written or codegen'd by hand; the type-safety I had been taking for granted was a discipline I now had to enforce manually; the iteration loop with a separate codegen pass and a fresh compile felt slower than the Unity-driven inner loop on the C# side. None of it was painful enough to argue against working in C if I had to. But by the end of March I had answered the question I had set out to answer. C# was the right choice not because it had been the easy choice, but because I had now lived inside the alternative and still preferred to be back. The decision was earned rather than assumed.

cfirelight did not continue past March 2024. The platform work was about to start, the .NET engine had the gravity of the surrounding Unity and platform code, and the question the dogfood was meant to answer had been answered. The branch stays in the repository as the C reference implementation of the MECS architecture — not because I intend to ship anything on it, but because building it was the experience that let the rest of the year proceed without me second-guessing the language under it.

March 2024: the platform appears

The new client-in-cloud requirement that landed during Q1 had a consequence I had not fully appreciated until I read the platform's developer documentation carefully. The cloud service that would host the streamed client offered no server-side infrastructure. None. No matchmaker, no account system, no instance manager, no shared state primitive — nothing. The game I was building derived a significant part of its identity from its multiplayer features. Cutting them was not a difficult choice; it was an impossible one. Which meant I was not only building the engine, the game-server simulation, and the streamed client. I was also building the entire platform layer underneath all three.

That realization is what made me pick up the phone and call Richard Jose.

Richard — or Jose, as everyone at Big Huge Games called him — is an old friend from the BHG days. After BHG he became CTO at Backflip Studios and then at Scopely, where he has spent the last decade holding the technology stack for some of the largest free-to-play games on mobile. Monopoly Go is the one most people would recognize. The shape of platform work I was about to do was the shape of work Jose had been doing at scale for years. I reached out for advice. We had one phone call. It crystallized the foundation of what I needed to build.

The single most concrete piece of advice from that call was Redis. Jose was emphatic about Redis as the right shared-state primitive for the platform layer I was describing — fast, well-understood, mature client libraries in every language, and at the right level of abstraction for session state, leaderboards, instance directories, and the cross-service publish-and-subscribe patterns that hold a multiplayer backend together. Almost everything I built across the next month sits on top of that recommendation. The Authentication service, the GameDirectory, the asset and type databases, the analytics short-path — all of it leans on Redis in roughly the shape Jose described. He was a huge help. We did not need a second call.

The piece I brought to the conversation, which I want to credit my time inside MECS for, was the insight that the game-server instances themselves could be modeled as platform services. I already had most of the building blocks. MECS had a Service abstraction that ran a deterministic frame loop; it had a message-routing system that already crossed a wire; it had a serialization model that produced the same bytes for the network and for disk. What I did not yet have was a way to consistently package, serve, and update game-server binaries and the asset databases they consumed, at scale, against a fleet of cloud-hosted simulations. That is the gap the GameInstance service and the Assets service were built to close. The platform was, from day one, designed to host arbitrary game-server processes the way a SaaS backend hosts arbitrary worker pods.

I will admit one more thing about that month. Underneath the immediate need — get the platform stood up for one specific game on one specific publishing deadline — was a quieter ambition I could not quite suppress. I was building services I knew I wanted to be the foundation under every Wood Fired Games title, now and in the future. Multiplayer infrastructure the next game would not have to reinvent. The publisher's deadline was the forcing function. The architecture I was reaching for was longer than that.

wood-fired-platform is created on March 30, 2024. The first commit reads, in its entirety:

"initial project setup"

Three days later, on April 2, the Authentication service lands:

"Set up a basic login system with a client and server component."

It is a Redis-backed AccountDatabase with anonymous login, account creation, and token issuance. It does not need to be more sophisticated than that for what the contracted game needs at this stage; the architecture is the point, not the feature set. The Authentication service becomes the central identity service over the next two years (Steam, Google, and Apple OAuth integrations land in late 2025), but the day-one version is the right level of complexity for the day-one game.

Two days after that, on April 4, the Analytics service lands:

"I got an analytics service rolling."

That commit introduces the first PostgreSQL use anywhere in the platform. The choice of database matters: Redis is fine for short-lived session state, but durable analytics needs a real relational store, and Postgres is the obvious default. The Analytics service has stayed mostly the same since this commit — AnalyticsDatabase writes AnalyticsEvent rows to PostgreSQL on a queue, services across the platform publish events asynchronously, dashboards and reporting consume from the same store.

By mid-April the platform has GameDirectory (April 13 — a Redis-backed catalog of available game instances), an Assets service (also April 13 — versioned asset manifests delivered before session launch), a GameInstance service (April 18 — "I stood up a service that runs a game server" — the project that spins up authoritative game-server processes and tracks their lifecycle), and a PlatformServices shared library that holds the cross-cutting types every service depends on. These are five distinct microservices, plus a shared library, plus a Redis instance, plus a Postgres instance, all stood up across about three weeks by one person who had committed to a single principle: that the same codebase that defines the simulation should define the services around the simulation.

This is the moment MECS stops being shared game code and starts being a platform.

April 2024: the five-layer architecture

The engine repository as a distinct artifact is created on April 9, 2024:

"I'm starting a larger MECS library that I intend to link in future Unity projects. This is also the beginning of a full asset pipeline."

That commit introduces wood-fired-engine as a standalone repo and lays down the architectural split that organizes the codebase today. The framework is divided into five layers, each with a clear boundary and a defined direction of dependency:

Core — the unmanaged primitives. UniqueID, Timestamp, ComponentType, the BitArray* family, FastList<T>, the unsafe Bitwise helpers. Zero dependencies on anything else. The bottom of the stack.
MECS — the simulation framework itself. Service, EntityLibrary, ComponentFactory, the entity-component-system primitives, the message-routing system, the delta-broadcast model. Depends on Core. The engine.
Generate — the code-generation pipeline. Reads database-defined type entries and emits the C# scaffolding that wires components, systems, queries, and interpreters into the runtime. Depends on MECS and Core.
AssetManagement — the database that defines what types exist and what game content is loaded. AssetDatabase, TypeDatabase, the MessagePack formatters that serialize unmanaged types to and from disk. Depends on MECS and Core. Independent of the platform layer above it.
Platform — the server-side services that run on top of the simulation. Authentication, Analytics, GameDirectory, GameInstance, the persistence layer. Depends on everything below it but on nothing above it (because there is nothing above it).

The constraint that holds the architecture together is that dependencies flow strictly downward. Core never imports anything from MECS or Platform. MECS never imports anything from AssetManagement or Platform. The Platform layer can use anything underneath it, but the engine layers do not know the platform layer exists. That allows a game running on the Unity client side to consume Core, MECS, Generate, and AssetManagement without dragging in any of the server-side service code. It also allows the server to run the same engine layers plus the platform services, with full type-system compatibility between the two sides.

This split has stayed essentially unchanged from April 9, 2024 to the present. The boundaries it draws are the same boundaries the NuGet package extraction would later carve along (more on that in the next post). The five-layer architecture is not a refactoring; it is the original shape of the engine as a library rather than as a game's source code.

Two weeks after the engine repo is created, on April 23, MECSEditor first appears:

"Adding a new project that gives me a gui that runs in ssh that I can use to edit the asset database."

That commit introduces what is now the central tool for editing the asset and type databases that MECS uses. The original implementation was a TUI — a text-mode interface that could run inside an SSH session, which was important because a lot of the asset-database editing was going to happen on remote servers where a graphical editor was not an option. The tool has grown since then (it is now a full CLI, a dotnet global tool, and an MCP server that lets agents call its commands directly), but the kernel of the design is the same: define types and content in a database, let the code generator emit the scaffolding, edit the database through tooling rather than through manually-maintained C# files.

The final publishing agreement was signed around May 2024. By the time the ink was dry I had the engine repo, the platform repo, the five-layer split, an authoritative-server transport on real sockets, five microservices and a shared library running against Redis and Postgres, and a TUI for editing the type and asset databases. I also had a new expanded vision for what the platform was going to become. The contract was for one game. The architecture I had just stood up was for every game I was going to build after it.

What the platform layer answers

This is the part of the story where the threads from the previous post tie together. The frustration that came out of the education-MMO project — the SaaS-shaped platform team and the games-shaped game team unable to combine cleanly into a shared multiplayer experience — has been hanging over the engine work since late 2020. The five-layer architecture is the answer to that frustration, made concrete.

The simulation and the platform services are one codebase. They share a type system. They share a serialization format. They share an identity model. A change to a component definition propagates from the database through the code generator to both the game-server simulation and the platform-side service that persists the component to durable storage. No cross-team translation layer is required because there is no cross-team boundary. The team building the persistence service and the team building the combat system would be looking at the same mecs-types.registry file. They are not different stacks. They are different services in the same stack.

That property is what made the platform layer worth building from scratch rather than adopted from a shelf. PlayFab, Nakama, GameSparks and the rest would have given me 80% of what the game on the immediate publishing deadline needed, in a week. I built it myself for three reasons that compounded. First, the cloud platform offered no server infrastructure, so the choice was not between "build" and "adopt" — it was between "build" and "do not ship multiplayer," and multiplayer was non-negotiable for the game. Second, the architectural commitment that the server be pure .NET (not Unity) ruled out the off-the-shelf options that expect a Unity runtime on both ends. And third — the one I would have built for even if the first two had not been there — I wanted the deep type-system compatibility between the simulation and the platform services that no off-the-shelf backend can offer. That last property is what closes the gap that had been the most expensive thing about the education-MMO project. The first two are what made the question moot in practice.

I would build the platform layer again. I would build it sooner.

Late 2024: the project ends, the platform does not

The publisher had an internal re-org in the second half of 2024. The details are not mine to share. The outcome that mattered for this story is that the project the platform had been built around was not going to continue. The proof point I had been longing for — the one that would have put the platform under real player load on a real publishing schedule — was going to have to wait.

The structural detail that mattered for what came next is that I had retained all rights to the technology stack. The engine, the platform, the asset pipeline, the transport, every service I had been building since March — all of it stayed with Wood Fired Games. The contract had been structured deliberately to produce that outcome if it came to it. It came to it, and the foresight paid off. I was not walking away from the platform. The platform was walking away from one specific game and toward whatever I pointed it at next.

The decision I made through the end of 2024 was to use the year ahead to harden the platform until it was capable of being put in front of any publisher I might bring to the table — rather than continuing to optimize it against the operational constraints of the first one. The first publisher's requirements had shaped a lot of the original 2024 work. The 2025 work was going to be about turning the platform into something publisher-agnostic, fully owned by Wood Fired Games, and ready for whatever partner came next.

There was a second decision I made in roughly the same window, and it turned out to matter much more than I expected at the time. JetBrains AI Pro — the AI Assistant feature inside Rider — had been a daily tool for me since February 2024. I had been using it as an expert reference: error explanations, build-system advice, second opinions on code I had already written. Through most of the year that was where the practice stopped, because the publisher deadline owned my attention and the AI was a way to move faster inside Rider rather than a thing I was investing in for its own sake. Once the deadline went away in the second half of 2024, I had the room to actually expand the practice. The November expansion is documented in detail in the next post, but the relevant beat for this one is the decision I made at the end of the year: that the AI tooling I had been using as a productivity assistant was worth taking seriously as a strategic investment.

I was not yet letting AI write code for me. I was the one typing every line. The agentic-coding category did not exist yet; Claude Code did not exist yet. What existed was a scoped chat assistant inside the IDE, plus — by the end of November — a local-AI workstation, a filesystem-aware editor (Cursor), and two cloud chat subscriptions (ChatGPT Plus and Claude.ai Pro). The activation energy for picking up new infrastructure tooling — the long list of things I had known about but never committed to learning, Docker and proper CI prominent among them — had dropped enough that I could begin to see a different studio underneath my hands than the one I had been operating all year. The more carefully I looked at the value that practice was delivering, the more I suspected I could lean on it much harder than I had been letting myself.

Where this leaves us

That is where this post ends, on the last days of 2024. The first publishing contract was over. The platform — engine, asset pipeline, transport, services, and the years of architectural conviction underneath all of them — was owned outright by Wood Fired Games. I had no immediate revenue replacing the publisher and a year of hardening work ahead of me. I also had ten months of daily AI-assisted-work practice already on my hands, an expanded toolchain I had stood up in the last quarter, and a hypothesis about what all of it might enable.

The hypothesis was that AI, used correctly, might be exactly the thing I needed to bring the larger vision the studio had been quietly aiming at since the 2020 thesis into a form an independent studio could actually ship. I had spent two decades watching engine work, multiplayer work, and platform work get done by teams of dozens. I had just spent a year doing all three by myself. And at the end of that year I could see, in something close to peripheral vision, that the bottleneck was about to change.

What that hypothesis has turned into across 2025 and 2026 — the evolution of how I work with AI, the impact it has had on me, and the impact it has had on every line of code I have shipped since — is the subject of the next post in this series.

Six years of an engine

Wed, 06 May 2026 00:00:00 GMT

MECS — the Modular Entity Component System I have been building since 2021 — is the engine underneath every game project I have shipped or attempted in the last five years. It is also the substrate underneath the AI orchestration work I have been writing about elsewhere. This post is the first in a short series on it. The other two will cover the platform that grew up around the engine in 2024 and the AI-era developments — source generators, compile-time diagnostics, the NuGet breakout — that landed earlier this year.

But the engine is six years old as a codebase, give or take. The thinking that produced it is twenty-three.

The honest version of the MECS origin story does not start when the first commit was pushed. It starts in 2003, in a building outside Baltimore, with me trying to figure out how to keep a 28.8-modem-connected eight-player real-time strategy game in sync across machines that had wildly different graphics cards and processors. It runs through nearly two decades of multiplayer game work — most of it at studios that share a lot of personnel and a strong engineering tradition — before it arrives at MECS as code. Without that pre-history, the design choices in MECS look arbitrary. With it, they look almost inevitable.

So let me start where the through-line actually begins.

Building a multiplayer RTS in 2003

I joined Big Huge Games when I was twenty-four. The company was Brian Reynolds's studio at the time, and the project was Rise of Nations. I was, for the duration of the original game's development, the multiplayer programmer — the only multiplayer programmer. RTS games were selling in big numbers in those years. The engineering substrate for building them was, in retrospect, almost charmingly thin.

There were no off-the-shelf engines you could license and build an RTS on top of. Every studio wrote their own. There were no networking libraries — not in any meaningful sense; the Berkeley sockets API and WinSock were what you had, and "high level" usually meant somebody else's wrapper around select(). NATs were a new enough concept that connection establishment between two players behind home routers was actually a research problem. People still played on 28.8 and 56k modems. GPUs were just starting to be a thing the average consumer would buy. The Pentium III was the dominant CPU. Cloud compute did not exist as a category; "deploying a server" meant putting a physical machine in a colocation cage.

The only published references I can remember consulting back then were Tim Sweeney's writings about the early Unreal engine — at that point still the codebase behind the original Unreal, not yet the platform it would become — and Mark Terrano and Paul Bettner's GDC 2001 paper 1500 Archers on a 28.8: Network Programming in Age of Empires and Beyond. We knew the team at Ensemble; the RTS world in those years was a small enough subculture that the studios building games in the genre talked to one another.

The networking architecture that we and Ensemble both used was peer-to-peer lockstep simulation. The idea is straightforward in description and unforgiving in implementation. Every player runs the full simulation locally. Each player broadcasts their input — their moves, in the parlance of the time — at a fixed turn cadence. I remember ours running at roughly six turns per second. At the end of each turn, every machine applies every player's collected input to its local simulation state, advances by one tick, and emits a checksum of the resulting state along with the next turn's moves. If any client's checksum ever fails to match any other client's checksum, the simulations have diverged. The game is out of sync. Anyone who played multiplayer RTS in the early 2000s remembers those words. They ruined a lot of weekends.

Lockstep imposed a discipline that has stayed with me. Every interaction that touched simulation state had to be deterministic. Every random number had to come from a seeded generator the entire connected set of clients agreed on. Every floating-point operation had to be carefully scrutinized because IEEE 754 is reproducible within an architecture but the order of operations matters, and a different order on one machine produces a different result. The simulation could not be allowed to read the wall clock. It could not be allowed to read any input the network had not delivered yet. The simulation had to be a pure function of its prior state and the input sequence. If any line of code violated that purity, the game went out of sync, and we found out about it in the worst possible way — at a milestone playtest, in front of the producer, with a checksum log that pointed at five hundred possible culprits.

Two further insights shaped the architecture for me, both during Rise of Nations.

Not all state is game state. People in 2003 were buying actual graphics cards — meaningful money, for the time — and they wanted to see the results on screen. A high-end machine could render at sixty frames per second; a min-spec machine struggled at fifteen. But the simulation had to advance at the same rate on every machine, because lockstep. That meant the render frame rate and the simulation tick rate could not be the same number. The simulation needed to advance at its fixed deterministic cadence; the render layer needed to interpolate between simulation samples and update at whatever pace the local machine could sustain.

The implication of that, once you start designing against it, is that you cannot have one state — you need at least two: simulation state and render state. And the moment you have two states, you have a problem. In a large codebase, with twelve programmers, those two states will be mixed within hours of the rule being announced. Somebody will store a particle position in the simulation state because it was convenient. Somebody will compute a damage roll from a render-frame-rate timer because they did not know better. The simulations will diverge, and you will spend two weeks finding the offending line.

The way we solved this on Rise of Nations was to make the C++ compiler enforce the firewall. Every game object was actually three class instances. One held simulation state. One held render state. The third was a bridge — the only object that could write to either of the first two, and only through an input-command queue. The bridge accepted commands that would be processed at a later tick to advance both state sets. There was no other entry point. If a programmer tried to mutate simulation state directly from render code, the compiler refused. The three-class pattern (we called it the tri-class) did not solve the entire problem — bugs still crept in — but it solved enough of it that we could ship the game.

That architectural choice — a hard, compiler-enforced separation between simulation state and render state, with all mutation flowing through a queued command system — has been the foundation of every multiplayer game I have worked on since. The exact mechanism has evolved many times. Peer-to-peer lockstep gave way to authoritative-server architectures once bandwidth and cloud compute made server-side simulation viable. The tri-class became, in later projects, a two-class split with the bridge implemented as a separate subsystem. The C++ inheritance hierarchies became component arrays. But the principle has not changed. The simulation and the view never share mutable references. Every change is a command issued to a queue and processed at a known later moment.

Cache coherence is everything once you have hundreds of units. Rise of Nations had hundreds of animated units on the screen at once, plus arrows, plus particles, plus weather, plus a world map. We needed it to run on min-spec hardware. A cache miss on a Pentium III in 2003 was roughly a thousand times more expensive than the average instruction operating on data already in cache. A thousand times. If you missed the cache once per unit per update, you did not have a real-time game. You had a slideshow.

The optimization that mattered most was structural. The naive approach is array-of-structs: an array of Unit objects, each one carrying every field a unit might need — position, health, animation state, AI state, rendering state, owning player, current command. When the update loop walks the units to do, say, a position-update pass, it loads the entire Unit struct into cache for each unit, uses three of the fifty fields, and evicts the rest. The cache hit rate is terrible.

The fix is to invert the layout. Instead of an array of Unit, you have one array per field. An array of positions. An array of healths. An array of animation states. A given unit's data is no longer contiguous in memory — its position is in one array, its health is in another — but the position update loop now walks one tight contiguous array of position data, fits dozens of units in cache at a time, and runs in a fraction of the time. When the render loop wants animation state, it reads only the animation-state array. Each phase of the update touches only the data it needs.

This pattern has a name now. People call it data-oriented design, or sometimes Entity Component System architecture, with components stored as columns in a table rather than rows. In 2003, I am not certain we had a name for it. We were not consciously building an ECS; we did not yet think of ourselves as building one. We were trying to keep the game playable on a Pentium III with sixty units fighting on screen. The struct-of-arrays optimization was a means to that end. It was, in retrospect, the primordial form of what would become ECS as the term was later codified. We were doing it because the cache architecture of the early 2000s would not let us do anything else.

So by the time Rise of Nations shipped, I had absorbed two ideas that would shape every engine I worked on after: a hard simulation/view separation enforced at the language level, and a data layout that organized state by use-pattern rather than by entity identity. I did not know they would still be the foundational decisions of an engine I was building twenty-plus years later. I just knew they worked.

Sparkypants and the Dropzone engine

In 2011 I helped start Sparkypants Studios with Jason Coleman, one of the original founders of Big Huge Games. There was substantial personnel overlap between the two studios. The engineering culture was continuous — the same emphasis on craftsmanship, the same comfort with low-level systems work, the same instinct that hard things should be built well rather than glued together. That shared culture was what made the next chapter of engine work possible.

Around 2014, a number of us at Sparkypants set out to build a new engine for the game that would become Dropzone. The ECS pattern was crystallizing in the industry by that point. Adam Martin's earlier blog posts had pushed it into the conversation. Game-engine architects were talking about it at GDC. Several teams were converging on data-oriented entity systems independently of one another. We committed to that shape.

I am going to be deliberately non-specific about the architectural details of that engine in this post. It is not mine to describe in public, and the details are not what mattered for the eventual arc into MECS. What mattered, for the purposes of this story, was the experience of building games on top of it. The engine was, in honest engineering terms, extremely efficient at runtime — the kind of efficiency that changes the operational math of running multiplayer game servers at scale. It was also unforgiving to iterate in. The same characteristics that made the engine fast meant that implementing gameplay logic required carrying a much larger mental model than most programmers were used to. The engineering and gameplay teams Sparkypants had were genuinely world-class, and that fact is the only reason Dropzone shipped weekly live-ops updates for months after its Steam early access launch in February 2017 without major regressions. A less experienced team would have been overwhelmed.

By the time Dropzone wound down and I started looking ahead, I was left with two impressions of the engine in roughly equal force. The first was awe — the runtime characteristics were everything I had hoped for and more. The second was reluctance. I did not want to build my next game on that codebase. Not because it was wrong, but because it asked a lot of the people building games on it, and I could see myself burning out trying to find the fun in new mechanics while simultaneously fighting the implementation friction of a deeply technical engine. The engine was a marvel. It was also a forcing function for an engineering-first culture, and I was no longer sure that was the right tradeoff for my own next thing.

A studio, and the education-MMO project

I left Sparkypants in 2018 and incorporated Wood Fired Games LLC shortly after, originally as the contracting vehicle for a Design Director engagement at Bad Robot Games. Wood Fired received its first check made out to the company that December — a contractor payment from an educational gaming company I had started doing some work for in parallel. The studio existed as a legal entity from that point on. It would not become operationally active for another two and a half years.

In 2019 I joined the educational gaming company full-time as EVP of internal game development. Wood Fired Games went dormant. The new company operated a platform that delivered short-form educational games to students through schools, run by one division. My division ran the team responsible for the marquee long-form game experiences the platform's audience played regularly.

In late 2019 the team I led decided to rework the core student experience into a shared multiplayer world — effectively an MMO for an educational audience. The premise was good. Kids working through educational content in a shared persistent space, encountering each other, building together, would produce engagement profiles the existing short-session model could not. The team was capable. The pieces were there.

The execution ran into a problem I had not seen at the studios I had come from. The platform side of the company — the systems delivering the educational sessions, the user identity infrastructure, the analytics — was built by engineers from a SaaS background. Modern web stacks, request/response patterns, database-backed session state, the whole vocabulary of web infrastructure. My team came out of the games industry — engines, simulations, frame-rate-sensitive update loops, networked authoritative state. Both teams were strong. The cultures and the methodologies were different in ways the org charts did not name.

When we tried to combine them into a shared multiplayer world — the kind that required SaaS-shaped identity and analytics on top of game-engine-shaped real-time simulation — the gaps between the teams' default assumptions manifested as gaps in what was technically deliverable to the students. We did our best. The students still got an experience they enjoyed. But the experience we delivered was constrained, in real and visible ways, by what the two teams together could build with the technology each of them had natively.

That bothered me, more than I expected it would. By late 2020 I had decided to move on from the full-time role, and I could not stop thinking about what the same project would have looked like running on top of the kind of engine we had built at Sparkypants for Dropzone — but with the iteration friction of that engine somehow solved. The infrastructure cost would have been lower by an order of magnitude. The simulation throughput would have been higher. The platform-versus-game cultural divide would have been narrower because the game engine and the platform services would have been one codebase. The constraints that limited what we delivered would have been different constraints, and most of them would have been smaller.

I did not, at the time, write any code in response to that line of thought. I just kept thinking about it.

The thesis, late November 2020

By Thanksgiving 2020 I had turned my attention to what Wood Fired Games would build next. The studio had been a dormant LLC for two years at that point — incorporated, capable of receiving income, but operationally idle while I was working full-time elsewhere. With the decision to move on, the studio was about to wake up, and the question was what it should wake up to. The shape of the project came out of two specific frustrations I had been carrying:

The first was Dropzone's iteration friction. I had seen what an ECS engine in C could do operationally. I had also seen what it cost to maintain development velocity on top of it. I wanted to find a way to keep the runtime characteristics — the per-host instance density, the cache-coherent update loop, the deterministic execution — while making it possible for a small team to iterate on gameplay mechanics in something close to the rhythm of a Unity project.

The second was the education project's cross-team technical limit. I wanted the simulation and the platform services around it to be one codebase, written in one language, sharing one type system. If the team building the player-state service did not need to learn a different stack from the team building the game's combat system, the things that limited what we shipped would be smaller things.

Out of those two frustrations came three goals I committed to in writing:

Take the operational efficiency of a real ECS engine. Cache-coherent component storage. Authoritative-server multiplayer at high per-host instance density. Deterministic execution. The same shape Dropzone had.
Multiplayer-first. Build the networking model into the architectural foundation. Not as a bolted-on layer. My background gave me an unfair advantage in the networking side of things, and I wanted the engine to lean into that rather than treat multiplayer as a feature flag.
Iterate fast. Preserve development momentum. The most important ingredient in finding the fun in a game mechanic is the number of times you can iterate on it before you ship. If the engine costs you iterations, the games on top of it will not be as good as the engine's runtime numbers say they should be.

Three goals will not get you to code on their own. I also needed to know which design ingredients would make them simultaneously achievable. My initial guesses, written down at the time, were these:

Modularity with carefully limited side effects. Code reuse across systems, but with state changes flowing through declared paths, not through arbitrary mutation. The lineage from the Rise of Nations tri-class architecture is direct. The command-queue pattern was the right answer in 2003, and I expected it to be the right answer now, generalized to a system layer.

No object-oriented programming. No inheritance hierarchies. No virtual dispatch. No this pointer. Components as data, systems as logic that operates over component queries, no classes that bundle both. The Dropzone engine had taught me that the OO ceremony was costing more than it was paying for, and the parts of it that did pay — encapsulation, polymorphism — could be replaced with explicit type systems and dispatch tables.

Auto-generated boilerplate. The single biggest friction on an ECS engine of the Dropzone shape was the wiring code. Every new component needed factory code, serialization code, network code, registration code — all of it formulaic, all of it manually maintained, all of it a tax on iteration. If I could generate that automatically from a high-level type definition, the iteration cost drops dramatically and the engine starts to feel as easy to work in as a Unity project, without losing any of the runtime characteristics.

That was the thesis. ECS efficiency, multiplayer-first, easy iteration. Modularity, no OOP, auto-generated boilerplate. I did not have code. I had a set of decisions I felt I could defend.

The first commits

I did not start writing code immediately after the thesis. The thesis was written in late November 2020. Through the winter and into 2021 I iterated locally on what would become the framework — sketching component models, trying out message-routing patterns, testing serialization approaches — without putting any of it into version control. I was operating Wood Fired Games on my own at that point, splitting time between picking up contractor work (including ongoing engagements with the same educational company I had recently left full-time) and the engine work I was beginning to take seriously. I had not yet built the discipline of frequent commits. The work lived on local disk and migrated between machines as I refined it.

The first commit that survives in git is dated April 17, 2021, on a repository I created for a roguelike side project I was using as the testbed. That commit landed as a substantial volume of accumulated work rather than a fresh start — months of local iteration becoming version-controlled in one push. The repo stayed under continuous iteration for the rest of 2021. By mid-May the simulation was already "reporting the full state to the view" — the sim/view separation principle from the Rise of Nations tri-class, instantiated for the first time in this codebase.

The structural milestone people tend to point at when they walk this repo's history later is December 2, 2021. That commit message reads, in full:

"Built an ECS framework and started building out the game using it."

It is tempting to call that the moment MECS was born. It is more accurate to say it is the moment the framework reached the shape I had been working toward all year. The five primitives I had been iterating on through the spring and summer (Sim, EntityLibrary, ComponentFactory, Component, Entity) landed in their named, recognizable form. Under various renames, those five primitives are still the core of MECS today.

Two weeks later, on December 17, the topological sort lands:

"...make the systems overlap properly and have a dependency hierarchy."

Five days after that, on December 22, the same plumbing was extended to thread the component-delta record through the systems loop and make systems skippable when none of their input component types had changed during the frame. The code comment I left inline at the time states the invariant precisely:

"if nothing has changed then there is no reason to execute the system at all. Systems are ordered such that later systems always depend on components that are controlled by earlier systems. That's required or this optimization won't work."

That paragraph encodes the contract between the topological sort from December 17 and the skip mechanism from December 22 — the optimization is correct only because the systems are already ordered such that producers run before consumers. That is the change-driven skip optimization I describe in detail later in this post, and it is, in retrospect, the architectural moment of the entire year. The same _executionOrder walk and shouldExecute skip is still load-bearing in 2026 — the line if (!shouldExecute && !executeAllSystems) continue; runs in every frame of every game I build on MECS today. The next day, on December 23, the unsafe Bitwise helpers showed up:

"...low level Bitwise unsafe code that should be fast."

By early January 2022 there was a code-generation tool ("...a code generation tool to make it quick and easy to add new messages and components") and end-to-end message routing was working ("Got the messages going all the way through the ECS setup and a response sent back to the bot console...").

On January 11, 2022, I pulled the framework out of the roguelike project into two new repositories: WoodFiredFramework and WoodFiredEditor. The framework repo carried the simulation engine, the entity library, the message routing, and the component infrastructure. The editor repo carried the code-generation tool that turned database-defined type entries into runtime C# scaffolding. The day of the extraction, I renamed Sim to Simulation, because once the code was a library other code could consume, the name needed to be a noun rather than an abbreviation.

The early architecture was managed C#. Components were abstract managed classes. The entity library was a ConcurrentDictionary-backed store. The component pools were ObjectPool<T> over managed types. It was correct for a roguelike with hundreds of entities. It was not going to scale to the per-host instance density I had committed to in the thesis. The managed path had to go.

The name appears

December 13, 2022, a new project (project-space-rts) started, and the name MECS appears for the first time as a folder under WoodFired/. It was not a side project; it was the canvas for a wholesale rewrite. By Christmas the new structure was in place:

Components became unmanaged C# structs with explicit [FieldOffset] layouts, capped at 512 bytes, no logic — data only.
ComponentFactory<T> where T : unmanaged allocated MemoryOwner<T> blocks instead of pooling managed objects. Component IDs became ushort indices into those blocks, recycled on destruction.
ComponentHandle — a readonly struct pairing a ComponentType and a ComponentID — became the lightweight reference passed between systems.
EntityQuery got a proper API, returning iterable views over the entity table filtered by component-mask predicates.
Messages became ref struct Packet<T> where T : unmanaged, with a MessageHeader for wire framing.
UniqueID showed up as a 16-byte GUID wrapper, the primary key for every registry entry.
The BitArray* family — BitArray32, BitArray64, all the way up to BitArray4096 — landed as the backing for component-presence bitmasks and network fragment tracking.

The acronym is what the framework is. Modular Entity Component System. The modularity is the side-effect-limited composition that came out of the 2020 thesis. The entity-component-system is the formalized ECS shape that Dropzone had codified and Rise of Nations had foreshadowed. The naming was not the important part.

The important part was the performance unlock. The early framework had been managed C# — abstract component classes, dictionary-backed entity stores, object pools over managed types. That shape was correct for a roguelike with a few hundred entities, and it had been hitting a ceiling I could not push past while staying in the managed lane. The unmanaged rewrite was the experiment that found a way through. Once components became unmanaged structs in explicit memory layouts and Span<T> was used aggressively across the hot paths, the framework's performance closed most of the gap with native C++. A side benefit was that the wire format and the in-memory format converged — component state could be serialized directly to the network without any managed boxing path — which made authoritative-server multiplayer feasible later. But the convergence was a consequence of the rewrite, not its motivation. The motivation was that I had found a way to make C# go fast enough.

Cross-project hardening

Through 2023 MECS lived as Unity source code embedded across a series of game prototypes. None of them shipped. All of them mattered. project-empire started in April. project-zombies in May. project-flame, project-vampire, and project-dwarves followed across the summer and into September. Each one stress-tested a different shape of game and surfaced a different class of bug in the framework. By the end of 2023, MECS had survived roughly ten Unity prototypes spanning strategy, simulation, action, and roguelike-adjacent genres. The framework's surface area at that point was: a per-tick execution model with topologically-ordered systems, a message-routing system with Send and Loopback modes, a code-generator that ate database-defined types and emitted C# wiring, an unmanaged-struct component pool, an entity store with component-mask queries, and a delta-broadcast mechanism that would later become the basis of authoritative-server multiplayer.

The framework still had not been a production backend or run an authoritative-server multiplayer game in earnest. That changed in October 2023. But that is the next post.

What ended up mattering

The design decisions that survived all the way from the original 2020 thesis to the codebase as it stands today are these:

Sim/view separation, enforced at the language level. The simulation runs deterministically as a function of input. The view consumes deltas. They never share mutable references. The lineage from the Rise of Nations tri-class is direct — same principle, different mechanism. This single decision is why MECS games can ship with multiple clients (terminal, graphical, AI-driven) running against the same authoritative server.

Command-queued state changes. All mutation flows through messages that are dispatched to interpreters at a known later tick. Systems do not reach into the entity library and mutate components in place. This is the second decision that comes straight from 2003. It costs an indirection. It makes the entire state machine inspectable, replayable, and serializable.

Unmanaged components only, no logic. Components are pure data in explicit memory layouts. Behavior lives in systems. The cost is that components cannot carry strings, lists, or managed references. The payoff is performance approaching native — dense contiguous memory, zero allocation in the hot paths, and aggressive Span<T> use throughout. Wire-format alignment is a side benefit; the performance is the goal.

Database-driven type definitions. New components, systems, queries, and messages start as rows in an asset database, not as new C# files. The runtime knows about them because the code generator reads the database and emits the wiring. This decouples type definition from build cycles and answers the auto-generated-boilerplate part of the 2020 thesis.

Network-first messages. Every message is an unmanaged struct with explicit serialization, including messages that only flow inside the same process. The same framing that goes over UDP goes through the in-process Loopback queue. Any local message can become a network message later without rewriting.

Topologically-ordered, change-driven system execution. Each system declares two things as part of its type definition: the set of component types it reads from, and the set of component types it writes to. The single hard constraint underneath those declarations is that only one system in the service is allowed to claim write access to any given component type. From those declarations the framework builds a DAG, validates the absence of cycles, and produces a startup execution order. So far this is a standard ECS pattern.

The non-standard piece — and the one architectural shape in MECS I am willing to claim as my own contribution rather than as my application of an existing idea — is that the same declarations make systems skippable at runtime. Because every component mutation flows through the delta broadcast, and because each system has a known set of read inputs, the framework can ask before executing a system whether any of that system's input component types have actually changed since the last tick. If none of them have, the system has no new input to process and therefore no new output to produce, and it does not run. Most frames have a substantial fraction of systems that meet this condition. The result is a fixed-cost startup-time topological sort paired with a per-frame change-detection pass, and the runtime ends up spending cycles only on systems that actually have work to do. The performance gain from that single design choice has been significant on every game I have run on this engine.

The same DAG carries one more piece of information I have not yet exploited: independent branches of the graph could be executed in parallel without contention, because the single-writer rule guarantees there is no shared mutable state between them. Parallel system execution is a direction I have not yet taken. The change-driven skip is what runs today.

Most of those patterns are not original to me. The games industry has been using versions of them for decades. The one I would actually claim as my own contribution is the change-driven skip optimization made possible by the single-writer-per-component-type rule described above — a shape I have not seen elsewhere and that produces a real per-frame performance win. What is otherwise mine is the integration: a single C# codebase that runs the same deterministic simulation across Unity, headless .NET, terminal clients, and AI-driven clients, with authoritative-server multiplayer baked in from the architectural level, on top of a set of design choices that have stayed stable since 2020.

Why a custom ECS at all

The obvious question is whether any of this needed to be custom. Unity has DOTS. Bevy has its own ECS. Entitas exists. Several smaller frameworks exist. Why not pick one and ship.

The short answer is that I needed the engine to run identically on the Unity client and on a .NET backend, with the same component schemas, the same system code, and the same wire format. None of the existing options solved that cleanly enough at the start. The simulation logic, the type system, the system ordering, and the network framing had to be one codebase running in two environments. That requirement narrowed the search until building it myself was the most defensible option.

The longer answer is the thesis from late 2020. The engine I wanted is opinionated in ways a general-purpose ECS cannot be. Components are unmanaged-only because the messages that carry them are unmanaged-only because the wire format requires it. Systems declare control and dependency relationships because the topological sort needs them. The whole framework leans into a specific shape of game — authoritative-server multiplayer at high per-host density, deterministic simulation, viewer-independent state, fast iteration cycles. That shape rules out a lot of other shapes. A general ECS has to support all the shapes. I only need to support one well, and the cost of not being general is paid back in the parts of the framework I do not have to write.

I would build it again.

Why C#?

The other obvious skeptical question is the language choice. C# is not the language you pick if your primary criterion is raw performance. It carries garbage collection, JIT compilation, and a runtime heavier than the C and C++ stacks most performance-critical multiplayer engines run on. I picked it anyway, for reasons that started partly emotional and became defensible engineering reasons over time.

The starting reasons were three. First, I enjoy writing C#. After decades of professional work that included substantial time in C++, I had come to prefer the language's syntax, its tooling, and the day-to-day experience of operating in it. Personal preference is not an engineering argument, but it is the argument that determines whether I will still be working on this framework five years later. Second, Unity uses C#. If MECS was going to run on the Unity client side and on a .NET backend as one codebase, picking the language Unity already supports was the obvious move — and I had no interest in committing to writing a renderer of my own when Unity, Godot, MonoGame, and FNA all existed. Third, a .NET backend is materially cheaper and easier to operate at scale than a native-code backend. Build pipelines, deployment, debugging, observability, the surrounding ecosystem — all of it is more mature for managed languages now than it was a decade ago, and most of the cost of operating a multiplayer game lives in that ecosystem rather than in raw CPU cycles.

The honest counter-argument is the performance gap. C# code naively written runs slower than C++ code naively written, and for some workloads the gap is large. That ceiling was the original motivation for the late-2022 unmanaged rewrite described earlier in this post. Once components became unmanaged structs in explicit memory layouts and Span<T> was used aggressively across the hot paths, most of the gap closed. The framework approaches native performance not because C# is fast by default but because C# can be fast if you stay in a narrow lane of the language.

The deep-dive resource that pushed me over the threshold on "can C# really run this fast" was a personal website maintained by Jackson Dunstan — jacksondunstan.com. Dunstan publishes a body of careful, benchmark-driven writing on optimizing C# performance, almost all of it backed by hard data. I encountered it in late 2022 and read everything he had written. I came out the other side convinced C# was a defensible choice for the kind of engine I was building. If you are evaluating C# for high-performance work and the conventional wisdom in your head says C# is slow, that site will rebuild your priors. Credit where it is due.

Where this leaves us

By the end of 2023, MECS was a battle-tested engine in search of a production environment to prove itself in. The architectural decisions that came out of two decades of multiplayer game work had been re-expressed inside a modern C# codebase, the unmanaged rewrite had landed, the cross-project hardening had been done. What was missing was a real game with real network traffic against a real production backend.

That arrived in October 2023, with a publishing contract for a real-time multiplayer game on a cloud gaming platform. The engine ran into the network layer at production scale for the first time. The next post in this series picks up there: the UDP transport, the backend services that grew around the engine, the asset pipeline, the year MECS stopped being shared game code and became a platform.

The post after that turns to the AI-era developments — the source-generator migration, the compile-time diagnostics, the NuGet breakout that happened earlier this year — and the argument that making a framework AI-legible is itself an engineering discipline worth doing deliberately.

This first post is the foundation. The engine has been twenty-three years in the making and six years on disk. The rest of the series is what those decisions enabled.