article
Stop talking to your agents.
The dominant way people work with coding agents is conversational — describe, correct, re-describe. That mode has a hard ceiling. The more durable lever is to constrain the environment so the agent physically cannot produce the wrong thing. In C#, the place to do that is the compiler. Here is how I flipped my engine's pipeline from human-first to agent-first, and how Roslyn turned a class of rules I had been writing into prose for years into build errors that neither a human nor an agent can miss.
There is a default posture people fall into with coding agents, and it is conversational. You describe what you want. The agent does most of it and gets a piece wrong. You explain the piece it got wrong. It fixes that and breaks something adjacent. You explain again. The loop is familiar to anyone who has spent real hours with these tools, and it works — up to a point. The point where it stops working is the point where the rule you keep having to restate is unusual. Unusual relative to what, is the question that matters, and the answer is: relative to the entire corpus of public code the model was trained on. When your codebase enforces a convention that almost no other codebase in the world enforces, no amount of conversation makes it stick. The model regresses to the mean of everything it has ever seen, and the mean does not include your rule.
I have spent the better part of two years learning this the expensive way, and the lesson I have landed on is blunt enough to put in a title. Stop talking to your agents. Or rather — stop only talking to them. Conversation is a probabilistic lever on a probabilistic system, and there is a more durable lever available, which is to change the environment the agent operates in so that the wrong answer is no longer reachable. In a C# codebase that lever has a specific name. It is the compiler.
The pipeline was built for me, and the AI hated it
To explain why I ended up here I have to describe a pipeline, because the whole argument grew out of watching that pipeline collide with an agent.
For years my engine — MECS, the networked entity-component-system I have been building since 2021 — has been authored through tools rather than through raw code. The workflow was the one most engine developers would recognize. I take a gameplay idea and decompose it into the systems and components needed to express it. I specify those elements in the engine editor, using tools designed to keep me from making mistakes — the editor will not let me define a component that violates the engine’s invariants, the same way a good form will not let you submit a malformed date. The editor then takes those definitions and generates the C# that actually runs at runtime. If you have used Unreal’s Blueprint, the shape is not so different: a structured authoring surface on top, generated executable artifacts underneath, and a body of data in the middle that describes everything the tools know about your types.
That middle layer of data is the part that matters for this story, so hold onto it. The generated code is downstream of a registry of type definitions. Those definitions are assets in their own right — they drive the editor, the database serialization, the network layer, the tooling. The generated .cs files are not the source of truth. The data is. The code is an artifact derived from the data.
When I started pointing a coding agent at this engine, it could not follow the pipeline, and the reason was structural rather than a matter of capability. The agent had been trained on millions of repositories in which you express an idea by writing code. It had been trained on essentially zero repositories in which you express an idea by populating a proprietary editor’s registry so that a code generator can emit the implementation. The sequence I had spent years refining — concept, to tool-authored definition, to generated code — was so far outside the distribution of how software is normally built that the agent kept trying to shortcut it. It wanted to go straight from concept to hand-written code, because that is what every example it had ever seen does.
I knew the agent could produce code equivalent to what my tools generated. That was never in doubt. The problem was that if I let it do that, I would lose the data layer — the registry, the assets, everything downstream of them. The generated code without the data behind it is a car with no factory. You can drive the one you have; you cannot make another.
So I had two bad options and went looking for a third. Option one: force the agent to drive the editor. I tried; it fought me, and an agent fighting its training data is a slow, lossy, expensive thing to supervise. Option two: let the agent write code directly and abandon the data layer. Unacceptable. The third option, the one I took, was to invert the relationship between the code and the data so that the agent could work the way it wanted to — by writing code — while the data layer got reconstructed from that code automatically.
Attributes, so the agent can write code and I keep my data
The mechanism is custom C# attributes. The agent writes ordinary-looking C# and decorates it with engine attributes, and a build-time pass reads those attributes and produces the data assets that used to be authored by hand in the editor. The pipeline went from concept → tool-authored definitions → generated code to concept → agent-written code with attributes → generated data assets. Same destination. The data layer survives. But the human-hostile-to-agents authoring step in the middle got replaced by the single most in-distribution activity a coding model has: writing a class and putting an attribute on it.
In practice this looks as plain as it sounds. A component is a struct with a marker attribute:
[MECSComponent]
public partial struct CreatureStats
{
[Key(0)][FieldOffset(0)] public EntityID OwnerID;
[Key(1)][FieldOffset(8)] public int Health;
[Key(2)][FieldOffset(12)] public int Strength;
}
A system declares, in attributes, exactly which component types it controls and which it depends on:
[MECSSystem]
[Controls(typeof(CreatureStats))]
[DependsOn(typeof(TurnClock))]
public sealed partial class CombatSystem { /* ... */ }
There is nothing exotic on the surface. That is the entire point — the surface is deliberately ordinary so the agent can produce it fluently. The exotic part lives underneath, where a build-time pass walks every type carrying one of these attributes and regenerates the registry, the serialization glue, the identity boilerplate, and the rest of the data layer that used to be hand-authored. When I migrated the engine to this model, hand-maintained generated files came out by the dozens across the game repos — they were no longer authored, they were derived. The agent writes the in-distribution thing; the build reconstitutes the out-of-distribution thing it would never have produced on its own.
That alone would have been a good trade. It is not the part I want to convince you of.
The compiler is a place to put your rules
Once you have an attribute surface, you have given the compiler something to reason about, and in C# the toolchain that reasons about it is Roslyn — the compiler-as-a-platform that lets you run your own analysis as part of every build. The attributes I added so the agent could author code turned out to be the foothold for something I valued more: a place to enforce the rules of my engine that had previously lived only in prose.
Every engine has a pile of these rules. They are the conventions that experienced people on the team know and newcomers violate — “components must start with their owner field,” “a system can’t depend on a type it exclusively owns,” “this thing must fit in a UDP datagram.” Historically those rules live in documentation, in code review, in the head of whoever has been here longest. They are real, they are load-bearing, and they leak constantly, because a rule that is only written in prose is enforced only by attention, and attention is the most exhaustible resource on any team.
Roslyn lets you take a rule like that and make it a property of the build. I now ship a set of compile-time diagnostics with the engine, and each one takes a failure mode that used to be discovered at runtime — or worse, in a multiplayer desync three weeks later — and moves it to the moment of compilation. A few concrete ones, because the specifics are the argument:
A component in MECS must fit in 512 bytes. The diagnostic itself states the proximate reason — it has to fit in the component buffer — but the deeper reason lives one layer down, in the network code, where state is fragmented into messages of at most 512 bytes and the socket is opened with fragmentation disabled. A 512-byte ceiling is therefore exactly the size at which a component can never be split across UDP datagrams on the wire. Violate it and the component crashes at runtime during buffer allocation. That is now a build error:
error MECS0009: Component 'WorldSnapshot' has size 640 bytes, exceeding the
512-byte limit. Components must be <= 512 bytes to fit in the component buffer.
A system that writes to a component it never declared access to used to crash deep inside query construction at runtime. It is now caught the instant you type it, live in the editor, by an analyzer rather than the generator:
error MECS0012: GetWritable<CreatureStats> used in 'CombatSystem' but
'CreatureStats' is not declared in Controls or Writes attributes. Undeclared
write access crashes at runtime during EntityQuery construction.
And my favorite, because the message does the teaching, is the one that fires when a system declares it both controls a type and depends on changes to that same type — a contradiction that used to produce a silent ordering bug in the system dependency graph:
error MECS0008: Self-edge detected: 'CreatureStats' appears in both Controls
and DependsOn. A system cannot depend on changes to a type it exclusively
controls — move the type to Reads if you only read it, or remove it from
DependsOn if you truly own it.
Read that message again and notice what it is. It is not an error. It is a prompt. It states the rule, explains why the rule exists, and tells you the two specific ways to satisfy it — and it does all of that at the exact moment the mistake is made, addressed to whoever made it, whether that is a person or an agent. Sixteen of these now ship with the engine. Not all of them are hard errors — a couple are warnings, and the lowest-numbered ones are really registry bookkeeping rather than logic bugs — but the load-bearing majority do the same job the three above do: they convert a category of bug that used to surface late, at runtime or in a desync, into a build failure raised the instant the offending code is written.
The rule that would not stick
Here is the one that made me a believer, told in simplified form because the real implementation carries more engine-specific weight than the point needs.
Systems in my engine are supposed to be stateless. That is a determinism requirement with a long lineage — it is the same discipline I learned writing lockstep RTS networking in 2003, where any state a system carried across simulation steps was a desync waiting to happen. But “stateless” stated baldly is too strong, because in practice it is genuinely useful to cache some component data in a field on the system for the duration of a single frame. The problem was never that a field exists on a system. The problem is precisely and only when the value in that field is allowed to survive into the next frame. Within a frame: fine, useful, encouraged. Across frames: a determinism bug that will eventually cost someone a weekend.
That is a subtle rule. It is “this field may hold state, but only transiently, and the boundary is the frame.” I had it written down in the obvious places — context files, the engine’s conventions docs, the subagents I run, the skills they load. And it still leaked into real code on a semi-regular basis, from agents and, if I am honest, from me. It leaked for exactly the reason I opened with: through the lens of all the code in the world, a field that holds state between frames is the most normal thing imaginable. Nearly every class in nearly every codebase does it. The rule asks the model to suppress its single strongest prior about how objects work, on the basis of a paragraph it read twenty thousand tokens ago. Prose was never going to win that fight.
So I stopped writing the rule down and started building it in. It took a second Roslyn analyzer to do it — a sibling of the diagnostics above rather than literally the same one, living in its own architecture-analysis assembly — but the principle is identical: a field that may legitimately cache within a frame is marked as such, and the analyzer flags any field whose value is allowed to outlive the frame. The genuinely dangerous shapes are error-severity and stop the build outright; a broader catch-all rides along as a warning to keep the surface honest. When one of the error-level invariants is violated the build fails, with a message that explains the frame-boundary rule in the same teaching voice as the self-edge error. And the behavior I observed after that was the whole reason I am writing this post. The leak stopped. Not mostly stopped — stopped. Because now, on every violation, the agent receives the rule restated at full strength at precisely the moment it matters, in a channel it cannot ignore, and it fixes the code correctly every single time. The context that used to sit dormant in a docs file three thousand lines away gets reactivated by the build failure exactly when it is relevant, and the fix follows immediately.
Better context only goes so far
The general claim I want to leave you with is this. The reliability of a coding agent is not primarily a function of how well you describe what you want. It is a function of how small you can make the space of wrong answers it is able to produce. Context — prompts, docs, skills, examples — shrinks that space probabilistically. It nudges the distribution. It is necessary and I am not telling anyone to stop writing it. But it degrades over a long session, it competes with ten thousand training examples pulling the other way, and it can always, in principle, be missed. A compiler error cannot be missed. It is not a nudge on a distribution; it is a wall. The wrong answer does not get generated and then caught — it does not compile, so the loop does not advance until it is right.
The reframe that follows is that the most valuable thing you can do for an agent’s reliability is often not to write a better prompt but to build a better failure. Take the rule that keeps leaking, the one that is unusual relative to the rest of the world’s code, the one your context files keep failing to enforce — and find a way to make violating it a build error with a descriptive message. The message is the prompt, except it is delivered with perfect timing, perfect relevance, and no possibility of being skipped. You are no longer hoping the agent remembers the rule. You have made the rule a precondition of the code existing at all.
And the quiet bonus, the thing that makes me confident this is not just an AI trick: it is exactly as good for the humans. The stateless-state rule was hard for me to remember too. The 512-byte limit caught my mistakes before it ever caught an agent’s. Every rule I have moved from prose into the compiler has made the codebase more legible to every contributor, carbon or silicon, because the knowledge stopped living in someone’s memory and started living in the build. That has always been the right direction for a codebase to evolve. Agents did not create the principle. They just raised the stakes high enough that I finally did the work.
The conversational mode is not going away, and it should not — it is how you do the open-ended, exploratory, genuinely novel work where no rule exists yet. But for the parts of your system that have hard invariants, stop relitigating them in prose on every session. Encode them. Let the compiler hold the line, for the agent and for you.
Companion post: the longer arc of how AI moved from a teacher to a worker in my studio.
next