← Writing

5 February 20268 min read

Writing system prompts and CLAUDE.md files that actually work

Persistent instructions for AI tools are a craft. Most engineers write them once, badly, and never revisit them. Here is how to write system prompts and CLAUDE.md files that compound over time.

ClaudeSystem promptsAI toolsWorkflowAgentic dev

The neglected lever

Every engineer who uses Claude Code has a CLAUDE.md file — or should. Every team that deploys an AI assistant has a system prompt. These files are the persistent instructions that shape every interaction the model has in that context.

Most are written once, in the first hour of setup, and never revisited. Most are too generic to do useful work.

The engineers getting the most leverage from their AI tools are the ones who treat their system prompts and CLAUDE.md files as living documents — refined continuously based on where the default model behavior diverges from what they need.

What belongs in a CLAUDE.md file

A CLAUDE.md file is not documentation. It is an operating brief. The distinction matters: documentation describes what exists; an operating brief describes how to work in this context.

What to include:

Architecture that is not obvious from the code. The conventions, the non-obvious constraints, the decisions that were made for reasons that are not visible in the files. "We use optimistic locking here because this module is accessed from three separate services" is not visible in the code. The model needs to know it.

Explicit scope constraints. Which files and directories are in scope for modification. Which must not be touched. What the boundaries of a typical task are in this codebase. These constraints prevent the most common Claude Code failure mode: touching things that were out of scope.

Style and convention. Not generic style guide content — the specific deviations from defaults in this codebase. Where you use a pattern that differs from the standard. Where you have explicitly chosen consistency over correctness.

The test contract. What the test suite covers, what it does not, and what must pass before any change is considered complete. The model cannot infer your quality bar from the tests alone.

What not to include:

Generic best practices. The model already knows them. Instructions to "write clean code" or "handle errors properly" add noise without signal. Be specific about what is non-standard in your context; trust the model's defaults for everything else.

Iterating a system prompt

A system prompt is wrong on first write. This is not a failure — it is the starting point.

The iteration discipline: every time the model produces an output that requires significant correction, ask why. Usually the answer is one of:

A system prompt that has been through fifty debugging cycles is categorically different from one written in an afternoon. The former is a compressed record of every non-obvious thing about working in that context. The latter is a list of general aspirations.

The specificity principle

Vague instructions produce vague compliance. Specific instructions produce specific compliance.

Vague: "Be careful with database operations."

Specific: "Never run DELETE or UPDATE statements without a WHERE clause. Never modify the users or payments tables without an explicit instruction to do so. Always use transactions for operations that modify more than one table."

The specific version is checkable. You can verify whether the model followed it. The vague version is not — "careful" is undefined.

Rewrite every vague instruction in your system prompt as a checkable constraint. If you cannot state it as something specific enough to be verifiable, you do not yet know precisely what you want.

The layered architecture

For teams, system prompts have a layered architecture worth making explicit:

Global layer: conventions that apply across all uses of the model in all contexts. Language preferences, tone, output format defaults, the team's definition of "done."

Project layer: constraints specific to this codebase. Architecture, scope limits, the test contract.

Task layer: constraints for a specific session or task. The current feature, the files in scope, the specific outcome needed.

The global and project layers live in files that persist. The task layer is written fresh for each session.

Most engineers conflate all three into a single system prompt and get results that are mediocre for every context. Separating the layers produces prompts that are precise for the contexts they govern.

The maintenance discipline

Stale instructions are worse than no instructions. A CLAUDE.md that describes an architecture you refactored three months ago actively misleads the model.

When you make a significant architectural change, update the CLAUDE.md as part of the change — not afterward. Treat it as a file that is part of the codebase, with the same discipline about keeping it current as you apply to any other documentation.

Two questions to ask when reviewing a system prompt:

  1. Is there anything here that is no longer true?
  2. Is there anything I have corrected the model for repeatedly that is not captured here?

The second question is where the compounding value lives. Every correction that gets absorbed into the prompt is a correction you will never have to make again.

The compounding return

A system prompt or CLAUDE.md file written with care and maintained over time becomes one of the highest-leverage assets in an AI-assisted workflow. It encodes the hard-won knowledge of what makes the model useful in your specific context.

The initial investment is a few hours. The maintenance is minutes per significant change. The return is every session where the model starts in the right frame, within the right constraints, without needing to be corrected for things it should already know.

Most engineers treat this as setup. The ones getting disproportionate results treat it as craft.

← All writingWork with me →