The Agent Harness: Everything Around the Model

Swap Claude for a different frontier model inside a coding agent and, for a huge share of tasks, you won't notice much difference in how the agent behaves day to day. Swap out the harness underneath an unchanged model — its system prompt, its tool set, its permission rules, the skills and hooks wired into it — and the same model can go from cautious and slow to fast and reckless, or from generically helpful to precisely tuned for your team's codebase. That gap is the whole reason "harness engineering" has become a real discipline instead of a footnote under prompt engineering.

The model is the engine. It supplies raw reasoning, language, and judgment. But an engine bolted to a workbench doesn't drive anywhere — it needs a chassis, a steering system, brakes, a fuel line, and a dashboard telling the driver what's happening. The harness is that whole vehicle: the system prompt that gives the model an identity and a set of house rules, the tools and MCP servers it's allowed to call, the skills and reference material it can load, the hooks and permission checks that constrain what it's allowed to do unsupervised, and the settings that decide whether it's running interactively next to a human or headless inside a CI job. Everything in this post is about that vehicle, not the engine inside it.

The model is the engine; the harness is the car#

It's worth being precise about where the boundary sits, because it changes what you should actually spend your tuning effort on. Model weights determine how well the system reasons about a given problem in isolation — code comprehension, multi-step planning, following an instruction precisely. None of that changes when you edit a CLAUDE.md file or add a new tool definition. What does change, dramatically, is what the model is even given the opportunity to reason about, and what it's permitted to do once it reaches a conclusion.

Two agents built on the exact same model checkpoint can feel like entirely different products depending on their harness. One reads code but can't write it, asks a human before every non-trivial change, and has no memory between sessions. The other has write and deploy access, runs unattended overnight, and carries forward a project's history through a set of loaded reference files. Both are "the same model." Neither behaves like it. If you're debugging why an agent is too timid, too aggressive, or simply blind to context it should have — look at the harness first. The model is rarely the bottleneck; the scaffolding around it usually is.

The five layers of a harness#

A harness isn't one setting — it's a stack of layers, each answering a different question about the agent, and each wrapping the model a little further from "raw weights" and a little closer to "a coworker who can actually get something done." It's useful to name them individually, because most harness bugs turn out to be a problem localized to exactly one of these layers.

Figure 1

Anatomy of a harness

The model core doesn't change. Everything that makes it behave like a working agent instead of a chat window is layered around it — identity closest to the core, automation furthest out.

Identity — who the agent thinks it is#

The innermost layer is the system prompt and whatever project-level instructions get loaded alongside it — a CLAUDE.md at the repo root, a persona description, a list of house conventions. This layer answers "what is this agent for, and what does it already know without being told?" It's the cheapest layer to change and, because it's loaded on every single turn, also the one where bloat costs you the most — every paragraph here is a paragraph competing for attention on every request the agent handles.

Capability — what it's allowed to reach for#

The capability layer is the tool set: file read/write, shell execution, web fetch, and whatever domain-specific tools are wired in through MCP servers — a database client, a ticketing system, an internal API. This is where the harness decides what's even possible, independent of whether it's wise. A model can't deploy to production if there's no deploy tool in its capability layer, no matter how confident it is that it should.

Knowledge — what it can load when it needs to#

Skills, reference documents, and command templates make up the knowledge layer — the packaged expertise a harness can pull in on demand rather than keep resident at all times. This is the layer this series covered first, for a reason: it's usually the highest-leverage place to invest, because a well-scoped skill upgrades the agent's behavior on a specific class of task without paying a context tax on every other task.

Control — what it's allowed to do unsupervised#

Permissions and hooks form the control layer: which tool calls require human approval, which file paths are off-limits, which actions trigger a pre- or post-hook that can block, log, or modify what the agent was about to do. Control is the layer that turns "capable of doing X" into "actually allowed to do X right now, under these conditions." A capable, unsupervised agent without a control layer isn't bold — it's just underspecified.

Automation — how it runs without a human in the loop#

The outermost layer decides whether the agent is even running interactively at all. Headless invocations, scheduled runs, CI integrations — this layer determines the triggering conditions and the reporting path back to a human when there's no one watching the terminal in real time. It's also where loop engineering (the subject of the next post in this series) actually lives: automation is what makes an agent something that can run while you sleep, for better or worse.

Same model, different harness#

The clearest way to see the harness's effect is to hold the model constant and vary only the surrounding configuration. Feed one model into a harness with read-only tools, a policy of asking before every write, and no hooks configured, and you get a careful reviewer that flags problems without touching anything. Feed the identical model into a harness with write and deploy tools, an auto-approve policy on edits, and a pre-commit hook that runs the test suite automatically, and you get an agent that ships pull requests on its own. Nothing about the model changed between those two setups — only the vehicle it was placed in.

Figure 2

Same model, different harness

Identical weights, two harnesses, two personalities. The difference in behavior comes entirely from the tools, permissions, and hooks each harness wraps around the model — not from anything the model itself decides on its own.

This is a genuinely practical lever, not just a thought experiment. If an agent in production is behaving too conservatively — asking for approval on changes that are obviously safe — the fix is almost never "get a smarter model." It's usually "loosen the control layer for that specific class of change, and give the capability layer the tool it's missing." Conversely, if an agent is doing things you didn't want, the fix is to tighten permissions and add a hook that gates the specific action, not to write a longer, more emphatic system prompt begging it not to.

Prompting is a weak substitute for control

Telling a model "please ask before deleting files" in a system prompt is advisory. A permission rule that requires explicit approval before any rm-equivalent tool call is enforced. When behavior really matters, push the constraint down from the identity layer into the control layer — instructions can be misread or deprioritized under pressure; a permission gate cannot.

The same lever works for adding capability, not just constraining it. An agent that keeps stalling on "I don't have a way to check the current staging deploy status" usually isn't missing intelligence — it's missing an MCP server that would let it actually query the deploy pipeline. Wire that server into the capability layer, pair it with a control-layer rule scoping which environments it can query, and the exact same model stops guessing and starts checking. It's tempting to read a gap like that as a model limitation and go looking for a better one; far more often, the model was reasoning correctly about a tool it simply didn't have.

Harness-as-code: versioned, shared, reproducible#

Once a harness is doing real work, treating its configuration as a one-off local setup stops being an option. The same discipline that made infrastructure-as-code the default for servers applies directly here: a harness's system prompt, tool wiring, skills, hooks, and permission rules should live in the repository, get reviewed in pull requests, and be identical whether they're running on your laptop, a teammate's laptop, or a CI runner at 3 a.m. with nobody watching.

Figure 3

Harness-as-code flow

One versioned config in the repo fans out identically to every machine and runner that reads it — no drift, no 'works on my agent' bugs.

In practice this means a project directory that looks a lot like any other configuration-as-code setup: a root-level context file, a skills directory, a commands directory, and a settings file that declares tool permissions and hooks explicitly rather than relying on whatever defaults happen to be active on a given machine.

.claude/settings.json

{
  "permissions": {
    "allow": ["Read", "Grep", "Glob"],
    "ask": ["Edit", "Write"],
    "deny": ["Bash(rm -rf *)"]
  },
  "hooks": {
    "PreToolUse": [
      { "matcher": "Bash", "command": "scripts/audit-command.sh" }
    ]
  }
}

Nothing about that file is exotic — it's just declarative configuration, checked into version control like anything else that governs how a system behaves. The payoff is that harness changes go through the same review process as code changes, and a regression in agent behavior can be traced to a specific commit instead of "something changed on someone's machine." Teams that get this right end up thinking about harness changes the same way they think about API changes: something to propose, review, and roll out deliberately, not something that accumulates as ad hoc local tweaks nobody remembers making.

Worked example: a harness for an on-call triage agent#

Layers are easiest to feel, not just describe, by watching one get built up around a real task. Take a common one: an agent that investigates a production alert and proposes a fix, but must never touch the deploy pipeline itself. Below is the harness end to end — and, more usefully, what the same agent does differently as each layer gets added on top of the exact same underlying model.

The identity layer starts with a project-level context file that gives the agent its mandate and the one rule that matters most: investigate, don't deploy.

CLAUDE.md

# On-call triage agent

You are the on-call triage assistant for the payments service. Given an
alert, investigate root cause and propose a fix. You do not deploy fixes.

## Rules
- Never run a rollout, restart, or scale command yourself — draft a PR instead.
- If the alert mentions "refund", read `docs/runbooks/refunds.md` before
  touching anything under `payments/`.
- Summarize root cause in plain language before proposing any code change.

Capability comes next: a scoped agent definition that wires in only the tools this task actually needs — read access, log queries, and nothing that can mutate the running system.

.claude/agents/triage.md

---
name: triage
description: Investigates a production alert using logs and runbooks, then drafts a root-cause summary and a proposed fix as a pull request. Never deploys or restarts anything.
tools: Read, Grep, Glob, "Bash(kubectl logs*)", "Bash(kubectl get*)"
---

Investigate the alert described in $ARGUMENTS. Read the relevant runbook
first if one matches. Pull logs for the affected service, identify root
cause, and open a draft PR with the fix. Do not run any command that
changes cluster state.

Control locks in what the rules above only asked for nicely.ask and deny entries turn "never run a rollout" from a request into something the harness itself refuses to execute, and a hook adds a paper trail independent of anything the agent decides to report.

.claude/settings.json (triage agent)

{
  "permissions": {
    "allow": ["Read", "Grep", "Glob", "Bash(kubectl logs*)", "Bash(kubectl get*)"],
    "ask": ["Edit", "Write", "Bash(git push*)"],
    "deny": ["Bash(kubectl rollout*)", "Bash(kubectl delete*)", "Bash(kubectl scale*)"]
  },
  "hooks": {
    "PreToolUse": [
      { "matcher": "Bash(kubectl*)", "command": "scripts/audit-kubectl.sh" }
    ],
    "PostToolUse": [
      { "matcher": "Write", "command": "scripts/notify-oncall-channel.sh" }
    ]
  }
}

Automation, finally, decides whether a human has to remember to run this at all. Wired to fire automatically on a PagerDuty webhook instead of waiting for someone to invoke it by hand, the same harness becomes something that's already investigating by the time an on-call engineer opens their laptop.

Watch what changes, and what doesn't, as each layer is added on top of the identical model checkpoint:

behavior by layer

Layer added          | What actually changes
----------------------|--------------------------------------------------
none (bare model)     | Suggests a fix in chat, sometimes including the
                       | exact "kubectl rollout restart" command — advice
                       | with no enforcement behind it.
+ Identity            | States "I won't deploy this" correctly, but
                       | nothing stops it from calling a deploy tool if it
                       | reconsiders mid-task — the rule is still just prose.
+ Capability          | Cannot run "kubectl rollout" even if it decides
                       | to — the tool simply isn't wired into its scope.
+ Control             | Every "kubectl" call is logged by the hook before
                       | it runs, and a deny pattern blocks rollout/scale/
                       | delete even if a future capability change adds
                       | a broader kubectl tool by accident.
+ Automation          | Runs unattended off a PagerDuty webhook at 3 a.m.,
                       | opens a draft PR, and posts to the on-call channel
                       | — with every layer above already in force.

None of that progression required a different model or a cleverer prompt. The identity layer alone got the agent to say the right thing; it took control to make the right thing the only thing it was physically able to do.

Anti-patterns that make a harness fight itself#

Prompting around a missing control. If you find yourself adding another paragraph to the system prompt asking the model to be careful about something, stop and ask whether that constraint belongs in permissions or a hook instead. Prose is not enforcement.
Capability without control. Wiring in a powerful tool — a database client with write access, a deploy command — without a matching permission rule or hook is the single most common way a harness ends up doing more than anyone intended. Capability and control should be added together, not capability first and control "later."
An identity layer that's really a knowledge dump. System prompts that grow to include every edge case the team has ever hit are trying to be a skill and failing at being an identity layer. If it's not relevant to every task, it belongs in a skill's references/, not in the text loaded on every single turn.
Untracked local overrides. A harness that only works because of a setting one engineer changed locally, and never committed, is a harness that will quietly break the next time someone else runs the same task from a clean checkout.

A permissive automation layer amplifies every other layer's mistakes

A misconfigured control layer is annoying when a human is watching and can interrupt. The same misconfiguration inside the automation layer — a scheduled or CI-triggered run with nobody reviewing individual actions — can do real damage before anyone notices. Tighten permissions and hooks before you widen the automation layer, never the other way around.

How this fails in practice#

Most harness incidents trace back to one of a handful of recurring shapes. Naming them makes them faster to recognize the second time.

The agent asks for approval on the same safe action, every run#

Symptom: an engineer approves the identical low-risk edit for the fifth day in a row and starts approving on autopilot without reading the diff. Cause: the control layer's ask bucket is too coarse — it gates an entire tool ("ask before any Edit") instead of the specific risky subset of that tool's uses. Once a human is rubber-stamping approvals out of habit, the gate has already stopped doing its job. Fix: narrow the rule to the actual risk — ask only for edits under a sensitive path like payments/, allow the rest outright, and let a logging hook (not a blocking one) cover everything in between. A gate that fires constantly trains the human watching it to stop watching.

The agent behaves differently on your laptop than in CI#

Symptom: a task passes locally and fails — or does something unexpected — the moment it runs headless in a CI job. Cause: harness drift. A personal, uncommitted ~/.claude/settings.json override, a local environment variable, or a skill installed on one machine but not synced to the repo, is quietly changing what the harness allows. The committed config and the effective config have diverged, and nobody noticed because the divergence only shows up as a behavior difference, not an error. Fix: treat any local override as scratch space that never survives past a debugging session, and add a CI step that diffs the effective settings against the checked-in config so drift fails loudly instead of surfacing as a mystery bug three weeks later.

The context file keeps growing and the agent starts ignoring parts of it#

Symptom: a rule that used to work reliably starts getting missed, and the fix that keeps getting reached for is adding it again, more emphatically, near the top. Cause: the identity layer has absorbed content that belongs in the knowledge layer — every edge case the team has ever hit, bolted onto CLAUDE.md instead of scoped into a skill that loads only when relevant. Past a certain length, everything in the file competes for the same attention, and the rule that matters most is no longer distinguishable from forty others. Fix: audit the file for anything that's only relevant to a specific class of task, and move it into a skill.CLAUDE.md should read like a short onboarding note, not an incident archive.

A deny rule doesn't actually stop the thing it names#

Symptom: a command the team was sure was blocked runs anyway — usually because it was invoked through a wrapper script, an alias, or a slightly different flag order than the pattern anticipated. Cause: a glob-style deny pattern like Bash(rm -rf *) matches the literal string, not the intent — a script that calls rm with the flags reordered, or through a Makefile target, slips right past it. Fix: pair pattern-based denies with a hook that actually inspects the resolved command before it runs, and prefer removing the capability outright (no destructive tool wired in at all) over trying to enumerate every way a destructive command could be spelled.

Trade-offs: what to tune first, and how many harnesses to maintain#

Once the five layers exist, the harder question is which one to reach for when behavior isn't what you wanted — and how many separate harnesses a team should actually be maintaining.

Tune control and capability before you touch the prompt#

The instinct when an agent misbehaves is to edit the words inCLAUDE.md — add a sentence, make it more emphatic, capitalize the important part. That's usually the wrong lever to reach for first, because prose is advisory and easy for the model to deprioritize under a long, complicated task. Capability and control are structural: a tool that isn't wired in can't be called regardless of what the model decides, and a deny rule doesn't depend on the model remembering anything. The practical order is capability and control first — remove or gate the specific action that went wrong — and identity last, for the genuinely soft judgment calls where enforcement isn't the right tool (tone, prioritization, style preferences). If you find yourself tuning the prompt for something a permission rule could enforce instead, that's a sign you're solving the problem in the wrong layer, not that the prompt needs more work.

Detecting drift before it becomes an incident#

Harness-as-code solves drift only if something actually checks for it. A settings file sitting in the repo doesn't guarantee the agent running on a given machine is using it — a stale local copy, a personal override, or a skill that was manually installed outside the repo can all silently diverge from what's committed. The cheap fix is a CI job that runs the harness's configuration loader and asserts the effective settings match the checked-in file byte for byte, failing the build if they don't. That single check turns "works on my machine" from a debugging session into a build failure, which is a far cheaper place to catch it.

One shared harness, or one per project?#

Neither extreme is right by default. A single org-wide harness forces every project into the same tool set and permission model, which is efficient to maintain but wrong the moment one project legitimately needs broader deploy access than another. A fully bespoke harness per project avoids that mismatch but multiplies the maintenance burden by the number of repos, and a fix for a real problem in one harness has to be manually ported to every other one. The pattern that scales is a shared base — common identity conventions, a default-deny control layer, the skills every team uses — with project-level deltas layered on top for anything genuinely specific to that repo's risk profile. That's the same base-plus-overrides shape most teams already use for shared CI configuration, applied to the harness instead.

Key takeaways

The model is the engine; the harness — identity, capability, knowledge, control, automation — is the whole vehicle, and it's usually the higher-leverage place to debug or tune agent behavior.
Identical model weights can produce wildly different agent behavior depending on tools, permissions, and hooks alone — swap the harness before you assume you need a different model.
Push real constraints down into the control layer (permissions, hooks) rather than up into the identity layer (prose in a system prompt). Enforcement beats instructions, and it's the layer to tune first.
Treat harness configuration as code: versioned, reviewed, and identical across every machine and CI runner that runs the agent — and add a CI check that catches drift instead of just hoping there isn't any.
Default to a shared base harness with project-level deltas, not one harness per repo or one harness for the whole org — the base-plus-overrides shape that already works for CI config works here too.

Every Noddle Deck persona pack is, in effect, a slice of a pre-built harness — a curated capability and knowledge layer (skills and commands scoped to a role) that drops straight into your existing identity, control, and automation setup rather than replacing it.

bash

noddle-deck pack install developer

Installing one is a fast way to see a well-scoped capability and knowledge layer in practice before you build the equivalent for your own team's specific harness.