Improve Your Agent Practice

In short

Agent-assisted work starts to feel off in recognizable ways: the agent keeps doing the thing you corrected last week, permission prompts fire for the same commands every session, your CLAUDE.md has become a wall of rules that nobody reads carefully anymore.

The fix is to treat your own practice the way you treat code — read the evidence, identify what’s actually recurring, and make a precise change. Export a week or two of transcripts, co-read them with the agent, and look for patterns: corrections that cluster, prompts that repeat, rules that aren’t catching. Then pick one or two changes with clear evidence — a CLAUDE.md rule, a narrowly-scoped permission entry, a workflow adjustment — and make them. Verify over the next week of sessions. Document why.

This is the iteration cycle turned inward, and it’s a recurring practice. Your setup drifts; the agent’s defaults shift with model updates. Run this on a cadence — monthly, or after any stretch where sessions felt harder than they should.

A sample session

A developer keeps hitting the same friction every session: permission prompts for two test commands and a build command, and the agent reformatting files it wasn’t asked to touch. They want to read recent sessions and figure out what to actually fix.

You: Let’s review my last couple weeks of sessions — I keep hitting the same friction and I want to figure out what to actually change. Two things happen almost every session: I get prompted to approve npm run test, npm run test:watch, and npm run build even though I always approve them, and the agent reformats files it wasn’t asked to touch. Here are my settings files. Read them, then let’s look at the transcripts.

Agent: I’ve read your settings. A few things to flag before we get into the transcripts. Your project .claude/settings.json has Bash(npm:*) as an allowlist entry — that’s a broad pattern meant to cover your test and build commands, but Claude Code matches it literally against the full command string. npm run test doesn’t match npm:*; npm:* would match a bare npm invocation. So the allowlist entry you added to fix the prompt friction isn’t actually matching. That’s probably why the prompts are still firing.

On reformatting: your CLAUDE.md has a rule against it, but the phrasing is vague — “don’t reformat files unnecessarily.” That’s judgment-dependent. A more precise rule would help. Let me read the transcripts now.

[the agent reads two weeks of transcripts, clustering recurring prompts and reformatting corrections]

Agent: I need to correct something in my initial count. I initially tallied 34 reformatting incidents, but I conflated two different things: the agent adding trailing newlines or fixing indentation on files it was editing anyway (incidental, not the issue), and the agent opening and reformatting files it had no other reason to touch (that’s the real pattern). The real pattern is about 11 instances, not 34 — all in the same situation: the agent reformats a file it opened to read a value, then writes it back with normalized formatting.

That’s actually a more useful finding. It’s not a general reformatting problem — it’s specifically that the agent treats “open to read” and “open to edit” as the same operation.

You: That’s exactly what I’ve been seeing. What about the permission prompts?

Agent: Consistent with what the settings analysis suggested. The three commands — npm run test, npm run test:watch, npm run build — appear in approval prompts across 9 of the last 14 sessions. Always approved, never denied. The allowlist pattern npm:* exists in your settings but isn’t matching them.

There’s also a smaller pattern: git status and git diff prompted 6 times over the two weeks. Those are read-only and likely don’t need to prompt either.

So: two distinct problems, both fixable. The allowlist patterns are just wrong syntax. The reformatting needs a more precise rule.

You: Let’s decide the changes. What are the exact patterns and the exact rule?

Agent: For the allowlist, replace Bash(npm:*) with three narrow entries: Bash(npm run test:*), Bash(npm run test:watch:*), Bash(npm run build:*). Add Bash(git status:*) and Bash(git diff:*) for the git reads. All five go in the project .claude/settings.json.

For the reformatting rule, replace the current vague phrase with this: “When reading a file to extract a value, use the Read tool — do not open the file for editing. Only write a file if you are making a change the user asked for. Do not normalize formatting, fix indentation, or adjust whitespace in files you were not explicitly asked to edit.”

That’s the full scope — five allowlist entries, one rule replacement. Want me to make the edits now?

You: Yes. Make the edits.

[the agent updates .claude/settings.json with the five allowlist entries and edits the CLAUDE.md rule; both files validated after the change]

Agent: Done. .claude/settings.json updated — the broad npm:* entry removed, five narrow patterns added. CLAUDE.md rule replaced. One thing I want to flag: the npm run test:watch:* pattern will match npm run test:watch but also any variant like npm run test:watch:ci if you have one. If you want to restrict it to exactly npm run test:watch, the pattern is Bash(npm run test:watch) with no wildcard. Your call.

You: Leave the wildcard — that’s fine. Let’s run the debrief.

[the agent runs the three-pass debrief: documents the session, checks for stale references, updates the journal]

Agent: Debrief complete. Two things from Pass 2. First, the CLAUDE.md rule I replaced had a cross-reference to a section called “File Editing Conventions” that no longer exists — it was removed two months ago. The old rule was pointing at a dead anchor; I’ve cleaned up the reference. Second, the journal entry is in.

Trust calibration: steady. The initial overcount (34 → 11 reformatting instances) was caught and corrected before it drove any decisions, which is the right shape. Worth carrying forward: when delegating a count to a transcript scan, sanity-check the categorization before reporting it.

You: Did you check the backend project’s settings file? It runs the same test commands.

Agent: No — and that’s a miss. I scoped to the project where the prompts were reported and didn’t check whether other projects with the same toolchain had the same broken pattern. The backend project has npm:* in its .claude/settings.json too — same wrong pattern. Adding the corrected entries there now. Journal updated to record the gap.

The full guide

You’re getting value from agent-assisted development, but something feels off. The agent keeps doing the thing you had to correct last week. You keep hitting permission prompts for the same shell commands. Your CLAUDE.md has grown into a wall of rules that nobody reads carefully anymore.

You want to improve — but improve what, exactly, and how do you know what to change?

The answer is in your session transcripts. They’re a record of what actually happened: where the agent drifted, what corrections you made, what friction recurred. Reading them — with the agent as a co-reader — is the Explore phase of a cycle you can run on your own practice.

The iteration cycle, turned inward

Improving your agent practice is a genuine application of the iteration cycle. The subject is how you work with the agent. The evidence is your session history. The deliverables are the changes you make to your setup, your rules, your habits.

Explore → Brainstorm → Plan → Execute → Verify → Document

This is a recurring cycle — not a one-off cleanup. Your practice drifts; the agent’s defaults shift with model updates; new friction appears. Build the habit of running this cycle on a cadence — monthly, or after any stretch where sessions felt harder than they should.

Explore — read your transcripts

Claude Code sessions can be exported as transcripts — session-export is a tool for it. Export a week or two of sessions, then read them with the agent. Reviewing real transcripts together is how you and the agent reflect on what’s working and what isn’t — in context, against actual examples, instead of from memory. Without the transcripts, you’re guessing.

Read them looking for patterns:

Where did you have to correct the agent? What was the correction?
What permission prompts appeared more than once?
Where did the agent drift from your instructions or conventions?
What did you have to re-explain that you’d explained before?

Corrections that cluster around the same behavior are a signal: that behavior isn’t covered well in your setup, or is covered but not catching.

Michi’s own machinery feeds this phase. The /michi-debrief skill has a process-observations section — if you’re running debriefs, those observations are already partial transcripts of what went wrong. /michi-sustainability has a process-retrospective sub-mode that surfaces similar patterns. These are shortcuts to the data; they don’t replace reading the sessions themselves when the friction is real.

Brainstorm — decide what to change

Once you’ve seen what actually happened, figure out what’s worth changing.

Not everything that went wrong needs a fix. Some corrections are one-offs. Some friction is the agent doing the right thing cautiously. The question is: what recurred?

A correction that appeared once — note it. A correction that appeared three times — that’s a missing rule or a poorly stated one. A permission prompt that appeared every day for two weeks — that’s a tuning problem, not normal behavior.

The changes that typically emerge:

A CLAUDE.md rule — the agent keeps doing something you’ve corrected; the correction should become a rule
A permission allowlist entry — a safe shell command or tool you’re being prompted for repeatedly
An extensions.md override — a Michi skill default that doesn’t fit your project or team
A workflow habit — something you’re doing (or not doing) that’s generating friction downstream

Don’t try to fix everything at once. Pick the one or two changes with the clearest evidence.

Plan — decide the change precisely

Vague rules produce vague behavior. Before you write anything, get specific:

If it’s a CLAUDE.md rule: state what the agent should do, what it should not do, and (if it’s not obvious) why. Rules without rationale get ignored or misapplied.
If it’s a permission allowlist entry: write the narrowest pattern that covers the real case. Bash(npm run test:*) instead of Bash(npm:*).
If it’s an extensions.md override: state which skill default you’re overriding and what replaces it.

Write the change before making it. Seeing it written down catches scope creep and vague reasoning.

Execute — make the change

CLAUDE.md and settings.json live in your project or in ~/.claude/. Make the edit. For allowlist changes, ~/.claude/settings.json (user scope) covers all projects; .claude/settings.json (project scope) covers one.

For team projects, extensions.md carries overrides that apply when any agent works in that project. See /michi-bootstrap customize — it scaffolds the file and wires the @-ref so the skill picks it up automatically.

Verify — run the next sessions

The test is the next week of work. Did the friction drop? Did the correction stop recurring?

This is not a unit test — it’s behavioral confirmation over time. You’re watching for the absence of the thing you fixed.

If the friction persists after a change, the change didn’t address the root cause. Go back to Explore: read more sessions, more closely.

Document — record what changed and why

Write a brief note — a journal entry, a comment in your CLAUDE.md — capturing:

What you changed
What you saw that prompted the change
What you’re watching for

This is lightweight but important. In three months, you won’t remember why a rule exists. Future you — and any teammates working in the same project — need the why, not just the what.

Sub-cases

Learning from your sessions

The core case: you export transcripts and co-read them with the agent. Ask the agent to identify correction patterns, recurring prompts, and places where your instructions and the agent’s behavior diverged. The agent is good at pattern-matching over text; use it.

For ongoing capture, /michi-debrief includes a process-observations section in its output. The habit of running a debrief after every milestone means the observations accumulate in your docs rather than evaporating with the session.

Tuning permissions

Permission prompts are one of the most concrete sources of friction, and one of the most fixable.

The pattern: read your sessions (or run the fewer-permission-prompts skill if you have it), find the commands you’re consistently approving, and add narrowly-scoped patterns to your allowlist.

The anti-pattern: blanket-allowing a tool class. Bash(git:*) to stop git prompts means any git command — including destructive ones — runs without a prompt. Narrow the pattern to what you actually approved: Bash(git status:*), Bash(git log:*), Bash(git diff:*).

When in doubt, err toward narrower and revisit. An allowlist that’s too narrow produces friction. One that’s too broad produces risk. Read the session record before deciding.

Adjusting conventions

Your CLAUDE.md rules and extensions.md overrides are living documents. They should evolve as your practice evolves.

Common reasons to revisit conventions:

A rule is being ignored — often because it’s vague, buried, or contradicted by a Michi skill default
A rule is no longer accurate — the project has changed
A Michi skill default doesn’t fit your team’s norms — override it in extensions.md rather than working around it

When you update a rule, update the rationale alongside it. Rules without rationale drift back to ambiguity over time.

Pitfalls

Improving in the abstract. It’s tempting to think through what might be going wrong without looking at what actually went wrong. The transcript read is not optional. Your intuition about what’s causing friction is often right — but it’s also often one level removed from the actual root cause.

Over-broad permission grants. Blanket-allowing a tool class to eliminate prompts trades friction for risk. The permission prompt exists for a reason. Narrow the pattern to what you actually need.

Treating it as one-and-done. Your practice drifts. The agent’s behavior shifts with model updates. New friction appears as you take on different kinds of work. Build the habit of reviewing on a cadence — not because something broke, but because the practice needs maintenance.

Fixing symptoms instead of causes. A rule that fires three times in one week is a symptom. The cause might be a poorly stated convention, a missing rationale, or a Michi skill default that overrides your intent. Look one level deeper before writing the fix.

A worked example

A developer working on a backend service has been getting permission prompts for npm run test, npm run build, and a handful of git read operations (git status, git log, git diff) almost every session. They also notice that the agent keeps adding TypeScript-style type annotations to their JavaScript files, which they correct every time.

They export two weeks of sessions and read them with the agent.

The permission pattern is clear: the same five commands appear in approval prompts across every session. The type annotation pattern is also clear: it shows up in every session that touches a new file.

Plan:

Add Bash(npm run test:*), Bash(npm run build:*), Bash(git status:*), Bash(git log:*), Bash(git diff:*) to the project’s .claude/settings.json allowlist.
Add a rule to CLAUDE.md: “This is a JavaScript project. Do not add TypeScript type annotations, JSDoc type tags, or .ts / .tsx files. The project does not use TypeScript.”

Execute: Both changes take five minutes.

Verify: The developer runs sessions for the next two weeks. The permission prompts for those five commands stop. The type annotation corrections stop.

Document: A comment in .claude/settings.json notes when the allowlist entries were added and why. A sentence in the CLAUDE.md rule explains the project context.

A month later, the developer does a brief transcript review and finds a new pattern — the agent has been checking in mid-implementation to ask for scope confirmation in situations where the scope was already clear from the plan doc. The cycle runs again.

The core tools: iteration cycle, principles, /michi-debrief, /michi-sustainability.