michi-debrief
The debrief skill runs after an implementation session completes. It is how Michi iterates spirally rather than in circles — each debrief improves the next session by capturing learnings, evolving verification scenarios, and calibrating how much autonomy the agent has earned. It can run in the same session (benefits from retained context) or a fresh session (clean perspective, no compaction bias).
When to use
Section titled “When to use”- After completing a
/michi-sessionmilestone - After multiple milestones have accumulated without review
- When you want to assess what was delivered versus what was planned
- When decisions from a session need human review before the next milestone
What it does
Section titled “What it does”Delivery assessment. The agent compares what was planned against what was delivered, checking each acceptance criterion explicitly. It tallies results — test counts for code, exit criteria for non-code — and verifies all milestones are committed.
Decision and discussion review. The agent reads the Decisions and Discussion sections from each milestone’s plan doc. For decisions: were they reasonable? Should any be reversed or codified as patterns? For discussion items: resolve now, defer with rationale, or promote to a project-level question.
Bug and gap analysis. For each bug found during the session, the agent asks what scenario would have caught it and writes that scenario in Given-When-Then format. This is how the verification set evolves — driven by actual failures, not imagined ones. Cross-package gaps get specific attention (missed schema updates, hardcoded assumptions).
Process observations. What worked, what didn’t, where the skill guidance helped or hindered. Scenario quality gets its own assessment — were Level A scenarios useful? Too granular or too vague? Did the agent execute them faithfully? The debrief also determines the lifecycle of verification artifacts: promote new scenarios to the catalog, update scenarios broken by intentional changes, retire ones no longer relevant.
Knowledge capture. Domain learnings go to the epic’s journal. Process learnings go to patterns. Human interventions on code quality get captured as applied examples in docs/reference/code-style.md. Memory-worthy content — collaboration patterns, corrections, confirmed approaches — updates docs/memory.md.
Trust calibration. The agent assesses signals that trust increased (all criteria met, decisions well-documented, no post-completion bugs) versus signals it decreased (premature “done” claims, skipped verification, unescalated decisions). It recommends an autonomy level for the next session.
What it produces
Section titled “What it produces”- Journal entry in the epic’s
journal.md— session summary, metrics, findings, learnings - Pattern updates to
docs/reference/patterns.md— new patterns with high confidence - CLAUDE.md updates — durable, broadly applicable rules surfaced by the session
- STATUS.md update reflecting current state and what’s next
- Scenario catalog updates — new scenarios from error analysis, updated or retired existing ones
- Code-style updates to
docs/reference/code-style.md— applied examples from human interventions - Memory update to
docs/memory.md— collaboration patterns and mental model changes - Trust recommendation for the next session’s autonomy level
Key things to know
Section titled “Key things to know”- Long sessions benefit from a fresh-session debrief — the agent’s compacted context may have lost nuance from early in the session.
- The most valuable scenarios come from actual failures. Error analysis during debrief is how the verification set gets stronger over time.
- The debrief is where human code-quality interventions get captured. If you corrected the agent’s approach during the session, the debrief ensures that calibration persists.
- The natural next step is
/michi-planningfor the next milestone, or/michi-sustainabilityif accumulated work needs a broader health check.
For the full agent instructions, see the SKILL.md source.