Troubleshooting
Purpose: When a journey stalls or the output looks wrong, catalogue the recovery moves so the team unsticks fast — without recreating work items.
Reading time: ~10 minutes · Audience: BA / developer / EL coach · The cheat-sheet table is at the top, recovery techniques in the middle, the per-pattern diagnostic detail at the bottom.
Cheat sheet
| # | Symptom | Common cause | Recovery |
|---|---|---|---|
| 1 | "Create API" WI stalls on first run | Routed to swifter-backend-architect instead of swifter-backend-analyst | Re-run with swifter-backend-analyst:generate-scoped-specifications |
| 2 | Backend dev run fails repeatedly | New developer_run_02 chat opened | Continue in same chat with structured restart |
| 3 | First-session ABANDONED on origin | Session-naming artefact — normal | Treat "Origin ABANDONED → development DONE" as success |
| 4 | Storybook broken after rename | No build gate on .stories.ts / .mdx | Re-run with mandatory BuildStorybook step |
| 5 | Agent uses raw library instead of project component | Skipped component-discovery step | Add explicit reuse hint to WI; re-run |
| 6 | Spec and implementation diverge | No spec-drift diff after implementation | Run spec-drift diff; reconcile via chat |
| 7 | Defect routed through analyst gets stuck | Wrong entry agent | Restart via /swifter-default:start (autonomous triage) |
| 8 | Agent says "done" but output looks wrong | No self-check ran | Ask the agent to verify against its <verification_gates> — see Ask the agent to verify |
| 9 | Same WI failing repeatedly with different errors | LLM hallucination on a hard WI | Run several parallel sessions with the same WI description — see Parallel sessions |
| 10 | Large WI hits context limit mid-session | One WI doing too much | Fork to a new chat at the spec → implementation boundary — see Fork at risk boundaries |
| 11 | Agent visibly stuck and rephrasing won't help | Unclear what to do next | Ask the agent to explain its work, then to produce a detailed plan before re-running — see Ask the agent to explain |
Most stalls fall into one of these buckets. The recovery is usually don't recreate the WI, fix the routing or the description and re-run in the same chat. The single most expensive anti-pattern observed in production is spawning <title>-attempt-2 and -attempt-3 WIs in place of amending the original.
Recovery techniques
When an LLM-driven workflow fails, the recovery is rarely try harder on the same thing. It is one of the techniques below, applied in roughly the order they appear. As good as Swifter and the LLMs behind it are, LLMs sometimes hallucinate; the techniques are how the team recovers gracefully rather than treating each hallucination as a platform failure.
Before you blame the platform
Three quick checks before deeper debugging — the issue is often upstream:
- The requirements. If the BRD is ambiguous, the WI will be ambiguous, and the agent's output will be ambiguous in the same way. Fix the source.
- The WI description. Does it meet Work-Item Description Composition? Atomic scope, explicit scope-out, Figma frame (not file), conditional logic captured, Implementation Hints where the agent has stumbled before. Most "platform" stalls trace back to a description that's missing one of these.
- The workflow order. Was the right agent chosen at Start? Were the analyst gates approved before the developer was invoked? Out-of-order workflows produce surprising outputs.
If all three are clean, the recovery techniques below apply.
Ask the agent to verify
Every Swifter command embeds a <verification_gates> block — named Yes/No checklists the command must satisfy before proceeding (see e.g. swifter-frontend-analyst/commands/onboard-component.md). These gates run automatically inside the command, but the agent does not always self-audit thoroughly. When output looks wrong despite no error, the cheapest diagnostic is to ask the agent to walk its own gates in the same chat.
The phrasing to use:
Walk through every
<verification_gates>block in the command you just ran. For each gate, answer Yes/No against the actual files on disk. Report any No items and fix them.
If the symptom is narrow, name the specific gate ("Verify the before_metadata gate from /swifter-frontend-analyst:onboard-component."). If the WI carried Implementation Hints, also ask:
Walk every line of the work-item description. For each instruction or hint, report whether the current implementation follows it, and where the implementation lives (file path + line range).
When to reach for this first: when the build is green and the agent reports done, but the artefact doesn't look right. When the build is red or an explicit error is raised, follow the resume command from the failing step first.
Ask the agent to explain
If verification gates come back clean but the output is still wrong, ask the agent to explain — describe what it did, in what order, and why. Asking for an explanation makes the agent re-traverse its own reasoning, and the gaps it skipped surface as inconsistencies in the explanation. The phrasing:
Explain step by step what you just did, in the order you did it. For each step, name the file you wrote (or modified) and why you wrote it that way. Don't summarise — be exhaustive.
Ask for a detailed plan before re-running
When a re-run is needed and the previous attempt was vague, ask for a plan before execution:
Before re-running, write out the plan. List, in order: each step you intend to take, the file or artefact each step produces, and where it will be written. Do not execute until I approve the plan.
This forces the agent to commit to a specific shape ahead of time, which is far easier to course-correct than mid-execution prose. Especially useful when the same problem has occurred more than once on the same WI.
Investigation WI
When the issue is suspected but not understood, open a small investigation WI in parallel — the description is the question being asked. Example: "Verify that the Figma frame at fetch-figma-data to read end-to-end; report the node count and identify any nodes the cache would truncate."
The investigation WI runs in its own chat, the findings come back as the agent's report, and the original WI's session stays clean. Investigation WIs are cheap to spawn and cheaper to discard once their question is answered.
Stop vibe-coding inside one WI
When the same WI has been hand-edited and re-run several times, the agent's context starts conflicting with itself — partial fixes from earlier prompts clash with the latest instruction. The discipline:
- Once the root cause is identified, stop iterating in the broken WI. Close it (or leave it as-is) and open a new WI with the lesson learned written into the description from the start.
- Non-functional findings belong in Implementation Hints — that section is exactly where workarounds and architectural pins live. A growing Implementation Hints section across many WIs is a signal that the platform / guidelines need an update.
- Functional findings belong in the new WI's body — scope-out, conditional-logic, sibling reference.
This protects the agent from accumulated context-window pressure inside a single broken session.
Parallel sessions
For a hard WI (especially anything spec-heavy), open multiple parallel sessions against the same WI. The LLM is non-deterministic; running the same prompt N times typically produces N-1 hallucinations and one usable output. Variability beats trying to argue a single confused session back to sanity.
The discipline:
- Run 3–5 parallel chats against the same WI body when one attempt has failed twice.
- Each session opens on its own branch; let each one finish independently.
- At the end, pick the cleanest output and discard the rest. Closed-without-merge sessions cost nothing.
Combine with the fork pattern below for very large WIs.
Fork at risk boundaries
When the team has reached a clean intermediate state (specs approved, page wrapper built, mapping table verified) and the next phase carries risk, fork to a new chat instead of continuing in the current one. The current chat carries everything from the previous phase — Figma cache, style mapping, analyst's working notes — which is mostly noise for the next phase and which the LLM has to keep paging through.
When to fork:
- After the analyst pass on a complex page, before invoking the developer.
- After the developer pass, before invoking the QA tester.
- After Storybook is green, before opening the FE-BE integration.
Forks are cheap; they reset the working context to exactly what the next phase needs and they expose new natural risk boundaries where parallel forks can be tried. (E.g. after specs are done, fork once for the developer and once for a parallel developer run — compare the two outcomes.)
Manual fix as the escape hatch
In extreme cases — a one-character typo, a misformatted YAML, an import that Swifter cannot infer — fix the code directly in the project repository. Then immediately onboard the fix so Swifter's specifications absorb what changed: run the relevant import-component-specs or sync-component-specs (UI) or re-run the analyst (generate-scoped-specifications) on the affected backend slice. A manual fix that is not onboarded becomes a silent divergence between the project code and the Swifter specs, which the next WI of the same type will overwrite.
Routing Mistakes
The largest cluster of stuck WIs at project start traces to the wrong agent picking up the item. The routing rules are not symmetric — different WI shapes go to different entry agents, and the autonomous-triage default works well for some shapes and not others.
| WI shape | Correct agent | Wrong agent (don't use) |
|---|---|---|
| Defect (UI single-control visual bug) | swifter-default:start → routes to swifter-frontend-developer | swifter-frontend-analyst |
| Sync / maintenance / import | swifter-frontend-analyst (sync-component-specs or import-component-specs) | autonomous triage may misroute |
| "Create API" (new endpoint) | swifter-backend-analyst | swifter-backend-architect — common mistake at project start |
| Architecture guidelines / coding conventions | swifter-backend-architect | analyst or developer |
| Frontend component or page | swifter-frontend-analyst (then developer) | autonomous triage on first run |
| Failed backend dev run, retry | continue same chat, structured restart | open new developer_run_02 chat |
Diagnostic move. Open the session's chat header and confirm which agent code is attached. If it's swifter-backend-architect on a "Create API" item, that's the bug; close the chat without saving artefacts and re-Start with the backend analyst explicitly selected. If it's the analyst on a defect, the agent will spin trying to write a spec for a one-line fix; restart via /swifter-default:start.
Re-Run Patterns
When the backend developer's first run fails — red gates, missing files, an unfinished implementation — the temptation is to open a new chat with a fresh name. Don't. The strongest contraindication from production was a backend WI that accumulated developer_run_02 through developer_run_05 chats and never reached Done; the same WI was later salvaged by reopening the original chat and supplying a structured restart message.
A structured restart message includes:
- What was attempted in the previous run (one sentence).
- What failed, specifically — file name, gate name, error message. Paste the error.
- What changed, if anything, between the previous run and this one — refined WI, updated spec, new contract.
- The explicit ask — "continue from current branch state, fix the failing test in
XService.spec.ts, do not regenerate the controller."
This keeps the agent's understanding of session state intact and lets it resume from the same branch rather than re-deriving everything. The same pattern applies to frontend developer failures — same chat, structured restart, don't open developer_run_02.
Convention Violations
The most pervasive class of issue across real engagements was agents violating documented project conventions and not self-auditing — wrong cast patterns, leftover metadata comments, wrong test-framework mocks, incorrect lifecycle hooks. The root cause is consistent: the agent generates plausible-looking code without re-reading framework_knowledge.md after writing, and declares done.
Typical violations to scan for in the Dev tab:
- Cast patterns — project uses
as Foo, agent emits<Foo>x(or vice versa). - Metadata comments — leftover
// TODO: from generationor analyst-internal markers in production code. - Test mocks — project standardises on one mocking library; the agent imports another it knows from training data.
- Lifecycle hooks — Angular
ngOnInitvs ReactuseEffectmixed across files in a project that has a clear preference. - State management — bypassing the project's chosen store to reach into the service directly.
The platform-level fix is a mandatory post-generation compliance review that re-reads framework_knowledge.md; until that fix is universally enforced, treat the Dev-tab review as the place where these are caught, and add the specific violation as a literal hint to the WI when re-running — "do not import from @angular/forms; use the project's FormsModule re-export at src/app/shared/forms."
Storybook and Build Gates
Storybook builds break silently after renames and refactors — broken import paths in .stories.ts, orphaned MDX references, missing default exports — because the agent declares done without running the build. The failure mode is asymmetric: the component itself compiles, Preview shows the new screen, but Storybook on main is broken.
Build-gate failure modes to expect:
- Renamed component, stale story import.
Foo.stories.tsstill imports from./FooafterFoomoved to./components/Foo. - Orphaned MDX. Documentation MDX references a story that no longer exists; Storybook fails on missing addon.
- Default export drift. Story file's default export
titleno longer matches the project's Storybook hierarchy convention. - Snapshot drift. Storybook test-runner snapshots out of date after a non-visual rename.
Treat BuildStorybook as a blocking final step in any skill that writes .stories.ts or .mdx. When it fails, do not open a new WI; share the build log in the existing chat and let the agent fix the references.
Unauthorised Writes
A recurring class of issue is agents writing files they weren't authorised to write — installing npm packages, modifying yarn.lock, creating stub directories at the project root, or touching shared configuration without surfacing the change in chat. The pattern is corrosive because the diffs land on the branch alongside the WI's intended changes and slip through casual review.
Triggers and mitigations:
yarn add/npm installinvocations. Hard guardrail — require explicit operator confirmation in chat before any package mutation. If the agent claims a new dependency is needed, that becomes a discussion, not a fait accompli.yarn.lock/package-lock.jsonmodifications. Any change to the lockfile without a correspondingpackage.jsonchange is automatically suspect.- Project-root config files (
.eslintrc,tsconfig.json,angular.json). These are owned by the project, not the WI; touching them needs an explicit reuse hint and a confirmation gate. - New top-level directories. A stub
lib/ortools/at the repo root for a single-component WI is a sign the agent has misread the project structure.
Diff review in the Dev tab is the catch point. If unauthorised writes appear, do not merge the PR — return to chat, ask the agent to revert the unauthorised files, and re-run only the parts it was authorised to write.
Spec-Implementation Drift
Outputs, field types, and function signatures silently diverge between the analyst's YAML spec and the developer's TypeScript until the operator audits the two side by side. The drift is often small — a renamed field, a changed return type, a function that takes one argument in the spec and two in the code — and the cumulative effect is integration breakage downstream.
The remediation is to run a spec-drift diff after every implementation. Concretely:
- Compare the analyst's
data_model.yamland logic specs against the developer's generated TypeScript types and function signatures. - Flag any field whose name, type, or cardinality changed; any function whose signature changed; any return type that was widened or narrowed silently.
- Reconcile via chat — either update the spec to match the implementation (rare; the spec is the contract), or update the implementation to match the spec (typical).
The discipline of the spec is the contract is what keeps integration WIs cheap. Drift that isn't caught at the page WI shows up as field-name mismatches at integration time and costs a session to unwind.
Session-Naming Artefact
A persistent ~47% first-session-ABANDONED rate across pilot data is the project baseline, not failure. The mechanism is mechanical: session 1 (named origin or analyst) completes the specification work, session 2 (named development) opens to do the implementation, and when the new session opens the previous one flips to ABANDONED because the chat is no longer active. The WI itself reaches Done via session 2.
How to read the metrics correctly:
- Origin ABANDONED → development DONE is a normal, successful arc — typical for page WIs.
- Origin DONE alone is normal for sync items, single-control defects, and simple cosmetic components that finish in one session.
- All sessions ABANDONED, no Done session is the real failure — that's a stuck WI.
- Same WI re-spawned as
<title>-attempt-2is the worst signal; the underlying issue was a description problem that wasn't fixed.
Communicate this to team leads early. Dashboards that count ABANDONED sessions as failures will mislead the team into thinking the platform is broken when it's just naming.
Over-Fragmentation
The anti-pattern of spawning <title>-attempt-2, <title>-attempt-3 WIs in place of amending the original is the single highest-leverage thing to forbid on a new engagement. Production data shows a Reconciliation API spread across nine attempt-named WIs (the description body was correct each time, but the mock data source kept changing); the work would have completed faster if the team had amended the original WI's data section in place.
Amend-in-place discipline:
- Same data shape, different details — edit the original WI's description; do not create a new one.
- Truly different scope — split into a new WI, but rename to reflect the new scope (
Add settlement endpoint), not the attempt (Reconciliation API attempt 2). - Recurring failure on the same description — the description is the bug; fix it. Recreating the WI does not fix descriptions.
- Different agent attempt — same WI, different agent selection at Start. Do not duplicate.
If the team finds itself reaching for -attempt-2, that is the signal to stop, re-read the WI, and apply the lessons from Routing Mistakes, Re-Run Patterns, or Convention Violations above before spending another session.