Resilience
Designing for Exceptions
Not the Happy Path
April 2026
12 min read
The Reality Split
Happy Path
20%
of reality
Exceptions
80%
of reality
The happy path describes what was planned. The exception paths reveal what the system is.
This is a familiar distributed-systems stance: design for partial failure, retries, timeouts, missing dependencies, and ambiguous state. In AI systems, exceptions dominate because the world is adversarial, noisy, and underspecified.
Why Exceptions Dominate in AI
In AI systems, exceptions aren’t edge cases. They’re the primary operating condition.
Ambiguous inputs
Users omit details, use slang, or provide contradictory information. What looks like a clear question to a human is often deeply underspecified.
Model uncertainty
Models have uncertainty and occasional mode collapse behaviors that are hard to predict. Fluency masks wrongness.
Retrieval failure
No relevant documents found, stale documents, wrong chunking. RAG doesn’t guarantee the right context reaches the model.
System failures
Upstream and downstream services fail: timeouts, rate limits, outages. The AI layer inherits every fragility beneath it.
Case Studies
Three different domains, same lesson.
Drive-thru ordering is an exception-dense environment
Noise, accents, interruptions, menu complexity, tight latency expectations. Every order is a potential exception. McDonald's ended the pilot — not because the AI didn't work in demos, but because demos aren't drive-thrus.
Diagnosis
The happy path (clear voice, simple order, quiet environment) is the minority case. Real conditions are dominated by exceptions the system wasn't designed for.
Transferable Pattern
Uncertainty-first dialogue: explicit confirmation steps for low-confidence intents, seamless handoff to humans as a designed state, metrics tied to order correctness.
A policy contradiction the chatbot couldn't handle
When a customer asked about bereavement fares, the chatbot gave information that contradicted the airline's actual policy. The exception — a policy edge case — triggered legal, trust, and cost exposure.
Diagnosis
The system had no way to recognize that it was in uncertain territory. Instead of escalating or expressing uncertainty, it generated a confident answer.
Transferable Pattern
Confidence gating: route uncertain answers to escalation rather than improvisation. Treat 'I can't verify this' as a first-class output state.
Exceptions and nuance as a feature, not a bug
Their 'expert-in-the-loop' approach treats taste, tone, and context as primary realities — not edge cases to be smoothed over. Human judgment and algorithmic generation are intentionally combined.
Diagnosis
Instead of trying to eliminate exceptions, they designed the system to embrace them. Domain experts define quality criteria and review outputs.
Transferable Pattern
Expert-in-the-loop: humans define quality, review outputs, and feed failure examples back into evaluation suites. Iterate on the definition of 'good' rather than trying to automate it away.
Exception-First Design Patterns
Four patterns that shift the design stance from “handle exceptions when they happen” to “design for exceptions first.”
Fallback design
Always maintain a deterministic baseline for core workflows. If the AI path fails, the user must still be able to complete the job. This is the non-negotiable minimum.
Uncertainty-first dialogue
Explicit confirmation steps for low-confidence intents. The system communicates what it’s unsure about rather than guessing. ‘I’m not confident about this’ is a better output than a wrong answer.
Human-in-the-loop
Design human oversight as the system, not as an escalation hack. Interface tools, competence, authority, and the ability to intervene or stop.
Confidence gating
Route outputs through confidence thresholds. High confidence → deliver. Medium → deliver with caveats. Low → escalate. Never deliver low-confidence outputs as if they’re certain.
UX for Exceptions
The most important AI UX isn’t the “magic answer.” It’s the recovery UI.
Communicate uncertainty and boundaries
What the system can and can’t do. Set expectations before failure, not after. Users who understand limits trust the system more than users who discover limits through errors.
Show provenance for high-stakes claims
Which documents were used, when they were last updated. Prevent overtrust by making the system’s reasoning visible.
Provide controls
Undo, edit, confirm, and escalate. The user is never trapped in an AI-driven path. Every automated decision has a manual override.
If you can only design one thing well, design the recovery path. That's where trust is won or lost.
Exceptions are where systems prove themselves.
If your AI system only works on the happy path, it doesn't work yet.