Skip to main content
Big Freight LifeBig Freight Life
Big Freight Life
Case Studies/Designing the System Behind Your AI

Designing the System Behind Your AI

AI is a participant inside a designed system, not the system itself. Most teams have it backwards.

2026

Ray Butler, AI Systems Architect

AI amplifies whatever system it operates inside.

Most teams are using AI backwards. They start with prompts. They build features. They hope the system holds. It doesn’t.

If the system is unclear, the output is unreliable.

AI doesn’t replace systems. It exposes whether you have one.

The Problem

AI has made execution cheap. Code can be generated. Interfaces can be assembled. Features can be shipped faster than ever.

But something hasn’t changed: judgment.

Most teams are still designing at the surface:

  • prompting instead of structuring

  • reacting instead of defining

  • building outputs instead of systems

The result:

  • unpredictable behavior

  • brittle workflows

  • no clear ownership of decisions

The Insight

So instead of asking “What can AI build?” the better question is:

What system should AI operate inside?

The Approach

I designed a starter system that forces AI to operate inside defined boundaries, explicit contracts, measurable evaluation loops, and human oversight. Not as a tool. As a participant in a system.

At its core, the system separates concerns clearly. Each layer has a responsibility. Nothing overlaps. Nothing is implied.

.ai/ → system rules and contracts src/agents/ → bounded AI workers orchestrator → coordination layer approvals → human-in-the-loop evaluation → measurable behavior DESIGN.md → visual system CLAUDE.md → AI memory layer

Multi-agent by design.

This isn’t theoretical. The system demonstrates a real multi-agent pattern where agents check each other before anything ships.

Input ↓ Classifier Agent ↓ Validator Agent ↓ Orchestrator ↓ Human Approval (if needed) ↓ Outcome

One agent proposes. One agent critiques. The system decides what happens next.

This is where most “agent systems” fail. They generate. They don’t govern.

Design as a contract.

Most systems treat design as decoration. This system treats it as infrastructure. DESIGN.mddefines colors, typography, layout, component behavior, and AI interface patterns — not as guidelines, but as a source of truth AI must follow.

AI interaction is explicit:

  • Recommendations are labeled.

  • Confidence is visible.

  • Reasoning is exposed.

  • Human approval is explicit.

Validation is built in.

Every system claim is testable. AI behavior isn’t assumed. It’s measured.

npm run validate → typechecking → agent evaluation → pipeline evaluation → dependency audit

Security by default.

The system intentionally avoids premature dependencies. No model SDK by default. No vendor lock-in. No hidden risk surface. Model providers are added only when justified.

Claude as a system participant.

With CLAUDE.md, AI is no longer just a tool. It becomes a team member with constraints — reading system context, following contracts, executing within defined rules, and using repeatable commands.

/aisys-review /aisys-validate /aisys-new-agent

AI doesn’t improvise. It operates inside the system.

Evolution.

This system didn’t start complete. It was refined through constraint and correction. Each step reduced ambiguity. Each step increased system integrity.

v0.6

Removed AI SDK

Control over convenience.

v0.8

Introduced DESIGN.md

Design becomes enforceable.

v0.9

Proved multi-agent pattern

Not theoretical — a working classifier-validator-orchestrator loop.

v0.9.1

Fixed runtime assumptions

Node types bug. Verification caught it before production.

v0.10

Claude-native workflows

AI becomes a collaborator, not a wrapper.

Real-world application.

This isn’t a sandbox exercise. The pattern maps directly to anywhere decisions matter:

  • AI-assisted plan review (PlanFlow)

  • regulated workflows

  • enterprise automation systems

  • multi-agent decision pipelines

Outcome.

A system that:

  • is runnable on day one

  • enforces structure before output

  • scales without collapsing

  • keeps humans in control

Download the starter repo on GitHub.

Clone it, run npm run validate, and you’ll have a working multi-agent system on day one.

View on GitHub

Few are designing systems for AI.

Most teams are building with AI. That’s the difference — design the system behind your AI, not just what it produces, but how it behaves.