AI Systems Architect | Big Freight Life

AI amplifies whatever system it operates inside.

Most teams are using AI backwards. They start with prompts. They build features. They hope the system holds. It doesn’t.

If the system is unclear, the output is unreliable.

AI doesn’t replace systems. It exposes whether you have one.

The Problem

AI has made execution cheap. Code can be generated. Interfaces can be assembled. Features can be shipped faster than ever.

But something hasn’t changed: judgment.

Most teams are still designing at the surface:

prompting instead of structuring
reacting instead of defining
building outputs instead of systems

The result:

unpredictable behavior
brittle workflows
no clear ownership of decisions

The Insight

So instead of asking “What can AI build?” the better question is:

What system should AI operate inside?

The Approach

I designed a starter system that forces AI to operate inside defined boundaries, explicit contracts, measurable evaluation loops, and human oversight. Not as a tool. As a participant in a system.

At its core, the system separates concerns clearly. Each layer has a responsibility. Nothing overlaps. Nothing is implied.

.ai/ → system rules and contracts src/agents/ → bounded AI workers orchestrator → coordination layer approvals → human-in-the-loop evaluation → measurable behavior DESIGN.md → visual system CLAUDE.md → AI memory layer

Multi-agent by design.

This isn’t theoretical. The system demonstrates a real multi-agent pattern where agents check each other before anything ships.

Input ↓ Classifier Agent ↓ Validator Agent ↓ Orchestrator ↓ Human Approval (if needed) ↓ Outcome

One agent proposes. One agent critiques. The system decides what happens next.

This is where most “agent systems” fail. They generate. They don’t govern.

Design as a contract.

Most systems treat design as decoration. This system treats it as infrastructure. DESIGN.mddefines colors, typography, layout, component behavior, and AI interface patterns. Not as guidelines — as the source of truth AI must follow.

AI interaction is explicit:

Recommendations are labeled.
Confidence is visible.
Reasoning is exposed.
Human approval is explicit.

Validation is built in.

Every system claim is testable. AI behavior isn’t assumed. It’s measured.

npm run validate → typechecking → agent evaluation → pipeline evaluation → dependency audit

Security by default.

The system intentionally avoids premature dependencies. No model SDK by default. No vendor lock-in. No hidden risk surface. Model providers are added only when justified.

Claude as a system participant.

With CLAUDE.md, AI is no longer just a tool. It becomes a team member with constraints — reading system context, following contracts, executing within defined rules, and using repeatable commands.

/aisys-review /aisys-validate /aisys-new-agent

AI doesn’t improvise. It operates inside the system.

Evolution.

This system didn’t start complete. It was refined through constraint and correction. Each step reduced ambiguity. Each step increased system integrity.

v0.6

Removed AI SDK

Control over convenience.

v0.8

Introduced DESIGN.md

Design becomes enforceable.

v0.9

Proved multi-agent pattern

Not theoretical — a working classifier-validator-orchestrator loop.

v0.9.1

Fixed runtime assumptions

Node types bug. Verification caught it before production.

v0.10

Claude-native workflows

AI becomes a collaborator, not a wrapper.

Real-world application.

This isn’t a sandbox exercise. The pattern maps directly to anywhere decisions matter:

AI-assisted plan review (PlanFlow)
regulated workflows
enterprise automation systems
multi-agent decision pipelines

Outcome.

A system that:

is runnable on day one
enforces structure before output
scales without collapsing
keeps humans in control

Download the starter repo on GitHub.

Clone it, run npm run validate, and you’ll have a working multi-agent system on day one.

View on GitHub

AI amplifies whatever system it operates inside.

Most teams are using AI backwards. They start with prompts. They build features. They hope the system holds. It doesn’t.

If the system is unclear, the output is unreliable.

AI doesn’t replace systems. It exposes whether you have one.

The Problem

AI has made execution cheap. Code can be generated. Interfaces can be assembled. Features can be shipped faster than ever.

But something hasn’t changed: judgment.

Most teams are still designing at the surface:

prompting instead of structuring
reacting instead of defining
building outputs instead of systems

The result:

unpredictable behavior
brittle workflows
no clear ownership of decisions

The Insight

So instead of asking “What can AI build?” the better question is:

What system should AI operate inside?

The Approach

I designed a starter system that forces AI to operate inside defined boundaries, explicit contracts, measurable evaluation loops, and human oversight. Not as a tool. As a participant in a system.

At its core, the system separates concerns clearly. Each layer has a responsibility. Nothing overlaps. Nothing is implied.

Multi-agent by design.

This isn’t theoretical. The system demonstrates a real multi-agent pattern where agents check each other before anything ships.

Input ↓ Classifier Agent ↓ Validator Agent ↓ Orchestrator ↓ Human Approval (if needed) ↓ Outcome

One agent proposes. One agent critiques. The system decides what happens next.

This is where most “agent systems” fail. They generate. They don’t govern.

Design as a contract.

AI interaction is explicit:

Recommendations are labeled.
Confidence is visible.
Reasoning is exposed.
Human approval is explicit.

Validation is built in.

Every system claim is testable. AI behavior isn’t assumed. It’s measured.

npm run validate → typechecking → agent evaluation → pipeline evaluation → dependency audit

Security by default.

The system intentionally avoids premature dependencies. No model SDK by default. No vendor lock-in. No hidden risk surface. Model providers are added only when justified.

Claude as a system participant.

/aisys-review /aisys-validate /aisys-new-agent

AI doesn’t improvise. It operates inside the system.

Evolution.

This system didn’t start complete. It was refined through constraint and correction. Each step reduced ambiguity. Each step increased system integrity.

v0.6

Removed AI SDK

Control over convenience.

v0.8

Introduced DESIGN.md

Design becomes enforceable.

v0.9

Proved multi-agent pattern

Not theoretical — a working classifier-validator-orchestrator loop.

v0.9.1

Fixed runtime assumptions

Node types bug. Verification caught it before production.

v0.10

Claude-native workflows

AI becomes a collaborator, not a wrapper.

Real-world application.

This isn’t a sandbox exercise. The pattern maps directly to anywhere decisions matter:

AI-assisted plan review (PlanFlow)
regulated workflows
enterprise automation systems
multi-agent decision pipelines

Outcome.

A system that:

is runnable on day one
enforces structure before output
scales without collapsing
keeps humans in control

Download the starter repo on GitHub.

Clone it, run npm run validate, and you’ll have a working multi-agent system on day one.

View on GitHub

Big Freight Life

Designing the System Behind Your AI

AI amplifies whatever system it operates inside.

The Problem

The Insight

The Approach

Multi-agent by design.

Design as a contract.

Validation is built in.

Security by default.

Claude as a system participant.

Evolution.

Real-world application.

Outcome.

Download the starter repo on GitHub.

Few are designing systems for AI.

Designing the System Behind Your AI

AI amplifies whatever system it operates inside.

The Problem

The Insight

The Approach

Multi-agent by design.

Design as a contract.

Validation is built in.

Security by default.

Claude as a system participant.

Evolution.

Real-world application.

Outcome.

Download the starter repo on GitHub.

Few are designing systems for AI.