The most common failure mode I've seen in agentic development adoption: someone builds an impressive demo with an AI agent, tries to scale it to a team, and discovers that "impressive demo" and "reliable team infrastructure" are very different engineering problems.

The demo works because the conditions are controlled. The infrastructure has to work when conditions are not controlled — when the request is ambiguous, the external API is flaky, the junior developer triggers a governance review by accident, and the on-call engineer is trying to understand what the agent did at midnight.

Layered architecture is the answer to that. Build the foundation right first. Each layer adds capability without requiring the layers below it to change. Skip a layer if you don't need it yet; come back to it when you do.

The seven layers

Layer 1: Foundation

AWS setup, GitHub organisation configuration, secrets management. This isn't exciting, but it's where things go wrong. Secrets in environment variables instead of SSM. IAM policies that are too broad. GitHub webhooks without signature verification. Layer 1 is about closing these gaps before anything else runs on the platform.

Layer 2: Governance personas

Deploy the governance personas: Sentinel (security), Auditor (cost), Architect (design), Tester (quality). Each persona has a defined focus, a review criteria list, and an output schema. They're not chatbots — they have specific jobs and produce structured outputs.

The key decision at Layer 2: make personas reachable via @mention in both Slack and GitHub issues/PRs. This means the team doesn't need to change how they work to get governance input — they just @mention the persona in the conversation they're already having.

# In a GitHub issue or PR comment:
@sentinel review this change for security implications
@auditor what's the cost impact of adding a NAT gateway here?
@architect is this design consistent with our event-driven pattern?

Layer 3: Execution engine

Parallel task execution with governance at every stage. Tasks are submitted to a queue, workers pick them up, governance reviews happen at defined checkpoints, and results are stored with full audit trails. This is the layer that moves work from "persona gave us advice" to "persona executed a task and produced an artefact."

The "governance dial" at this layer is one of the most useful design choices: you can run the execution engine in a fully supervised mode (every action requires human approval), a semi-supervised mode (humans approve significant actions, routine tasks run unattended), or an autonomous mode (human review happens post-execution, not pre-execution). Different teams need different settings; the dial makes this explicit rather than encoding it in policy documents nobody reads.

Layer 4: GitHub integration

Governance gates in CI. Every PR can trigger a governance review — security, cost, architecture — as a GitHub Actions check. The check passes or fails based on what the governance personas find. This is where "governance as code" becomes "governance as CI": it runs automatically, it blocks merges when it needs to, and it produces a structured report that appears in the PR.

Layer 5: Knowledge and context

Persistent memory for personas. A Sentinel persona that remembers the last ten security reviews it did is more useful than one that starts fresh every time. This layer adds a knowledge store — architectural decisions, past findings, open issues, team conventions — that personas can query to produce contextually aware output rather than generic advice.

Layer 6: Observability

You can't run a self-healing system without knowing what's healthy. This layer adds structured logging, metrics, and dashboards — not just for the application, but for the agentic layer itself. How many tasks completed successfully? How many governance gates were triggered? How long did the last execution cycle take? If the self-healing layer later needs to detect and respond to anomalies, it needs these signals to be reliable.

Layer 7: Self-healing

The layer that monitors the system and fixes what it finds. When the observability layer surfaces an anomaly — a failing health check, a Lambda that's timing out, a DynamoDB table approaching capacity — the self-healing layer classifies the issue, selects a remediation strategy, executes it (with appropriate governance approval based on severity), and records what it did.

Self-healing doesn't mean the system never needs humans. It means routine, well-understood failures are resolved without waking someone up. Novel failures — ones the system hasn't seen before — still escalate to humans, with full context about what was tried and what the system observed.

The @mention interface

I want to say more about the @mention pattern because it's been the most unexpectedly high-value design decision in the whole stack.

The friction of switching context — leaving the code review to open a different tool, fill in a form, wait for output, then come back — is exactly the friction that makes governance feel like bureaucracy. When @sentinel is just a comment in the PR you're already reading, governance becomes a natural part of the workflow rather than an interruption to it.

The same pattern works in Slack: you're discussing an architecture decision in a channel, you @architect the question you're debating, and the response appears in the thread. No context switch. No special interface. The governance system meets people where they are.

Layers aren't sequential requirements

The stack is designed so each layer is useful independently. You can deploy Layer 2 (governance personas via @mention) without building Layers 3–7. A team that just wants AI-assisted code review can stop there. A team that wants autonomous task execution can add Layer 3. A team doing incident response automation might jump to Layer 7 before finishing Layer 5.

The important constraint is that each layer depends on the ones below it being stable. Don't build self-healing on top of flaky observability. Don't add an execution engine before governance is working reliably. The discipline is in resisting the temptation to skip ahead to the impressive bits before the foundations are solid.

A note on pace: Each layer takes real engineering time to do properly. Rushing through Layer 2 to get to Layer 7 produces a system that looks impressive in a demo and fails in production. I've seen this. The layers are not a checklist; they're a migration path. Take the time each one deserves.

If the articles or tools have been useful, a coffee helps keep things running.

☕ buy me a coffee

Related tools and articles

→ OMNI: building a self-aware capability mesh → Brood: worker queues for governance agents → Governance as code

ticketyboo brings governed AI development to your pull request workflow. 5 governance runs free, one-time welcome grant. No card required.

View pricing Start free →