designtheagent.com · a planning canvas & coach

Don't prompt.
Design.

Most agentic AI pilots are lost before a line of code is written, in the assumptions nobody put on paper.

Agentic AI design is org design. A one-page canvas for the nine decisions you make before you build, and a coach that reviews it before you commit.

95%
of enterprise generative-AI initiatives showed no measurable impact on profit and loss. MIT Project NANDA, 2025
~5%
of companies are capturing value from AI at scale. BCG, 2025
40%+
of agentic-AI projects forecast to be cancelled by the end of 2027. Gartner, 2025

The diagnosis

The model keeps improving.
The failure rate does not.

The cause is not in the technology. An un-designed agent is not un-designed; it is designed by accident, in code, by whoever shipped first. The briefing nobody wrote. The boundaries nobody drew. The oversight nobody owns. A person you hire restrains themselves, remembers, and can be held to account; an agent brings none of it. Every design decision you skip to close that gap is a debt, and model capability is the interest rate.

A brilliant
new hire
The worker
An LLM is capable on day one: trained, articulate, and not yet trustworthy. It optimises plausibility, not truth. None of the context, rules, or tools it needs arrives in the box.
40+ errors
What un-design looks like
In April 2026, Sullivan & Cromwell, the Wall Street firm that also advises OpenAI on safe AI, filed an emergency motion in federal bankruptcy court with fabricated citations and misquotes. The model was state of the art; the system around it was not. Bloomberg ↗
9 decisions
The fix
Everything you'd give any capable hire: a briefing pack, a job description, the rules, the tools, a team. All designed on paper, before the worker shows up.
Layer 01 · provides cognition

Worker the LLM

Capable, articulate, probabilistic. It generates the next plausible word, brilliant the way a new hire is brilliant on day one.

Layer 02 · provides control

Harness the operating system around it

The part you control. Five things, each already familiar from managing people:

Briefing pack Professional code Job description Memory Discretion
Layer 03 · provides reach

Tools the systems it can act on

A web search, a database query, an email sent. The worker issues a command, the harness catches it, the system executes. An LLM in a loop, with tools.

The Agent Operating Model

Every agent has an anatomy.

Three layers sit inside every agentic system, and each supplies one term of the same equation. The worker brings cognition. The harness imposes control. The tools grant reach. Most failures trace to the same blind spot: a worker chosen with care, wrapped in a harness and a governance regime that were never designed at all.

power = cognition × control × reach

It is a product, not a sum. A brilliant worker with wide reach and no control isn't powerful; it is a liability. Add a calendar and the worker becomes a scheduler. Give it the authority to move money and it becomes a fiduciary actor. The more reach you grant, the more control the harness has to hold.

The canvas

Nine decisions, before the build.

One page, three colour groups, nine cells: the Agent Operating Model turned into choices you make on paper, before anyone writes a line of code. This is the canvas itself. Start with Purpose. Always.

Agentic AI Design Canvas

North Star

What business outcome must this agent ultimately achieve?

Target Workflow & Success Metrics

Which workflow will it improve, and which metrics will show it worked?

01

Users & Stakeholders

Who uses it directly, and who is affected by what it does?

02

Performance Needs

What accuracy, explainability, speed, and cost standards must it meet?

04

AI Role & Autonomy

What role should it play, and where should its autonomy stop? (assist, advise, draft, decide, execute)

03

Context & Knowledge

What information is available to support its work, and which sources should it trust (e.g. policies, cases, records, reports, communications)?

05

Tools & Action Channels

Which tools or systems can it read from, write to, or act through (e.g. email, calendar, file storage, databases, enterprise systems, APIs)?

06

Rules & Boundaries

What must it always do, and what must it never do, even when asked?

07

Memory & Learning

What should it remember, what must it forget, and how will it improve over time?

08

Oversight & Accountability

Who oversees it, when must it hand off to a person, and who owns the outcome?

09
Purpose · why & for whom Capability · how it works Governance · Control & Trust

Want to run it in the room? Download the A1 canvas →

A worked example: the AI in Kirin's boardroom

Kirin Holdings gave its executive committee a thirteenth participant: CoreMate, an AI adviser that stress-tests proposals before the room decides. See the whole agent on one canvas, reconstructed from public sources and run through the coach.

Open example

Two canvases, two outcomes

The same nine questions decided both.

An un-designed agent is designed by accident. A designed one is designed on purpose. The answers ran in opposite directions.

Designed by accident

The compass set to the wrong north.

  • Cell 01Measured cost saved and agents replaced, not the quality of the outcome.
  • Cell 03Placed at execute: handling two-thirds of all interactions, end to end.
  • 07–09Blank. No never-do list, no way to detect a distressed customer, no owner.

A little over a year later, it narrowed what the agent was allowed to own and hired people back for the conversations that needed them.

Bloomberg · May 2025 ↗
Designed on purpose

Every cell filled.

  • Cell 01Named workflows: contract review, due diligence, legal research.
  • Cell 03Autonomy at draft: the agent drafts, a qualified lawyer signs.
  • 07–09Explicit: no training on client data, expert review of every output, an audit trail.

A&O Shearman cut contract-review time sharply, then co-developed with Harvey and redesigned its business model around the agent.

A&O Shearman · April 2025 ↗

The canvas coach

A coach, not a grader.

Fill the nine cells, then let a Claude-backed coach press the gaps a good workshop facilitator would: per cell, plus cross-cell contradictions and an overall readiness read.

  • Socratic, per-cell feedback. It surfaces gaps, vague answers and danger signals, and never invents facts about your company.
  • Cross-cell contradiction flags. A "decide"-level autonomy sitting above an empty Rules cell is a post-mortem waiting to happen, caught now.
  • Autonomy-aware. It judges every cell in proportion to the rung you choose in Cell 03.
  • Built for the room. Fill from a photo of a hand-drawn sheet, iterate in rounds, export JSON, print a clean PDF handout.

How to run it

Five moves. Half a day.

The canvas is built to be worked, not read. Answer its nine questions in half a day now, or discover them the hard way after you ship. In real cases, that has meant months of rework, a public reversal, or an apology to a federal judge.

01

Print it large, fill the room

Put it on an A1 sheet. Gather product, engineering, design, legal or risk, and someone who does the work the agent will touch.

02

Start with Purpose. Always.

Anchor on the workflow and the people it serves before anyone discusses what the AI does. Teams that start in Capability design a clever agent in search of a job.

03

Design the role, then its boundaries

Place the agent on the autonomy spectrum, then decide which tools, knowledge and rules that rung requires. Autonomy is the pivot.

04

End with Control & Trust, out loud

What must never happen? What persists, what is forgotten? Who owns the outcome? The cells teams skip are where the failures cluster.

05

Walk the canvas, hunt the gaps

The empty cells and the contradictions are the deliverable. Surface them now, in a room, for the price of an afternoon.

The LLM provides cognition.
You provide control.

For the first time, you have to write it all down before the worker shows up. And the better the model becomes, the more that control is worth. Start on the canvas.