Briefing · Agentic AI Systems · Four levels of autonomy

Four levels of agentic AI.

From a chatbot in a browser tab to a coordinated system running an entire operation — and how each step moves more autonomy to the model while the structure around it grows up.

The model is replaceable. The harness is your engineering asset.

Level 01 · Chatbots Level 02 · Workflows Level 03 · Agentic Level 04 · Systems
01 / 09 · Title
02 / 09
01 · The ladder · four steps, one shape

The leap is not more code — it's more structure.

Each step moves more decision-making to the model and gives it more of your business in context. Underneath, "harness engineering" is just files in folders feeding the right information at the right moment.

Level 01 · Advice
01
Chatbots

A passive assistant in a browser tab. Knows nothing about your business until you paste it in.

Autonomy ~18%
DecidesYou
ToolsChatGPT · Claude · Gemini
LimitStatic context, no execution
Level 02 · Automation
02
AI Workflows

A pipeline that fires on a trigger. AI fills gaps inside steps you defined — same order, every time.

Autonomy ~42%
DecidesYou define every step
Toolsn8n · Zapier · Make
LimitCannot re-route or judge
Level 03 · Reasoning
03
Agentic Workflows

You give a goal — the model picks the path. Reason → Act → Observe → Iterate inside a harness.

Autonomy ~72%
DecidesThe model
ToolsClaude Code · Codex · Cursor
LimitOne agent, one goal, no memory
Level 04 · Production
04
Agentic AI Systems

A coordinated team of skills, MCPs and shared memory running an entire operation — humans in the loop where it counts.

Autonomy ~96%
DecidesThe system, you review
ToolsSkills · MCPs · Memory · Harness
LimitYou sit at the gates

The core idea: the leap from a smart chatbot to a production-grade operation is structure, not code — skills, memory, and MCPs are just markdown files and folders that feed the right context to the right agent at the right time.

Advice Operations
03 / 09
02 · The spectrum · what changes between levels

Autonomy grows. The structure around it grows up too.

18%
Level 01 · Chatbot
Advice, no action
42%
Level 02 · Workflow
Automation, not judgment
72%
Level 03 · Agentic
Goals, not steps
96%
Level 04 · System
Operations, not tasks
Autonomy ◷ 0% 100% ◷ Operational
Level 01 · Chatbot
Level 02 · Workflow
Level 03 · Agentic
Level 04 · System
Who decides the path
You
You define every step
The model
The system, you review
Business context
None — paste each time
Static prompt template
Files the agent can read
Skills + memory + tools
Can it act?
No — chat only
Inside pre-defined steps
Yes — via tools
Yes — across systems
When it tops out
No execution, no context
Can't think or re-route
One goal, no memory
— the new baseline —
01
04 / 09
Level 01 · Four · Chatbots

Chatbots — advice, no action.

A passive assistant that lives in a browser tab. Smart on the surface, powerless beneath it: no business context, no execution, no awareness of last week's work.

Example · Content repurposing

"Write me a LinkedIn post."

Here's the transcript of this week's video. Write me a LinkedIn post about it.
Generic draft
"Excited to share my latest insights! AI is transforming how we work, and here are 5 takeaways that will change the game…"
— emoji-heavy, doesn't sound like you, no idea your carousels outperformed text last month.

Projects, gems, and custom GPTs help — but it's still static context you paste in manually.

Why it tops out

Two structural limits

  • ×
    Doesn't know your business.

    Brand voice, audience, post history, what worked last week — none of it is in the model unless you paste it in every time.

  • ×
    Passive by design.

    Waits for you to prompt. Won't pull anything, run anything, or check anything on its own.

  • Where it shines.

    First drafts, brainstorms, explanations — anywhere a thinking partner is enough and execution isn't required.

Autonomy
~18%
Who decides the path100% you
Typical tools
ChatGPTClaudeGemini
02
05 / 09
Level 02 · Four · AI Workflows

Workflows — automation, not judgment.

A pipeline that fires on a trigger. The AI fills gaps inside steps you defined — same order, every time. Magical at first, brittle at scale.

Example · The "auto-LinkedIn" pipeline

Trigger → Transcript → Draft → Schedule

1
YouTube publishes a new videoTrigger fires the workflow automatically
trigger
2
Pull the transcriptVia API node into the workflow runtime
fetch
3
Send to AI with hardcoded voice promptVoice guidelines pasted into a template months ago
ai · llm
4
Drop draft into the schedulerYou review and click publish
publish
Why it tops out

The workflow can't think

  • ×
    No judgment calls.

    If this week's topic is better as a Twitter thread than a LinkedIn post, the workflow can't tell — it runs the same steps in the same order.

  • ×
    Static prompts go stale.

    The voice file you wrote three months ago doesn't know your carousels are outperforming text right now.

  • ×
    Output not good enough? You rewrite the prompt.

    The workflow can't iterate on its own.

  • Where it shines.

    Repeatable, well-defined work where the steps genuinely never change.

Autonomy
~42%
Who decides the pathYou — every step
Typical tools
n8nZapierMake.com
03
06 / 09
Level 03 · Four · Agentic Workflows

Agentic — goals, not steps.

You give a goal — the model picks the path. This is where AI starts thinking, not just doing. The shift from "follow my recipe" to "figure it out".

Example · One prompt, full content set

"Turn this video into LinkedIn, Twitter and a newsletter."

Turn this week's video into content for LinkedIn, Twitter, and my newsletter.
Reasoning trace
Pulling transcript… reading brand voice file… topic suits visual storytelling → drafting LinkedIn carousel… contrarian angle present → drafting X thread… running both through style guide… rewriting two posts that failed quality bar… saving for review.

You didn't write those steps. The model decided them based on the goal you gave it.

The core mechanic

The ReAct loop, inside a harness

↺ ITERATE UNTIL THE GOAL IS REACHED LLM loop driver ① REASONplan next move ② ACTcall a tool ③ OBSERVEread the result ④ ITERATErefine, retry when evidence enough → ▸ ANSWER · CITED 3 drafts + a thread reviewed against brand voice SAVED FOR REVIEW

A harness is the infrastructure that turns thinking into doing — read files, run tools, check its own work. Without one, you have a chatbot in a tab.

Autonomy
~72%
Who decides the pathThe model
Typical tools
Claude CodeCodexCursor
04
07 / 09
Level 04 · Four · Agentic AI Systems · production

Systems — a team that runs operations.

Not one agent doing one job — a coordinated system running an entire operation. Multiple skills, shared memory, real tools, with you in the loop where it counts.

Example · From one video to a published week

One command, the content engine runs

  • Clip extractor skill ranks short-form moments by virality criteria.
  • Carousel skill builds platform-specific decks with on-brand dimensions and copy.
  • Newsletter skill drafts the weekly from the key takeaways.
  • Ad-copy skill generates angles based on what's performed before.
  • Scheduler MCP queues everything for review — nothing publishes without you.
The building blocks · harness engineering

Six primitives — all just files in folders

COORDINATOR routes the work Skillsfolders of instructions MCPsplugs into your tools Memorycross-session context Human-in-loopreview at the gates Ruleshow it behaves Hookscheckpoints & gates
Autonomy
~96%
Who decides the pathThe system · you review
Building blocks
SkillsMCPsMemoryHooks
08 / 09
03 · The pattern · what the harness actually is

Underneath the terminology — it's just files in folders.

Closer to organising a Notion workspace than writing code. Which is why the audience is business owners and knowledge workers — not just developers.

It's not more code — it's more structure: skills, memory and MCPs are markdown files and folders feeding the right context to the right agent at the right time.

— the thesis behind every Jentrix engagement

Versioned in your repo

Skills, memory, rules — diffed, reviewed, CI-checked. Improvable like any other engineering asset.

Reviewable by humans

A new joiner reads brand-voice.md the same way they'd read a Notion doc.

Replaceable model

Drop in a better model next quarter; your harness — the asset — keeps shipping.

Human at the gates

~95% autonomous, but you sit at the input and the publish step. Where it actually matters.

09 / 09
04 · Closing · start where the cost of staying a pilot is highest

Start with one process.

Bring the workflow you're not sure about. We'll spend 30 minutes mapping whether a harness pays off — and you'll leave with a written recommendation either way.

The model is replaceable. The harness is your engineering asset.

~18% · Chatbot ~42% · Workflow ~72% · Agentic ~96% · System
What happens next

A working session, an honest read.

30 minutes with the people who own the workflow. We map roles, rules, memory, validation — and where humans stay in the loop.

  1. 1A 30-minute working session. Your workflow, our questions. Bring whoever owns the outcome.
  2. 2We map the harness. Roles, rules, memory, validation — and exactly where AI fits, where humans stay in the loop.
  3. 3You get a written recommendation — including "don't build it yet" if that's the honest call. Yours to keep.