Anatomy of an Autonomous Agent

2026-05-07

Section 1 · TaxonomyWhere the architecture changes

The Autonomous Agent: a loop with a goal.

Four components turn a model into an agent. Each one is conceptually simple. Each one also introduces a failure class your existing controls were not designed for.

Goal

A persistent objective that survives across many tool calls. The agent keeps working until the goal is satisfied — or it gives up.

New failure class

Goal mis-specification. The agent satisfies the literal goal in a way no human would have endorsed. (“Reduce ticket backlog” → mass-close everything.)

Planner

Decomposes the goal into sub-tasks, sequences them, and decides what to do next based on intermediate results.

New failure class

Plan drift. Each step seems locally rational; the trajectory ends somewhere the original plan never contemplated.

Memory

Short-term scratchpad for the current task; long-term store for facts, prior runs, and learned patterns.

New failure class

Context poisoning. Hostile input written into memory by an earlier interaction influences a later, higher-stakes decision.

Reflection

The model evaluates its own output, catches mistakes, retries with adjustments. Closes the loop on its own work.

New failure class

Confident wrongness. The reflection step rationalizes a bad decision rather than rejecting it. The agent argues itself into the wrong answer.

What changes architecturally

The system is no longer stateless. Decisions made in step 3 depend on memory written in step 1. Replay, audit, and rollback all become first-class concerns.

What changes operationally

You are no longer reviewing answers. You are reviewing trajectories — the sequence of decisions an agent made on its way to an outcome.