Section 2 Takeaway: Trust is Structural

2026-05-07

Section 2 · WrapEnd of trust architecture

Trust is structural, not behavioral.

You cannot prompt your way to safety. You cannot fine-tune your way to compliance. A trustworthy agent is one running inside a system designed so that any reasonable model — and many unreasonable ones — produces only acceptable outcomes.

Authority follows blast radius, not capability.

What an agent is allowed to do is a function of how recoverable the action is — not how clever the model behind it appears to be in a demo.

Defense in depth means three phases, not three vendors.

Pre-flight, in-flight, and post-flight validation catch non-overlapping failure classes. They are not substitutes for each other.

A kill switch is a designed control, not a slogan.

Scoped, triggered, operated, and tested. If your team cannot describe all four, you do not have a kill switch.

The hand-off is a UX problem disguised as a policy problem.

Approval fatigue and automation complacency are the enemies. The interface determines whether oversight is real or theater.

Now: even if the architecture is right, the management model has to change. Section 3.

SECTION 3 →