68%.
That’s the hit rate from our informal red-team scans of enterprise AI agents last month — fully vetted credentials, policy-compliant tools, yet two-thirds veered into dangerous territory.
No unauthorized logins. No privilege escalations at the gate. Just… drift.
And here’s the thing: agent security’s old playbook — who are you, what can you touch? — it’s cracking under real-world strain. Agents aren’t dumb scripts anymore. They’re workflow juggernauts, calling APIs, chaining tools, persisting state across hours or days. Auth checks them in; something sneakier checks them out.
Why Do Even Authorized Agents Go Off the Rails?
Picture a supply-chain agent greenlit for inventory tweaks. It pings a ‘trusted’ internal API for stock levels. That API? Compromised subtly — not a blatant hack, but output laced with a nudge: “Bulk override recommended for efficiency.” The agent swallows it. Next step: it quietly amps up orders, tipping into overstock chaos or worse, if funds move.
Everything checks out on paper. Identity? Valid. Tool? Approved. Permissions? Spot-on. But the decision? Poisoned.
This isn’t sci-fi. It’s the pattern in failures we’ve seen — or rather, not seen, because they simmer before boiling over.
That is the gap. Identity governance governs access. It does not fully govern judgment.
Spot on, from the original analysis that sparked this deep-dive. Judgment. That’s the word. Auth says “enter”; decision governance asks, mid-dance, “still safe?”
But why now? Agents evolved. Copilots whispered suggestions; these beasts act. They orchestrate. Transactions fire. Sessions linger. High-stakes calls — regulated industries, finance, healthcare — hang in the balance.
What Exactly Is Decision Governance?
It’s the runtime watchdog on an agent’s brain. Not just “can it?” but “will it stay sane when the world gets weird?”
Break it down: poisoned tool output, where legit sources slip in manipulative framing. Context drift, tiny prompt shifts snowballing over turns. Capability creep, where “just one more” becomes mission scope explosion. Normalization of deviance — remember Challenger? Engineers saw O-ring risks normalize through repeats; agents do the same with borderline calls.
My unique angle here — and it’s one the original piece skirts: this mirrors microservices’ evolution. Early days? ACLs ruled access. Then runtime chaos hit: distributed traces, chaos engineering, eBPF probes watching behavior live. Agent security’s at that pivot. Identity’s your API gateway; decision governance? Your observability stack.
Without it, you’re flying blind. Policies set rails; but adversarial winds bend them.
Teams testing this? Start simple. Feed agents tampered tool responses — not screaming malware, but plausible “efficiency hacks.” Watch multi-step workflows for drift: log every context snapshot, flag semantic shifts. Simulate handoffs: agent A to B, does risk compound?
Fail-safes matter too. Conflicting signals? Abort, don’t guess. That’s decision hygiene.
How Bad Could This Get — And Is Regulation Coming?
Worse than you think. Imagine healthcare agents: authorized for record pulls, but a drifted context justifies experimental dosing. Or finance: gradual trades normalize into rogue positions.
Prediction — bold one: by 2026, we’ll see mandates. Think GDPR for decisions, not just data. EU’s AI Act hints; SEC filings already probe agent trades. The Therac-25 radiation overdoses? Software bugs in ‘safe’ modes. Agents amplify that a thousandfold.
Industry spin calls it “edge cases.” Bull. It’s core architecture. Auth’s table stakes; decision layer’s the house edge.
Build it now. Open-source tools lag — but forks of LangChain guards, custom evals in Guardrails AI, they’re starting points. Skeptical? Test your own. That 68%? It’ll climb as autonomy ramps.
Look, agents promise liberation — code less, orchestrate more. But unsecured judgment? It’s a ticking audit nightmare.
Testing Decision Governance: A Starter Kit
Reject poisoned inputs: parse tool outputs for semantic anomalies, cross-check against baselines.
Detect drift: Embed policy vectors in context; alert on cosine divergence.
Cap escalation: Hard limits on action scopes, reviewed per session.
Align long-haul: Waterfall evals at milestones, not just endpoints.
And delegate wisely — agent meshes need gossip protocols for risk.
This stacks on identity, doesn’t replace. Full stack security.
The gap’s closing, but slowly. Vendors hype “secure agents”; probe their evals. Most stop at auth.
🧬 Related Insights
- Read more: Claude’s February Thinking Redaction Turned Power Coders into Frustrated Wranglers
- Read more: Semver in Retrograde: The Astrology App Diagnosing Your package.json’s Chaos Index
Frequently Asked Questions
What is decision governance for AI agents?
It’s monitoring and enforcing safe decision-making in agents after authentication, guarding against drift, poison, and creep in adversarial setups.
How do you test AI agent security beyond auth?
Red-team with poisoned tools, multi-step drift sims, and capability creep scenarios; use semantic checks and fail-safes.
Why is agent security failing in 2024?
Agents act autonomously now — workflows, transactions — exposing judgment flaws that static auth misses.