Runtime Policy
Every tool call your agent makes passes through Warden before it executes. The evaluation is deterministic for safety rules and heuristic for session guidance. Each call receives one of five decisions:
| Decision | What happens | Example |
|---|---|---|
| Pass | Command proceeds silently. Zero overhead. | Normal work — the agent doesn’t know Warden is there. |
| Teach | Command runs, agent receives a targeted hint. | ”4 files edited since last build — consider running tests.” |
| Apply | Warden rewrites the command to a safer or more efficient form. | grep rewritten to rg with equivalent flags. |
| Require Structure | Agent is asked to restructure its approach before proceeding. | Session is looping — Warden asks for a different strategy. |
| Deny | Command is blocked. Agent sees why and what to do instead. | rm -rf /, credential exposure, reverse shell patterns. |
This is not post-hoc analysis. The decision happens on the live path, before the command reaches your environment.
Why Runtime Policy Matters
Static rules in a CLAUDE.md or system prompt are suggestions. The agent can ignore them, misinterpret them, or forget them after context compaction. They also consume context budget on every turn.
Runtime policy operates outside the model’s context window, on the actual tool call. The agent cannot bypass it because the evaluation happens before the command executes. A rm -rf / in a static rule file is a polite request. A rm -rf / caught by runtime policy is a wall.
Deterministic vs. Heuristic
Not everything Warden does carries the same guarantee. Understanding which is which builds trust:
Deterministic — hard guarantees, same input always produces same output:
- Safety deny/allow decisions (pattern matching against compiled rules)
- Tool substitution rewrites
- Output compression where filter rules exist
- Rule merge behavior (compiled → global → packs → project)
Heuristic — bounded signals, advisory in nature:
- Drift detection and focus scoring
- Loop detection and verification debt tracking
- Session phase classification
- Advisory injection timing and budget
Deterministic protections are Warden’s floor — they work regardless of model, session length, or context state. Heuristic guidance is Warden’s ceiling — it improves sessions but degrades gracefully if signals are ambiguous.
Decision Types in Detail
Deny blocks the command entirely. The agent receives a structured explanation:
BLOCKED: rm -rf on broad paths. Remove specific files by name.
Rule: safety.0 | Pattern: \brm\s+-rf?\s+[~*/.]
Most agents self-correct immediately after a deny.
Apply rewrites the command transparently. The agent intended grep -r "pattern" src/, Warden runs rg "pattern" src/ instead. The result is faster, produces less output, and the agent learns the substitution for the rest of the session.
Teach lets the command run but injects a hint alongside the output. Advisories are budgeted — healthy sessions get fewer, struggling sessions get more. Safety signals always fire regardless of budget.
Require Structure is for deeper session issues. When the agent is stuck in a loop or has accumulated significant verification debt, Warden asks it to step back and restructure rather than retry the same approach.
Pass produces zero overhead. In a healthy session, the vast majority of tool calls hit this path. The agent works as if Warden isn’t there.
How Policies Compose
Rules come from four sources, merged at startup in precedence order:
- Compiled defaults — shipped with Warden. Safety rules, substitutions, hallucination detection. Core safety rules cannot be disabled.
- Global rules (
~/.warden/rules.toml) — your personal overrides that apply to all projects. - Installed packs (
~/.warden/packs/*.toml) — community or official rule packs for specific domains (database, infrastructure, frontend, etc.). - Project rules (
.warden/rules.toml) — project-specific overrides. A monorepo might need different thresholds than a small library.
Each layer can append rules or use replace = true to fully override defaults for that category. Project rules always win.
Advisory Budget
Not every advisory is worth injecting. Each one consumes context tokens, and excessive injection degrades the session it’s trying to help.
Advisories are ranked by impact. Session health determines how many can fire per turn — healthy sessions hear very little, struggling sessions get comprehensive guidance. Safety signals are never budgeted; they always fire.
The net effect: when everything is going well, Warden is invisible. As a session degrades, Warden becomes more vocal — but only with the most impactful corrections.