The policy gate every agent needs before you go to production
Every tool your agent calls needs a policy gate before it runs. Here's the fail-closed pattern. consultBefore, the three outcomes, and the reactive approval trigger.
Every tool your agent calls should pass through a policy check before it runs. Not most tools. Not trusted tools. Every tool.
I learned this the hard way. I was building a customer support agent that could send emails, update CRM records, and issue refunds. The agent was sharp: it diagnosed problems correctly, it crafted good responses. And then a test case triggered a bug in the routing logic and it sent a refund email to the wrong person.
The agent had the capability. It didn’t have a gate.
A policy gate is the difference between an agent that’s powerful in demo and an agent that survives production. Here’s what it looks like in practice.
TL;DR: A policy gate checks every tool call before it runs: allow, deny, or pause for human approval. Fail-closed means if the policy service is unreachable, the call is denied. The reactive approval trigger lets a human resolve decisions without rebuilding your state machine. You can implement this in under 50 lines of Python with no framework.
Key takeaways:
- Every tool call goes through one chokepoint: the policy gate
- Fail-closed: if the gate can’t run, the call is denied: not allowed through
- Three outcomes: allow (proceed), deny (blocked), needs_approval (park and wait)
- The reactive trigger pattern (turn::on_approval) means one trigger handles every session’s approvals: no per-call resume functions
- You can implement this without any framework: just a function and a YAML ruleset
The gate is a single chokepoint
The pattern is simple. Every tool call, no matter what the agent decided to call, goes through one function before it executes. That function is your policy gate.
def consultBefore(tool_call: dict, timeout: float = 5.0) -> dict:
"""
Check a tool call against policy before execution.
Returns: {'decision': 'allow'} | {'decision': 'deny', 'reason': '..'} | {'decision': 'needs_approval'}
"""
# Load policy rules
policy = load_policy_rules("iii-permissions.yaml")
# Match the tool's function_id against the rule set
function_id = tool_call.get("function_id") or tool_call.get("name")
matched_rule = policy.match(function_id)
# Return the outcome
if matched_rule == "allow":
return {"decision": "allow"}
elif matched_rule == "deny":
return {"decision": "deny", "reason": matched_rule.reason}
elif matched_rule == "needs_approval":
return {"decision": "needs_approval", "parked_at": matched_rule.checkpoint}
else:
return {"decision": "deny", "reason": "no matching rule found"}
That’s the entire gate. Everything else is about what happens at each outcome.
The three outcomes
When consultBefore returns, you get one of three decisions:
allow: the tool call dispatches normally. The orchestrator triggers the target function and writes the result. Nothing special.
deny: the tool call short-circuits. Instead of executing, the result becomes a structured denial record:
{
"decision": "deny",
"envelope": "gate_unavailable",
"reason": "policy: delete_file tool not allowed outside /tmp",
"tool_call_id": "call_abc123"
}
The agent receives the denial and decides what to do next: try a different approach, ask the user, or report the block. You don’t hide the denial from the agent. The agent should know it was blocked and why.
needs_approval: the tool call parks. The rest of the batch keeps dispatching. The turn transitions to a waiting state only when one or more calls are pending. Nothing blocks unnecessarily.
# In the orchestrator
def dispatchWithHook(tool_calls: list):
results = []
awaiting_approval = []
for call in tool_calls:
outcome = consultBefore(call, timeout=5.0)
if outcome["decision"] == "allow":
result = execute_tool(call)
results.append(result)
elif outcome["decision"] == "deny":
results.append(build_denial_envelope(call, outcome["reason"]))
elif outcome["decision"] == "needs_approval":
awaiting_approval.append({"call": call, "reason": outcome.get("reason")})
if awaiting_approval:
transition_to("function_awaiting_approval", parked_calls=awaiting_approval)
return results
Fail-closed: the most important property
Here’s the property that makes the difference between a policy gate that protects you and one that fails open dangerously:
If the policy service is unreachable or the check times out, the call is denied.
Not allowed through. Not retried. Denied.
import signal
class TimeoutError(Exception):
pass
def consultBefore_with_timeout(tool_call: dict, timeout: float = 5.0) -> dict:
def timeout_handler(signum, frame):
raise TimeoutError("Policy check timed out")
# Set 5-second timeout
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(int(timeout))
try:
result = do_consult_before(tool_call)
signal.alarm(0) # Cancel the alarm
return result
except TimeoutError:
# Fail-closed: if we can't check, we deny
return {"decision": "deny", "reason": "gate_unavailable", "envelope": "policy_timeout"}
This is the part most implementations get wrong. They write the happy path, allow, deny, needs_approval, and forget what happens when the policy service itself is down. The answer has to be: deny everything until the service is back.
The approval wake: one reactive trigger
When a tool call needs human approval, the turn pauses. Someone, a manager, a customer, your Slack channel, makes a decision. The decision has to get back into the right turn.
The naive approach: register a per-call resume function when you park the call. Store it somewhere. Re-scan on startup to recover pending calls.
The better approach: one reactive trigger that fires on a shared topic.
approval::resolve(session_id, function_call_id, decision, reason)
↓
writes to: approvals/<sid>/<cid> = {decision, reason}
↓
fires: turn::on_approval (scope: approvals)
↓
advances the right session: whichever one owns that call
The orchestrator registers exactly one state trigger on scope approvals. When approval::resolve writes the decision to shared state, that write fires the trigger and the right session wakes up. No per-call resume registration. No startup re-scan to recover pending approvals.
# In the orchestrator: one trigger, handles everything
def register_approval_trigger():
engine.on(
trigger="turn::on_approval",
scope="approvals",
handler=lambda event: advance_session(event.session_id, event.approvals)
)
def advance_session(session_id: str, approvals: list):
# Read only the decisions that just landed
for approval in approvals:
call_id = approval["function_call_id"]
decision = approval["decision"]
if decision == "allow":
# Pre-approved dispatch: proceed exactly as if the gate said allow
result = execute_tool(parked_calls[call_id])
emit_result(call_id, result)
elif decision in ("deny", "aborted"):
# Synthetic denial: the human said no or timed out
emit_denial(call_id, reason=approval.get("reason", "human_denied"))
# When awaiting list is empty, transition back to running
if not pending_approvals(session_id):
transition_to("running")
The key property: one trigger covers every session. Adding a new approval surface (Slack, email, your internal tool) means writing a worker that calls approval::resolve. You don’t touch the orchestrator. You don’t modify the turn state machine. You add a new way to resolve and the trigger handles the rest.
The YAML ruleset
Your policy rules live in a versioned config file. This is what iii-permissions.yaml looks like in the iii harness:
rules:
- id: "allow_email_internal"
function_id: "send_email"
condition: "recipient_domain == 'company.com'"
decision: "allow"
- id: "deny_prod_db_write"
function_id: "db_write"
condition: "target_environment == 'production'"
decision: "deny"
reason: "Production database writes require manager approval"
- id: "approval_refund_over_100"
function_id: "issue_refund"
condition: "amount > 100"
decision: "needs_approval"
reason: "Refunds over $100 need finance approval"
The rules are readable, auditable, and version-controlled. You can see exactly what each rule covers and why each decision was made. When someone asks “why did the refund get blocked?”, you show them the YAML.
What this prevents
Without a policy gate, your agent can do everything it’s capable of doing. With one, it can only do what your rules permit.
The scenarios this catches that seem innocuous until they aren’t:
- A user asks “delete all my data” and the agent calls
delete_userwithout checking the user’s identity first - The routing logic sends an email to the wrong customer record because the CRM ID was wrong
- A refund gets triggered by a prompt injection attack in the user’s message
- A developer testing the agent accidentally runs
drop tableon the production schema
None of these are hypothetical. Every team that ships agents without policy gates hits one of these in the first month of production.
The 5-second rule in practice
Set a 5-second timeout on every policy check. If the check doesn’t return in 5 seconds, deny the call. This keeps a slow or hung policy service from blocking the entire agent.
5 seconds is long enough for most policy evaluations: you’re checking a ruleset, not running a simulation. It’s short enough that a runaway loop doesn’t burn through budget while waiting for a decision that isn’t coming.
POLICY_TIMEOUT = 5.0 # seconds
def consultBefore(tool_call: dict) -> dict:
start = time.time()
result = evaluate_policy(tool_call)
elapsed = time.time() - start
if elapsed > POLICY_TIMEOUT:
return {
"decision": "deny",
"reason": "policy_timeout",
"envelope": "gate_unavailable",
"elapsed_seconds": elapsed
}
return result
Adding a new approval surface
The reactive trigger pattern means any system can become an approval surface.
Want to approve tool calls from Slack? Write a worker that listens for /approve <call_id> and /deny <call_id> slash commands, then calls:
iii.trigger('approval::resolve', {
"session_id": session_id,
"function_call_id": call_id,
"decision": "allow", # or "deny"
"reason": "approved via Slack"
})
The orchestrator never knows the difference. The turn-orchestrator’s turn::on_approval trigger picks up the write and advances the session. You added a new worker; you didn’t replace the existing one.
The minimum implementation
If you want to implement this without any framework, here’s the smallest version that works:
import yaml
import time
class PolicyGate:
def __init__(self, rules_path: str):
with open(rules_path) as f:
self.rules = yaml.safe_load(f)["rules"]
def check(self, tool_call: dict, timeout: float = 5.0) -> dict:
function_id = tool_call.get("function_id") or tool_call.get("name")
start = time.time()
for rule in self.rules:
if rule["function_id"] == function_id:
if time.time() - start > timeout:
return {"decision": "deny", "reason": "policy_timeout"}
return {"decision": rule["decision"], "reason": rule.get("reason", "")}
# No matching rule: deny by default (fail-closed)
return {"decision": "deny", "reason": "no_matching_rule"}
That’s it. Load the YAML, check each tool call, return the decision. Everything else, the timeout handling, the approval parking, the reactive trigger, is additive on top of this core.
Agent mode: The policy gate pattern is the foundation of secure agent execution. Every serious production agent harness needs one. The pattern is the same whether you’re using LangGraph, a custom Python loop, or the iii engine: the gate is the single chokepoint that keeps your agent from doing things it shouldn’t.
Related Posts
Read AI agent error handling patterns for how to handle the downstream errors when a tool call is denied: retry strategies, fallback behaviors, and structured error responses.
Read AI agent multi-step workflows for how workflow orchestration and policy gates work together: sequential chains, conditional branching, and the human-in-the-loop checkpoint pattern.
Read AI agent logging and monitoring for what to log when a policy gate denies a call: decision logging, denial reasons, and how to replay a blocked run for debugging.
Galileo AI’s roundup of agent guardrails solutions compares 8 tools for policy enforcement and safety monitoring. An analysis of AI agent security in 2026 (https://agatsoftware.com/blog/ai-agent-security-enterprise-2026/) covers common enterprise security gaps and policy gate patterns.
Galileo AI’s guardrails roundup compares 8 tools for policy enforcement and safety monitoring.
This article was published on Agentic Up (https://agenticup.dev): practical guides for developers and founders building with AI agents. Reach me at [email protected].