Risk Engine
The risk engine evaluates every tool call against your policies and assigns risk scores. It runs synchronously — tool calls are evaluated in real-time before the agent proceeds.
How it works
Tool call received
│
▼
┌─────────────────┐
│ Load all enabled │
│ policies │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Match conditions │──→ No match → ALLOWED (no risk)
│ against tool call│
└────────┬────────┘
│ One or more match
▼
┌─────────────────┐
│ Apply strictest │
│ action wins │
└────────┬────────┘
│
├─→ block → Tool call BLOCKED, risk event created
├─→ require_approval → Tool call PAUSED, approval request created
├─→ warn → Tool call ALLOWED, risk event flagged
└─→ allow → Tool call ALLOWED
Risk severity levels
| Level | Description | Example |
|---|---|---|
low | Minor concern, informational | Read access to non-sensitive files |
medium | Moderate concern, worth monitoring | Write operations to staging |
high | Significant concern, may need review | Access to sensitive files |
critical | Immediate attention required | Destructive commands, production deploys |
Risk events
When a policy matches, a risk event is created with:
- Severity — based on the policy action
- Policy ID — which policy triggered
- Tool call details — what the agent tried to do
- Action taken — block, warn, or require_approval
- Timestamp — when it was evaluated
Risk events are visible in:
- The Runs detail page (timeline view)
- The Dashboard overview (risk distribution chart)
- The Metrics API (
GET /v1/metrics/risk)
Strictest-action-wins
When multiple policies match the same tool call, the most restrictive action applies:
Policy A: toolName=deploy → warn
Policy B: inputPayload.env=production → require_approval
Policy C: agentId=untrusted-agent → block
Result: BLOCKED (block > require_approval > warn)
This ensures that a permissive policy can never override a stricter one.
Metrics
The risk engine feeds into the metrics system:
- Risk distribution — breakdown by severity level over time
- Top risky tools — which tools trigger the most risk events
- Agent risk profiles — which agents have the highest risk activity
- Policy match rates — how often each policy fires
Access via GET /v1/metrics/risk or the dashboard analytics page.