Skip to main content

Risk Engine

The risk engine evaluates every tool call against your policies and assigns risk scores. It runs synchronously — tool calls are evaluated in real-time before the agent proceeds.

How it works

Tool call received


┌─────────────────┐
│ Load all enabled │
│ policies │
└────────┬────────┘


┌─────────────────┐
│ Match conditions │──→ No match → ALLOWED (no risk)
│ against tool call│
└────────┬────────┘
│ One or more match

┌─────────────────┐
│ Apply strictest │
│ action wins │
└────────┬────────┘

├─→ block → Tool call BLOCKED, risk event created
├─→ require_approval → Tool call PAUSED, approval request created
├─→ warn → Tool call ALLOWED, risk event flagged
└─→ allow → Tool call ALLOWED

Risk severity levels

LevelDescriptionExample
lowMinor concern, informationalRead access to non-sensitive files
mediumModerate concern, worth monitoringWrite operations to staging
highSignificant concern, may need reviewAccess to sensitive files
criticalImmediate attention requiredDestructive commands, production deploys

Risk events

When a policy matches, a risk event is created with:

  • Severity — based on the policy action
  • Policy ID — which policy triggered
  • Tool call details — what the agent tried to do
  • Action taken — block, warn, or require_approval
  • Timestamp — when it was evaluated

Risk events are visible in:

  • The Runs detail page (timeline view)
  • The Dashboard overview (risk distribution chart)
  • The Metrics API (GET /v1/metrics/risk)

Strictest-action-wins

When multiple policies match the same tool call, the most restrictive action applies:

Policy A: toolName=deploy → warn
Policy B: inputPayload.env=production → require_approval
Policy C: agentId=untrusted-agent → block

Result: BLOCKED (block > require_approval > warn)

This ensures that a permissive policy can never override a stricter one.

Metrics

The risk engine feeds into the metrics system:

  • Risk distribution — breakdown by severity level over time
  • Top risky tools — which tools trigger the most risk events
  • Agent risk profiles — which agents have the highest risk activity
  • Policy match rates — how often each policy fires

Access via GET /v1/metrics/risk or the dashboard analytics page.