AI Agent Governance

Last updated April 25, 2026

Zero Trust AI agent governance built on industry standards

VeriRFP implements all 25 requirements of the Cloud Security Alliance Agentic Trust Framework. Every AI agent operates under structured governance across identity, behavior, data, segmentation, and incident response.

View plans Security overview

ATF conformance at a glance

25/25 ATF requirements implemented
20 governed agent types with capability manifests
HMAC-SHA256 signed audit chains
Ed25519 signed compliance heartbeats
Redis-backed workspace kill switch checked at every agent gate
4-level autonomy model with automatic demotion

What is AI governance?

AI governance is the discipline of applying structured controls to AI agents and models so their behavior stays within approved boundaries. It covers identity and capability declaration, behavioral monitoring, data handling, segmentation of permitted actions, and incident response with rollback and kill-switch paths. Governance frameworks such as the Cloud Security Alliance Agentic Trust Framework define concrete requirements organizations can implement and attest to.

VeriRFP governs every AI agent through cryptographically signed audit trails, behavioral baseline monitoring with statistical anomaly detection, OPA policy evaluation with OpenFGA authorization, multi-stage LLM security scanning, Redis-backed circuit breakers with workspace-level kill switches, and a four-level autonomy model that automatically demotes agents on critical incidents. This governance stack is aligned to the Cloud Security Alliance Agentic Trust Framework at the Compatible (self-attestation) level.

Identity Management

Every agent is uniquely identified, cryptographically signed, and capability-declared.

Req	Requirement	Status	Implementation
I-1	Unique immutable agent ID	Met	20 governed agent types with per-execution trace_id and trace_span_id for full correlation.
I-2	Credential binding	Met	HMAC-SHA256-v2 signed audit chains ensure tamper-evident execution records. Ed25519 signed compliance heartbeats verify ongoing system integrity.
I-3	Ownership chain documented	Met	Every governance decision records workspace_id, actor identity, role, approval context, and target resource.
I-4	Purpose declaration	Met	Machine-readable capability manifests declare each agent's purpose, permitted actions, and data access scope.
I-5	Capability manifest	Met	Typed manifests enumerate permitted actions, data read/write boundaries, external call targets, and maximum impact scope per agent.

Behavioral Monitoring

Rolling baselines flag agents that drift more than two standard deviations from their 30-day profile.

Req	Requirement	Status	Implementation
B-1	Structured logging	Met	Append-only agent audit log with structured entry schema covering action, entity, confidence, duration, and outcome.
B-2	Action attribution	Met	Every audit entry records agent type, action performed, entity affected, trace correlation, and execution outcome.
B-3	Behavioral baseline	Met	Rolling 30-day baselines computed per agent type per workspace. Metrics include average duration, error rate, request frequency, and p95 latency.
B-4	Anomaly detection	Met	Statistical deviation detection triggers alerts when behavior exceeds 2 sigma from baseline. Worker processor runs hourly for active workspaces.
B-5	Explainability	Met	Audit entries include input_summary, output_summary, confidence_score, and reasoning traces for draft pipeline agents.

Data Governance

Multi-stage security pipeline with injection scanning, PII masking, and evidence lineage.

Req	Requirement	Status	Implementation
D-1	Schema validation	Met	Zod schemas enforce typed contracts for all tool inputs, governance surfaces, and API boundaries.
D-2	Injection prevention	Met	10+ pattern prompt injection scanner in the LLM gateway security pipeline with configurable enforcement modes.
D-3	PII/PHI detection and masking	Met	Stage 2 DLP with PII detector and redaction engine processes all LLM responses before delivery.
D-4	Output validation	Met	VeriScore confidence grading and retrieval adequacy evaluation score every AI-generated response against evidence.
D-5	Data lineage	Met	Evidence graph with OSCAL control mappings, framework crosswalks, and GraphRAG link methods maintain full provenance.

Segmentation

Relationship-based access control with policy-evaluated action boundaries.

Req	Requirement	Status	Implementation
S-1	Resource allowlist	Met	OpenFGA relationship-based authorization model with explicit tuple-based access grants per workspace.
S-2	Action boundaries	Met	OPA policy engine evaluates 22+ governance decisions with configurable posture levels per workspace.
S-3	Rate limiting	Met	Per-workspace rate limiting enforced at the MCP server layer with configurable thresholds.
S-4	Transaction limits	Met	Billing enforcement with per-plan quotas for AI drafts, evidence documents, trust centers, and seats.
S-5	Blast radius containment	Met	Four-level autonomy model constrains agent impact scope. Workspace isolation prevents cross-tenant data access.

Incident Response

Circuit breakers, kill switches, and automatic demotion within one second.

Req	Requirement	Status	Implementation
R-1	Circuit breaker	Met	Redis-backed circuit breakers track consecutive failures per agent per workspace. Three failures trip the circuit, blocking execution until cooldown.
R-2	Kill switch	Met	Workspace-level kill switch terminates all agent execution within one Redis TTL cycle. Requires admin role and is logged as a critical governance event.
R-3	Session revocation	Met	Workspace secret rotation invalidates HMAC signing keys. OAuth tokens are scoped and individually revocable.
R-4	State rollback	Met	Complete audit trail enables reconstruction of pre-incident state. Agent actions are traceable through structured input/output summaries.
R-5	Graceful degradation	Met	Four-level autonomy model with automatic demotion on incidents. Circuit breaker trips demote to Level 1 (read-only). Kill switch demotes all agents.

AI Governance Dashboard

Business and Enterprise workspaces include the AI Governance Dashboard, providing real-time visibility into agent health, quality trends, and compliance status.

Agent health overview with per-agent autonomy level, circuit breaker state, and latest evaluation score
Quality trend scoring tracking response accuracy, evidence citation, and hallucination detection over 30 days
Anomaly detection alerts with severity classification and resolution tracking
ATF compliance status organized by element with real-time requirement satisfaction
Searchable audit trail of recent agent executions with confidence scores and durations

Available on Business and Enterprise plans

Continuous agent evaluation

Every AI agent is scored against golden evaluation benchmarks on a recurring cadence. The evaluation engine runs natively in TypeScript with no external service dependencies.

190

golden evaluation examples

scoring criteria per agent

6 hr

automated evaluation cadence

Scoring criteria include response accuracy, evidence citation coverage, hallucination detection, tool trajectory correctness, safety compliance, latency budget adherence, and confidence calibration. Score declines greater than 10% trigger automated anomaly alerts.

OpenTelemetry instrumented

Every LLM call through VeriRFP's gateway emits OpenTelemetry spans following the gen_ai.* semantic conventions. Spans capture provider, model, token usage, latency, security scan results, and workspace correlation. Telemetry data feeds both the continuous evaluation engine and customer-accessible dashboards.

Frequently asked questions

What is the CSA Agentic Trust Framework?

The Agentic Trust Framework (ATF) is a Zero Trust governance specification for AI agents published by the Cloud Security Alliance in February 2026. It defines 25 requirements across five elements: identity management, behavioral monitoring, data governance, segmentation, and incident response. ATF provides a structured way for organizations to evaluate whether AI agents operate within governed boundaries.

Is VeriRFP ATF certified?

VeriRFP implements ATF at the Compatible (self-attestation) level. The CSA does not currently offer formal ATF certification programs. Our conformance matrix on this page maps each of the 25 requirements to specific implementations in the VeriRFP codebase, providing auditable evidence for security teams evaluating our platform.

Which AI agents does VeriRFP govern?

VeriRFP governs 20 specialized agent types including draft pipeline, autopilot, confidence gate, connector sync, compliance heartbeat, smart requestionnaire, quality check, retrieval evaluator, trust center chatbot, crosswalk adapter, and others. Each agent has a typed capability manifest declaring its purpose, permitted actions, data access, and maximum impact scope.

What happens when an AI agent fails?

VeriRFP implements a three-layer response: first, the circuit breaker tracks consecutive failures and automatically blocks the agent after three failures, entering a cooldown period. Second, administrators can activate a workspace-level kill switch that immediately halts all agent execution. Third, the four-level autonomy model automatically demotes agents to read-only mode on critical incidents, requiring manual promotion after review.

Can I control AI agent autonomy levels?

Yes. VeriRFP implements a four-level autonomy model aligned to the ATF maturity framework: Level 1 (Intern) is read-only, Level 2 (Junior) can recommend with human approval, Level 3 (Senior) acts autonomously with guardrails, and Level 4 (Principal) is fully autonomous within its domain. Workspace administrators can set autonomy levels per agent type, and the system automatically demotes agents on incidents.

How does behavioral anomaly detection work?

VeriRFP computes rolling 30-day behavioral baselines per agent type per workspace, tracking metrics including average duration, error rate, request frequency, and p95 latency. When an agent's recent behavior deviates more than 2 standard deviations from its baseline, an anomaly alert is generated. The system processes baselines hourly for active workspaces and emits a system event the moment a deviation is detected.

View pricing Security overview Compare AI governance platforms Full product overview