Beyond CAEP: Building Continuous Access Evaluation That Scales
CAEP defines the protocol, but building continuous access evaluation that works at enterprise scale requires more — event correlation, decision caching, and graceful degradation. Lessons from processing 10 billion access decisions.
In January, we published an introduction to CAEP and the Shared Signals Framework. The response was overwhelming — but the most common follow-up question was: "How do you actually build this at scale?"
CAEP (Continuous Access Evaluation Protocol) provides the protocol layer — how security events are transmitted between systems to trigger session re-evaluation. The Shared Signals Framework standardizes the event format. Together, they're the foundation.
But the protocol alone doesn't solve the engineering challenges of evaluating millions of access decisions per second, correlating signals from dozens of sources, and maintaining sub-50ms latency at 99.99% availability. This article covers what we've learned building TigerIdentity's evaluation engine to handle 10 billion+ access decisions.
TigerIdentity Decision Engine
What CAEP Doesn't Cover
CAEP is a protocol, not an architecture. It defines how to transmit events between systems but leaves the hard engineering problems to the implementer. Here are the four challenges we had to solve beyond the spec:
Event Correlation Across Sources
A single identity threat involves signals from multiple sources — your IdP reports a suspicious login, your EDR flags malware, your SIEM detects data exfiltration. Each signal alone might be low-confidence. Together, they're a confirmed incident.
Challenge: Correlating events across time windows, matching identities across systems with different naming conventions, and scoring multi-source signals without generating false positives.
Decision Caching and Consistency
What happens when the policy engine is unreachable for 500ms? Do you fail-open (security risk) or fail-closed (availability risk)? Caching decisions helps, but cached decisions go stale — especially during active threats.
Challenge: Cache invalidation when policies or context change. Tiered caching strategies based on resource sensitivity. Consistency guarantees across distributed decision points.
Graceful Degradation
Not all resources need the same evaluation rigor. A request to view a public wiki page can use a cached decision. A request to delete production data must be evaluated in real-time, every time.
Challenge: Defining degradation tiers, implementing circuit breakers that activate per-resource-sensitivity, and ensuring degraded mode doesn't become a security bypass.
Multi-Tenant Isolation
One tenant's event storm shouldn't degrade another tenant's evaluation latency. A security incident at Company A — generating thousands of events per second — must not affect Company B's access decisions.
Challenge: Per-tenant rate limiting, resource isolation, and priority queuing without over-provisioning.
Architecture for Scale
TigerIdentity's evaluation architecture is designed around four key layers, each optimized for its specific role in the decision pipeline:
Event Ingestion Layer
NATS JetStream provides the messaging backbone with at-least-once delivery guarantees. Events from IdPs, EDR agents, SIEMs, and HR systems flow into topic-based streams with per-tenant partitioning.
Decision Engine
Policies are compiled from YAML DSL to executable rules at deploy time — not interpreted at evaluation time. The engine runs entirely in memory with pre-indexed policy lookups. Cache hits resolve in under 5ms.
Signal Correlation
Sliding window analysis correlates signals from multiple sources into composite risk scores. A suspicious login alone might score 30. Add a concurrent EDR alert and it scores 85. Multi-source fusion eliminates false positives.
Session Management
Distributed session store backed by Redis with instant revocation propagation. When a session is revoked, CAEP events are published to all connected relying parties within 2 seconds. No polling — push-based revocation.
Patterns That Work
Tiered Evaluation
Not every access decision needs real-time policy evaluation. We tier requests by resource sensitivity:
- • Tier 1 (public/internal): Cached decisions, 5ms response, re-evaluate every 5 minutes
- • Tier 2 (confidential): Real-time evaluation, <50ms response, continuous monitoring
- • Tier 3 (restricted): Real-time evaluation + step-up auth, full context check every request
Pre-Computed Access Grants
For predictable access patterns, we evaluate policies ahead of access requests. When a user's context changes (team assignment, risk score update), we recompute their access grants for commonly accessed resources and cache the results.
This turns most access checks into cache lookups — sub-5ms with no policy evaluation overhead.
Circuit Breakers
If a signal source goes down (EDR agent unreachable, IdP timeout), the circuit breaker activates. Access decisions continue with degraded context — but the degradation is resource-sensitivity-aware. Tier 1 resources use cached decisions. Tier 3 resources fail-closed until the signal source recovers.
Eventual Consistency for Analytics
Access decisions must be strongly consistent. Analytics can be eventually consistent. We use PostgreSQL (OLTP) for real-time decision state and ClickHouse (OLAP) for analytics, compliance reports, and trend analysis. Events replicate from PostgreSQL to ClickHouse asynchronously.
Event Processing Pipeline
# TigerIdentity evaluation engine configuration
evaluation:
engine:
compiled_policies: true
cache:
enabled: true
ttl_by_sensitivity:
public: 300s # 5 minutes
internal: 120s # 2 minutes
confidential: 0s # No caching — always real-time
restricted: 0s # No caching — always real-time
max_entries: 100000
signal_correlation:
window: 300s # 5-minute sliding window
decay: exponential
min_confidence: 0.7
sources:
- idp # Identity provider events
- edr # Endpoint detection
- siem # Security events
- hr_system # Employment status changes
- device_mgmt # Device posture signals
degradation:
circuit_breaker:
failure_threshold: 5
recovery_timeout: 30s
fallback_by_tier:
tier_1: use_cached_decision
tier_2: use_cached_with_alert
tier_3: fail_closed
messaging:
provider: nats_jetstream
delivery: at_least_once
partitioning: per_tenant
retention: 7d
session_revocation:
method: caep_push
propagation_target: 2s
storage: redis_cluster
audit_backend: clickhouseBuild on Proven Scale
TigerIdentity's evaluation engine has processed 10 billion+ access decisions at sub-50ms latency. Deploy continuous access evaluation that scales with your enterprise.
30-day trial. No credit card required. Full platform access.