Pre-execution enforcement for AI agents

Stop dangerous
agent actions
before they run.

AI agents can delete infrastructure, modify IAM roles, and exfiltrate data — in seconds. Agent Sentinel intercepts every action before AWS is called. Not after.

See it in action → View on GitHub

The problem

Agents act faster than
humans can review.

Existing safety mechanisms live in the system prompt. They fail when an agent is injected, misconfigured, or simply wrong. Observability tools only tell you what went wrong after the damage is done.

// current architecture

Agent decides → AWS API called → action executed → damage done → you find out in the logs.

// with agent sentinel

Agent decides → Sentinel intercepts → policy evaluated → risk scored → decision enforced → AWS called only if safe.

// system prompt safety fails

A capable agent can reason around any system prompt instruction. It cannot reason around an architectural enforcement boundary.

// the dangerous cases

Rarely single API calls. Usually sequences: modify IAM → export data → delete logs. Sentinel detects the chain, not just individual actions.

Watch Sentinel decide
in real time.

These are live responses from the Sentinel API running on AWS right now.

terraform-agent → prod
# agent tries to delete production S3 bucket

"action": "delete",
"resource": "prod-s3-bucket",
"environment": "prod"

→ Sentinel evaluates...

"decision": "BLOCK",
"risk_score": 1.0,
"reason": "Risk score exceeds
          critical threshold"

# AWS was never called.
# The bucket is safe.
monitoring-agent → dev
# agent reads dev logs — routine operation

"action": "read",
"resource": "dev-logs",
"environment": "dev"

→ Sentinel evaluates...

"decision": "ALLOW",
"risk_score": 0.05,
"reason": "Action within
          acceptable parameters"

# Sentinel gets out of the way
# for safe actions.
ALLOW
Risk score < 0.4 — safe to execute
HUMAN REQUIRED
Risk score 0.4–0.75 — needs approval
BLOCK
Risk score > 0.75 — rejected instantly

How it works

Three-layer enforcement.
Every decision auditable.

LAYER 01

Policy Engine

Explicit versioned rules compiled from natural language instructions. Hard stops for actions that violate policy — regardless of risk score. Deterministic, testable, explainable.

LAYER 02

Risk Scoring

Every action scored 0.0 to 1.0 based on action type, environment sensitivity, and resource criticality. Delete in prod scores 1.0. Read in dev scores 0.05. Context matters.

LAYER 03

Sequence Analysis

Tracks multi-step agent behavior across a session. Individually safe actions that form a dangerous chain — modify IAM → export data → delete logs — are caught and escalated.

Project status

In active development.
Open for design partners.

Component Status Notes
Action evaluation API Live Running on AWS, structured ALLOW/BLOCK/HUMAN_REQUIRED response
Risk scoring engine Live Weighted 0.0–1.0 score per action
Policy store (versioned) Live DynamoDB-backed, idempotent, hash-verified
Audit log Live Every decision logged with action ID and timestamp
LLM policy compiler In progress Natural language → structured policy via Bedrock
Sequence analysis In progress Multi-step chain detection across agent sessions
Human approval webhook Planned Slack / dashboard notification on HUMAN_REQUIRED
Sentinel SDK Planned LangChain and LlamaIndex wrapper

Get involved

Building agents that touch real infrastructure?

We are looking for design partners to test Agent Sentinel on real agent workloads. Early access is free.

Get in touch → View on GitHub