Pre-execution enforcement for AI agents

Stop dangerous
agent actions
before they run.

AI agents can delete infrastructure, modify IAM roles, and exfiltrate data — in seconds. Agent Sentinel intercepts every action before AWS is called. Not after.

See it in action → View on GitHub

The problem

Agents act faster than
humans can review.

Existing safety mechanisms live in the system prompt. They fail when an agent is injected, misconfigured, or simply wrong. Observability tools only tell you what went wrong after the damage is done.

// current architecture

Agent decides → AWS API called → action executed → damage done → you find out in the logs.

// with agent sentinel

Agent decides → Sentinel intercepts → policy evaluated → risk scored → decision enforced → AWS called only if safe.

// system prompt safety fails

A capable agent can reason around any system prompt instruction. It cannot reason around an architectural enforcement boundary.

// the dangerous cases

Rarely single API calls. Usually sequences: modify IAM → export data → delete logs. Sentinel detects the chain, not just individual actions.

Live demo

Watch Sentinel decide
in real time.

These are live responses from the Sentinel API running on AWS right now.

terraform-agent → prod

# agent tries to delete production S3 bucket

"action": "delete",

"resource": "prod-s3-bucket",

"environment": "prod"

→ Sentinel evaluates...

"decision": "BLOCK",

"risk_score": 1.0,

"reason": "Risk score exceeds

critical threshold"

# AWS was never called.

# The bucket is safe.

monitoring-agent → dev

# agent reads dev logs — routine operation

"action": "read",

"resource": "dev-logs",

"environment": "dev"

→ Sentinel evaluates...

"decision": "ALLOW",

"risk_score": 0.05,

"reason": "Action within

acceptable parameters"

# Sentinel gets out of the way

# for safe actions.

ALLOW

Risk score < 0.4 — safe to execute

HUMAN REQUIRED

Risk score 0.4–0.75 — needs approval

BLOCK

Risk score > 0.75 — rejected instantly

How it works

Three-layer enforcement.
Every decision auditable.

LAYER 01

Policy Engine

Explicit versioned rules compiled from natural language instructions. Hard stops for actions that violate policy — regardless of risk score. Deterministic, testable, explainable.

LAYER 02

Risk Scoring

Every action scored 0.0 to 1.0 based on action type, environment sensitivity, and resource criticality. Delete in prod scores 1.0. Read in dev scores 0.05. Context matters.

LAYER 03

Sequence Analysis

Tracks multi-step agent behavior across a session. Individually safe actions that form a dangerous chain — modify IAM → export data → delete logs — are caught and escalated.

Project status

In active development.
Open for design partners.

Component	Status	Notes
Action evaluation API	Live	Running on AWS, structured ALLOW/BLOCK/HUMAN_REQUIRED response
Risk scoring engine	Live	Weighted 0.0–1.0 score per action
Policy store (versioned)	Live	DynamoDB-backed, idempotent, hash-verified
Audit log	Live	Every decision logged with action ID and timestamp
LLM policy compiler	In progress	Natural language → structured policy via Bedrock
Sequence analysis	In progress	Multi-step chain detection across agent sessions
Human approval webhook	Planned	Slack / dashboard notification on HUMAN_REQUIRED
Sentinel SDK	Planned	LangChain and LlamaIndex wrapper

Get involved

Building agents that touch real infrastructure?

We are looking for design partners to test Agent Sentinel on real agent workloads. Early access is free.

Get in touch → View on GitHub

Stop dangerousagent actionsbefore they run.

Agents act faster thanhumans can review.