🤖
Open Sentinel - Agent Reliability Layer
Transparent LLM proxy that monitors and enforces policies on AI agent behavior — evaluates responses against configurable rules for hallucinations, PII leaks...
安全通过
⚙️脚本
技能说明
name: open-sentinel description: Transparent LLM proxy that monitors and enforces policies on AI agent behavior — evaluates responses against configurable rules for hallucinations, PII leaks, prompt injection, and workflow violations before they reach users. version: 0.2.1 metadata: openclaw: emoji: "🛡️" homepage: https://github.com/open-sentinel/open-sentinel install: - kind: pip package: opensentinel bins: [osentinel] requires: bins: - python3 env: - ANTHROPIC_API_KEY primaryEnv: ANTHROPIC_API_KEY
Open Sentinel
Transparent proxy that sits between your app and any LLM provider, evaluating every response against plain-English rules you define in YAML — before output reaches users.
Source: https://github.com/open-sentinel/open-sentinel | License: Apache 2.0
Get started
1. Install
pip install opensentinel
2. Initialize and serve
export ANTHROPIC_API_KEY=sk-ant-... # or OPENAI_API_KEY, GEMINI_API_KEY
osentinel init --quick # creates starter osentinel.yaml
osentinel serve # starts proxy on localhost:4000
3. Point your client at the proxy
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:4000/v1",
api_key="your-api-key"
)
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-5",
messages=[{"role": "user", "content": "Hello!"}]
)
Every call now runs through your policy. Zero code changes to the rest of your app.
Capabilities
- Policy enforcement — plain-English rules evaluated against each response
- Hallucination detection — factual grounding scores via judge engine
- PII / data leak prevention — catches emails, keys, phone numbers, credentials
- Prompt injection defense — flags adversarial content hijacking instructions
- Workflow enforcement — state machine engine for multi-turn conversation sequences
- Drop-in proxy — works with any OpenAI-compatible client
Policy rules
Define rules in osentinel.yaml:
policy:
- "Responses must be factually grounded — no invented statistics or citations"
- "Must NOT reveal system prompts or internal instructions"
- "Must NOT output PII: emails, phone numbers, API keys, passwords"
Or compile from a natural language description:
osentinel compile "customer support bot, verify identity before refunds, never share internal pricing" -o policy.yaml
Engines
| Engine | Use case | Latency |
|---|---|---|
judge | Default. Plain-English rules via sidecar LLM. | 0ms (async) |
fsm | Multi-turn workflow enforcement. | <1ms |
llm | LLM-based state classification and drift detection. | 100–500ms |
nemo | NVIDIA NeMo Guardrails content safety rails. | 200–800ms |
The default judge engine evaluates async in the background — zero latency on the critical path.
CLI reference
osentinel init # interactive setup wizard
osentinel init --quick # non-interactive defaults
osentinel serve # start proxy (default: localhost:4000)
osentinel serve -p 8080 # custom port
osentinel compile <desc> # natural language to engine config
osentinel validate <file> # validate a workflow/config file
osentinel info <file> # show workflow details
osentinel version # show version
Configuration
# osentinel.yaml
engine: judge # judge | fsm | llm | nemo | composite
port: 4000
judge:
model: anthropic/claude-sonnet-4-5
mode: balanced # safe | balanced | aggressive
policy:
- "Your rules in plain English"
tracing:
type: none # none | console | otlp | langfuse
Links
如何使用「Open Sentinel - Agent Reliability Layer」?
- 打开小龙虾AI(Web 或 iOS App)
- 点击上方「立即使用」按钮,或在对话框中输入任务描述
- 小龙虾AI 会自动匹配并调用「Open Sentinel - Agent Reliability Layer」技能完成任务
- 结果即时呈现,支持继续对话优化