Ship AI agents with a security harness, not hope.
Ultra13 gives engineering teams offensive agent testing, context-aware runtime controls, and regression validation for agents that use RAG, memory, MCP, APIs, code execution, browsers, and internal tools.
Enforcement where it actually matters.
The firewall is not a wrapper around the prompt. It sits at every boundary the agent crosses.
Prompt / context assembly
Decide what the model is allowed to see for this request.
Retrieved context
Treat RAG chunks as untrusted evidence with per-source policy.
Memory reads / writes
Gate what becomes persistent, and what it’s allowed to authorize.
MCP responses
Validate tool descriptions and responses; catch rug-pulls and shadowing.
Tool calls
Inspect name, args, identity, tenant, data class, and side effects.
Outbound actions
Control egress, exports, webhooks, shell, and code execution.
Which context can influence which action.
External web content can answer a question — but it cannot authorize a database export, a shell command, or an outbound webhook. You declare the boundary; the firewall enforces it.
allow sink=answer from={user, rag, web, mcp}
deny sink=db.export from={web, rag, mcp, memory}
deny sink=shell.exec from=* unless approved
deny sink=webhook.post from={web, rag} # block egress
require_approval sink=*.write when data_class=PII
treat retrieved_context as evidence not instructionInspect the call before it runs.
Tool name, arguments, target resource, user identity, tenant, data classification, side effects, and the trust of the context that requested it — evaluated against policy, then a decision.
Poisoned memory never becomes standing instruction.
Untrusted memory is quarantined and trust-labelled, with TTLs on what can persist. A note an attacker planted last session can’t silently authorize an action this one.
Regression tests for agent security.
Re-run the offensive suite whenever the model, prompt, retrieval source, tool, or policy changes. Every change resets the risk posture — so every change gets re-tested.
Framework-neutral by design.
Ultra13 wraps the boundaries, not your framework. Bring whatever stack your agents already run on.
Inline, without inheriting new risk.
Drop it between your agent and the world without a new model dependency, a new token bill, or a new region problem.
Deterministic-first screening
Fast deterministic detectors run first, then a purpose-built specialist classifier checks what passes — not a generative LLM-judge in the hot path. Two tiers, fail-open: an outage falls back to detectors, never quarantines clean traffic.
Your model stays yours
The firewall fronts your agent and screens the boundary — bring your own model, keys, and provider bill. We screen the traffic; we never proxy-resell tokens or replace your model.
Choose your region
Run the managed service in the USA, EU, or UK for data residency — or self-host the data plane in your own VPC for full control.
Fast enough to leave in the request path.
The deterministic screening tier is CPU-cheap — measured on a single core, not estimated.
Inline SDK adds only screening CPU — no network hop. A quota rule adds one ~0.1–0.2 ms Redis round-trip; the opt-in classifier and OCR tiers add inference only when enabled.
Give us one workflow and one toolchain. We’ll show you where it breaks.
A technical teardown, context firewall controls, and a regression harness for the agent you’re shipping.