Ultra13
Research · Agent Attack Surface

Prompt injection is only the first failure mode.

Modern AI agents read context, call tools, update memory, use MCP servers, browse the web, write code, and trigger workflows. The missing layer is a Context Firewall between what they read and what they do.

Then vs now

The model you secured is not the model you shipped.

Context is now an attack surface. Every retrieved document, memory item, tool response, and MCP description can become instruction.

The old model
User promptLLMresponse

One input, one output. Guardrails on the prompt could catch most of the obvious abuse.

The agentic model
Goalplanretrieve contextread memorycall toolsobserve resultupdate stateact again

Every step in the loop reads untrusted content and can take a real action. The attack surface is the whole workflow, not the prompt.

Attack classes

The failure modes that live past the prompt.

Prompt injection is the entry point, not the whole threat model. These are the classes we replay against real agent workflows.

Indirect prompt injection
Malicious instructions hidden in docs, tickets, web pages, emails, or MCP responses.
Tool hijacking
The agent is manipulated into calling a tool with dangerous arguments.
Memory poisoning
Bad context persists and steers future sessions.
RAG poisoning
Retrieved content becomes instruction instead of evidence.
Confused deputy
The agent uses a privileged identity for an attacker-controlled goal.
Output handling abuse
Model output becomes shell, SQL, code, markdown, or workflow config.
OAST exfiltration
The agent sends sensitive data to an external callback.
Human approval spoofing
The agent misrepresents a dangerous action as safe.

Built for teams searching: prompt injection protection · AI agent security · MCP security · agentic AI red teaming · context firewall · LLM firewall.

Stop treating agent security as a prompt problem.

Guardrails catch the obvious. Ultra13 controls the context-to-action boundary.