Scan API
The Scan API runs STACK's prompt-injection detector against arbitrary content you supply. Use it on retrieved data — emails, documents, web pages, calendar invites, API responses, anything an agent might pull from a non-user source — BEFORE feeding the content into your LLM. Returns a verdict so your agent can refuse to proceed if the content carries an injection payload.
Most production prompt injections are indirect — the user is benign, the agent retrieves a doc/email, and the doc/email contains the injection.POST /v1/scan exists specifically for this attack class. Available on every tier within the monthly scan allowance.
Make a Scan Request
Send the content to scan. The detector runs the same three-layer chain that fires on/v1/proxy and /v1/skills/:id/invoke:
- L1 - regex pattern catalog (instruction overrides, role-play breakouts, system-prompt extraction, jailbreak names, safety bypass)
- L2 - encoding-aware normalization that decodes base64, URL, hex, leetspeak, unicode homoglyphs, zero-width characters, ROT13, separator obfuscation, and reversed text before re-running L1 against each variant
- L3 - LLM-classifier funnel (Haiku 4.5 via OpenRouter) for semantic-novel attacks the regex layer cannot see: paraphrased overrides, authority impersonation, polite-format wrappers, indirect injection in retrieved content, multilingual variants. Heuristic gate fires L3 only on body length ≥ 60 chars and non-critical L1+L2 verdict; 3-second timeout with graceful degrade-to-L1+L2 on failure
curl -X POST https://api.getstack.run/v1/scan \
-H "Authorization: Bearer sk_live_your_key" \
-H "Content-Type: application/json" \
-d '{
"content": "<the email body, document text, or scraped webpage>",
"context": "email",
"source": "sender@external.com"
}'Request Fields
- content (string | object, required) - the content to scan. Strings scanned directly; objects walked one level deep, same as proxy bodies.
- context (enum, optional) - one of "email" | "document" | "webpage" | "chat_log" | "api_response" | "calendar" | "other". Recorded in security event metadata.
- source (string, optional) - source identifier (URL / sender / filename). Truncated to 256 chars in event metadata.
Optional Headers
- X-Passport-Token - if present, the scan is attributed to the passport JWT (agent_id + passport_jti recorded on any security event).
Response
{
"verdict": "critical",
"scan_id": "ses_abc123def456",
"duration_ms": 2120,
"match": {
"pattern_id": "override_ignore_previous",
"severity": "critical",
"field_path": "content",
"matched_excerpt": "ignore all previous instructions",
"encoding": "base64_decode"
},
"llm": {
"verdict": "critical",
"confidence": 92,
"reasoning": "Authority impersonation attempt with embedded directive...",
"model": "anthropic/claude-haiku-4.5"
},
"l3_degraded": null
}- verdict ("clean" | "suspicious" | "critical") - clean = no L1/L2/L3 hit, suspicious = warning-severity, critical = critical-severity
- scan_id - links to the security event row when verdict is non-clean
- duration_ms - end-to-end scan time. L1+L2 alone is 1-10ms; full L1+L2+L3 chain is 1-3s when the LLM gate fires
- match - null when no L1/L2 pattern fired. On a hit: pattern_id, severity, field path, truncated matched text (lowercased), and encoding if L2 normalization revealed the match
- llm - null when L3 was skipped (body too short, L1/L2 already critical, no OpenRouter credential), or when L3 ran but degraded. On a successful run: verdict, confidence (0-100), one-sentence reasoning, model identifier
- l3_degraded - null when L3 ran successfully OR was skipped. Set to { reason, duration_ms } when L3 ran but failed (timeout, network, parse error). Callers can lift trust accordingly
Encoding Attribution
When L2 normalization reveals a match the raw text didn't expose, the response carries an encoding field on the match. Possible values:
- base64_decode - found in a base64-encoded substring
- url_decode - %XX-encoded payload
- hex_decode - hex-encoded payload
- leetspeak - digit/symbol substitution
- homoglyph - Cyrillic / Greek / mathematical-bold lookalikes
- zero_width_strip / zero_width_as_space - invisible char obfuscation
- separator_collapse - i.g.n.o.r.e style
- spaced_letters - i g n o r e style
- identifier_casing - camelCase / snake_case / kebab-case
- rot13 - ROT13-rotated text
- reverse - character-reversed text
Recommended Agent Flow
// Pseudo-code: scan retrieved content before feeding to the LLM
async function summarizeEmail(emailBody, sender) {
const result = await stack.scan.scan({
content: emailBody,
context: 'email',
source: sender,
}, { passportToken: passport });
if (result.verdict === 'critical') {
// Refuse — the email contains injection. Surface to the user.
return { error: 'Email body contains a prompt-injection payload; refusing to summarize.', match: result.match };
}
if (result.verdict === 'suspicious') {
// Warn but proceed with extra caution. Optionally lower trust.
log.warn('scan suspicious', result.match);
}
return await llm.summarize(emailBody);
}Rate Limiting + Pricing
Scan calls are metered per operator on a monthly basis. Allowances:
- Free - 10,000 scans / month
- Developer - 50,000 scans / month
- Studio - 500,000 scans / month
- Enterprise - unlimited
Above the monthly allowance, calls are debited from the operator's wallet at $0.0001 per scan (cheaper than proxy because no upstream forward). If the wallet balance is insufficient, the call returns 402 Payment Required.
Check Scan Usage
curl https://api.getstack.run/v1/scan/usage \
-H "Authorization: Bearer sk_live_your_key"{
"tier": "developer",
"limit": 50000,
"used": 1523,
"remaining": 48477,
"period": "2026-04"
}Security Events
Every match (warning or critical) records a prompt_injectionsecurity event with:
- pattern_id - which L1 pattern caught it
- field_path - where in the content
- matched_excerpt - first ~80 chars of the trigger, lowercased (PII protection)
- encoding - which L2 normalization revealed it (if any)
- scan_context - the context tag you supplied
- scan_source - the source identifier you supplied (truncated to 256 chars)
The event is severity-tagged exactly the same as proxy/skill scans: warning for soft phrasing, critical for strong instruction-override or jailbreak-name matches.
Honest Scope Statement
Benchmark numbers on a 1087-sample corpus (deepset/prompt-injections + STACK curated supplement + AgentDojo extracted): full L1+L2+L3 chain achieves F1 0.86, precision 0.98, recall 0.77. L1 alone reaches F1 0.43; L1+L2 reaches 0.49. L3 is the largest single contributor.
The detector is strong but not infallible. Bear in mind: (1) L3 sees the content you scan — it leaves STACK's infrastructure to reach the OpenRouter-hosted Haiku model. For especially-sensitive data, consider scanning only with L1+L2 (future per-detector enforcement controls will expose this). (2) Adversarial attacks targeted specifically at LLM classifiers (e.g. carefully-crafted confusing-the-classifier prompts) may evade L3. (3) Indirect attacks that don't carry any directive surface in the content (e.g. data poisoning that only manifests later) are out of scope for input-side scanning. Use the scan verdict as one strong signal in your overall trust model, not as a complete guarantee.