darkfield

Your AI agents can be jailbroken,
exfiltrate data, and hallucinate credentials.

Darkfield finds these vulnerabilities before attackers do.

darkfield scan
$ darkfield scan agent-skill.mdCRITICAL prompt-injection System prompt override via user input concatenation L:14HIGH network-exfil Unrestricted outbound HTTP to external endpoints L:28HIGH credential-access API key read from environment without scope guard L:413 findings (1 critical, 2 high) in 0.04s
$pip install darkfield
SCAN

Find vulnerabilities in AI agent skills

-Prompt injection, system prompt leaks, jailbreak vectors

-Network exfiltration, credential access, privilege escalation

-Unsafe code execution, path traversal, data poisoning

darkfield scan-dir
$ darkfield scan-dir skills/ --format textScanning 12 skill files...skills/web-search.md PASS 0 findingsskills/code-executor.md FAIL 4 findings (2 critical)skills/data-fetcher.md WARN 2 findings (1 high)skills/summarizer.md PASS 0 findings12 files scanned, 6 findings in 0.12s

12 static pattern rules. 8 semantic analyzers. SARIF output.

SCREEN

Behavioral risk scoring

Extract persona trait vectors from LLMs via contrastive activation collection. Project inputs to quantify behavioral drift.

$darkfield screen data.jsonl
EXPLOIT

Red-team generation

Adversarial prompts via persona vector inversion. Stealth obfuscation, validation, success metrics, batch library building.

$darkfield exploit generate
ENCODE

17 encoding transforms

Reversible text transforms across classic, unicode, invisible, and structural categories. Offense and defense in one toolkit.

$darkfield encode "payload"
$pip install darkfield
View on GitHub