Architecture Overview¶

CloneGuard operates as an independent trust boundary between the AI coding agent and the commands it executes. It cannot be disabled by repository content because it runs at the hook layer, outside the agent's control.

Defense Layers¶

Repository files
       |
  Layer 0: Pre-execution scan (before agent launches)
       |
  Agent starts
       |
  Layer 1: InstructionsLoaded (scans CLAUDE.md, rules files)
       |
  Agent works...
       |
  Layer 2: PostToolUse (scans tool output for injected instructions)
       |
  Layer 3: PreToolUse (gates writes, builds, config changes)
       |
  Tool executes (or is blocked/constrained)

Layer 0: Pre-Execution Wrapper¶

Scans all high-priority and medium-priority files in the repository before the agent starts. Blocks agent launch if critical findings are detected.

Cannot be disabled by repository content -- executes before the agent reads any files.

Layer 1: InstructionsLoaded¶

Scans instruction files (CLAUDE.md, .claude/rules/*.md, .cursorrules) when the agent loads them. Uses STRICT scan mode -- HIGH severity findings result in a hard block.

Layer 2: PostToolUse¶

Scans all tool output for injection patterns. Catches payloads injected via tool results -- for example, a cat command reading a malicious file that contains prompt injection.

Layer 3: PreToolUse¶

Gates dangerous operations before execution:

Blocks writes to protected paths (~/.claude/settings.json, trust stores)
Scans content being written to sensitive files (package.json, Makefile)
Warns on build commands (npm install, pip install, cargo build)
Blocks allowlist manipulation (cloneguard allow in Bash)
Blocks bypass attempts (cloneguard --bypass, claude --bypass)

Detection Signals¶

Three independent signals are computed for each tool call:

Signal	Method	Speed	Strengths
Pattern	240 compiled regex rules	<50ms	Fast, predictable, catches known patterns
Semantic	MiniLM-L6-v2 ONNX classifier	~16ms/sample	Catches synonym substitution, encoding evasion, social engineering
Behavioral	CaMeL-lite session-wide monitoring	<0.5ms/event	Catches multi-step sequences (read credentials, then exfiltrate)

Signals are evaluated independently -- any signal crossing its threshold is sufficient to raise a detection. See Detection Engine for details.

Enforcement Pipeline¶

Detection signals (evaluated independently)
       |
  Verdict: SAFE / SUSPICIOUS / MALICIOUS
       |
  Policy engine (YAML rules, per-tool overrides)
       |
  Action: allow / constrain (sandbox) / block

When enforcement is enabled, SUSPICIOUS verdicts can constrain the tool call via OS-level sandboxing (Landlock on Linux, Seatbelt on macOS) without affecting CloneGuard itself. See Enforcement for details.

Default mode is dry-run: detect and log, do not enforce.

Audit Trail¶

Every detection event emits structured NDJSON logs -- one line per event, machine-readable. See Audit for details.