Integration
Crawl4ai
OpenClaw plugin that packages Crawl4AI self-host as local MCP tools with proxy-aware Docker and release verification.
Install
npm install
npm
README
# OpenClaw Crawl4AI Plugin
`crawl4ai` packages a local [Crawl4AI](https://github.com/unclecode/crawl4ai) self-host deployment as an OpenClaw plugin without replacing `web_fetch`.
This repository is the standalone source for the plugin. It is meant to be cloned, packed into a `.tgz`, and installed into OpenClaw as a trusted local plugin.
## What You Get
- `openclaw crawl4ai up`
- `openclaw crawl4ai down`
- `openclaw crawl4ai status`
- `openclaw crawl4ai doctor`
- Automatic `mcp.servers.crawl4ai` wiring through a local `stdio` bridge
- Proxy-first Docker defaults for WSL/Linux environments in mainland China
## Supported Scope
- WSL/Linux
- Docker on `PATH`
- Host proxy available for non-mainland-China internet access
- OpenClaw 2026.4.15 or newer recommended
This plugin does not modify the OpenClaw installation tree and does not replace `web_fetch`.
## Quick Start
From the repository root, the one-command installer is:
```bash
bash ./scripts/install-and-verify.sh
```
That command:
- packs the current repository into a fresh `.tgz`
- installs it into OpenClaw as a trusted local plugin
- runs `openclaw crawl4ai up`
- runs `openclaw crawl4ai doctor`
If you already have a built archive, you can point the script at it:
```bash
bash ./scripts/install-and-verify.sh ./openclaw-crawl4ai-plugin-<version>.tgz
```
If you prefer manual steps:
```bash
npm install
npm test
npm pack
openclaw plugins install --dangerously-force-unsafe-install ./openclaw-crawl4ai-plugin-<version>.tgz
openclaw crawl4ai up
openclaw crawl4ai doctor
```
## Why Trusted Install Is Required
OpenClaw's plugin installer scans packaged code for dangerous patterns before install.
This plugin is expected to trigger that scan because it:
- shells out to Docker and OpenClaw CLI helpers
- reads environment variables
- performs live network probes for `status` and `doctor`
The unsafe-install flag is an explicit trust acknowledgement for a local plugin whose job is to manage a container runtime and verify live network access.
## Runtime Defaults
- Crawl4AI image: `unclecode/crawl4ai:latest`
- Container name: `crawl4ai`
- Upstream service: `http://127.0.0.1:11235`
- Bridge mode: local `stdio`
- Default proxy: `http://127.0.0.1:7897`
- Default `NO_PROXY`: `127.0.0.1,localhost`
When the proxy host is loopback, the plugin starts Crawl4AI with Docker host networking so the container can reach the host proxy directly.
## Docs
- [Installation guide](./docs/INSTALL.md)
- [Troubleshooting guide](./docs/TROUBLESHOOTING.md)
- [One-shot AI deployment prompt](./docs/AI_DEPLOY_PROMPT.md)
- [Suggested upstream outreach notes](./docs/UPSTREAM_OUTREACH.md)
## Repository Layout
- `src/`: CLI, bridge, and runtime logic
- `tests/`: unit and integration-style tests
- `docs/`: install, troubleshooting, and deployment assets
- `openclaw.plugin.json`: OpenClaw plugin manifest
## Release Flow
```bash
npm run pack:release
npm run release:verify
```
`pack:release` runs the unit test suite and emits a new `openclaw-crawl4ai-plugin-<version>.tgz`.
`release:verify` is the pre-publish smoke test. It runs:
- `npm ci`
- `npm test`
- `npm pack`
- isolated tgz install into a temporary OpenClaw state
- `openclaw crawl4ai up|status|doctor|down`
- live `md`, `html`, and `crawl` calls through the installed bridge
- cleanup rechecks for the isolated config, extension directory, Docker container, and `/tmp/openclaw-crawl4ai-*`
By default it uses balanced cleanup: it deletes temporary state and runtime residue, but keeps the `.tgz`, a JSON summary, and a command log under `artifacts/release-verification/`.
If you are doing failure forensics and want to preserve the temporary state directory, use:
```bash
npm run release:verify:keep-evidence
```
If a shared host-network Crawl4AI runtime is already active on the machine, the verifier reuses it for non-destructive checks and skips the destructive `down` step for that shared container.
## License
[MIT](./LICENSE)
integration
Comments
Sign in to leave a comment