Tools
Page Agent
OpenClaw plugin for Page Agent - GUI browser automation
Configuration Example
{
"plugins": {
"load": {
"paths": [
"/path/to/.openclaw/workspace/plugins/page-agent"
]
},
"entries": {
"page-agent": {
"enabled": true,
"config": {
"baseURL": "https://api.openai.com/v1",
"apiKey": "sk-your-api-key",
"model": "gpt-5.2",
"extensionToken": "your-extension-token-from-page-agent"
}
}
}
}
}
README
# Page Agent Plugin for OpenClaw
Integration with [Page Agent](https://alibaba.github.io/page-agent/) Chrome Extension - The GUI Agent Living in Your Webpage.
## Features
- ✅ **Natural Language Browser Automation**: Control your browser using natural language
- ✅ **Multi-page Support**: Navigate across tabs and websites
- ✅ **Smart DOM Analysis**: Pure text-based DOM analysis, no visual recognition
- ✅ **Custom LLM Configuration**: Use your preferred LLM (OpenAI, Qwen, Claude, etc.)
## Installation
### 1. Install Page Agent Chrome Extension
First, install the Page Agent Chrome Extension from:
- Chrome Web Store (TBD)
- Or build from source: https://github.com/alibaba/page-agent
### 2. Get Extension Token
1. Open the Page Agent extension settings
2. Copy your auth token
3. Set it in localStorage: `localStorage.setItem('PageAgentExtUserAuthToken', '<your-token>')`
### 3. Configure OpenClaw
Add the plugin to your OpenClaw configuration:
```json
{
"plugins": {
"load": {
"paths": [
"/path/to/.openclaw/workspace/plugins/page-agent"
]
},
"entries": {
"page-agent": {
"enabled": true,
"config": {
"baseURL": "https://api.openai.com/v1",
"apiKey": "sk-your-api-key",
"model": "gpt-5.2",
"extensionToken": "your-extension-token-from-page-agent"
}
}
}
}
}
```
### 4. Restart OpenClaw Gateway
```bash
openclaw gateway restart
```
## Available Tools
### `page_agent_execute`
Execute a natural language task using Page Agent Chrome Extension.
**Parameters:**
- `instruction` (required): Natural language instruction for Page Agent
- `baseURL` (optional): Override LLM API base URL
- `apiKey` (optional): Override LLM API key
- `model` (optional): Override LLM model
**Example:**
```javascript
await page_agent_execute({
instruction: "Search for 'page-agent' on GitHub and open the first result"
})
```
### `page_agent_navigate`
Navigate to a URL using Page Agent.
**Parameters:**
- `url` (required): The URL to navigate to
### `page_agent_click`
Click on an element using Page Agent.
**Parameters:**
- `selector` (required): CSS selector or element description
### `page_agent_type`
Type text into an input using Page Agent.
**Parameters:**
- `selector` (required): CSS selector or element description
- `text` (required): Text to type
### `page_agent_screenshot`
Take a screenshot using Page Agent.
**Parameters:**
- `name` (optional): Optional name for the screenshot
### `page_agent_extract`
Extract content from the page using Page Agent.
**Parameters:**
- `instruction` (required): What to extract from the page
## Supported LLM Models
Page Agent works with any model that follows OpenAI API format and supports tool calls:
| Provider | Recommended Models |
|----------|-------------------|
| Alibaba Qwen | qwen3.5-plus, qwen3.5-flash, qwen3-coder-next |
| OpenAI | gpt-5.2, gpt-5.1, gpt-4.1-mini |
| Anthropic | claude-haiku-4.5, claude-sonnet-4.5 |
| DeepSeek | deepseek-3.2 |
| Google | gemini-3-pro, gemini-3-flash |
| and more... | |
## Use Cases
### 1. Customer Service Automation
Let your support agent directly operate the page for users:
```javascript
await page_agent_execute({
instruction: "Help the user submit a support ticket by filling out the form"
})
```
### 2. Business Process Automation
Guide new employees through complex workflows:
```javascript
await page_agent_execute({
instruction: "Walk me through the customer onboarding process step by step"
})
```
### 3. Personal Productivity
Automate repetitive tasks across websites:
```javascript
await page_agent_execute({
instruction: "Check my calendar and schedule a meeting room for tomorrow afternoon"
})
```
### 4. DevOps Automation
Operate management panels using natural language:
```javascript
await page_agent_execute({
instruction: "Restart the production server and show me the latest logs"
})
```
## Current Status
⚠️ **Placeholder Implementation**: This plugin currently provides the tool definitions and configuration schema, but the actual browser integration is not yet implemented.
To fully integrate with Page Agent Chrome Extension, you would need:
1. A way to communicate with the Chrome extension (WebSocket, Native Messaging, etc.)
2. Or use Playwright/Puppeteer to drive the browser programmatically
3. Or build a custom bridge between OpenClaw and the browser
## Links
- [Page Agent Documentation](https://alibaba.github.io/page-agent/)
- [Page Agent GitHub](https://github.com/alibaba/page-agent)
- [OpenClaw Documentation](https://docs.openclaw.ai/)
tools
Comments
Sign in to leave a comment