← Back to Plugins
Tools

Page Agent

zxsaqwz By zxsaqwz 👁 235 views ▲ 0 votes

OpenClaw plugin for Page Agent - GUI browser automation

GitHub

Configuration Example

{
  "plugins": {
    "load": {
      "paths": [
        "/path/to/.openclaw/workspace/plugins/page-agent"
      ]
    },
    "entries": {
      "page-agent": {
        "enabled": true,
        "config": {
          "baseURL": "https://api.openai.com/v1",
          "apiKey": "sk-your-api-key",
          "model": "gpt-5.2",
          "extensionToken": "your-extension-token-from-page-agent"
        }
      }
    }
  }
}

README

# Page Agent Plugin for OpenClaw

Integration with [Page Agent](https://alibaba.github.io/page-agent/) Chrome Extension - The GUI Agent Living in Your Webpage.

## Features

- ✅ **Natural Language Browser Automation**: Control your browser using natural language
- ✅ **Multi-page Support**: Navigate across tabs and websites
- ✅ **Smart DOM Analysis**: Pure text-based DOM analysis, no visual recognition
- ✅ **Custom LLM Configuration**: Use your preferred LLM (OpenAI, Qwen, Claude, etc.)

## Installation

### 1. Install Page Agent Chrome Extension

First, install the Page Agent Chrome Extension from:
- Chrome Web Store (TBD)
- Or build from source: https://github.com/alibaba/page-agent

### 2. Get Extension Token

1. Open the Page Agent extension settings
2. Copy your auth token
3. Set it in localStorage: `localStorage.setItem('PageAgentExtUserAuthToken', '<your-token>')`

### 3. Configure OpenClaw

Add the plugin to your OpenClaw configuration:

```json
{
  "plugins": {
    "load": {
      "paths": [
        "/path/to/.openclaw/workspace/plugins/page-agent"
      ]
    },
    "entries": {
      "page-agent": {
        "enabled": true,
        "config": {
          "baseURL": "https://api.openai.com/v1",
          "apiKey": "sk-your-api-key",
          "model": "gpt-5.2",
          "extensionToken": "your-extension-token-from-page-agent"
        }
      }
    }
  }
}
```

### 4. Restart OpenClaw Gateway

```bash
openclaw gateway restart
```

## Available Tools

### `page_agent_execute`
Execute a natural language task using Page Agent Chrome Extension.

**Parameters:**
- `instruction` (required): Natural language instruction for Page Agent
- `baseURL` (optional): Override LLM API base URL
- `apiKey` (optional): Override LLM API key
- `model` (optional): Override LLM model

**Example:**
```javascript
await page_agent_execute({
  instruction: "Search for 'page-agent' on GitHub and open the first result"
})
```

### `page_agent_navigate`
Navigate to a URL using Page Agent.

**Parameters:**
- `url` (required): The URL to navigate to

### `page_agent_click`
Click on an element using Page Agent.

**Parameters:**
- `selector` (required): CSS selector or element description

### `page_agent_type`
Type text into an input using Page Agent.

**Parameters:**
- `selector` (required): CSS selector or element description
- `text` (required): Text to type

### `page_agent_screenshot`
Take a screenshot using Page Agent.

**Parameters:**
- `name` (optional): Optional name for the screenshot

### `page_agent_extract`
Extract content from the page using Page Agent.

**Parameters:**
- `instruction` (required): What to extract from the page

## Supported LLM Models

Page Agent works with any model that follows OpenAI API format and supports tool calls:

| Provider | Recommended Models |
|----------|-------------------|
| Alibaba Qwen | qwen3.5-plus, qwen3.5-flash, qwen3-coder-next |
| OpenAI | gpt-5.2, gpt-5.1, gpt-4.1-mini |
| Anthropic | claude-haiku-4.5, claude-sonnet-4.5 |
| DeepSeek | deepseek-3.2 |
| Google | gemini-3-pro, gemini-3-flash |
| and more... | |

## Use Cases

### 1. Customer Service Automation
Let your support agent directly operate the page for users:
```javascript
await page_agent_execute({
  instruction: "Help the user submit a support ticket by filling out the form"
})
```

### 2. Business Process Automation
Guide new employees through complex workflows:
```javascript
await page_agent_execute({
  instruction: "Walk me through the customer onboarding process step by step"
})
```

### 3. Personal Productivity
Automate repetitive tasks across websites:
```javascript
await page_agent_execute({
  instruction: "Check my calendar and schedule a meeting room for tomorrow afternoon"
})
```

### 4. DevOps Automation
Operate management panels using natural language:
```javascript
await page_agent_execute({
  instruction: "Restart the production server and show me the latest logs"
})
```

## Current Status

⚠️ **Placeholder Implementation**: This plugin currently provides the tool definitions and configuration schema, but the actual browser integration is not yet implemented.

To fully integrate with Page Agent Chrome Extension, you would need:
1. A way to communicate with the Chrome extension (WebSocket, Native Messaging, etc.)
2. Or use Playwright/Puppeteer to drive the browser programmatically
3. Or build a custom bridge between OpenClaw and the browser

## Links

- [Page Agent Documentation](https://alibaba.github.io/page-agent/)
- [Page Agent GitHub](https://github.com/alibaba/page-agent)
- [OpenClaw Documentation](https://docs.openclaw.ai/)
tools

Comments

Sign in to leave a comment

Loading comments...