Browser
computer-use
Full desktop computer use for headless Linux servers and VPS.
---
name: computer-use
description: Full desktop computer use for headless Linux servers and VPS. Creates a virtual display (Xvfb + XFCE) to control GUI applications without a physical monitor. Screenshots, mouse clicks, keyboard input, scrolling, dragging β all 17 standard actions. Includes flicker-free VNC setup for live remote viewing. Model-agnostic, works with any LLM.
version: 1.2.0
---
# Computer Use Skill
Full desktop GUI control for headless Linux servers. Creates a virtual display (Xvfb + XFCE) so you can run and control desktop applications on VPS/cloud instances without a physical monitor.
## Environment
- **Display**: `:99`
- **Resolution**: 1024x768 (XGA, Anthropic recommended)
- **Desktop**: XFCE4 (minimal β xfwm4 + panel only)
## Quick Setup
Run the setup script to install everything (systemd services, flicker-free VNC):
```bash
./scripts/setup-vnc.sh
```
This installs:
- Xvfb virtual display on `:99`
- Minimal XFCE desktop (xfwm4 + panel, no xfdesktop)
- x11vnc with stability flags
- noVNC for browser access
All services auto-start on boot and auto-restart on crash.
## Actions Reference
| Action | Script | Arguments | Description |
|--------|--------|-----------|-------------|
| screenshot | `screenshot.sh` | β | Capture screen β base64 PNG |
| cursor_position | `cursor_position.sh` | β | Get current mouse X,Y |
| mouse_move | `mouse_move.sh` | x y | Move mouse to coordinates |
| left_click | `click.sh` | x y left | Left click at coordinates |
| right_click | `click.sh` | x y right | Right click |
| middle_click | `click.sh` | x y middle | Middle click |
| double_click | `click.sh` | x y double | Double click |
| triple_click | `click.sh` | x y triple | Triple click (select line) |
| left_click_drag | `drag.sh` | x1 y1 x2 y2 | Drag from start to end |
| left_mouse_down | `mouse_down.sh` | β | Press mouse button |
| left_mouse_up | `mouse_up.sh` | β | Release mouse button |
| type | `type_text.sh` | "text" | Type text (50 char chunks, 12ms delay) |
| key | `key.sh` | "combo" | Press key (Return, ctrl+c, alt+F4) |
| hold_key | `hold_key.sh` | "key" secs | Hold key for duration |
| scroll | `scroll.sh` | dir amt [x y] | Scroll up/down/left/right |
| wait | `wait.sh` | seconds | Wait then screenshot |
| zoom | `zoom.sh` | x1 y1 x2 y2 | Cropped region screenshot |
## Usage Examples
```bash
export DISPLAY=:99
# Take screenshot
./scripts/screenshot.sh
# Click at coordinates
./scripts/click.sh 512 384 left
# Type text
./scripts/type_text.sh "Hello world"
# Press key combo
./scripts/key.sh "ctrl+s"
# Scroll down
./scripts/scroll.sh down 5
```
## Workflow Pattern
1. **Screenshot** β Always start by seeing the screen
2. **Analyze** β Identify UI elements and coordinates
3. **Act** β Click, type, scroll
4. **Screenshot** β Verify result
5. **Repeat**
## Tips
- Screen is 1024x768, origin (0,0) at top-left
- Click to focus before typing in text fields
- Use `ctrl+End` to jump to page bottom in browsers
- Most actions auto-screenshot after 2 sec delay
- Long text is chunked (50 chars) with 12ms keystroke delay
## Live Desktop Viewing (VNC)
Watch the desktop in real-time via browser or VNC client.
### Connect via Browser
```bash
# SSH tunnel (run on your local machine)
ssh -L 6080:localhost:6080 your-server
# Open in browser
http://localhost:6080/vnc.html
```
### Connect via VNC Client
```bash
# SSH tunnel
ssh -L 5900:localhost:5900 your-server
# Connect VNC client to localhost:5900
```
### SSH Config (recommended)
Add to `~/.ssh/config` for automatic tunneling:
```
Host your-server
HostName your.server.ip
User your-user
LocalForward 6080 127.0.0.1:6080
LocalForward 5900 127.0.0.1:5900
```
Then just `ssh your-server` and VNC is available.
## System Services
```bash
# Check status
systemctl status xvfb xfce-minimal x11vnc novnc
# Restart if needed
sudo systemctl restart xvfb xfce-minimal x11vnc novnc
```
### Service Chain
```
xvfb β xfce-minimal β x11vnc β novnc
```
- **xvfb**: Virtual display :99 (1024x768x24)
- **xfce-minimal**: Watchdog that runs xfwm4+panel, kills xfdesktop
- **x11vnc**: VNC server with `-noxdamage` for stability
- **novnc**: WebSocket proxy with heartbeat for connection stability
## Opening Applications
```bash
export DISPLAY=:99
google-chrome --no-sandbox & # Chrome (recommended)
xfce4-terminal & # Terminal
thunar & # File manager
```
**Note**: Snap browsers (Firefox, Chromium) have sandbox issues on headless servers. Use Chrome `.deb` instead:
```bash
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb
sudo apt-get install -f
```
## Manual Setup
If you prefer manual setup instead of `setup-vnc.sh`:
```bash
# Install packages
sudo apt install -y xvfb xfce4 xfce4-terminal xdotool scrot imagemagick dbus-x11 x11vnc novnc websockify
# Copy service files
sudo cp systemd/*.service /etc/systemd/system/
# Edit xfce-minimal.service: replace %USER% and %SCRIPT_PATH%
sudo nano /etc/systemd/system/xfce-minimal.service
# Mask xfdesktop (prevents VNC flickering)
sudo mv /usr/bin/xfdesktop /usr/bin/xfdesktop.real
echo -e '#!/bin/bash\nexit 0' | sudo tee /usr/bin/xfdesktop
sudo chmod +x /usr/bin/xfdesktop
# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable --now xvfb xfce-minimal x11vnc novnc
```
## Troubleshooting
### VNC shows black screen
- Check if xfwm4 is running: `pgrep xfwm4`
- Restart desktop: `sudo systemctl restart xfce-minimal`
### VNC flickering/flashing
- Ensure xfdesktop is masked (check `/usr/bin/xfdesktop`)
- xfdesktop causes flicker due to clearβdraw cycles on Xvfb
### VNC disconnects frequently
- Check noVNC has `--heartbeat 30` flag
- Check x11vnc has `-noxdamage` flag
### x11vnc crashes (SIGSEGV)
- Add `-noxdamage -noxfixes` flags
- The DAMAGE extension causes crashes on Xvfb
## Requirements
Installed by `setup-vnc.sh`:
```bash
xvfb xfce4 xfce4-terminal xdotool scrot imagemagick dbus-x11 x11vnc novnc websockify
```
browser
By
Comments
Sign in to leave a comment