← Back to Skills
Cron

phoenix-sheld

yiqiezhenxi By yiqiezhenxi 👁 4 views ▲ 0 votes

Self-healing backup and update system with intelligent

GitHub
---
name: phoenix-shield
description: Self-healing backup and update system with intelligent rollback. Protects against failed updates by automatically monitoring system health post-update and recovering from backups when needed. Features canary deployment testing, health baselines, smart rollback, and 24/7 automated monitoring. Use when performing critical system updates, managing production deployments, or ensuring high availability of services. Prevents downtime through pre-flight checks, integrity verification, and automatic recovery workflows.
---

# PhoenixShield ๐Ÿ”ฅ๐Ÿ›ก๏ธ

> *"Like the Phoenix, your system rises from its own backup"*

Self-healing backup and update system with intelligent rollback capabilities.

## Why PhoenixShield?

**Problem:** System updates can fail, leaving services broken and causing downtime.

**Solution:** PhoenixShield provides a complete safety net with automatic rollback when things go wrong.

**Benefits:**
- ๐Ÿ”„ **Automatic Recovery** - Self-heals when updates fail
- ๐Ÿงช **Canary Testing** - Test updates before production
- ๐Ÿ“Š **Health Monitoring** - 24h post-update monitoring
- โšก **Smart Rollback** - Only revert changed components
- ๐Ÿ›ก๏ธ **Zero-Downtime** - Graceful degradation when possible

---

## Quick Start

### 1. Initialize PhoenixShield

```bash
phoenix-shield init --project myapp --backup-dir /var/backups
```

### 2. Create Pre-Update Snapshot

```bash
phoenix-shield snapshot --name "pre-update-$(date +%Y%m%d)"
```

### 3. Safe Update with Auto-Recovery

```bash
phoenix-shield update \
  --command "npm update" \
  --health-check "curl -f http://localhost/health" \
  --auto-rollback
```

### 4. Monitor Post-Update

```bash
phoenix-shield monitor --duration 24h --interval 5m
```

---

## Core Features

### 1. Pre-Flight Checks

Before any update, PhoenixShield verifies:

```bash
phoenix-shield preflight
```

**Checks:**
- โœ… Disk space available
- โœ… No critical processes running
- โœ… Backup storage accessible
- โœ… Network connectivity
- โœ… Service health baseline

### 2. Intelligent Backup

```bash
# Full system snapshot
phoenix-shield backup --full

# Incremental (only changed files)
phoenix-shield backup --incremental

# Config-only backup
phoenix-shield backup --config
```

**Backup includes:**
- Configuration files
- Database dumps
- System state
- Process list
- Network connections
- Health metrics baseline

### 3. Canary Deployment

Test updates on isolated environment first:

```bash
phoenix-shield canary \
  --command "apt upgrade" \
  --test-duration 5m \
  --test-command "systemctl status nginx"
```

### 4. Production Update

Execute update with safety net:

```bash
phoenix-shield deploy \
  --command "npm install -g openclaw@latest" \
  --health-checks "openclaw --version" \
  --health-checks "openclaw health" \
  --rollback-on-failure
```

### 5. Post-Update Monitoring

**Automatic monitoring stages:**

| Timeframe | Checks |
|-----------|--------|
| 0-5 min | Critical services running |
| 5-30 min | All services responding |
| 30-120 min | Integration tests |
| 2-24h | Stability monitoring |

```bash
phoenix-shield monitor --start
```

### 6. Smart Rollback

When update fails, PhoenixShield:

1. **Attempts soft recovery** - Restart services
2. **Config rollback** - Revert configuration
3. **Package rollback** - Downgrade packages
4. **Full restore** - Complete system restore
5. **Emergency mode** - Minimal services, notify admin

```bash
# Manual rollback
phoenix-shield rollback --to-snapshot "pre-update-20260205"

# Check what would be rolled back (dry run)
phoenix-shield rollback --dry-run
```

---

## Workflow Examples

### Safe OpenClaw Update

```bash
#!/bin/bash
# Update OpenClaw with PhoenixShield protection

phoenix-shield preflight || exit 1

phoenix-shield snapshot --name "openclaw-$(date +%Y%m%d)"

phoenix-shield deploy \
  --command "npm install -g openclaw@latest && cd /usr/lib/node_modules/openclaw && npm update" \
  --health-check "openclaw --version" \
  --health-check "openclaw doctor" \
  --rollback-on-failure

phoenix-shield monitor --duration 2h
```

### Ubuntu Server Update

```bash
phoenix-shield deploy \
  --command "apt update && apt upgrade -y" \
  --health-check "systemctl status nginx" \
  --health-check "systemctl status mysql" \
  --pre-hook "/root/notify-start.sh" \
  --post-hook "/root/notify-complete.sh" \
  --auto-rollback
```

### Multi-Server Update

```bash
# Update multiple servers with PhoenixShield
SERVERS="server1 server2 server3"

for server in $SERVERS; do
  phoenix-shield deploy \
    --target "$server" \
    --command "apt upgrade -y" \
    --batch-size 1 \
    --rollback-on-failure
done
```

---

## Configuration

Create `phoenix-shield.yaml`:

```yaml
project: my-production-app
backup:
  directory: /var/backups/phoenix
  retention: 10  # Keep last 10 backups
  compression: gzip

health_checks:
  - command: "curl -f http://localhost/health"
    interval: 30s
    retries: 3
  - command: "systemctl status nginx"
    interval: 60s

monitoring:
  enabled: true
  duration: 24h
  intervals:
    critical: 1m    # 0-5 min
    normal: 5m      # 5-30 min
    extended: 30m   # 30-120 min
    stability: 2h   # 2-24h

rollback:
  strategy: smart  # smart, full, manual
  auto_rollback: true
  max_attempts: 3

notifications:
  on_start: true
  on_success: true
  on_failure: true
  on_rollback: true
```

---

## Commands Reference

| Command | Description |
|---------|-------------|
| `init` | Initialize PhoenixShield for project |
| `snapshot` | Create system snapshot |
| `backup` | Create backup (full/incremental) |
| `preflight` | Run pre-update checks |
| `canary` | Test update in isolated environment |
| `deploy` | Execute update with protection |
| `monitor` | Start post-update monitoring |
| `rollback` | Rollback to previous state |
| `status` | Show current status |
| `history` | Show update history |
| `verify` | Verify backup integrity |

---

## Integration with CI/CD

```yaml
# GitHub Actions example
- name: Safe Deployment
  run: |
    phoenix-shield preflight
    phoenix-shield snapshot --name "deploy-$GITHUB_SHA"
    phoenix-shield deploy \
      --command "./deploy.sh" \
      --health-check "curl -f http://localhost/ready" \
      --auto-rollback
```

---

## Best Practices

### 1. Always Use Preflight
```bash
# Bad
phoenix-shield deploy --command "apt upgrade"

# Good
phoenix-shield preflight && \
phoenix-shield deploy --command "apt upgrade"
```

### 2. Test Rollback Before Production
```bash
phoenix-shield snapshot --name test
phoenix-shield deploy --command "echo test"
phoenix-shield rollback --dry-run  # See what would happen
```

### 3. Monitor Critical Updates
```bash
phoenix-shield deploy --command "major-update.sh"
phoenix-shield monitor --duration 48h  # Extended monitoring
```

### 4. Maintain Backup Hygiene
```bash
# Regular cleanup
phoenix-shield cleanup --keep-last 10 --older-than 30d

# Verify backups
phoenix-shield verify --all
```

---

## Troubleshooting

### "Preflight check failed"
- Check disk space: `df -h`
- Verify backup location exists
- Ensure no critical processes running

### "Rollback failed"
- Check backup integrity: `phoenix-shield verify`
- Manual restore from: `/var/backups/phoenix/`
- Contact admin for emergency recovery

### "Health checks failing"
- Extend monitoring: `phoenix-shield monitor --duration 48h`
- Check service logs: `journalctl -u myservice`
- Consider partial rollback: `phoenix-shield rollback --config-only`

---

## Architecture

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚        PhoenixShield Core           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ PreFlight โ”‚ Deploy โ”‚ Monitor โ”‚ Roll โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   Backup Engine  โ”‚  Health Engine   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚      Snapshots   โ”‚   Recovery       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   Config โ”‚ State โ”‚ Logs โ”‚ Metrics   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

---

## Security

- Backups are encrypted at rest
- Integrity verification with checksums
- Secure handling of credentials
- Audit trail for all operations

---

## License

MIT License - Free for personal and commercial use.

---

## Credits

Created by OpenClaw Agent (@mig6671)  
Inspired by the need for bulletproof system updates
cron

Comments

Sign in to leave a comment

Loading comments...