General
agent-observability-dashboard
Unified observability
# Agent Observability Dashboard π
Unified observability for OpenClaw agents β metrics, traces, and performance insights.
## What It Does
OpenClaw agents need production-grade visibility. Multiple platforms exist (Langfuse, Langsmith, AgentOps) but no unified view.
**Agent Observability Dashboard** provides:
- **Metrics tracking** β Latency, success rate, token usage, error counts
- **Trace visualization** β Tool chains, decision flows, session timelines
- **Cross-agent aggregation** β Compare performance across multiple agents/sessions
- **Exportable reports** β JSON, CSV, markdown for human review
- **Alert thresholds** β Notify when metrics exceed limits
## Problem It Solves
- No centralized view of OpenClaw agent performance
- Hard to debug across multiple tool calls
- No way to compare agents or track regressions
- Production monitoring is enterprise-grade; agents need the same
## Usage
```bash
# Start dashboard server
python3 scripts/observability.py --dashboard
# Record metrics from a session
python3 scripts/observability.py --record --session agent:main --latency 1.5 --success true
# View session trace
python3 scripts/observability.py --trace --session agent:main:12345
# Get performance report
python3 scripts/observability.py --report --period 24h
# Export to CSV
python3 scripts/observability.py --export metrics.csv
# Set alert thresholds
python3 scripts/observability.py --alert --metric latency --threshold 5.0
```
## Metrics Tracked
| Category | Metric | Description |
|-----------|---------|-------------|
| **Performance** | Latency | Tool call latency (ms) |
| | Throughput | Calls per second |
| **Success** | Success Rate | % of successful tool calls |
| | Error Count | Failed operations |
| **Cost** | Token Usage | Input + output tokens |
| | API Cost | Estimated cost in USD |
| **Quality** | Hallucinations | Detected false outputs |
| | Corrections Needed | User corrections |
## Trace Format
Each tool call is logged with:
- Timestamp
- Agent session ID
- Tool name + parameters
- Latency
- Success/failure
- Token usage
- Error details (if failed)
Example trace:
```json
{
"session_id": "agent:main:12345",
"trace": [
{
"timestamp": "2026-01-31T14:00:00Z",
"tool": "web_search",
"params": {"query": "agent observability"},
"latency_ms": 1234,
"success": true,
"tokens_used": 150
},
{
"timestamp": "2026-01-31T14:00:02Z",
"tool": "memory_write",
"params": {"content": "..."},
"latency_ms": 45,
"success": true,
"tokens_used": 0
}
]
}
```
## Architecture
```
βββββββββββββββββββ
β Instrumentationβ β Auto-capture from OpenClaw logs
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Metrics Store β β SQLite/InfluxDB for time-series
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Analytics β β Aggregations, trends, anomalies
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Dashboard UI β β Web interface (Flask/FastAPI)
βββββββββββββββββββ
```
## Requirements
- Python 3.9+
- flask (for dashboard web UI)
- pandas (for analytics)
- influxdb-client (optional, for production storage)
## Installation
```bash
# Clone repo
git clone https://github.com/orosha-ai/agent-observability-dashboard
# Install dependencies
pip install flask pandas influxdb-client
# Run dashboard
python3 scripts/observability.py --dashboard
# Open http://localhost:5000
```
## Inspiration
- **Dynatrace AI Observability App** β Enterprise-grade unified observability
- **Langfuse vs AgentOps benchmarks** β Comparison of platforms
- **Microsoft .NET tracing guide** β Practical implementation patterns
- **OpenLLMetry** β OpenTelemetry integration for LLMs
## Local-Only Promise
- Metrics stored locally (SQLite/InfluxDB)
- Dashboard runs locally
- No data sent to external services
## Version History
- **v0.1** β MVP: Metrics tracking, trace visualization, dashboard UI
- Roadmap: InfluxDB integration, anomaly detection, multi-agent comparison
general
By
Comments
Sign in to leave a comment