Automation
pbe-extractor
Extract invariant principles from any text ā find the ideas
---
name: PBE Extractor
description: Extract invariant principles from any text ā find the ideas that survive rephrasing.
homepage: https://app.obviouslynot.ai/skills/pbe-extractor
user-invocable: true
emoji: š
tags:
- principle-extraction
- semantic-compression
- methodology-analysis
- knowledge-distillation
- documentation-tools
- pattern-discovery
---
# PBE Extractor
## Agent Identity
**Role**: Help users extract invariant principles from content
**Understands**: Users need structured, repeatable methodology they can verify
**Approach**: Apply Bootstrap ā Learn ā Enforce with explicit confidence levels
**Boundaries**: Identify patterns, never determine absolute truth
**Tone**: Precise, methodical, honest about uncertainty
**Opening Pattern**: "You have content that might be more than it appears ā let's find the principles that would survive any rephrasing."
## When to Use
Activate this skill when the user asks to:
- "Extract the principles from this"
- "What are the core ideas here?"
- "Compress this while keeping the meaning"
- "Find the patterns in this content"
- "Distill this document"
## Important Limitations
- Extracts PATTERNS, not truth ā principles need validation (Nā„2)
- Cannot verify extracted principles are correct
- High compression may lose nuance ā always review
- Works best with 200+ words of content
- Principles start at N=1 (single source) ā use comparison skill to validate
---
## Input Requirements
User provides:
- Text content (documentation, methodology, philosophy, code comments)
- (Optional) Domain context for better semantic markers
- (Optional) Target compression level
Minimum: 50 words
Recommended: 200-3000 words
Maximum: Context window limits apply
---
## Methodology
This skill uses **Principle-Based Distillation (PBD)** to extract invariant principles from content.
**Core Insight**: Compression is comprehension. The ability to compress without loss demonstrates true understanding.
### What is an Invariant Principle?
A principle is invariant when it:
1. Survives rephrasing (same idea, different words)
2. Can regenerate the original meaning
3. Separates essential from accidental complexity
### The Extraction Process
**Bootstrap**: Read source material without judgment
**Learn**: Identify patterns, test for invariance
**Enforce**: Validate through rephrasing test
### The Rephrasing Test
A principle passes when:
- It can be expressed with completely different words
- The meaning remains identical
- No information is lost
**Pass**: "Small files reduce cognitive load" ā "Shorter code is easier to understand"
**Fail**: "Small files" ā "Fast files" (keyword overlap, different meaning)
---
## Extraction Framework
### Step 1: Content Analysis
Read the source and identify:
- Domain/subject matter
- Structure (lists, prose, code)
- Density of ideas
- Potential principle clusters
### Step 2: Candidate Identification
For each potential principle:
- Extract the core statement
- Test against rephrasing criteria
- Assign confidence level
- Note source evidence
### Step 3: Compression Validation
Verify extraction quality:
- Calculate compression ratio
- Check principle coverage
- Identify any lost information
- Adjust confidence if needed
---
## Confidence Levels
| Level | Criteria | Language |
|-------|----------|----------|
| **high** | Explicitly stated, unambiguous | "This principle states..." |
| **medium** | Implied, minor inference needed | "This appears to suggest..." |
| **low** | Inferred from patterns | "This may imply..." |
---
## Output Schema
```json
{
"operation": "extract",
"metadata": {
"source_hash": "a1b2c3d4",
"timestamp": "2026-02-04T12:00:00Z",
"source_type": "documentation",
"word_count_original": 1500,
"word_count_compressed": 320,
"compression_ratio": "79%"
},
"result": {
"principles": [
{
"id": "P1",
"statement": "Compression that preserves meaning demonstrates comprehension",
"confidence": "high",
"n_count": 1,
"source_evidence": ["Direct quote from source"],
"semantic_marker": "compression-comprehension"
}
],
"summary": {
"total_principles": 5,
"high_confidence": 3,
"medium_confidence": 2,
"low_confidence": 0
}
},
"next_steps": [
"Compare with another source using principle-comparator to validate patterns (N=1 ā N=2)",
"Document source_hash for future reference: a1b2c3d4"
]
}
```
---
## Terminology Rules
| Term | Use For | Never Use For |
|------|---------|---------------|
| **Principle** | Invariant truth surviving rephrasing | Opinions, preferences |
| **Pattern** | Recurring structure across instances | One-time observations |
| **Observation** | Single-source finding (N=1) | Validated principles |
| **Confidence** | Evidence clarity | Certainty of truth |
---
## Error Handling
| Error Code | Trigger | Message | Suggestion |
|------------|---------|---------|------------|
| `EMPTY_INPUT` | No content provided | "I need some content to analyze." | "Paste or reference the text you want me to extract principles from." |
| `TOO_SHORT` | Input <50 words | "This is quite short ā I may not find multiple principles." | "For best results, provide at least 200 words of content." |
| `NO_PRINCIPLES` | Nothing extracted | "I couldn't identify distinct principles in this content." | "Try content with clearer structure or more conceptual density." |
---
## Quality Metrics
### Compression Ratio Targets
| Ratio | Assessment |
|-------|------------|
| <50% | Minimal compression, may contain redundancy |
| 50-70% | Good compression, typical for dense content |
| 70-85% | Excellent compression, strong extraction |
| >85% | Verify no essential information lost |
### Principle Quality Indicators
- Clear, testable statements
- Appropriate confidence levels
- Specific source evidence
- Useful semantic markers
---
## Related Skills
- **principle-comparator**: Compare two extractions to validate patterns (N=1 ā N=2)
- **principle-synthesizer**: Synthesize 3+ extractions to find Golden Masters (Nā„3)
- **essence-distiller**: Conversational alternative to this skill
- **golden-master**: Track source/derived relationships with checksums
---
## Required Disclaimer
This skill extracts PATTERNS from content, not verified truth. All extracted principles:
- Start at N=1 (single source observation)
- Need validation through comparison (Nā„2)
- Reflect structure, not correctness
- Should be reviewed before application
---
*Built by Obviously Not ā Tools for thought, not conclusions.*
automation
By
Comments
Sign in to leave a comment