Media
tldw
**too long; didn't watch**
# tldw - YouTube Video Summarizer
**too long; didn't watch**
Extract and summarize YouTube video transcripts quickly and efficiently.
## Overview
The tldw skill takes a YouTube URL, extracts the video transcript, and provides a comprehensive summary of the content. This allows you to quickly understand video content without watching the entire video.
### Purpose
This skill solves the problem of information overload from video content. Instead of spending 10-60 minutes watching a video, you can get the key points, main arguments, and conclusions in a concise summary within seconds.
### When to Use
Use this skill when:
- A user provides a YouTube video URL and asks for a summary
- You need to quickly understand video content without watching it
- You want to analyze or reference specific video content
- You need to extract information from educational, news, or documentary videos
### How It Works
1. **Extraction**: Uses yt-dlp to download video transcripts (captions/subtitles)
2. **Cleaning**: Applies deduplication to remove artifacts from auto-generated captions
3. **Processing**: Analyzes the cleaned transcript directly in the main agent session
4. **Summary**: Returns a structured summary with main points, key arguments, and conclusions
### Key Features
- **Caching**: Downloaded transcripts are cached locally to avoid re-downloading
- **Deduplication**: Removes duplicate lines common in auto-generated captions
- **Multi-format support**: Works with VTT, SRT, and JSON caption formats
- **Cookie support**: Can access age-restricted content with a cookie file
- **Comprehensive summaries**: Provides thesis, key examples, comparisons, and conclusions
- **Fast processing**: Typical videos summarized in seconds
### Attribution
This skill is based on the [tldw project](https://github.com/rmusser01/tldw) by stong. Full attribution and licensing details are available in [ATTRIBUTION.md](ATTRIBUTION.md).
---
## Requirements
### System Requirements
- **Python**: 3.8 or higher
- **Disk space**: ~60MB for virtual environment and dependencies, plus additional space for transcript cache
### Required Dependencies
The skill uses a Python virtual environment with the following dependencies:
- **yt-dlp**: Video transcript downloader (installed via pip)
- **Python standard library**: json, re, argparse (built-in)
All dependencies are installed in the local virtual environment at `venv/`.
### Optional Dependencies
- **Cookie file**: For accessing age-restricted or members-only content
- Format: Netscape cookie format (can be exported from browser)
- Place in skill directory and reference with `--cookies` flag
### Directory Structure
```
tldw/
├── SKILL.md # This documentation
├── ATTRIBUTION.md # Credit to original project
├── LICENSE # AGPL-3.0 license
├── scripts/
│ └── extract_transcript.py # Main extraction script
├── cache/ # Cached transcripts (auto-created)
└── venv/ # Python virtual environment
├── bin/
│ └── yt-dlp # Video transcript downloader
└── lib/ # Python packages
```
### Setup
Follow these steps to set up the tldw skill:
1. **Navigate to the skill directory:**
```bash
cd tldw/
```
2. **Create a Python virtual environment:**
```bash
python3 -m venv venv
```
3. **Install dependencies:**
```bash
venv/bin/pip install yt-dlp webvtt-py
```
4. **Verify installation:**
```bash
venv/bin/yt-dlp --version
```
5. **The cache directory will be created automatically on first use**
The skill is now ready to use!
---
## Usage Instructions
### Basic Usage
When a user provides a YouTube URL and requests a summary, use the following workflow:
1. Extract the transcript using the extraction script
2. Parse the JSON output to get the cleaned transcript
3. Summarize the transcript directly (do not use sub-agents for large transcripts)
4. Return a structured summary to the user
### Command Syntax
```bash
cd tldw/ && \
venv/bin/python scripts/extract_transcript.py \
--json --cache-dir cache "YOUTUBE_URL"
```
### Processing the Output
The script returns JSON with the following structure:
```json
{
"transcript": "Full cleaned transcript text...",
"video_id": "video_id_here",
"title": "Video Title",
"description": "Video description...",
"duration": 1234,
"uploader": "Channel Name",
"upload_date": "20260101",
"view_count": 12345,
"webpage_url": "https://www.youtube.com/watch?v=..."
}
```
Extract the `transcript` field and process it directly to create a comprehensive summary.
### Command Options
- `--json`: Output in JSON format (recommended for parsing)
- `--cache-dir <path>`: Specify cache directory (default: `cache/`)
- `--cookies <file>`: Path to Netscape-format cookie file for age-restricted content
### Example Workflow
```bash
# 1. Extract transcript
cd tldw/ && \
venv/bin/python scripts/extract_transcript.py \
--json --cache-dir cache "https://www.youtube.com/watch?v=VIDEO_ID"
# 2. Parse the JSON output and extract the transcript field
# 3. Summarize the transcript directly (include main points, key arguments, conclusions)
# 4. Return formatted summary to user
```
### Accessing Age-Restricted Content
For age-restricted or members-only videos, export cookies from your browser:
1. Install a browser extension like "Get cookies.txt LOCALLY"
2. Navigate to YouTube while logged in
3. Export cookies in Netscape format
4. Save to the tldw directory (e.g., `youtube_cookies.txt`)
5. Use with: `--cookies youtube_cookies.txt`
---
## Error Handling
### No Captions Available
**Error message:** `"No subtitles/captions found"`
**What it means:** The video has no auto-generated or manual captions available.
**Solution:** Inform the user that the video cannot be transcribed because it lacks captions.
### Invalid URL
**Error message:** `"ERROR: unable to download video data"`
**What it means:** The URL is malformed, the video doesn't exist, or it's private/deleted.
**Solution:** Verify the URL is correct and check if the video is publicly accessible.
### Age-Restricted Content
**Error message:** `"Sign in to confirm your age"` or similar authentication errors
**What it means:** The video requires age verification or YouTube login.
**Solution:** Use the `--cookies` flag with exported browser cookies (see "Accessing Age-Restricted Content" above).
### Network/Connection Errors
**Error messages:** `"Unable to download"`, `"Connection timeout"`, extraction failures
**What it means:** Network issues, YouTube blocking the request, or **outdated yt-dlp** that's incompatible with current YouTube.
**Solution:**
1. **First, update yt-dlp:**
```bash
cd tldw/ && \
venv/bin/pip install --upgrade yt-dlp
```
2. Retry the extraction
3. If still failing: check internet connection or wait and try later
YouTube frequently changes their API, so keeping yt-dlp updated is essential.
### Cache Issues
**Symptoms:** Permission errors, disk full errors
**What it means:** The cache directory has permission problems or insufficient disk space.
**Solution:** Check available disk space with `df -h` and verify write permissions on the `cache/` directory.
### Large Transcript Handling
**Note:** Transcripts over 50,000 characters may take longer to process.
**Best practice:** Process large transcripts directly in the main agent session. Do not delegate to sub-agents, as they have been found unreliable with large payloads.
### Debugging
To see full error output (not just the last 100 lines):
```bash
cd tldw/ && \
venv/bin/python scripts/extract_transcript.py \
--json --cache-dir cache "YOUTUBE_URL"
```
To inspect cached transcripts:
```bash
ls -lh tldw/cache/
```
---
## Limitations
### Caption Dependency
- The skill **only works with videos that have captions/subtitles** available
- Cannot transcribe videos with only audio (no built-in speech-to-text capability)
- Auto-generated captions may contain errors, typos, or timing artifacts
- Deduplication helps clean up auto-generated caption issues but isn't perfect
### Language Support
- Depends on available caption languages provided by YouTube
- The script extracts whatever captions are available (auto-generated or manual)
- Non-English captions work, but summarization quality depends on the language model's capabilities
- English captions typically provide the best results
### Video Length
- Very long videos (2+ hours) may produce extremely large transcripts (100k+ characters)
- Processing time increases proportionally with transcript length
- No hard limit, but practical considerations for context window and processing time apply
### YouTube-Only
- Currently **only supports YouTube URLs**
- Does not work with other video platforms (Vimeo, Dailymotion, TikTok, etc.)
- While yt-dlp supports many platforms, this script is optimized specifically for YouTube
### Private/Restricted Content
- Cannot access truly private videos (not shared publicly)
- Members-only or channel membership content requires cookies from an authenticated session
- Live streams may not have captions available until after the stream has ended
- Some geographic restrictions cannot be bypassed even with cookies
### Deduplication Limitations
- The deduplication logic removes **consecutive duplicate lines**
- May occasionally remove legitimate repeated phrases or refrains
- Designed primarily for auto-generated caption artifacts, not all repetition scenarios
- Manual captions typically don't need deduplication
### No Audio Extraction
- This skill extracts **text transcripts only**, not audio files
- For audio extraction or processing, other tools (like yt-dlp directly with audio flags) would be needed
- Focus is on text-based content analysis, not media files
---
media
By
Comments
Sign in to leave a comment