← Back to Skills
Browser

amazon-competitor-analyzer

phheng By phheng 👁 4 views ▲ 0 votes

Scrapes Amazon product data from ASINs

GitHub
---
name: amazon-competitor-analyzer
description: Scrapes Amazon product data from ASINs using browseract.com automation API and performs surgical competitive analysis. Compares specifications, pricing, review quality, and visual strategies to identify competitor moats and vulnerabilities.
---

# Amazon Competitor Analyzer

This skill scrapes Amazon product data from user-provided ASINs using browseract.com's browser automation API and performs deep competitive analysis. It compares specifications, pricing, review quality, and visual strategies to identify competitor moats and vulnerabilities.

## When to Use This Skill

- Competitive research: Input multiple ASINs to understand market landscape
- Pricing strategy analysis: Compare price bands across similar products
- Specification benchmarking: Deep dive into technical specs and feature differences
- Review insights: Analyze review quality, quantity, and sentiment patterns
- Visual strategy research: Evaluate main images, A+ content, and brand visuals
- Market opportunity discovery: Identify gaps and potential threats
- Product optimization: Develop optimization strategies based on competitor analysis
- New product research: Support new product development with market data

## What This Skill Does

1. **ASIN Data Collection**: Automatically extract product title, price, rating, review count, images, and core data using BrowserAct workflow templates
2. **Specification Extraction**: Deep extraction of technical specs, features, and materials
3. **Review Quality Analysis**: Analyze review patterns, keywords, and sentiment
4. **Visual Strategy Assessment**: Evaluate main images, A+ page design, and brand consistency
5. **Multi-Dimensional Comparison**: Side-by-side comparison of key metrics across products
6. **Moat Identification**: Identify core competitive advantages and barriers
7. **Vulnerability Discovery**: Find competitor weaknesses and market opportunities
8. **Structured Output**: Generate JSON and Markdown analysis reports

## Features
1. **No hallucinations, ensuring stable and accurate data extraction**: Pre-set workflows eliminate AI-generated hallucinations.
2.
**No CAPTCHA challenges**: Built-in bypass mechanisms eliminate the need to handle reCAPTCHA or other verification challenges.
3.
**No IP Access Restrictions or Geofencing**: Overcomes geographic IP limitations for stable global access.
4.
**Faster Execution Speed**: Tasks complete more rapidly than purely AI-driven browser automation solutions.
5. **Exceptional Cost Efficiency**: Significantly reduces data acquisition costs compared to token-intensive AI solutions.



## Prerequisites

### 1. BrowserAct.com Account Setup

You need a BrowserAct.com account and API key:

1. Visit [browseract.com](https://browseract.com)
2. Sign up for an account
3. Navigate to API settings
4. Generate an API key
5. Store your API key securely (environment variables recommended)

### 2. Environment Configuration

Set your API key as an environment variable:

```bash
export BROWSERACT_API_KEY="your-api-key-here"
```

Or create a `.env` file:

```
BROWSERACT_API_KEY=your-api-key-here
```

## How to Use

### Basic Competitor Analysis

```
Analyze the following Amazon ASIN: B09XYZ12345
```

```
Compare these three products: B07ABC11111, B07DEF22222, B07GHI33333
```

### Deep Specification Comparison

```
Analyze the technical specification differences: B09XYZ12345, B09ABC11111
```

### Review Quality Analysis

```
Analyze review quality and feedback: B09XYZ12345, B07DEF22222
```

### Visual Strategy Research

```
Research main image and visual presentation strategies: B09XYZ12345, B09ABC11111
```

### Complete Competitive Analysis

```
Analyze competitor landscape: B09XYZ12345, B07DEF22222, B07GHI33333, B09JKL44444
```

## Instructions

When a user requests Amazon competitor analysis:

### 1. ASIN Identification and Validation

Identify ASINs from user input:

- **ASIN Format**: 10-character alphanumeric (e.g., B09XYZ12345)
- **Validation**: Check format compliance with Amazon ASIN standards
- **URL Parsing**: Extract ASIN from Amazon product URLs
- **Error Handling**: Prompt user to correct invalid ASINs

### 2. BrowserAct API Implementation

```python
"""
BrowserAct API - Run Template Task and Wait for Completion
Scenarios for beginners - Synchronous task execution with official templates
"""
import os
import time
import traceback
import json
import requests

# ============ Configuration Area ============
# API Key - Get from: https://www.browseract.com/reception/integrations
API_KEY = os.getenv("BROWSERACT_API_KEY", "your-api-key-here")

# Workflow Template ID for Amazon product scraping
# You can get it from:
# - Run: python Workflow-Python/11.list_official_workflow_templates.py
# - Or visit: https://www.browseract.com/template?platformType=0
WORKFLOW_TEMPLATE_ID = "77814333389670716"

# Polling configuration
POLL_INTERVAL = 5  # Check task status every 5 seconds
MAX_WAIT_TIME = 1800  # Maximum wait time: 30 minutes (1800 seconds)

API_BASE_URL = "https://api.browseract.com/v2/workflow"


def create_input_parameters(asins):
    """Create input parameters for the workflow template"""
    return [
        {
            "name": "ASIN",
            "value": asin.strip()
        }
        for asin in asins if asin.strip()
    ]


def run_task_by_template(workflow_template_id, input_parameters):
    """Start a task using template"""
    headers = {
        "Authorization": f"Bearer {API_KEY}"
    }
    
    data = {
        "workflow_template_id": workflow_template_id,
        "input_parameters": input_parameters,
    }
    
    api_url = f"{API_BASE_URL}/run-task-by-template"
    response = requests.post(api_url, json=data, headers=headers)
    
    if response.status_code == 200:
        result = response.json()
        task_id = result["id"]
        print(f"Task started successfully, Task ID: {task_id}")
        if "profileId" in result:
            print(f"   Profile ID: {result['profileId']}")
        return task_id
    else:
        print(f"Failed to start task: {response.json()}")
        return None


def get_task_status(task_id):
    """Get task status"""
    headers = {
        "Authorization": f"Bearer {API_KEY}"
    }
    
    api_url = f"{API_BASE_URL}/get-task-status?task_id={task_id}"
    try:
        response = requests.get(api_url, headers=headers, timeout=30)
        
        if response.status_code == 200:
            return response.json().get("status")
        else:
            print(f"Failed to get task status: {response.json()}")
            return None
    except (requests.exceptions.SSLError, requests.exceptions.ConnectionError, 
            requests.exceptions.Timeout, requests.exceptions.RequestException) as e:
        # Network error, will retry in next polling cycle
        return None


def get_task(task_id):
    """Get detailed task information and results"""
    headers = {
        "Authorization": f"Bearer {API_KEY}"
    }
    
    api_url = f"{API_BASE_URL}/get-task?task_id={task_id}"
    try:
        response = requests.get(api_url, headers=headers, timeout=30)
        
        if response.status_code == 200:
            return response.json()
        else:
            print(f"Failed to get task details: {response.json()}")
            return None
    except (requests.exceptions.SSLError, requests.exceptions.ConnectionError, 
            requests.exceptions.Timeout, requests.exceptions.RequestException) as e:
        print(f"Network error while getting task details: {type(e).__name__}")
        return None


def wait_for_task_completion(task_id):
    """Wait for task completion with progress updates"""
    start_time = time.time()
    previous_status = None
    
    print(f"Waiting for task completion (max wait time: {MAX_WAIT_TIME // 60} minutes)...")
    
    while True:
        # Check if timeout
        elapsed_time = time.time() - start_time
        if elapsed_time > MAX_WAIT_TIME:
            print(f"Wait timeout (waited {elapsed_time:.0f} seconds)")
            return None
        
        # Get task status
        status = get_task_status(task_id)
        
        if status is None:
            # Network error or API error, continue waiting
            elapsed = int(elapsed_time)
            print(f"   Network error, retrying... (waited {elapsed} seconds)", end="\r")
        elif status == "finished":
            print(f"Task completed successfully!")
            return "finished"
        elif status == "failed":
            print(f"Task execution failed")
            return "failed"
        elif status == "canceled":
            print(f"Task canceled")
            return "canceled"
        else:
            # running, created, paused, etc.
            elapsed = int(elapsed_time)
            if status != previous_status:
                print(f"   Status: {status} (waited {elapsed} seconds)", end="\r")
                previous_status = status
            else:
                print(f"   Status: {status} (waited {elapsed} seconds)", end="\r")
        
        # Wait before checking again
        time.sleep(POLL_INTERVAL)


def scrape_amazon_products(asins):
    """
    Main function to scrape Amazon product data
    
    Args:
        asins: List of Amazon ASINs to scrape
        
    Returns:
        dict: Task result containing product data
    """
    if not asins:
        raise ValueError("No ASINs provided for scraping")
    
    # Create input parameters
    input_parameters = create_input_parameters(asins)
    
    print(f"Starting Amazon product scraping for {len(asins)} ASIN(s)...")
    print(f"ASINs: {[p['value'] for p in input_parameters]}")
    
    # Step 1: Start task using template
    task_id = run_task_by_template(WORKFLOW_TEMPLATE_ID, input_parameters)
    
    if task_id is None:
        raise Exception("Unable to start scraping task")
    
    # Step 2: Wait for task completion
    final_status = wait_for_task_completion(task_id)
    
    if final_statu

... (truncated)
browser

Comments

Sign in to leave a comment

Loading comments...