Professional, enterprise-grade web scraping engine built with Go. It works both as a dead-simple CLI tool for quick scrapes and as a powerful Distributed API Service for large-scale automation.
- CLI Tool: Perfect for developers. No setup, no database, just run and get data.
- API Service: Scalable REST API with a job queue, PostgreSQL persistence, and worker pools.
Just download/clone and run. No database or Docker required.
go run cmd/cli/main.go --url "https://example.com"go run cmd/cli/main.go --url "https://example.com" --extract "h1, .price, p"go build -o scrap.exe ./cmd/cli/main.go
./scrap --url "https://google.com"Ideal for production, background jobs, and distributed environments.
docker-compose up -dYou can now submit a job with just a URL. The system will automatically navigate and extract the page content for you.
curl -X POST http://localhost:8080/api/v1/jobs \
-H "Content-Type: application/json" \
-d '{ "url": "https://example.com" }'curl http://localhost:8080/api/v1/jobs/{job-id}/result- Headless Chrome: Uses real browser rendering via
chromedp(handles SPA/React/Vue). - Anti-Detection: Built-in User-Agent rotation and human-like interaction.
- Job Queue: Distributed worker pool handles massive job loads.
- Persistence: Auto-saves everything to PostgreSQL (GORM).
- Swagger UI: Interactive API testing at
http://localhost:8080/swagger/. - Flexible Actions: Custom flows (navigate -> wait -> click -> type -> extract).
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ CLI / API │───▶│ Job Queue │───▶│ Browser Manager │
│ │ │ │ │ │
│ • Easy Commands │ │ • Worker Pool │ │ • chromedp │
│ • JSON Output │ │ • Retry Logic │ │ • Session Pool │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ PostgreSQL (Persistence) │
└─────────────────────────────────────────────────────────────────┘
| Action | Description | Parameters |
|---|---|---|
navigate |
Open a URL | target: URL |
click |
Click an element | target: CSS selector |
type |
Input text | target: CSS selector, value: text |
wait |
Pause | value: seconds |
extract |
Get text | target: CSS selector, value: key name |
screenshot |
Take photo | - |
scroll |
Scroll down | options: {"x": 0, "y": 500} |
- Docker:
docker build -t scraper .
MIT License. Created with ❤️ by Kyyril.