9.7 KiB
proxy-pool
An async proxy pool backend that discovers, validates, and serves free proxy servers. Built with FastAPI, SQLAlchemy 2.0, asyncpg, ARQ, and Redis.
What it does
Proxy Pool automates the full lifecycle of free proxy management:
- Discovers proxies by scraping configurable sources on a schedule
- Validates proxies through a multi-stage pipeline (TCP liveness, HTTP anonymity detection, exit IP identification)
- Serves working proxies through a query API with filtering by protocol, country, anonymity level, score, and latency
- Manages access through API key authentication and a credit-based acquisition system with exclusive leases
Architecture
The system runs as two processes sharing a PostgreSQL database and Redis instance:
- API server — FastAPI application handling all HTTP requests
- ARQ worker — Background task processor running scrape, validation, and cleanup jobs on cron schedules
A plugin system allows extending functionality without modifying core code. Plugins can add new proxy list parsers, validation methods, and notification channels.
See docs/01-architecture.md for the full architecture overview.
Quick start
Prerequisites
- Python 3.12+
- uv
- Docker and Docker Compose
1. Clone and install
git clone <repo-url> proxy-pool
cd proxy-pool
uv sync --all-extras
2. Configure
cp .env.example .env
Edit .env with your settings. For local development with Docker infrastructure:
DB_URL=postgresql+asyncpg://proxypool:proxypool@localhost:5432/proxypool
REDIS_URL=redis://localhost:6379/0
SECRET_KEY=change-me-to-something-random
3. Start infrastructure
docker compose up -d postgres redis
4. Run migrations
uv run alembic upgrade head
5. Start the application
# API server (with hot reload)
uv run uvicorn proxy_pool.app:create_app --factory --reload --port 8000
# In a separate terminal: background worker
uv run arq proxy_pool.worker.settings.WorkerSettings
The API is available at http://localhost:8000. Interactive docs are at http://localhost:8000/docs.
6. Add a proxy source
curl -X POST http://localhost:8000/sources \
-H "Content-Type: application/json" \
-d '{
"url": "https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/http.txt",
"parser_name": "plaintext"
}'
Trigger an immediate scrape:
curl -X POST http://localhost:8000/sources/{source_id}/scrape
7. Query the pool
curl "http://localhost:8000/proxies?status=active&min_score=0.5&limit=10"
Docker deployment
Run the full stack in Docker:
make docker-up
This builds the images, runs migrations, and starts the API and worker. Services are available at http://localhost:8000.
To stop:
docker compose down
API overview
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check (Postgres + Redis status) |
/sources |
GET, POST | List and create proxy sources |
/sources/{id} |
GET, PATCH, DELETE | Manage individual sources |
/sources/{id}/scrape |
POST | Trigger immediate scrape |
/proxies |
GET | Query proxies with filtering and sorting |
/proxies/{id} |
GET | Get proxy details |
/proxies/acquire |
POST | Acquire a proxy with an exclusive lease (requires auth) |
/proxies/acquire/{id}/release |
POST | Release a lease early |
/auth/register |
POST | Create account and get API key |
/auth/keys |
GET, POST | List and create API keys |
/auth/keys/{id} |
DELETE | Revoke an API key |
/account |
GET | Account info |
/account/credits |
GET | Credit balance and history |
See docs/04-api-reference.md for full request/response documentation.
Plugin system
Proxy Pool uses a plugin architecture with three extension points:
- Source parsers — Extract proxies from different list formats
- Proxy checkers — Validate proxies in a staged pipeline
- Notifiers — React to system events (pool health, credit alerts)
Plugins implement Python Protocol classes — no inheritance required. Drop a .py file in plugins/contrib/ with a create_plugin(settings) function and it's auto-discovered at startup.
Built-in plugins
| Plugin | Type | Description |
|---|---|---|
plaintext |
Parser | One ip:port per line |
protocol_prefix |
Parser | protocol://ip:port format (HTTP, SOCKS4/5) |
tcp_connect |
Checker (stage 1) | TCP liveness check |
http_anonymity |
Checker (stage 2) | Exit IP detection, anonymity classification |
smtp |
Notifier | Email alerts for pool health and credit events |
See docs/02-plugin-system.md for the full plugin development guide.
Contributing
Development setup
-
Fork and clone the repository.
-
Install dependencies:
uv sync --all-extras -
Install pre-commit hooks:
uv tool install pre-commit pre-commit install pre-commit install --hook-type commit-msgThis sets up:
- ruff for linting and formatting on every commit
- Conventional Commits validation on commit messages
-
Start infrastructure:
docker compose up -d postgres redis -
Run migrations:
uv run alembic upgrade head
Commit conventions
This project uses Conventional Commits. Every commit message must follow the format:
type: description
Optional body explaining the change.
Valid types: feat, fix, docs, refactor, test, chore, build, ci.
Examples:
feat: add SOCKS5 proxy support to HTTP checker
fix: catch CancelledError in validation pipeline
docs: update API reference for acquire endpoint
test: add integration tests for credit ledger
refactor: extract proxy scoring into service layer
chore: bump httpx dependency to 0.28
Branch workflow
All work happens on feature branches. Branch names mirror commit types:
git checkout -b feat/geoip-checker
git checkout -b fix/validation-timeout
git checkout -b docs/deployment-guide
When done, rebase onto master and fast-forward merge:
git checkout master
git merge --ff-only feat/your-feature
git branch -d feat/your-feature
Running tests
# Unit tests only (fast, no infrastructure needed)
make test-unit
# Full test suite (requires Postgres + Redis running)
make test
# Full suite in Docker (completely isolated, clean database)
make test-docker
Integration tests require Postgres and Redis. Either run them locally with Docker infrastructure or use the Docker test stack which provides an ephemeral database.
To reset your local database between test runs:
make reset-db
Adding a database migration
# Auto-generate from model changes
uv run alembic revision --autogenerate -m "description of change"
# Always review the generated migration before applying
uv run alembic upgrade head
Code quality
# Lint and format check
make lint
# Auto-fix lint issues
make lint-fix
# Type checking
make typecheck
Ruff is configured for Python 3.12 with rules for import sorting, naming conventions, bugbear checks, and code simplification. Mypy runs in strict mode with the Pydantic plugin.
Project structure
proxy-pool/
├── src/proxy_pool/ # Application source code
│ ├── app.py # FastAPI app factory
│ ├── config.py # Settings (env-driven, grouped)
│ ├── common/ # Shared dependencies, schemas
│ ├── db/ # SQLAlchemy base, session factory
│ ├── proxy/ # Proxy domain (models, routes, service)
│ ├── accounts/ # Accounts domain (auth, credits, keys)
│ ├── plugins/ # Plugin system + built-in plugins
│ └── worker/ # ARQ task definitions
├── tests/ # Test suite (unit, integration, plugins)
├── alembic/ # Database migrations
├── docs/ # Architecture and reference documentation
├── Dockerfile # Production image
├── Dockerfile.test # Test image with dev dependencies
├── docker-compose.yml # Full stack (API, worker, Postgres, Redis)
└── docker-compose.test.yml # Test override (ephemeral database)
Makefile reference
| Command | Description |
|---|---|
make dev |
Start API with hot reload |
make worker |
Start ARQ worker |
make test |
Run full test suite locally |
make test-unit |
Run unit tests only |
make test-docker |
Run tests in Docker (clean DB) |
make lint |
Check linting and formatting |
make lint-fix |
Auto-fix lint issues |
make typecheck |
Run mypy |
make migrate |
Apply database migrations |
make reset-db |
Drop and recreate local database |
make docker-up |
Build and start full Docker stack |
make docker-down |
Stop Docker stack |
make docker-logs |
Tail API and worker logs |
Documentation
Detailed documentation lives in the docs/ directory:
| Document | Contents |
|---|---|
01-architecture.md |
System overview, components, data flow, deployment |
02-plugin-system.md |
Plugin protocols, registry, discovery, how to write plugins |
03-database-schema.md |
Every table, column, index with design rationale |
04-api-reference.md |
All endpoints with request/response examples |
05-worker-tasks.md |
Background tasks, schedules, retry behavior |
06-development-guide.md |
Setup, workflow, common tasks |
07-operations-guide.md |
Configuration, monitoring, troubleshooting |
License
TBD