agatha/proxy-pool

Fork 0

agatha c041e83a19 docs: add developer documentation

2026-03-14 12:22:08 -04:00

9.0 KiB

Raw Blame History

Operations guide

Deployment

Docker Compose (single-server)

The simplest deployment for small-to-medium workloads. All services run on a single machine.

# Clone and configure
git clone <repo-url> proxy-pool && cd proxy-pool
cp .env.example .env
# Edit .env with production values

# Build and start
docker compose build
docker compose --profile migrate up -d migrate   # Run migrations
docker compose up -d api worker                    # Start services

Production considerations

API scaling: Run multiple API instances behind a load balancer. The API is stateless — any instance can handle any request. In Docker Compose, use docker compose up -d --scale api=3.

Worker scaling: Typically 1-2 worker instances are sufficient. ARQ deduplicates jobs via Redis, so multiple workers don't cause duplicate work. Scale workers if validation throughput is a bottleneck.

Database: Use a managed PostgreSQL service (AWS RDS, GCP Cloud SQL, etc.) for production. Enable connection pooling (PgBouncer) if running more than ~10 API instances.

Redis: A single Redis instance is sufficient for most workloads. Enable persistence (AOF or RDB snapshots) if you want lease state to survive Redis restarts. For high availability, use Redis Sentinel or a managed Redis service.

Configuration reference

All configuration is via environment variables, parsed by pydantic-settings.

Required

Variable	Description	Example
`DATABASE_URL`	PostgreSQL connection string	`postgresql+asyncpg://user:pass@host:5432/db`
`REDIS_URL`	Redis connection string	`redis://host:6379/0`
`SECRET_KEY`	Used for internal signing (API key generation)	Random 64+ character string

Application

Variable	Default	Description
`APP_NAME`	`proxy-pool`	Application name (appears in logs, OpenAPI docs)
`LOG_LEVEL`	`INFO`	Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`
`CORS_ORIGINS`	`[]`	Comma-separated list of allowed CORS origins
`API_KEY_PREFIX`	`pp_`	Prefix for generated API keys

Proxy pipeline

Variable	Default	Description
`SCRAPE_TIMEOUT_SECONDS`	`30`	HTTP timeout when fetching proxy sources
`SCRAPE_USER_AGENT`	`ProxyPool/0.1`	User-Agent header for scrape requests
`CHECK_TCP_TIMEOUT`	`5.0`	Timeout for TCP connect checks
`CHECK_HTTP_TIMEOUT`	`10.0`	Timeout for HTTP-level checks
`CHECK_PIPELINE_TIMEOUT`	`120`	Overall pipeline timeout per proxy
`JUDGE_URL`	`http://httpbin.org/ip`	URL used by the HTTP anonymity checker to determine exit IP
`REVALIDATE_ACTIVE_INTERVAL_MINUTES`	`10`	How often active proxies are re-checked
`REVALIDATE_DEAD_INTERVAL_HOURS`	`6`	How often dead proxies are re-checked
`REVALIDATE_BATCH_SIZE`	`200`	Max proxies per revalidation sweep
`POOL_LOW_THRESHOLD`	`100`	Emit `proxy.pool_low` event when active count drops below this

Accounts

Variable	Default	Description
`DEFAULT_CREDITS`	`100`	Credits granted to new accounts
`MAX_LEASE_DURATION_SECONDS`	`3600`	Maximum allowed lease duration
`DEFAULT_LEASE_DURATION_SECONDS`	`300`	Default lease duration if not specified
`CREDIT_LOW_THRESHOLD`	`10`	Emit `credits.low_balance` when balance drops below this

Cleanup

Variable	Default	Description
`PRUNE_DEAD_AFTER_DAYS`	`30`	Delete dead proxies older than this
`PRUNE_CHECKS_AFTER_DAYS`	`7`	Delete check history older than this
`PRUNE_CHECKS_KEEP_LAST`	`100`	Always keep at least this many checks per proxy

Notifications

Variable	Default	Description
`SMTP_HOST`	(empty)	SMTP server. If empty, SMTP notifier is disabled.
`SMTP_PORT`	`587`	SMTP port
`SMTP_USER`	(empty)	SMTP username
`SMTP_PASSWORD`	(empty)	SMTP password
`ALERT_EMAIL`	(empty)	Recipient for alert emails
`WEBHOOK_URL`	(empty)	Webhook URL. If empty, webhook notifier is disabled.

Redis cache

Variable	Default	Description
`CACHE_PROXY_LIST_TTL`	`60`	TTL in seconds for cached proxy query results
`CACHE_CREDIT_BALANCE_TTL`	`300`	TTL in seconds for cached credit balances

Monitoring

Health check

curl http://localhost:8000/health

Returns 200 with connection status for PostgreSQL and Redis. Use this as a Docker/Kubernetes health check and load balancer target.

Key metrics to watch

Pool health (GET /stats/pool):

by_status.active — The number of working proxies. If this drops suddenly, investigate source failures or upstream blocks.
last_scrape_at — If this is stale, the worker may be down or the scrape task is failing.
last_validation_at — If this is stale, validation is backed up or the worker is stuck.

Plugin health (GET /stats/plugins):

Check notifiers[].healthy — if a notifier is unhealthy, alerts won't be delivered.

Worker job queue: Monitor Redis keys arq:queue:default (pending jobs) and arq:result:* (completed/failed jobs). A growing queue indicates the worker can't keep up.

Log format

Logs are structured JSON in production (LOG_LEVEL=INFO):

{
  "timestamp": "2025-01-15T10:30:00Z",
  "level": "INFO",
  "message": "scrape_source completed",
  "source_id": "abc-123",
  "proxies_new": 23,
  "duration_ms": 1540
}

Alerting

The built-in notification system handles operational alerts:

proxy.pool_low — Active proxy count below threshold. Action: add more sources or investigate why proxies are dying.
source.failed — A scrape failed. Usually transient (upstream 503). Investigate if persistent.
source.stale — A source hasn't produced results in N hours. The source may be dead or blocking your scraper.
credits.low_balance / credits.exhausted — User account alerts. No operational action needed unless it's your own account.

Troubleshooting

Proxies are all dying

Symptoms: by_status.active dropping, by_status.dead increasing.

Possible causes:

The judge URL (JUDGE_URL) is down or rate-limiting you. Check if httpbin.org/ip is accessible from your server.
Your server's IP is blocked by proxy providers. Try from a different IP or use a self-hosted judge endpoint.
Proxy sources are returning stale lists. Check last_scraped_at on sources.

Fix: Self-host a simple judge endpoint (a Flask/FastAPI app that returns {"ip": request.remote_addr}) to eliminate dependency on httpbin.

Worker is not processing jobs

Symptoms: last_scrape_at and last_validation_at are stale. Redis queue is growing.

Check:

docker compose logs worker --tail=50
docker compose exec redis redis-cli LLEN arq:queue:default

Possible causes:

Worker process crashed. Restart it: docker compose restart worker.
Redis connection lost. Check Redis health: docker compose exec redis redis-cli ping.
A task is stuck (infinite loop or hung network call). Check CHECK_PIPELINE_TIMEOUT.

Database connections exhausted

Symptoms: asyncpg.exceptions.TooManyConnectionsError or slow queries.

Fix: Reduce the connection pool size in DATABASE_URL parameters, or deploy PgBouncer. The default asyncpg pool size is 10 connections per process — with 3 API instances and 1 worker, that's 40 connections. PostgreSQL's default limit is 100.

# In DATABASE_URL or via SQLAlchemy pool config
DATABASE_POOL_SIZE=5
DATABASE_MAX_OVERFLOW=10

Redis memory growing

Symptoms: Redis memory usage increasing over time.

Possible causes:

ARQ job results not expiring. Check keep_result setting.
Proxy cache not being invalidated. Verify CACHE_PROXY_LIST_TTL is set.
Lease keys not expiring (should auto-expire via TTL).

Fix: Set a Redis maxmemory policy:

maxmemory 256mb
maxmemory-policy allkeys-lru

Migration failed

Symptoms: alembic upgrade head errors.

Steps:

Check the current state: uv run alembic current.
Look at the error — usually a constraint violation or type mismatch.
If the migration is partially applied, you may need to manually fix the state: uv run alembic stamp <revision>.
For production, always test migrations against a copy of the production database first.

Backup and recovery

Database backup

# Dump
docker compose exec postgres pg_dump -U proxypool proxypool > backup.sql

# Restore
docker compose exec -T postgres psql -U proxypool proxypool < backup.sql

Redis

For proxy pool, Redis data is ephemeral (cache + queue). Losing Redis state means:

Cached proxy lists are rebuilt on next query (minor latency spike).
Active leases are lost (the expire_leases task will clean up PostgreSQL state).
Pending ARQ jobs are lost (the next cron cycle will re-enqueue them).

If lease integrity is critical, enable Redis persistence (AOF recommended):

appendonly yes
appendfsync everysec

9.0 KiB Raw Blame History

Operations guide

Deployment

Docker Compose (single-server)

Production considerations

Configuration reference

Required

Application

Proxy pipeline

Accounts

Cleanup

Notifications

Redis cache

Monitoring

Health check

Key metrics to watch

Log format

Alerting

Troubleshooting

Proxies are all dying

Worker is not processing jobs

Database connections exhausted

Redis memory growing

Migration failed

Backup and recovery

Database backup

Redis

9.0 KiB

Raw Blame History