From c69336a8230384bb6c0db76dd1fe25b892cc2774 Mon Sep 17 00:00:00 2001 From: agatha Date: Sun, 15 Mar 2026 18:00:35 -0400 Subject: [PATCH] docs: update README.md --- README.md | 343 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 338 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 52e26cd..1f9839c 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,339 @@ -# Proxy Pool -Proxy Pool is a FastAPI backend that discovers, validates, and serves free proxy servers. It scrapes configurable proxy -list sources, runs a multi-stage validation pipeline to determine which proxies are alive and usable, and exposes a -query API for consumers to find and acquire working proxies. +# proxy-pool -See `docs/` for full documentation. +An async proxy pool backend that discovers, validates, and serves free proxy servers. Built with FastAPI, SQLAlchemy 2.0, asyncpg, ARQ, and Redis. + +## What it does + +Proxy Pool automates the full lifecycle of free proxy management: + +- **Discovers** proxies by scraping configurable sources on a schedule +- **Validates** proxies through a multi-stage pipeline (TCP liveness, HTTP anonymity detection, exit IP identification) +- **Serves** working proxies through a query API with filtering by protocol, country, anonymity level, score, and latency +- **Manages access** through API key authentication and a credit-based acquisition system with exclusive leases + +## Architecture + +The system runs as two processes sharing a PostgreSQL database and Redis instance: + +- **API server** — FastAPI application handling all HTTP requests +- **ARQ worker** — Background task processor running scrape, validation, and cleanup jobs on cron schedules + +A plugin system allows extending functionality without modifying core code. Plugins can add new proxy list parsers, validation methods, and notification channels. + +See `docs/01-architecture.md` for the full architecture overview. + +## Quick start + +### Prerequisites + +- Python 3.12+ +- [uv](https://docs.astral.sh/uv/) +- Docker and Docker Compose + +### 1. Clone and install + +```bash +git clone proxy-pool +cd proxy-pool +uv sync --all-extras +``` + +### 2. Configure + +```bash +cp .env.example .env +``` + +Edit `.env` with your settings. For local development with Docker infrastructure: + +```env +DB_URL=postgresql+asyncpg://proxypool:proxypool@localhost:5432/proxypool +REDIS_URL=redis://localhost:6379/0 +SECRET_KEY=change-me-to-something-random +``` + +### 3. Start infrastructure + +```bash +docker compose up -d postgres redis +``` + +### 4. Run migrations + +```bash +uv run alembic upgrade head +``` + +### 5. Start the application + +```bash +# API server (with hot reload) +uv run uvicorn proxy_pool.app:create_app --factory --reload --port 8000 + +# In a separate terminal: background worker +uv run arq proxy_pool.worker.settings.WorkerSettings +``` + +The API is available at `http://localhost:8000`. Interactive docs are at `http://localhost:8000/docs`. + +### 6. Add a proxy source + +```bash +curl -X POST http://localhost:8000/sources \ + -H "Content-Type: application/json" \ + -d '{ + "url": "https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/http.txt", + "parser_name": "plaintext" + }' +``` + +Trigger an immediate scrape: + +```bash +curl -X POST http://localhost:8000/sources/{source_id}/scrape +``` + +### 7. Query the pool + +```bash +curl "http://localhost:8000/proxies?status=active&min_score=0.5&limit=10" +``` + +## Docker deployment + +Run the full stack in Docker: + +```bash +make docker-up +``` + +This builds the images, runs migrations, and starts the API and worker. Services are available at `http://localhost:8000`. + +To stop: + +```bash +docker compose down +``` + +## API overview + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/health` | GET | Health check (Postgres + Redis status) | +| `/sources` | GET, POST | List and create proxy sources | +| `/sources/{id}` | GET, PATCH, DELETE | Manage individual sources | +| `/sources/{id}/scrape` | POST | Trigger immediate scrape | +| `/proxies` | GET | Query proxies with filtering and sorting | +| `/proxies/{id}` | GET | Get proxy details | +| `/proxies/acquire` | POST | Acquire a proxy with an exclusive lease (requires auth) | +| `/proxies/acquire/{id}/release` | POST | Release a lease early | +| `/auth/register` | POST | Create account and get API key | +| `/auth/keys` | GET, POST | List and create API keys | +| `/auth/keys/{id}` | DELETE | Revoke an API key | +| `/account` | GET | Account info | +| `/account/credits` | GET | Credit balance and history | + +See `docs/04-api-reference.md` for full request/response documentation. + +## Plugin system + +Proxy Pool uses a plugin architecture with three extension points: + +- **Source parsers** — Extract proxies from different list formats +- **Proxy checkers** — Validate proxies in a staged pipeline +- **Notifiers** — React to system events (pool health, credit alerts) + +Plugins implement Python Protocol classes — no inheritance required. Drop a `.py` file in `plugins/contrib/` with a `create_plugin(settings)` function and it's auto-discovered at startup. + +### Built-in plugins + +| Plugin | Type | Description | +|--------|------|-------------| +| `plaintext` | Parser | One `ip:port` per line | +| `protocol_prefix` | Parser | `protocol://ip:port` format (HTTP, SOCKS4/5) | +| `tcp_connect` | Checker (stage 1) | TCP liveness check | +| `http_anonymity` | Checker (stage 2) | Exit IP detection, anonymity classification | +| `smtp` | Notifier | Email alerts for pool health and credit events | + +See `docs/02-plugin-system.md` for the full plugin development guide. + +## Contributing + +### Development setup + +1. Fork and clone the repository. + +2. Install dependencies: + + ```bash + uv sync --all-extras + ``` + +3. Install pre-commit hooks: + + ```bash + uv tool install pre-commit + pre-commit install + pre-commit install --hook-type commit-msg + ``` + + This sets up: + - **ruff** for linting and formatting on every commit + - **Conventional Commits** validation on commit messages + +4. Start infrastructure: + + ```bash + docker compose up -d postgres redis + ``` + +5. Run migrations: + + ```bash + uv run alembic upgrade head + ``` + +### Commit conventions + +This project uses [Conventional Commits](https://www.conventionalcommits.org/). Every commit message must follow the format: + +``` +type: description + +Optional body explaining the change. +``` + +Valid types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore`, `build`, `ci`. + +Examples: + +``` +feat: add SOCKS5 proxy support to HTTP checker +fix: catch CancelledError in validation pipeline +docs: update API reference for acquire endpoint +test: add integration tests for credit ledger +refactor: extract proxy scoring into service layer +chore: bump httpx dependency to 0.28 +``` + +### Branch workflow + +All work happens on feature branches. Branch names mirror commit types: + +```bash +git checkout -b feat/geoip-checker +git checkout -b fix/validation-timeout +git checkout -b docs/deployment-guide +``` + +When done, rebase onto master and fast-forward merge: + +```bash +git checkout master +git merge --ff-only feat/your-feature +git branch -d feat/your-feature +``` + +### Running tests + +```bash +# Unit tests only (fast, no infrastructure needed) +make test-unit + +# Full test suite (requires Postgres + Redis running) +make test + +# Full suite in Docker (completely isolated, clean database) +make test-docker +``` + +Integration tests require Postgres and Redis. Either run them locally with Docker infrastructure or use the Docker test stack which provides an ephemeral database. + +To reset your local database between test runs: + +```bash +make reset-db +``` + +### Adding a database migration + +```bash +# Auto-generate from model changes +uv run alembic revision --autogenerate -m "description of change" + +# Always review the generated migration before applying +uv run alembic upgrade head +``` + +### Code quality + +```bash +# Lint and format check +make lint + +# Auto-fix lint issues +make lint-fix + +# Type checking +make typecheck +``` + +Ruff is configured for Python 3.12 with rules for import sorting, naming conventions, bugbear checks, and code simplification. Mypy runs in strict mode with the Pydantic plugin. + +### Project structure + +``` +proxy-pool/ +├── src/proxy_pool/ # Application source code +│ ├── app.py # FastAPI app factory +│ ├── config.py # Settings (env-driven, grouped) +│ ├── common/ # Shared dependencies, schemas +│ ├── db/ # SQLAlchemy base, session factory +│ ├── proxy/ # Proxy domain (models, routes, service) +│ ├── accounts/ # Accounts domain (auth, credits, keys) +│ ├── plugins/ # Plugin system + built-in plugins +│ └── worker/ # ARQ task definitions +├── tests/ # Test suite (unit, integration, plugins) +├── alembic/ # Database migrations +├── docs/ # Architecture and reference documentation +├── Dockerfile # Production image +├── Dockerfile.test # Test image with dev dependencies +├── docker-compose.yml # Full stack (API, worker, Postgres, Redis) +└── docker-compose.test.yml # Test override (ephemeral database) +``` + +### Makefile reference + +| Command | Description | +|---------|-------------| +| `make dev` | Start API with hot reload | +| `make worker` | Start ARQ worker | +| `make test` | Run full test suite locally | +| `make test-unit` | Run unit tests only | +| `make test-docker` | Run tests in Docker (clean DB) | +| `make lint` | Check linting and formatting | +| `make lint-fix` | Auto-fix lint issues | +| `make typecheck` | Run mypy | +| `make migrate` | Apply database migrations | +| `make reset-db` | Drop and recreate local database | +| `make docker-up` | Build and start full Docker stack | +| `make docker-down` | Stop Docker stack | +| `make docker-logs` | Tail API and worker logs | + +## Documentation + +Detailed documentation lives in the `docs/` directory: + +| Document | Contents | +|----------|----------| +| `01-architecture.md` | System overview, components, data flow, deployment | +| `02-plugin-system.md` | Plugin protocols, registry, discovery, how to write plugins | +| `03-database-schema.md` | Every table, column, index with design rationale | +| `04-api-reference.md` | All endpoints with request/response examples | +| `05-worker-tasks.md` | Background tasks, schedules, retry behavior | +| `06-development-guide.md` | Setup, workflow, common tasks | +| `07-operations-guide.md` | Configuration, monitoring, troubleshooting | + +## License + +TBD \ No newline at end of file