# Implementation Plan: Login Brute-Force Protection **Branch**: `009-login-rate-limiting` | **Date**: 2026-05-06 | **Spec**: [spec.md](spec.md) **Input**: Feature specification from `specs/009-login-rate-limiting/spec.md` ## Summary Add failure-counting brute-force protection to the login endpoint (`POST /api/v1/auth/token`). After a configurable number of consecutive failed attempts from the same resolved client IP, the endpoint returns HTTP 429 with a `Retry-After` header for a configurable cooldown period. A successful login resets the counter. All thresholds are configurable via environment variables. When deployed behind a reverse proxy (nginx, Kubernetes ingress), a `LOGIN_TRUSTED_PROXY_IPS` setting enables extraction of the real client IP from `X-Forwarded-For`. No new infrastructure (no Redis, no new DB table) — counters live in process memory. --- ## Technical Context **Language/Version**: Python 3.12+ **Primary Dependencies**: FastAPI, pydantic-settings (already in use); no new dependencies added **Storage**: In-memory `dict` (no persistence across restarts — intentional) **Testing**: pytest + pytest-asyncio (existing test infrastructure) **Target Platform**: Linux server (Docker) **Project Type**: Web service (API only — this feature has no UI surface) **Performance Goals**: Rate limiter adds negligible overhead (dict lookup + lock acquisition; sub-millisecond) **Constraints**: Must not add new runtime service dependencies; must not change any auth behaviour for non-blocked sources **Scale/Scope**: Single process, single user; in-memory store is sufficient --- ## Constitution Check | Principle | Status | Notes | |-----------|--------|-------| | §2.4 Auth abstraction (AuthProvider interface) | ✅ Pass | Rate limiter is a guard *before* `JWTAuthProvider.verify_credentials()`, not a bypass of the interface | | §2.5 DB abstraction (repository layer) | ✅ Pass | No database access; in-memory only | | §2.6 No speculative abstraction | ✅ Pass | Concrete `LoginRateLimiter` class, no interface; only one implementation planned | | §3.3 Error envelope (`detail` + `code`) | ✅ Pass | 429 response uses `{"detail": "...", "code": "login_rate_limited"}` | | §5.1 TDD | ✅ Required | Tasks follow red → green order | | §5.2 Integration tests against PostgreSQL | ✅ Pass | Integration test for the login endpoint will run against the Docker PostgreSQL stack | | §7.2 Environment configuration | ✅ Pass | `LOGIN_MAX_FAILURES`, `LOGIN_WINDOW_SECONDS`, `LOGIN_COOLDOWN_SECONDS`, `LOGIN_TRUSTED_PROXY_IPS` from env vars | | §7.3 Linting (ruff) | ✅ Required | All new files must pass `ruff check` | **Gate result**: No violations. Cleared to proceed. --- ## Project Structure ### Documentation (this feature) ```text specs/009-login-rate-limiting/ ├── plan.md ← this file ├── research.md ← decisions on approach ├── data-model.md ← rate-limit record entity ├── quickstart.md ← curl runbook ├── contracts/ │ └── auth.md ← updated POST /api/v1/auth/token with 429 └── tasks.md ← generated by /speckit-tasks ``` ### Source Code Changes ```text api/ ├── app/ │ ├── auth/ │ │ ├── rate_limiter.py ← NEW: LoginRateLimiter class │ │ ├── jwt_provider.py (unchanged) │ │ ├── noop.py (unchanged) │ │ └── provider.py (unchanged) │ ├── config.py ← add login_max_failures, login_window_seconds, login_cooldown_seconds, login_trusted_proxy_ips │ ├── main.py ← init LoginRateLimiter in lifespan, attach to app.state │ └── routers/ │ └── auth.py ← check rate limit before auth, record outcome └── tests/ ├── unit/ │ └── test_rate_limiter.py ← NEW: unit tests for LoginRateLimiter logic └── integration/ └── test_login_rate_limit.py ← NEW: integration tests for 429 behaviour via HTTP ``` --- ## Implementation Detail ### `api/app/auth/rate_limiter.py` ```python import ipaddress import logging import time from dataclasses import dataclass, field from ipaddress import IPv4Network, IPv6Network from threading import Lock from starlette.requests import Request logger = logging.getLogger(__name__) def get_client_ip( request: Request, trusted_networks: list[IPv4Network | IPv6Network], ) -> str: """Return the resolved client IP, honouring X-Forwarded-For when the TCP peer is a trusted upstream proxy. Falls back to the TCP peer address when no trusted networks are configured or the peer is not in the list.""" peer = request.client.host if request.client else "unknown" if trusted_networks and peer != "unknown": try: peer_addr = ipaddress.ip_address(peer) if any(peer_addr in net for net in trusted_networks): xff = request.headers.get("X-Forwarded-For", "").split(",")[0].strip() if xff: return xff real_ip = request.headers.get("X-Real-IP", "").strip() if real_ip: return real_ip except ValueError: pass return peer @dataclass class _Record: failures: int = 0 window_start: float = field(default_factory=time.time) blocked_until: float = 0.0 class LoginRateLimiter: def __init__( self, max_failures: int = 5, window_seconds: int = 300, cooldown_seconds: int = 900, ) -> None: self._max = max_failures self._window = window_seconds self._cooldown = cooldown_seconds self._store: dict[str, _Record] = {} self._lock = Lock() @property def cooldown_seconds(self) -> int: return self._cooldown def is_blocked(self, ip: str) -> bool: now = time.time() with self._lock: rec = self._store.get(ip) if rec is None: return False if rec.blocked_until > now: return True if rec.blocked_until > 0: del self._store[ip] return False def record_failure(self, ip: str) -> None: now = time.time() with self._lock: rec = self._store.get(ip) if rec is None: rec = _Record(window_start=now) self._store[ip] = rec if now - rec.window_start > self._window: rec.failures = 0 rec.window_start = now rec.failures += 1 if rec.failures >= self._max: rec.blocked_until = now + self._cooldown logger.warning( "Login blocked for %s after %d failures", ip, rec.failures ) def record_success(self, ip: str) -> None: with self._lock: self._store.pop(ip, None) ``` ### `api/app/config.py` additions ```python login_max_failures: int = 5 login_window_seconds: int = 300 login_cooldown_seconds: int = 900 login_trusted_proxy_ips: str = "" # comma-separated IPs/CIDRs; empty = disabled ``` ### `api/app/main.py` lifespan update ```python import ipaddress from app.auth.rate_limiter import LoginRateLimiter @asynccontextmanager async def lifespan(application: FastAPI): settings = get_settings() application.state.login_rate_limiter = LoginRateLimiter( max_failures=settings.login_max_failures, window_seconds=settings.login_window_seconds, cooldown_seconds=settings.login_cooldown_seconds, ) trusted_networks = [] for part in settings.login_trusted_proxy_ips.split(","): part = part.strip() if part: try: trusted_networks.append(ipaddress.ip_network(part, strict=False)) except ValueError: pass # invalid entry — skip silently application.state.login_trusted_networks = trusted_networks # ... existing DB setup unchanged engine = get_engine() async with engine.begin() as conn: await conn.run_sync(Base.metadata.create_all) yield await engine.dispose() ``` ### `api/app/routers/auth.py` update ```python from fastapi import APIRouter, Depends, HTTPException, Request from fastapi.responses import JSONResponse from pydantic import BaseModel from app.auth.jwt_provider import JWTAuthProvider from app.auth.rate_limiter import LoginRateLimiter, get_client_ip from app.dependencies import get_jwt_auth router = APIRouter(tags=["auth"]) class LoginRequest(BaseModel): username: str password: str class TokenResponse(BaseModel): access_token: str token_type: str = "bearer" expires_in: int @router.post("/auth/token", response_model=TokenResponse) async def login( request: Request, body: LoginRequest, auth: JWTAuthProvider = Depends(get_jwt_auth), ): limiter: LoginRateLimiter = request.app.state.login_rate_limiter ip: str = get_client_ip(request, request.app.state.login_trusted_networks) if limiter.is_blocked(ip): return JSONResponse( status_code=429, content={ "detail": "Too many failed login attempts. Please try again later.", "code": "login_rate_limited", }, headers={"Retry-After": str(limiter.cooldown_seconds)}, ) if not auth.verify_credentials(body.username, body.password): limiter.record_failure(ip) raise HTTPException( status_code=401, detail={"detail": "Invalid credentials", "code": "invalid_credentials"}, ) limiter.record_success(ip) token = auth.create_token() return TokenResponse( access_token=token, token_type="bearer", expires_in=auth._expiry_seconds, ) ``` ### `api/tests/unit/test_rate_limiter.py` (representative cases) ```python import time import pytest from app.auth.rate_limiter import LoginRateLimiter def test_not_blocked_initially(): limiter = LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=300) assert limiter.is_blocked("1.2.3.4") is False def test_blocked_after_threshold(): limiter = LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=300) for _ in range(3): limiter.record_failure("1.2.3.4") assert limiter.is_blocked("1.2.3.4") is True def test_success_clears_failures(): limiter = LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=300) limiter.record_failure("1.2.3.4") limiter.record_failure("1.2.3.4") limiter.record_success("1.2.3.4") assert limiter.is_blocked("1.2.3.4") is False def test_ips_are_isolated(): limiter = LoginRateLimiter(max_failures=2, window_seconds=60, cooldown_seconds=300) limiter.record_failure("1.1.1.1") limiter.record_failure("1.1.1.1") assert limiter.is_blocked("2.2.2.2") is False ``` ### `api/tests/integration/test_login_rate_limit.py` (representative cases) ```python import pytest from httpx import AsyncClient # Uses the 'client' fixture (NoOpAuthProvider) from conftest — sufficient for this # endpoint since we're testing the rate-limit layer, not auth correctness. # The login endpoint instantiates its own limiter via app.state, so we need # the full ASGI app. BAD_CREDS = {"username": "attacker", "password": "wrong"} @pytest.mark.asyncio async def test_repeated_failures_trigger_429(client: AsyncClient): # Use a custom limiter with low threshold to avoid slow tests # (the app.state.login_rate_limiter is set in lifespan; override for test) from app.auth.rate_limiter import LoginRateLimiter from app.main import app original = app.state.login_rate_limiter app.state.login_rate_limiter = LoginRateLimiter( max_failures=3, window_seconds=60, cooldown_seconds=30 ) try: for _ in range(3): await client.post("/api/v1/auth/token", json=BAD_CREDS) resp = await client.post("/api/v1/auth/token", json=BAD_CREDS) assert resp.status_code == 429 assert resp.json()["code"] == "login_rate_limited" assert "Retry-After" in resp.headers finally: app.state.login_rate_limiter = original ``` --- ## Implementation Phases ### Phase 1 (MVP — P1): Blocking after repeated failures 1. Add `login_max_failures`, `login_window_seconds`, `login_cooldown_seconds`, `login_trusted_proxy_ips` to `api/app/config.py` 2. Create `api/app/auth/rate_limiter.py` with `LoginRateLimiter` and `get_client_ip()` 3. Initialize rate limiter and parse trusted networks in `api/app/main.py` lifespan; attach both to `app.state` 4. Update `api/app/routers/auth.py` to resolve client IP via `get_client_ip()`, then check + record outcomes 5. Unit tests: `api/tests/unit/test_rate_limiter.py` 6. Integration tests: `api/tests/integration/test_login_rate_limit.py` ### Phase 2 (US2 — observability): Logging and response hints Delivered as part of Phase 1 (the `logger.warning(...)` call and `Retry-After` header are embedded in the same implementation). No separate phase needed. --- ## Environment Variables to Add to `.env.example` ```dotenv # Login brute-force protection LOGIN_MAX_FAILURES=5 LOGIN_WINDOW_SECONDS=300 LOGIN_COOLDOWN_SECONDS=900 # Comma-separated IPs/CIDRs of trusted upstream proxies (e.g. nginx ingress pod CIDR). # Leave empty when not behind a reverse proxy. LOGIN_TRUSTED_PROXY_IPS= ``` These are optional (have defaults) so existing `.env` files without them continue working.