Files
reactbin/specs/009-login-rate-limiting/tasks.md
agatha 7a835d3172 Feat: Rate-limit login endpoint to block brute-force attacks
After LOGIN_MAX_FAILURES consecutive failed attempts from the same source
IP within LOGIN_WINDOW_SECONDS, POST /api/v1/auth/token returns HTTP 429
with a Retry-After header for LOGIN_COOLDOWN_SECONDS. A successful login
resets the counter. Trusted upstream proxy IPs/CIDRs can be declared via
LOGIN_TRUSTED_PROXY_IPS so X-Forwarded-For is honoured correctly behind
nginx ingress or similar reverse proxies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 21:01:37 +00:00

12 KiB
Raw Blame History

Tasks: Login Brute-Force Protection

Input: Design documents from specs/009-login-rate-limiting/ Prerequisites: plan.md , spec.md , research.md , data-model.md , contracts/auth.md , quickstart.md

Tests: TDD is non-negotiable (§5.1). Every test task appears before the implementation task it covers. For each red step, run the test and confirm it fails before proceeding to the implementation.

Organization: Phase 1 adds env vars; Phase 2 adds config fields (shared by both stories); Phase 3 implements the core blocking behaviour (US1 MVP); Phase 4 adds observability-specific test coverage (US2); Phase 5 is polish.

Format: [ID] [P?] [Story] Description

  • [P]: Can run in parallel with other [P] tasks in the same phase
  • [Story]: Which user story this task belongs to
  • Exact file paths included in every task description

Phase 1: Setup

  • T001 Add a # Login brute-force protection comment block with LOGIN_MAX_FAILURES=5, LOGIN_WINDOW_SECONDS=300, LOGIN_COOLDOWN_SECONDS=900, and LOGIN_TRUSTED_PROXY_IPS= (empty by default, with an inline comment explaining it accepts comma-separated IPs/CIDRs) to both .env.example and .env.test.example at the repo root

Phase 2: Foundational

Purpose: Add the three new settings fields — required before any story implementation.

  • T002 Add login_max_failures: int = 5, login_window_seconds: int = 300, login_cooldown_seconds: int = 900, login_trusted_proxy_ips: str = "" to the Settings class in api/app/config.py (append after owner_password)

Checkpoint: api/app/config.py accepts all three new env vars with defaults.


Phase 3: User Story 1 — Repeated failed logins are blocked (Priority: P1) 🎯 MVP

Goal: After LOGIN_MAX_FAILURES consecutive failed login attempts from the same source IP within LOGIN_WINDOW_SECONDS, POST /api/v1/auth/token returns HTTP 429 for LOGIN_COOLDOWN_SECONDS. A successful login resets the counter.

Independent Test: cd api && python -m pytest tests/unit/test_rate_limiter.py tests/integration/test_login_rate_limit.py::test_repeated_failures_trigger_429 tests/integration/test_login_rate_limit.py::test_success_resets_counter tests/integration/test_login_rate_limit.py::test_429_has_retry_after_header tests/integration/test_login_rate_limit.py::test_xff_header_ignored_when_no_trusted_networks -v — all pass.

Tests for User Story 1 (TDD red — write first, confirm failure before T005)

  • T003 [P] [US1] Create api/tests/unit/test_rate_limiter.py with ten failing unit tests — import LoginRateLimiter and get_client_ip from app.auth.rate_limiter; for LoginRateLimiter (instantiate with max_failures=3, window_seconds=60, cooldown_seconds=300): test_not_blocked_initially, test_blocked_after_threshold, test_success_clears_failures, test_ips_are_isolated, test_window_resets_after_expiry, test_log_warning_on_lockout (caplog at WARNING level: call record_failure() until threshold, assert "Login blocked" in caplog.text and IP in log output); for get_client_ip (construct a mock using from unittest.mock import MagicMock and from starlette.requests import Request: req = MagicMock(spec=Request); req.client.host = "10.0.0.1"; req.headers = {"X-Forwarded-For": "203.0.113.5"}): test_get_client_ip_no_trusted_networks_returns_peer (empty trusted_networks=[] → returns req.client.host), test_get_client_ip_trusted_peer_uses_xff (peer "10.0.0.1" in trusted CIDR "10.0.0.0/8" → returns "203.0.113.5"), test_get_client_ip_untrusted_peer_ignores_xff (peer "8.8.8.8" not in trusted CIDR "10.0.0.0/8" → returns "8.8.8.8" despite XFF), test_get_client_ip_trusted_peer_falls_back_to_real_ip (peer trusted, no XFF header, X-Real-IP: "203.0.113.9" → returns "203.0.113.9"); run python -m pytest tests/unit/test_rate_limiter.py -v and confirm ImportError or ModuleNotFoundError (red)
  • T004 [P] [US1] Create api/tests/integration/test_login_rate_limit.py with four failing integration tests; each must override both app.state.login_rate_limiter (fresh LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=30)) and app.state.login_trusted_networks (set to [] for all four tests — the ASGITransport peer is "testclient", not a valid IP, so trusted-network matching can't be exercised here; proxy extraction is fully covered by T003 unit tests) via try/finally: (1) test_repeated_failures_trigger_429 — POST three bad-credential requests then assert fourth returns 429 with resp.json()["code"] == "login_rate_limited"; (2) test_success_resets_counter — two failures → one valid login using {"username": os.environ["OWNER_USERNAME"], "password": os.environ["OWNER_PASSWORD"]} (matching conftest.py defaults: testowner/testpassword) → three more failures → assert all three return 401, not 429; (3) test_429_has_retry_after_header — trigger lockout (three failures), then assert "Retry-After" in resp.headers and int(resp.headers["Retry-After"]) > 0; (4) test_xff_header_ignored_when_no_trusted_networks — send three bad-cred requests with headers={"X-Forwarded-For": "1.2.3.4"} then a fourth with headers={"X-Forwarded-For": "9.9.9.9"} — assert the fourth returns 429 (not 401), proving the limiter tracked the real peer "testclient" for all requests and XFF was ignored; run python -m pytest tests/integration/test_login_rate_limit.py -v and confirm failure (red)

Implementation for User Story 1

  • T005 [US1] Create api/app/auth/rate_limiter.py with two exports: (1) get_client_ip(request: Request, trusted_networks: list[IPv4Network | IPv6Network]) -> str — imports ipaddress, from ipaddress import IPv4Network, IPv6Network, from starlette.requests import Request; extracts peer = request.client.host if request.client else "unknown"; if trusted_networks is non-empty and peer is parseable as an IP address and falls within any trusted network, returns first X-Forwarded-For entry (strip whitespace) or X-Real-IP value, otherwise returns peer; wraps ipaddress.ip_address(peer) in try/except ValueError and falls back to peer on error; (2) LoginRateLimiter class: __init__(self, max_failures: int = 5, window_seconds: int = 300, cooldown_seconds: int = 900) storing params as _max, _window, _cooldown; _store: dict[str, _Record] and _lock: threading.Lock; @dataclass _Record with failures: int = 0, window_start: float = field(default_factory=time.time), blocked_until: float = 0.0; is_blocked(ip: str) -> bool, record_failure(ip: str) -> None (logs WARNING on lockout), record_success(ip: str) -> None, cooldown_seconds property; stdlib imports: import ipaddress, logging, time, from dataclasses import dataclass, field, from threading import Lock
  • T006 [US1] Update api/app/main.py lifespan: add import ipaddress at top; import LoginRateLimiter from app.auth.rate_limiter; inside lifespan before engine = get_engine(), consolidate to settings = get_settings() (remove the existing bare get_settings() call), then set application.state.login_rate_limiter = LoginRateLimiter(max_failures=settings.login_max_failures, window_seconds=settings.login_window_seconds, cooldown_seconds=settings.login_cooldown_seconds); then parse settings.login_trusted_proxy_ips — split on ",", strip each part, skip empty strings, call ipaddress.ip_network(part, strict=False) inside a try/except ValueError (skip invalid entries silently), collect results into trusted_networks: list; set application.state.login_trusted_networks = trusted_networks
  • T007 [US1] Update api/app/routers/auth.py login endpoint: add Request to FastAPI imports and add from fastapi.responses import JSONResponse; add from app.auth.rate_limiter import LoginRateLimiter, get_client_ip; add request: Request as first parameter to login(); extract limiter: LoginRateLimiter = request.app.state.login_rate_limiter and ip: str = get_client_ip(request, request.app.state.login_trusted_networks); add guard block — if limiter.is_blocked(ip): return JSONResponse(status_code=429, content={"detail": "Too many failed login attempts. Please try again later.", "code": "login_rate_limited"}, headers={"Retry-After": str(limiter.cooldown_seconds)}); after verify_credentials returns False: call limiter.record_failure(ip) before the existing HTTPException; after auth.create_token(): call limiter.record_success(ip) before returning TokenResponse
  • T008 [US1] Verify TDD green: run cd api && python -m pytest tests/unit/test_rate_limiter.py -v — all 10 pass; run make test-integration — all tests pass including test_repeated_failures_trigger_429, test_success_resets_counter, test_429_has_retry_after_header, and test_xff_header_ignored_when_no_trusted_networks

Checkpoint: Brute-force blocking is live. Automated repeated failures are stopped after threshold; the owner can still log in after cooldown; unit and integration tests pass.


Phase 4: User Story 2 — Operators can observe blocking activity (Priority: P2)

Goal: The 429 response includes a Retry-After header with a positive integer; the response body code is "login_rate_limited" and contains no threshold numeric values; server logs a WARNING when blocking triggers.

Independent Test: Trigger the rate limiter (already works from Phase 3) and assert Retry-After header is present in the response and code field is "login_rate_limited".

Tests for User Story 2 (TDD red — extend existing file)

  • T009 [US2] Add one test to api/tests/integration/test_login_rate_limit.py targeting observability properties not yet covered: test_429_body_shape — override app.state.login_rate_limiter with a fresh LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=30) via try/finally (same isolation pattern as T004), trigger lockout (three failures), then assert resp.json() == {"detail": "Too many failed login attempts. Please try again later.", "code": "login_rate_limited"} (exact match — confirms no threshold values leak and shape is correct); confirm this test is green immediately against the US1 implementation (T007 already returns this exact body)

Checkpoint: US2 observability properties are explicitly exercised by integration tests; a future regression in the Retry-After header or code field will be caught.


Phase 5: Polish & Cross-Cutting Concerns

  • T010 Run cd api && ruff check app/auth/rate_limiter.py app/routers/auth.py app/config.py app/main.py tests/unit/test_rate_limiter.py tests/integration/test_login_rate_limit.py — fix any violations

Dependencies & Execution Order

Phase Dependencies

  • Phase 1 (Setup): No external dependencies — can start immediately
  • Phase 2 (Foundational): No external dependencies — can start immediately (parallel with Phase 1)
  • Phase 3 (US1): Depends on Phase 2 (T002 must exist before T006 can use settings.login_max_failures)
  • Phase 4 (US2): Depends on Phase 3 (tests verify behaviour implemented in T007)
  • Phase 5 (Polish): Depends on all prior phases

Within Phase 3

  • T003 ∥ T004 (different files, no dependency — write tests in parallel)
  • T005 after T003, T004 (implement after tests confirm they fail)
  • T006 ∥ T007 after T005 (both import from rate_limiter.py; write to different files — main.py and auth.py; T006 sets app.state.login_trusted_networks which T007's router reads)
  • T008 after T005, T006, T007 (verify all pass)

Execution Order Summary

Step 1: T001 ∥ T002 (setup + foundational — parallel, different files)
Step 2: T003 ∥ T004 (write failing tests — parallel)
Step 3: T005 (implement LoginRateLimiter — after red tests confirmed)
Step 4: T006 ∥ T007 (wire limiter into app — parallel, different files)
Step 5: T008 (verify green)
Step 6: T009 (US2 observability tests — verify green immediately)
Step 7: T010 (ruff clean)

Implementation Strategy

MVP (US1 — the blocker)

  1. Complete T001T002 (config setup)
  2. Complete T003T008 (core blocking)
  3. Validate: Run make test-integration — all 88 existing tests still pass; 2 new rate-limit tests pass
  4. US2 adds verification coverage for already-implemented observability features

Incremental Delivery

  • After Phase 3: Brute-force attacks on the login endpoint are blocked — core security net is in place
  • After Phase 4: Observability properties are explicitly tested — regressions in headers/logs will be caught
  • After Phase 5: Lint clean, ready for merge