After LOGIN_MAX_FAILURES consecutive failed attempts from the same source IP within LOGIN_WINDOW_SECONDS, POST /api/v1/auth/token returns HTTP 429 with a Retry-After header for LOGIN_COOLDOWN_SECONDS. A successful login resets the counter. Trusted upstream proxy IPs/CIDRs can be declared via LOGIN_TRUSTED_PROXY_IPS so X-Forwarded-For is honoured correctly behind nginx ingress or similar reverse proxies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
Tasks: Login Brute-Force Protection
Input: Design documents from specs/009-login-rate-limiting/
Prerequisites: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/auth.md ✅, quickstart.md ✅
Tests: TDD is non-negotiable (§5.1). Every test task appears before the implementation task it covers. For each red step, run the test and confirm it fails before proceeding to the implementation.
Organization: Phase 1 adds env vars; Phase 2 adds config fields (shared by both stories); Phase 3 implements the core blocking behaviour (US1 MVP); Phase 4 adds observability-specific test coverage (US2); Phase 5 is polish.
Format: [ID] [P?] [Story] Description
- [P]: Can run in parallel with other [P] tasks in the same phase
- [Story]: Which user story this task belongs to
- Exact file paths included in every task description
Phase 1: Setup
- T001 Add a
# Login brute-force protectioncomment block withLOGIN_MAX_FAILURES=5,LOGIN_WINDOW_SECONDS=300,LOGIN_COOLDOWN_SECONDS=900, andLOGIN_TRUSTED_PROXY_IPS=(empty by default, with an inline comment explaining it accepts comma-separated IPs/CIDRs) to both.env.exampleand.env.test.exampleat the repo root
Phase 2: Foundational
Purpose: Add the three new settings fields — required before any story implementation.
- T002 Add
login_max_failures: int = 5,login_window_seconds: int = 300,login_cooldown_seconds: int = 900,login_trusted_proxy_ips: str = ""to theSettingsclass inapi/app/config.py(append afterowner_password)
Checkpoint: api/app/config.py accepts all three new env vars with defaults.
Phase 3: User Story 1 — Repeated failed logins are blocked (Priority: P1) 🎯 MVP
Goal: After LOGIN_MAX_FAILURES consecutive failed login attempts from the same source IP within LOGIN_WINDOW_SECONDS, POST /api/v1/auth/token returns HTTP 429 for LOGIN_COOLDOWN_SECONDS. A successful login resets the counter.
Independent Test: cd api && python -m pytest tests/unit/test_rate_limiter.py tests/integration/test_login_rate_limit.py::test_repeated_failures_trigger_429 tests/integration/test_login_rate_limit.py::test_success_resets_counter tests/integration/test_login_rate_limit.py::test_429_has_retry_after_header tests/integration/test_login_rate_limit.py::test_xff_header_ignored_when_no_trusted_networks -v — all pass.
Tests for User Story 1 (TDD red — write first, confirm failure before T005)
- T003 [P] [US1] Create
api/tests/unit/test_rate_limiter.pywith ten failing unit tests — importLoginRateLimiterandget_client_ipfromapp.auth.rate_limiter; forLoginRateLimiter(instantiate withmax_failures=3, window_seconds=60, cooldown_seconds=300):test_not_blocked_initially,test_blocked_after_threshold,test_success_clears_failures,test_ips_are_isolated,test_window_resets_after_expiry,test_log_warning_on_lockout(caplog at WARNING level: callrecord_failure()until threshold, assert"Login blocked" in caplog.textand IP in log output); forget_client_ip(construct a mock usingfrom unittest.mock import MagicMockandfrom starlette.requests import Request:req = MagicMock(spec=Request); req.client.host = "10.0.0.1"; req.headers = {"X-Forwarded-For": "203.0.113.5"}):test_get_client_ip_no_trusted_networks_returns_peer(emptytrusted_networks=[]→ returnsreq.client.host),test_get_client_ip_trusted_peer_uses_xff(peer"10.0.0.1"in trusted CIDR"10.0.0.0/8"→ returns"203.0.113.5"),test_get_client_ip_untrusted_peer_ignores_xff(peer"8.8.8.8"not in trusted CIDR"10.0.0.0/8"→ returns"8.8.8.8"despite XFF),test_get_client_ip_trusted_peer_falls_back_to_real_ip(peer trusted, no XFF header,X-Real-IP: "203.0.113.9"→ returns"203.0.113.9"); runpython -m pytest tests/unit/test_rate_limiter.py -vand confirmImportErrororModuleNotFoundError(red) - T004 [P] [US1] Create
api/tests/integration/test_login_rate_limit.pywith four failing integration tests; each must override bothapp.state.login_rate_limiter(freshLoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=30)) andapp.state.login_trusted_networks(set to[]for all four tests — theASGITransportpeer is"testclient", not a valid IP, so trusted-network matching can't be exercised here; proxy extraction is fully covered by T003 unit tests) via try/finally: (1)test_repeated_failures_trigger_429— POST three bad-credential requests then assert fourth returns 429 withresp.json()["code"] == "login_rate_limited"; (2)test_success_resets_counter— two failures → one valid login using{"username": os.environ["OWNER_USERNAME"], "password": os.environ["OWNER_PASSWORD"]}(matching conftest.py defaults:testowner/testpassword) → three more failures → assert all three return 401, not 429; (3)test_429_has_retry_after_header— trigger lockout (three failures), then assert"Retry-After" in resp.headersandint(resp.headers["Retry-After"]) > 0; (4)test_xff_header_ignored_when_no_trusted_networks— send three bad-cred requests withheaders={"X-Forwarded-For": "1.2.3.4"}then a fourth withheaders={"X-Forwarded-For": "9.9.9.9"}— assert the fourth returns 429 (not 401), proving the limiter tracked the real peer"testclient"for all requests and XFF was ignored; runpython -m pytest tests/integration/test_login_rate_limit.py -vand confirm failure (red)
Implementation for User Story 1
- T005 [US1] Create
api/app/auth/rate_limiter.pywith two exports: (1)get_client_ip(request: Request, trusted_networks: list[IPv4Network | IPv6Network]) -> str— importsipaddress,from ipaddress import IPv4Network, IPv6Network,from starlette.requests import Request; extractspeer = request.client.host if request.client else "unknown"; iftrusted_networksis non-empty and peer is parseable as an IP address and falls within any trusted network, returns firstX-Forwarded-Forentry (strip whitespace) orX-Real-IPvalue, otherwise returnspeer; wrapsipaddress.ip_address(peer)intry/except ValueErrorand falls back topeeron error; (2)LoginRateLimiterclass:__init__(self, max_failures: int = 5, window_seconds: int = 300, cooldown_seconds: int = 900)storing params as_max,_window,_cooldown;_store: dict[str, _Record]and_lock: threading.Lock;@dataclass _Recordwithfailures: int = 0,window_start: float = field(default_factory=time.time),blocked_until: float = 0.0;is_blocked(ip: str) -> bool,record_failure(ip: str) -> None(logs WARNING on lockout),record_success(ip: str) -> None,cooldown_secondsproperty; stdlib imports:import ipaddress, logging, time,from dataclasses import dataclass, field,from threading import Lock - T006 [US1] Update
api/app/main.pylifespan: addimport ipaddressat top; importLoginRateLimiterfromapp.auth.rate_limiter; insidelifespanbeforeengine = get_engine(), consolidate tosettings = get_settings()(remove the existing bareget_settings()call), then setapplication.state.login_rate_limiter = LoginRateLimiter(max_failures=settings.login_max_failures, window_seconds=settings.login_window_seconds, cooldown_seconds=settings.login_cooldown_seconds); then parsesettings.login_trusted_proxy_ips— split on",", strip each part, skip empty strings, callipaddress.ip_network(part, strict=False)inside atry/except ValueError(skip invalid entries silently), collect results intotrusted_networks: list; setapplication.state.login_trusted_networks = trusted_networks - T007 [US1] Update
api/app/routers/auth.pylogin endpoint: addRequestto FastAPI imports and addfrom fastapi.responses import JSONResponse; addfrom app.auth.rate_limiter import LoginRateLimiter, get_client_ip; addrequest: Requestas first parameter tologin(); extractlimiter: LoginRateLimiter = request.app.state.login_rate_limiterandip: str = get_client_ip(request, request.app.state.login_trusted_networks); add guard block — iflimiter.is_blocked(ip): returnJSONResponse(status_code=429, content={"detail": "Too many failed login attempts. Please try again later.", "code": "login_rate_limited"}, headers={"Retry-After": str(limiter.cooldown_seconds)}); afterverify_credentialsreturns False: calllimiter.record_failure(ip)before the existingHTTPException; afterauth.create_token(): calllimiter.record_success(ip)before returningTokenResponse - T008 [US1] Verify TDD green: run
cd api && python -m pytest tests/unit/test_rate_limiter.py -v— all 10 pass; runmake test-integration— all tests pass includingtest_repeated_failures_trigger_429,test_success_resets_counter,test_429_has_retry_after_header, andtest_xff_header_ignored_when_no_trusted_networks
Checkpoint: Brute-force blocking is live. Automated repeated failures are stopped after threshold; the owner can still log in after cooldown; unit and integration tests pass.
Phase 4: User Story 2 — Operators can observe blocking activity (Priority: P2)
Goal: The 429 response includes a Retry-After header with a positive integer; the response body code is "login_rate_limited" and contains no threshold numeric values; server logs a WARNING when blocking triggers.
Independent Test: Trigger the rate limiter (already works from Phase 3) and assert Retry-After header is present in the response and code field is "login_rate_limited".
Tests for User Story 2 (TDD red — extend existing file)
- T009 [US2] Add one test to
api/tests/integration/test_login_rate_limit.pytargeting observability properties not yet covered:test_429_body_shape— overrideapp.state.login_rate_limiterwith a freshLoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=30)via try/finally (same isolation pattern as T004), trigger lockout (three failures), then assertresp.json() == {"detail": "Too many failed login attempts. Please try again later.", "code": "login_rate_limited"}(exact match — confirms no threshold values leak and shape is correct); confirm this test is green immediately against the US1 implementation (T007 already returns this exact body)
Checkpoint: US2 observability properties are explicitly exercised by integration tests; a future regression in the Retry-After header or code field will be caught.
Phase 5: Polish & Cross-Cutting Concerns
- T010 Run
cd api && ruff check app/auth/rate_limiter.py app/routers/auth.py app/config.py app/main.py tests/unit/test_rate_limiter.py tests/integration/test_login_rate_limit.py— fix any violations
Dependencies & Execution Order
Phase Dependencies
- Phase 1 (Setup): No external dependencies — can start immediately
- Phase 2 (Foundational): No external dependencies — can start immediately (parallel with Phase 1)
- Phase 3 (US1): Depends on Phase 2 (T002 must exist before T006 can use
settings.login_max_failures) - Phase 4 (US2): Depends on Phase 3 (tests verify behaviour implemented in T007)
- Phase 5 (Polish): Depends on all prior phases
Within Phase 3
- T003 ∥ T004 (different files, no dependency — write tests in parallel)
- T005 after T003, T004 (implement after tests confirm they fail)
- T006 ∥ T007 after T005 (both import from
rate_limiter.py; write to different files —main.pyandauth.py; T006 setsapp.state.login_trusted_networkswhich T007's router reads) - T008 after T005, T006, T007 (verify all pass)
Execution Order Summary
Step 1: T001 ∥ T002 (setup + foundational — parallel, different files)
Step 2: T003 ∥ T004 (write failing tests — parallel)
Step 3: T005 (implement LoginRateLimiter — after red tests confirmed)
Step 4: T006 ∥ T007 (wire limiter into app — parallel, different files)
Step 5: T008 (verify green)
Step 6: T009 (US2 observability tests — verify green immediately)
Step 7: T010 (ruff clean)
Implementation Strategy
MVP (US1 — the blocker)
- Complete T001–T002 (config setup)
- Complete T003–T008 (core blocking)
- Validate: Run
make test-integration— all 88 existing tests still pass; 2 new rate-limit tests pass - US2 adds verification coverage for already-implemented observability features
Incremental Delivery
- After Phase 3: Brute-force attacks on the login endpoint are blocked — core security net is in place
- After Phase 4: Observability properties are explicitly tested — regressions in headers/logs will be caught
- After Phase 5: Lint clean, ready for merge