Feat: Rate-limit login endpoint to block brute-force attacks
After LOGIN_MAX_FAILURES consecutive failed attempts from the same source IP within LOGIN_WINDOW_SECONDS, POST /api/v1/auth/token returns HTTP 429 with a Retry-After header for LOGIN_COOLDOWN_SECONDS. A successful login resets the counter. Trusted upstream proxy IPs/CIDRs can be declared via LOGIN_TRUSTED_PROXY_IPS so X-Forwarded-For is honoured correctly behind nginx ingress or similar reverse proxies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
388
specs/009-login-rate-limiting/plan.md
Normal file
388
specs/009-login-rate-limiting/plan.md
Normal file
@@ -0,0 +1,388 @@
|
||||
# Implementation Plan: Login Brute-Force Protection
|
||||
|
||||
**Branch**: `009-login-rate-limiting` | **Date**: 2026-05-06 | **Spec**: [spec.md](spec.md)
|
||||
**Input**: Feature specification from `specs/009-login-rate-limiting/spec.md`
|
||||
|
||||
## Summary
|
||||
|
||||
Add failure-counting brute-force protection to the login endpoint (`POST /api/v1/auth/token`).
|
||||
After a configurable number of consecutive failed attempts from the same resolved client IP,
|
||||
the endpoint returns HTTP 429 with a `Retry-After` header for a configurable cooldown period.
|
||||
A successful login resets the counter. All thresholds are configurable via environment variables.
|
||||
When deployed behind a reverse proxy (nginx, Kubernetes ingress), a `LOGIN_TRUSTED_PROXY_IPS`
|
||||
setting enables extraction of the real client IP from `X-Forwarded-For`. No new infrastructure
|
||||
(no Redis, no new DB table) — counters live in process memory.
|
||||
|
||||
---
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: Python 3.12+
|
||||
**Primary Dependencies**: FastAPI, pydantic-settings (already in use); no new dependencies added
|
||||
**Storage**: In-memory `dict` (no persistence across restarts — intentional)
|
||||
**Testing**: pytest + pytest-asyncio (existing test infrastructure)
|
||||
**Target Platform**: Linux server (Docker)
|
||||
**Project Type**: Web service (API only — this feature has no UI surface)
|
||||
**Performance Goals**: Rate limiter adds negligible overhead (dict lookup + lock acquisition; sub-millisecond)
|
||||
**Constraints**: Must not add new runtime service dependencies; must not change any auth behaviour for non-blocked sources
|
||||
**Scale/Scope**: Single process, single user; in-memory store is sufficient
|
||||
|
||||
---
|
||||
|
||||
## Constitution Check
|
||||
|
||||
| Principle | Status | Notes |
|
||||
|-----------|--------|-------|
|
||||
| §2.4 Auth abstraction (AuthProvider interface) | ✅ Pass | Rate limiter is a guard *before* `JWTAuthProvider.verify_credentials()`, not a bypass of the interface |
|
||||
| §2.5 DB abstraction (repository layer) | ✅ Pass | No database access; in-memory only |
|
||||
| §2.6 No speculative abstraction | ✅ Pass | Concrete `LoginRateLimiter` class, no interface; only one implementation planned |
|
||||
| §3.3 Error envelope (`detail` + `code`) | ✅ Pass | 429 response uses `{"detail": "...", "code": "login_rate_limited"}` |
|
||||
| §5.1 TDD | ✅ Required | Tasks follow red → green order |
|
||||
| §5.2 Integration tests against PostgreSQL | ✅ Pass | Integration test for the login endpoint will run against the Docker PostgreSQL stack |
|
||||
| §7.2 Environment configuration | ✅ Pass | `LOGIN_MAX_FAILURES`, `LOGIN_WINDOW_SECONDS`, `LOGIN_COOLDOWN_SECONDS`, `LOGIN_TRUSTED_PROXY_IPS` from env vars |
|
||||
| §7.3 Linting (ruff) | ✅ Required | All new files must pass `ruff check` |
|
||||
|
||||
**Gate result**: No violations. Cleared to proceed.
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
specs/009-login-rate-limiting/
|
||||
├── plan.md ← this file
|
||||
├── research.md ← decisions on approach
|
||||
├── data-model.md ← rate-limit record entity
|
||||
├── quickstart.md ← curl runbook
|
||||
├── contracts/
|
||||
│ └── auth.md ← updated POST /api/v1/auth/token with 429
|
||||
└── tasks.md ← generated by /speckit-tasks
|
||||
```
|
||||
|
||||
### Source Code Changes
|
||||
|
||||
```text
|
||||
api/
|
||||
├── app/
|
||||
│ ├── auth/
|
||||
│ │ ├── rate_limiter.py ← NEW: LoginRateLimiter class
|
||||
│ │ ├── jwt_provider.py (unchanged)
|
||||
│ │ ├── noop.py (unchanged)
|
||||
│ │ └── provider.py (unchanged)
|
||||
│ ├── config.py ← add login_max_failures, login_window_seconds, login_cooldown_seconds, login_trusted_proxy_ips
|
||||
│ ├── main.py ← init LoginRateLimiter in lifespan, attach to app.state
|
||||
│ └── routers/
|
||||
│ └── auth.py ← check rate limit before auth, record outcome
|
||||
└── tests/
|
||||
├── unit/
|
||||
│ └── test_rate_limiter.py ← NEW: unit tests for LoginRateLimiter logic
|
||||
└── integration/
|
||||
└── test_login_rate_limit.py ← NEW: integration tests for 429 behaviour via HTTP
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Detail
|
||||
|
||||
### `api/app/auth/rate_limiter.py`
|
||||
|
||||
```python
|
||||
import ipaddress
|
||||
import logging
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from ipaddress import IPv4Network, IPv6Network
|
||||
from threading import Lock
|
||||
|
||||
from starlette.requests import Request
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def get_client_ip(
|
||||
request: Request,
|
||||
trusted_networks: list[IPv4Network | IPv6Network],
|
||||
) -> str:
|
||||
"""Return the resolved client IP, honouring X-Forwarded-For when the
|
||||
TCP peer is a trusted upstream proxy. Falls back to the TCP peer address
|
||||
when no trusted networks are configured or the peer is not in the list."""
|
||||
peer = request.client.host if request.client else "unknown"
|
||||
if trusted_networks and peer != "unknown":
|
||||
try:
|
||||
peer_addr = ipaddress.ip_address(peer)
|
||||
if any(peer_addr in net for net in trusted_networks):
|
||||
xff = request.headers.get("X-Forwarded-For", "").split(",")[0].strip()
|
||||
if xff:
|
||||
return xff
|
||||
real_ip = request.headers.get("X-Real-IP", "").strip()
|
||||
if real_ip:
|
||||
return real_ip
|
||||
except ValueError:
|
||||
pass
|
||||
return peer
|
||||
|
||||
|
||||
@dataclass
|
||||
class _Record:
|
||||
failures: int = 0
|
||||
window_start: float = field(default_factory=time.time)
|
||||
blocked_until: float = 0.0
|
||||
|
||||
|
||||
class LoginRateLimiter:
|
||||
def __init__(
|
||||
self,
|
||||
max_failures: int = 5,
|
||||
window_seconds: int = 300,
|
||||
cooldown_seconds: int = 900,
|
||||
) -> None:
|
||||
self._max = max_failures
|
||||
self._window = window_seconds
|
||||
self._cooldown = cooldown_seconds
|
||||
self._store: dict[str, _Record] = {}
|
||||
self._lock = Lock()
|
||||
|
||||
@property
|
||||
def cooldown_seconds(self) -> int:
|
||||
return self._cooldown
|
||||
|
||||
def is_blocked(self, ip: str) -> bool:
|
||||
now = time.time()
|
||||
with self._lock:
|
||||
rec = self._store.get(ip)
|
||||
if rec is None:
|
||||
return False
|
||||
if rec.blocked_until > now:
|
||||
return True
|
||||
if rec.blocked_until > 0:
|
||||
del self._store[ip]
|
||||
return False
|
||||
|
||||
def record_failure(self, ip: str) -> None:
|
||||
now = time.time()
|
||||
with self._lock:
|
||||
rec = self._store.get(ip)
|
||||
if rec is None:
|
||||
rec = _Record(window_start=now)
|
||||
self._store[ip] = rec
|
||||
if now - rec.window_start > self._window:
|
||||
rec.failures = 0
|
||||
rec.window_start = now
|
||||
rec.failures += 1
|
||||
if rec.failures >= self._max:
|
||||
rec.blocked_until = now + self._cooldown
|
||||
logger.warning(
|
||||
"Login blocked for %s after %d failures", ip, rec.failures
|
||||
)
|
||||
|
||||
def record_success(self, ip: str) -> None:
|
||||
with self._lock:
|
||||
self._store.pop(ip, None)
|
||||
```
|
||||
|
||||
### `api/app/config.py` additions
|
||||
|
||||
```python
|
||||
login_max_failures: int = 5
|
||||
login_window_seconds: int = 300
|
||||
login_cooldown_seconds: int = 900
|
||||
login_trusted_proxy_ips: str = "" # comma-separated IPs/CIDRs; empty = disabled
|
||||
```
|
||||
|
||||
### `api/app/main.py` lifespan update
|
||||
|
||||
```python
|
||||
import ipaddress
|
||||
|
||||
from app.auth.rate_limiter import LoginRateLimiter
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(application: FastAPI):
|
||||
settings = get_settings()
|
||||
application.state.login_rate_limiter = LoginRateLimiter(
|
||||
max_failures=settings.login_max_failures,
|
||||
window_seconds=settings.login_window_seconds,
|
||||
cooldown_seconds=settings.login_cooldown_seconds,
|
||||
)
|
||||
trusted_networks = []
|
||||
for part in settings.login_trusted_proxy_ips.split(","):
|
||||
part = part.strip()
|
||||
if part:
|
||||
try:
|
||||
trusted_networks.append(ipaddress.ip_network(part, strict=False))
|
||||
except ValueError:
|
||||
pass # invalid entry — skip silently
|
||||
application.state.login_trusted_networks = trusted_networks
|
||||
# ... existing DB setup unchanged
|
||||
engine = get_engine()
|
||||
async with engine.begin() as conn:
|
||||
await conn.run_sync(Base.metadata.create_all)
|
||||
yield
|
||||
await engine.dispose()
|
||||
```
|
||||
|
||||
### `api/app/routers/auth.py` update
|
||||
|
||||
```python
|
||||
from fastapi import APIRouter, Depends, HTTPException, Request
|
||||
from fastapi.responses import JSONResponse
|
||||
from pydantic import BaseModel
|
||||
|
||||
from app.auth.jwt_provider import JWTAuthProvider
|
||||
from app.auth.rate_limiter import LoginRateLimiter, get_client_ip
|
||||
from app.dependencies import get_jwt_auth
|
||||
|
||||
router = APIRouter(tags=["auth"])
|
||||
|
||||
|
||||
class LoginRequest(BaseModel):
|
||||
username: str
|
||||
password: str
|
||||
|
||||
|
||||
class TokenResponse(BaseModel):
|
||||
access_token: str
|
||||
token_type: str = "bearer"
|
||||
expires_in: int
|
||||
|
||||
|
||||
@router.post("/auth/token", response_model=TokenResponse)
|
||||
async def login(
|
||||
request: Request,
|
||||
body: LoginRequest,
|
||||
auth: JWTAuthProvider = Depends(get_jwt_auth),
|
||||
):
|
||||
limiter: LoginRateLimiter = request.app.state.login_rate_limiter
|
||||
ip: str = get_client_ip(request, request.app.state.login_trusted_networks)
|
||||
|
||||
if limiter.is_blocked(ip):
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={
|
||||
"detail": "Too many failed login attempts. Please try again later.",
|
||||
"code": "login_rate_limited",
|
||||
},
|
||||
headers={"Retry-After": str(limiter.cooldown_seconds)},
|
||||
)
|
||||
|
||||
if not auth.verify_credentials(body.username, body.password):
|
||||
limiter.record_failure(ip)
|
||||
raise HTTPException(
|
||||
status_code=401,
|
||||
detail={"detail": "Invalid credentials", "code": "invalid_credentials"},
|
||||
)
|
||||
|
||||
limiter.record_success(ip)
|
||||
token = auth.create_token()
|
||||
return TokenResponse(
|
||||
access_token=token,
|
||||
token_type="bearer",
|
||||
expires_in=auth._expiry_seconds,
|
||||
)
|
||||
```
|
||||
|
||||
### `api/tests/unit/test_rate_limiter.py` (representative cases)
|
||||
|
||||
```python
|
||||
import time
|
||||
import pytest
|
||||
from app.auth.rate_limiter import LoginRateLimiter
|
||||
|
||||
|
||||
def test_not_blocked_initially():
|
||||
limiter = LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=300)
|
||||
assert limiter.is_blocked("1.2.3.4") is False
|
||||
|
||||
|
||||
def test_blocked_after_threshold():
|
||||
limiter = LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=300)
|
||||
for _ in range(3):
|
||||
limiter.record_failure("1.2.3.4")
|
||||
assert limiter.is_blocked("1.2.3.4") is True
|
||||
|
||||
|
||||
def test_success_clears_failures():
|
||||
limiter = LoginRateLimiter(max_failures=3, window_seconds=60, cooldown_seconds=300)
|
||||
limiter.record_failure("1.2.3.4")
|
||||
limiter.record_failure("1.2.3.4")
|
||||
limiter.record_success("1.2.3.4")
|
||||
assert limiter.is_blocked("1.2.3.4") is False
|
||||
|
||||
|
||||
def test_ips_are_isolated():
|
||||
limiter = LoginRateLimiter(max_failures=2, window_seconds=60, cooldown_seconds=300)
|
||||
limiter.record_failure("1.1.1.1")
|
||||
limiter.record_failure("1.1.1.1")
|
||||
assert limiter.is_blocked("2.2.2.2") is False
|
||||
```
|
||||
|
||||
### `api/tests/integration/test_login_rate_limit.py` (representative cases)
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from httpx import AsyncClient
|
||||
|
||||
# Uses the 'client' fixture (NoOpAuthProvider) from conftest — sufficient for this
|
||||
# endpoint since we're testing the rate-limit layer, not auth correctness.
|
||||
# The login endpoint instantiates its own limiter via app.state, so we need
|
||||
# the full ASGI app.
|
||||
|
||||
BAD_CREDS = {"username": "attacker", "password": "wrong"}
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_repeated_failures_trigger_429(client: AsyncClient):
|
||||
# Use a custom limiter with low threshold to avoid slow tests
|
||||
# (the app.state.login_rate_limiter is set in lifespan; override for test)
|
||||
from app.auth.rate_limiter import LoginRateLimiter
|
||||
from app.main import app
|
||||
original = app.state.login_rate_limiter
|
||||
app.state.login_rate_limiter = LoginRateLimiter(
|
||||
max_failures=3, window_seconds=60, cooldown_seconds=30
|
||||
)
|
||||
try:
|
||||
for _ in range(3):
|
||||
await client.post("/api/v1/auth/token", json=BAD_CREDS)
|
||||
resp = await client.post("/api/v1/auth/token", json=BAD_CREDS)
|
||||
assert resp.status_code == 429
|
||||
assert resp.json()["code"] == "login_rate_limited"
|
||||
assert "Retry-After" in resp.headers
|
||||
finally:
|
||||
app.state.login_rate_limiter = original
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1 (MVP — P1): Blocking after repeated failures
|
||||
|
||||
1. Add `login_max_failures`, `login_window_seconds`, `login_cooldown_seconds`, `login_trusted_proxy_ips` to `api/app/config.py`
|
||||
2. Create `api/app/auth/rate_limiter.py` with `LoginRateLimiter` and `get_client_ip()`
|
||||
3. Initialize rate limiter and parse trusted networks in `api/app/main.py` lifespan; attach both to `app.state`
|
||||
4. Update `api/app/routers/auth.py` to resolve client IP via `get_client_ip()`, then check + record outcomes
|
||||
5. Unit tests: `api/tests/unit/test_rate_limiter.py`
|
||||
6. Integration tests: `api/tests/integration/test_login_rate_limit.py`
|
||||
|
||||
### Phase 2 (US2 — observability): Logging and response hints
|
||||
|
||||
Delivered as part of Phase 1 (the `logger.warning(...)` call and `Retry-After` header
|
||||
are embedded in the same implementation). No separate phase needed.
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables to Add to `.env.example`
|
||||
|
||||
```dotenv
|
||||
# Login brute-force protection
|
||||
LOGIN_MAX_FAILURES=5
|
||||
LOGIN_WINDOW_SECONDS=300
|
||||
LOGIN_COOLDOWN_SECONDS=900
|
||||
# Comma-separated IPs/CIDRs of trusted upstream proxies (e.g. nginx ingress pod CIDR).
|
||||
# Leave empty when not behind a reverse proxy.
|
||||
LOGIN_TRUSTED_PROXY_IPS=
|
||||
```
|
||||
|
||||
These are optional (have defaults) so existing `.env` files without them continue working.
|
||||
Reference in New Issue
Block a user