Feat: Add production-grade multi-stage container image for API
Two-stage build (uv builder + python:3.12-slim runtime) with non-root user (UID 1001), no dev deps, layer-cache-optimised dep install, and graceful SIGTERM shutdown. Verified by api/tests/build/verify_production_image.sh covering build, health endpoint, non-root, stdout logging, secret-free layers, missing-env-var exit, and dep-layer cache hit. All 102 integration tests still pass; shellcheck clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
242
specs/010-api-prod-dockerfile/plan.md
Normal file
242
specs/010-api-prod-dockerfile/plan.md
Normal file
@@ -0,0 +1,242 @@
|
||||
# Implementation Plan: Production-Grade API Container Image
|
||||
|
||||
**Branch**: `010-api-prod-dockerfile` | **Date**: 2026-05-07 | **Spec**: [spec.md](spec.md)
|
||||
**Input**: Feature specification from `specs/010-api-prod-dockerfile/spec.md`
|
||||
|
||||
## Summary
|
||||
|
||||
Produce a production-ready `api/Dockerfile.prod` using a two-stage build: a uv builder stage that installs lockfile-pinned, production-only dependencies into a virtual environment, and a lean `python:3.12-slim` runtime stage that contains only the venv, application source, and `curl` for health checks. The runtime process runs as a non-root user (UID 1001), handles SIGTERM gracefully via uvicorn's built-in drain, and logs exclusively to stdout/stderr. Behavioral verification is automated via a shell script (`api/tests/build/verify_production_image.sh`) written before the Dockerfile (§5.1 TDD).
|
||||
|
||||
---
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: Python 3.12 (existing API), Docker multi-stage build
|
||||
**Build tool**: uv (lockfile: `api/uv.lock`, already committed)
|
||||
**Base images**: `ghcr.io/astral-sh/uv:python3.12-bookworm-slim` (builder), `python:3.12-slim` (runtime)
|
||||
**Testing**: Shell verification script (`verify_production_image.sh`) + `make verify-prod` target
|
||||
**Target Platform**: linux/amd64 container (Kubernetes or Docker host)
|
||||
**Performance Goals**: Container starts and passes health check within 30s; rebuild from warm cache in under 60s
|
||||
**Constraints**: No root process, no hardcoded secrets, no dev deps in final image, compatible with `--read-only` filesystem
|
||||
**Scale/Scope**: Single-file addition (`Dockerfile.prod`) + shell test + two Makefile targets; zero changes to existing source code
|
||||
|
||||
---
|
||||
|
||||
## Constitution Check
|
||||
|
||||
*GATE: Must pass before Phase 0 research. Re-checked post-design below.*
|
||||
|
||||
| Principle | Status | Notes |
|
||||
|-----------|--------|-------|
|
||||
| §5.1 TDD non-negotiable | **COMPLIANT** | `verify_production_image.sh` written before `Dockerfile.prod`; script fails (red) because the build file is absent, then passes (green) after |
|
||||
| §5.2 Test pyramid | **COMPLIANT** | Shell verification script is the integration-level test for this build artefact; no unit tests applicable (no Python business logic added) |
|
||||
| §5.4 CI must pass | **COMPLIANT** | `make verify-prod` target is runnable in host CI (requires Docker on the runner, which the existing `make test-integration` already requires) |
|
||||
| §6 Tech Stack — Docker | **COMPLIANT** | Docker + Docker Compose are mandated; this adds a production Docker file within that constraint |
|
||||
| §7.1 One-command local start | **COMPLIANT** | `api/Dockerfile` (dev stack) is unchanged; `docker compose up` is unaffected |
|
||||
| §7.2 Environment configuration | **COMPLIANT** | `Dockerfile.prod` contains zero hardcoded env values; all config is injected at runtime |
|
||||
| §7.3 Ruff/lint | **COMPLIANT** | No new Python files; shell script linted with `shellcheck` |
|
||||
| §2.6 No speculative abstraction | **COMPLIANT** | Single Dockerfile, no plugin system or generics |
|
||||
| §8 Scope boundaries | **COMPLIANT** | Purely infrastructure; no new API routes, data model, or UI changes |
|
||||
|
||||
**Post-design re-check**: All gates remain green. No violations.
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
specs/010-api-prod-dockerfile/
|
||||
├── plan.md # This file
|
||||
├── research.md # Phase 0 decisions
|
||||
├── contracts/
|
||||
│ └── container.md # Container interface contract (port, env vars, signals, user)
|
||||
├── quickstart.md # Build and verification scenarios
|
||||
└── tasks.md # Generated by /speckit-tasks
|
||||
```
|
||||
|
||||
### Source Code Changes
|
||||
|
||||
```text
|
||||
api/
|
||||
├── Dockerfile # Existing dev/test image — UNCHANGED
|
||||
├── Dockerfile.prod # NEW: production multi-stage image
|
||||
├── .dockerignore # Existing — verify test files are excluded from build context
|
||||
└── tests/
|
||||
└── build/
|
||||
└── verify_production_image.sh # NEW: TDD verification script (written first)
|
||||
|
||||
Makefile # Root Makefile — add build-prod and verify-prod targets
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dockerfile.prod — Annotated Reference
|
||||
|
||||
```dockerfile
|
||||
# syntax=docker/dockerfile:1
|
||||
|
||||
# ════════════════════════════════════════════════
|
||||
# Build stage: install production deps via uv
|
||||
# ════════════════════════════════════════════════
|
||||
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS builder
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Pre-compile bytecode; use copy mode for cross-layer compatibility
|
||||
ENV UV_COMPILE_BYTECODE=1 \
|
||||
UV_LINK_MODE=copy \
|
||||
UV_PYTHON_DOWNLOADS=never
|
||||
|
||||
# ── Layer cache split: deps only (changes rarely) ──
|
||||
COPY pyproject.toml uv.lock ./
|
||||
RUN --mount=type=cache,target=/root/.cache/uv \
|
||||
uv sync --frozen --no-dev --no-install-project
|
||||
|
||||
# ── Layer cache split: source (changes often) ──
|
||||
COPY app/ ./app/
|
||||
|
||||
# ════════════════════════════════════════════════
|
||||
# Runtime stage: lean image with venv + source
|
||||
# ════════════════════════════════════════════════
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# curl for HEALTHCHECK — only tool added beyond base Python
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Non-root system user (UID/GID 1001)
|
||||
RUN groupadd --system --gid 1001 appgroup \
|
||||
&& useradd --system --uid 1001 --gid 1001 --no-create-home appuser
|
||||
|
||||
# Copy venv from builder; copy source directly from build context
|
||||
COPY --from=builder --chown=appuser:appgroup /app/.venv /app/.venv
|
||||
COPY --chown=appuser:appgroup app/ ./app/
|
||||
|
||||
USER appuser
|
||||
|
||||
# Activate the venv by prepending its bin to PATH
|
||||
ENV PATH="/app/.venv/bin:$PATH"
|
||||
|
||||
EXPOSE 8000
|
||||
|
||||
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
|
||||
CMD curl -f http://localhost:8000/api/v1/health || exit 1
|
||||
|
||||
# uvicorn handles SIGTERM; --timeout-graceful-shutdown gives 30s to drain requests
|
||||
CMD ["uvicorn", "app.main:app", \
|
||||
"--host", "0.0.0.0", \
|
||||
"--port", "8000", \
|
||||
"--timeout-graceful-shutdown", "30"]
|
||||
```
|
||||
|
||||
> **Note on COPY paths**: Build context is `api/` (as set by the Makefile target). `COPY app/ ./app/` in both stages refers to `api/app/`. The runtime stage copies source directly from the build context, not from the builder stage — this is simpler and avoids an extra intermediate layer.
|
||||
|
||||
---
|
||||
|
||||
## verify_production_image.sh — Structure
|
||||
|
||||
```sh
|
||||
#!/usr/bin/env bash
|
||||
# TDD verification script for api/Dockerfile.prod
|
||||
# Fails (red) if Dockerfile.prod does not exist or any check fails.
|
||||
set -euo pipefail
|
||||
|
||||
IMAGE="reactbin-api-prod:verify-$$"
|
||||
|
||||
cleanup() { docker rm -f "$CONTAINER" 2>/dev/null || true; docker rmi "$IMAGE" 2>/dev/null || true; }
|
||||
trap cleanup EXIT
|
||||
|
||||
# Step 1: Build — fails red if Dockerfile.prod is absent
|
||||
docker build -f api/Dockerfile.prod api/ -t "$IMAGE"
|
||||
|
||||
# Step 2: Start container with minimal env vars
|
||||
CONTAINER=$(docker run -d -p 18000:8000 \
|
||||
-e JWT_SECRET_KEY=verify-test-key \
|
||||
-e OWNER_USERNAME=testowner \
|
||||
-e OWNER_PASSWORD=testpassword \
|
||||
-e DATABASE_URL=postgresql+asyncpg://noop:noop@noop/noop \
|
||||
-e S3_ENDPOINT_URL=http://noop:9000 \
|
||||
-e S3_BUCKET_NAME=noop \
|
||||
-e S3_ACCESS_KEY_ID=noop \
|
||||
-e S3_SECRET_ACCESS_KEY=noop \
|
||||
-e S3_REGION=us-east-1 \
|
||||
"$IMAGE")
|
||||
|
||||
# Step 3: Poll health endpoint (app will fail to connect to DB, but /health is pre-DB)
|
||||
for i in $(seq 1 30); do
|
||||
if curl -sf http://localhost:18000/api/v1/health > /dev/null; then break; fi
|
||||
sleep 1
|
||||
[[ $i -eq 30 ]] && { echo "FAIL: health check timed out"; exit 1; }
|
||||
done
|
||||
|
||||
# Step 4: Assert non-root user
|
||||
UID_IN_CONTAINER=$(docker exec "$CONTAINER" id -u)
|
||||
[[ "$UID_IN_CONTAINER" -ne 0 ]] || { echo "FAIL: process running as root"; exit 1; }
|
||||
|
||||
# Step 5: Graceful shutdown
|
||||
docker stop "$CONTAINER" # sends SIGTERM
|
||||
EXIT_CODE=$(docker wait "$CONTAINER")
|
||||
[[ "$EXIT_CODE" -eq 0 ]] || { echo "FAIL: non-zero exit code $EXIT_CODE"; exit 1; }
|
||||
|
||||
# Step 6: Dev deps absent
|
||||
if docker run --rm "$IMAGE" /app/.venv/bin/python -c "import pytest" 2>/dev/null; then
|
||||
echo "FAIL: pytest importable in production image (dev deps present)"; exit 1
|
||||
fi
|
||||
|
||||
echo "All production image checks passed."
|
||||
```
|
||||
|
||||
> **Note on health check feasibility**: `/api/v1/health` is a simple JSON response that does not require a database connection (confirmed in `api/app/main.py`). The verification script can therefore pass even without a real PostgreSQL instance.
|
||||
|
||||
---
|
||||
|
||||
## Makefile Targets
|
||||
|
||||
Add to root `Makefile`:
|
||||
|
||||
```makefile
|
||||
.PHONY: build-prod verify-prod
|
||||
|
||||
build-prod:
|
||||
docker build -f api/Dockerfile.prod api/ -t reactbin-api-prod:latest
|
||||
|
||||
verify-prod:
|
||||
bash api/tests/build/verify_production_image.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## `.dockerignore` Review
|
||||
|
||||
The existing `api/.dockerignore` already excludes `.venv/`, `__pycache__/`, `.env`, etc. Two additions improve the production build context:
|
||||
|
||||
```
|
||||
tests/
|
||||
*.egg-info/
|
||||
alembic/
|
||||
alembic.ini
|
||||
```
|
||||
|
||||
`tests/` and `alembic/` are not needed in the production image (we `COPY app/ ./app/` explicitly). Excluding them from the build context reduces the data sent to the Docker daemon.
|
||||
|
||||
> `*.egg-info/` is already present in the existing `.dockerignore`.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
Tasks are generated by `/speckit-tasks`, but the logical dependency order is:
|
||||
|
||||
1. **Write `verify_production_image.sh`** (TDD red — build fails because `Dockerfile.prod` absent)
|
||||
2. **Add `Makefile` targets** (`build-prod`, `verify-prod`) — references the script
|
||||
3. **Write `api/Dockerfile.prod`** (implement to make TDD pass)
|
||||
4. **Update `api/.dockerignore`** (exclude `tests/`, `alembic/` from build context)
|
||||
5. **Run `make verify-prod`** (TDD green — all 6 checks pass)
|
||||
6. **Run `shellcheck`** on `verify_production_image.sh`
|
||||
|
||||
No existing tests are modified. `make test-integration` continues to use `api/Dockerfile` unchanged.
|
||||
Reference in New Issue
Block a user