Two-stage build (uv builder + python:3.12-slim runtime) with non-root user (UID 1001), no dev deps, layer-cache-optimised dep install, and graceful SIGTERM shutdown. Verified by api/tests/build/verify_production_image.sh covering build, health endpoint, non-root, stdout logging, secret-free layers, missing-env-var exit, and dep-layer cache hit. All 102 integration tests still pass; shellcheck clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9.9 KiB
Implementation Plan: Production-Grade API Container Image
Branch: 010-api-prod-dockerfile | Date: 2026-05-07 | Spec: spec.md
Input: Feature specification from specs/010-api-prod-dockerfile/spec.md
Summary
Produce a production-ready api/Dockerfile.prod using a two-stage build: a uv builder stage that installs lockfile-pinned, production-only dependencies into a virtual environment, and a lean python:3.12-slim runtime stage that contains only the venv, application source, and curl for health checks. The runtime process runs as a non-root user (UID 1001), handles SIGTERM gracefully via uvicorn's built-in drain, and logs exclusively to stdout/stderr. Behavioral verification is automated via a shell script (api/tests/build/verify_production_image.sh) written before the Dockerfile (§5.1 TDD).
Technical Context
Language/Version: Python 3.12 (existing API), Docker multi-stage build
Build tool: uv (lockfile: api/uv.lock, already committed)
Base images: ghcr.io/astral-sh/uv:python3.12-bookworm-slim (builder), python:3.12-slim (runtime)
Testing: Shell verification script (verify_production_image.sh) + make verify-prod target
Target Platform: linux/amd64 container (Kubernetes or Docker host)
Performance Goals: Container starts and passes health check within 30s; rebuild from warm cache in under 60s
Constraints: No root process, no hardcoded secrets, no dev deps in final image, compatible with --read-only filesystem
Scale/Scope: Single-file addition (Dockerfile.prod) + shell test + two Makefile targets; zero changes to existing source code
Constitution Check
GATE: Must pass before Phase 0 research. Re-checked post-design below.
| Principle | Status | Notes |
|---|---|---|
| §5.1 TDD non-negotiable | COMPLIANT | verify_production_image.sh written before Dockerfile.prod; script fails (red) because the build file is absent, then passes (green) after |
| §5.2 Test pyramid | COMPLIANT | Shell verification script is the integration-level test for this build artefact; no unit tests applicable (no Python business logic added) |
| §5.4 CI must pass | COMPLIANT | make verify-prod target is runnable in host CI (requires Docker on the runner, which the existing make test-integration already requires) |
| §6 Tech Stack — Docker | COMPLIANT | Docker + Docker Compose are mandated; this adds a production Docker file within that constraint |
| §7.1 One-command local start | COMPLIANT | api/Dockerfile (dev stack) is unchanged; docker compose up is unaffected |
| §7.2 Environment configuration | COMPLIANT | Dockerfile.prod contains zero hardcoded env values; all config is injected at runtime |
| §7.3 Ruff/lint | COMPLIANT | No new Python files; shell script linted with shellcheck |
| §2.6 No speculative abstraction | COMPLIANT | Single Dockerfile, no plugin system or generics |
| §8 Scope boundaries | COMPLIANT | Purely infrastructure; no new API routes, data model, or UI changes |
Post-design re-check: All gates remain green. No violations.
Project Structure
Documentation (this feature)
specs/010-api-prod-dockerfile/
├── plan.md # This file
├── research.md # Phase 0 decisions
├── contracts/
│ └── container.md # Container interface contract (port, env vars, signals, user)
├── quickstart.md # Build and verification scenarios
└── tasks.md # Generated by /speckit-tasks
Source Code Changes
api/
├── Dockerfile # Existing dev/test image — UNCHANGED
├── Dockerfile.prod # NEW: production multi-stage image
├── .dockerignore # Existing — verify test files are excluded from build context
└── tests/
└── build/
└── verify_production_image.sh # NEW: TDD verification script (written first)
Makefile # Root Makefile — add build-prod and verify-prod targets
Dockerfile.prod — Annotated Reference
# syntax=docker/dockerfile:1
# ════════════════════════════════════════════════
# Build stage: install production deps via uv
# ════════════════════════════════════════════════
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS builder
WORKDIR /app
# Pre-compile bytecode; use copy mode for cross-layer compatibility
ENV UV_COMPILE_BYTECODE=1 \
UV_LINK_MODE=copy \
UV_PYTHON_DOWNLOADS=never
# ── Layer cache split: deps only (changes rarely) ──
COPY pyproject.toml uv.lock ./
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --frozen --no-dev --no-install-project
# ── Layer cache split: source (changes often) ──
COPY app/ ./app/
# ════════════════════════════════════════════════
# Runtime stage: lean image with venv + source
# ════════════════════════════════════════════════
FROM python:3.12-slim
WORKDIR /app
# curl for HEALTHCHECK — only tool added beyond base Python
RUN apt-get update \
&& apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
# Non-root system user (UID/GID 1001)
RUN groupadd --system --gid 1001 appgroup \
&& useradd --system --uid 1001 --gid 1001 --no-create-home appuser
# Copy venv from builder; copy source directly from build context
COPY --from=builder --chown=appuser:appgroup /app/.venv /app/.venv
COPY --chown=appuser:appgroup app/ ./app/
USER appuser
# Activate the venv by prepending its bin to PATH
ENV PATH="/app/.venv/bin:$PATH"
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost:8000/api/v1/health || exit 1
# uvicorn handles SIGTERM; --timeout-graceful-shutdown gives 30s to drain requests
CMD ["uvicorn", "app.main:app", \
"--host", "0.0.0.0", \
"--port", "8000", \
"--timeout-graceful-shutdown", "30"]
Note on COPY paths: Build context is
api/(as set by the Makefile target).COPY app/ ./app/in both stages refers toapi/app/. The runtime stage copies source directly from the build context, not from the builder stage — this is simpler and avoids an extra intermediate layer.
verify_production_image.sh — Structure
#!/usr/bin/env bash
# TDD verification script for api/Dockerfile.prod
# Fails (red) if Dockerfile.prod does not exist or any check fails.
set -euo pipefail
IMAGE="reactbin-api-prod:verify-$$"
cleanup() { docker rm -f "$CONTAINER" 2>/dev/null || true; docker rmi "$IMAGE" 2>/dev/null || true; }
trap cleanup EXIT
# Step 1: Build — fails red if Dockerfile.prod is absent
docker build -f api/Dockerfile.prod api/ -t "$IMAGE"
# Step 2: Start container with minimal env vars
CONTAINER=$(docker run -d -p 18000:8000 \
-e JWT_SECRET_KEY=verify-test-key \
-e OWNER_USERNAME=testowner \
-e OWNER_PASSWORD=testpassword \
-e DATABASE_URL=postgresql+asyncpg://noop:noop@noop/noop \
-e S3_ENDPOINT_URL=http://noop:9000 \
-e S3_BUCKET_NAME=noop \
-e S3_ACCESS_KEY_ID=noop \
-e S3_SECRET_ACCESS_KEY=noop \
-e S3_REGION=us-east-1 \
"$IMAGE")
# Step 3: Poll health endpoint (app will fail to connect to DB, but /health is pre-DB)
for i in $(seq 1 30); do
if curl -sf http://localhost:18000/api/v1/health > /dev/null; then break; fi
sleep 1
[[ $i -eq 30 ]] && { echo "FAIL: health check timed out"; exit 1; }
done
# Step 4: Assert non-root user
UID_IN_CONTAINER=$(docker exec "$CONTAINER" id -u)
[[ "$UID_IN_CONTAINER" -ne 0 ]] || { echo "FAIL: process running as root"; exit 1; }
# Step 5: Graceful shutdown
docker stop "$CONTAINER" # sends SIGTERM
EXIT_CODE=$(docker wait "$CONTAINER")
[[ "$EXIT_CODE" -eq 0 ]] || { echo "FAIL: non-zero exit code $EXIT_CODE"; exit 1; }
# Step 6: Dev deps absent
if docker run --rm "$IMAGE" /app/.venv/bin/python -c "import pytest" 2>/dev/null; then
echo "FAIL: pytest importable in production image (dev deps present)"; exit 1
fi
echo "All production image checks passed."
Note on health check feasibility:
/api/v1/healthis a simple JSON response that does not require a database connection (confirmed inapi/app/main.py). The verification script can therefore pass even without a real PostgreSQL instance.
Makefile Targets
Add to root Makefile:
.PHONY: build-prod verify-prod
build-prod:
docker build -f api/Dockerfile.prod api/ -t reactbin-api-prod:latest
verify-prod:
bash api/tests/build/verify_production_image.sh
.dockerignore Review
The existing api/.dockerignore already excludes .venv/, __pycache__/, .env, etc. Two additions improve the production build context:
tests/
*.egg-info/
alembic/
alembic.ini
tests/ and alembic/ are not needed in the production image (we COPY app/ ./app/ explicitly). Excluding them from the build context reduces the data sent to the Docker daemon.
*.egg-info/is already present in the existing.dockerignore.
Implementation Order
Tasks are generated by /speckit-tasks, but the logical dependency order is:
- Write
verify_production_image.sh(TDD red — build fails becauseDockerfile.prodabsent) - Add
Makefiletargets (build-prod,verify-prod) — references the script - Write
api/Dockerfile.prod(implement to make TDD pass) - Update
api/.dockerignore(excludetests/,alembic/from build context) - Run
make verify-prod(TDD green — all 6 checks pass) - Run
shellcheckonverify_production_image.sh
No existing tests are modified. make test-integration continues to use api/Dockerfile unchanged.