Two-stage build (node:22-slim builder + nginxinc/nginx-unprivileged:alpine runtime) with SPA fallback routing, long-lived cache headers for fingerprinted assets, non-root user (UID 101), and no Node.js toolchain in runtime image (82 MB vs 329 MB+ single-stage). Verified by ui/tests/build/verify_production_image.sh covering build, health, SPA routing, non-root, stdout logging, cache-control headers, SIGTERM exit 0, Node.js absent, secret-free layers, and dep-layer cache hit. 102 integration tests still pass; shellcheck clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.9 KiB
Feature Specification: Production-Grade UI Container Image
Feature Branch: 011-ui-prod-dockerfile
Created: 2026-05-07
Status: Draft
Input: User description: "Production-grade UI container image build"
User Scenarios & Testing (mandatory)
User Story 1 - UI Serves Reliably in Production (Priority: P1)
A production deployment starts the UI container and it serves the compiled application correctly — returning the app shell for all routes, responding quickly, and shutting down cleanly when the orchestrator stops it.
Why this priority: A container that can't serve traffic is not deployable. All other properties (security, build speed) are meaningless without a running service.
Independent Test: Build the image, start the container, and verify the root path returns a 200 response. Stopping the container produces a clean exit. This alone constitutes a deployable MVP.
Acceptance Scenarios:
- Given a built production image, When the container starts, Then it serves the application on port 8080 within 30 seconds.
- Given the container is running, When a request is made to any client-side route (e.g.,
/library,/tags), Then the server returns the app shell (200 OK) so client-side routing can take over. - Given the container is running, When a static asset is requested, Then it is returned with appropriate caching headers.
- Given a running container, When the orchestrator sends a stop signal, Then the container exits with code 0 within a reasonable timeout.
- Given the production image, When a health probe is issued to a designated endpoint, Then the container reports healthy.
User Story 2 - Minimal, Secure Container (Priority: P2)
The production image contains only what is needed to serve static files — no build tools, no source code, no node_modules. It runs as a non-privileged user.
Why this priority: Shipping build tools and source code in production images increases attack surface and image size. Running as root violates least-privilege principles.
Independent Test: Inspect the running container — confirm the process user is non-root; attempt to import or run a Node.js binary inside the image and confirm it is absent.
Acceptance Scenarios:
- Given the production image, When the running process user is inspected, Then it is not root (UID ≠ 0).
- Given the production image, When the image contents are inspected, Then
node_modules/, source TypeScript files, and the Node.js runtime are absent. - Given the production image, When image layer history is inspected, Then no secrets, API keys, or credentials appear in any layer command.
- Given the production image, When the image size is measured, Then it is substantially smaller than a single-stage image that includes the Node.js toolchain.
User Story 3 - Fast, Reproducible Builds (Priority: P3)
Rebuilding the image after a source-only change (no dependency changes) reuses the dependency installation layer from cache, completing in seconds rather than minutes.
Why this priority: Slow builds impede the development feedback loop and CI pipeline throughput. Dependency installs are the dominant time cost.
Independent Test: Build once, then change a source file and build again — the build output confirms the dependency layer was served from cache.
Acceptance Scenarios:
- Given the image has been built once, When only a source file is changed and the image is rebuilt, Then the dependency installation step is skipped (cache hit).
- Given a dependency file is changed, When the image is rebuilt, Then the dependency installation step runs fresh (cache miss is correct behaviour).
- Given two successive builds with identical inputs, Then both produce functionally identical output.
Edge Cases
- What happens when the container starts but the built assets are missing or corrupted?
- How does the server handle requests for non-existent routes that should fall back to the app shell (SPA routing)?
- What happens when the container receives a stop signal while actively serving requests?
- What happens if the port is already in use at startup?
Requirements (mandatory)
Functional Requirements
- FR-001: The production image MUST be built via a multi-stage process — a build stage compiles the application into static assets, and a separate runtime stage serves only those assets.
- FR-002: The runtime stage MUST NOT contain the Node.js runtime, npm, source TypeScript, or
node_modules/. - FR-003: The container MUST serve the application on port 8080. External orchestrators (docker-compose, Kubernetes ingress) map this to port 80 as needed.
- FR-004: The container MUST handle SPA (single-page application) routing by returning the app shell for any unmatched path, so client-side routing works correctly.
- FR-005: The container MUST run as a non-root user.
- FR-006: The container MUST expose a health-check endpoint that returns success when the service is ready to accept traffic.
- FR-007: The container MUST exit with code 0 when sent a graceful stop signal.
- FR-008: Static assets MUST be served with cache-control headers that enable client-side caching for fingerprinted assets.
- FR-009: The Dockerfile MUST structure layers so that dependency installation is cached independently from source code changes.
- FR-010: The build MUST be reproducible — given the same source and lockfile, successive builds produce equivalent images.
- FR-011: No credentials, secrets, or API keys MUST appear in any image layer.
Key Entities
- Build Stage: The intermediate container that installs dependencies and compiles source into static assets; discarded after build.
- Static Assets: The compiled output (HTML, JS bundles, CSS, fonts, images) that the runtime stage serves.
- Runtime Stage: The minimal final image containing only a web server and the compiled static assets.
- Production Image: The tagged, distributable image produced by the build; used directly in deployment.
Success Criteria (mandatory)
Measurable Outcomes
- SC-001: The container serves a 200 response on port 8080 within 30 seconds of starting.
- SC-002: The production image is substantially smaller than a single-stage image that retains the Node.js toolchain. A manual size comparison after the initial build confirms the multi-stage approach delivers a meaningful reduction (expected: >60% reduction).
- SC-003: A source-only rebuild completes in under 30 seconds (dependency layer served from cache).
- SC-004: All 11 functional requirements pass automated verification on every build.
- SC-005: The running container process has UID ≠ 0, confirmed by automated check.
- SC-006: No existing integration tests regress after the Dockerfile and supporting files are introduced.
Assumptions
- The Angular application is built for production using the standard build toolchain (
ng build --configuration productionor equivalent), producing adist/output directory. - The production web server is responsible for SPA fallback routing (returning the app shell for unmatched paths).
- Gzip or Brotli compression at the web server layer is desirable but not mandatory for the initial implementation.
- The UI container does not need to proxy API requests — it communicates with the API directly from the browser (the Angular proxy config is only used in local development).
- The container listens on port 8080 (non-privileged, enabling non-root operation). External load balancers or ingress controllers map this to port 80. TLS termination occurs upstream.
- The build context is the
ui/directory; files excluded from the build context (source maps in CI,node_modules/already present locally) are managed via.dockerignore. - The same verification approach used for the API image (a shell script as the TDD artefact) applies here.