Optimizing Docker Images: A Data-Driven Guide to Reducing Image Size with Dive
These articles are AI-generated summaries. Please check the original sources for full details.
Docker Image Diet: Find the Problem With dive Before Trying to Fix It
Engineer Recca Tsai demonstrates that guessing image optimizations is less effective than diagnostic profiling using specialized tooling. By identifying specific layer inefficiencies, a standard Node.js image was reduced from 1.25GB to just 139MB.
Why This Matters
Engineers often apply generic optimization checklists without understanding why an image is large, leading to minimal gains. In reality, build-time dependencies and duplicated files—such as a 561MB apt-get layer or 107MB of wasted devDependencies—often persist in the final image, increasing storage costs and slowing deployment pipelines. True optimization requires visibility into the layer stack to target the actual source of weight rather than applying superficial changes.
Key Insights
- The ‘docker image history’ command reveals layer-specific weight, such as identifying a 561MB layer dedicated to build tools like gcc and python3.
- The ‘dive’ tool identifies file-level duplication, showing that files like typescript.js can appear in multiple layers when a .dockerignore file is missing.
- Wasted space often stems from devDependencies; analysis showed 107MB of waste from packages like @babel/parser that serve no purpose in production.
- Multi-stage builds effectively separate build environments from production runtimes, reducing a Node.js image to its 139MB Alpine-based floor.
- Switching to Google’s Distroless images for Node.js can further reduce the final production footprint to approximately 100MB.
Working Examples
A typical unoptimized Node.js Dockerfile that results in a 1.25GB image.
FROM node:latest
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
CMD ["node", "index.js"]
A multi-stage build using Alpine and production-only dependencies to reduce size to 139MB.
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
FROM node:20-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY --from=builder /app/index.js ./
CMD ["node", "index.js"]
Using the ‘scratch’ base image for Go applications to create minimal images containing only the binary.
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o server .
FROM scratch
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]
Practical Applications
- Use Case: Deploying Node.js microservices where production stages use ‘—omit=dev’ to exclude large testing frameworks like Jest. Pitfall: Forgetting a .dockerignore file causes local node_modules to be copied over the fresh install, doubling the image size.
- Use Case: Identifying redundant system packages in legacy images using ‘dive’ to find unused build-essential tools. Pitfall: Relying on ‘node:latest’ which uses Debian and includes hundreds of megabytes of unnecessary system utilities.
References:
Continue reading
Next article
Solving AI Behavioral Drift with Execution-Time Governance
Related Content
Optimizing Docker Images: Best Practices for Efficient Builds
Multi-stage builds reduce Docker image sizes by up to 80%, improving deployment speed and reducing storage costs.
From 1.2GB to 54MB: My Docker Image Went on a Diet
Reduced a Node.js Docker image from 1.2GB to 54MB using multi-stage builds and Alpine base images.
Working with Docker Images: From Basics to Best Practices
Master Docker image management with best practices, multi-stage builds, and distroless images to optimize size and security.