Skip to main content
Test Double company logo
Services
Services Overview
Holistic software investment consulting
Software Delivery
Accelerate quality software development
Product Management
Launch modern product orgs
Legacy Modernization
Renovate legacy software systems
DevOps
Scale infrastructure smoothly
Upgrade Rails
Update Rails versions seamlessly
Technical Recruitment
Build tech & product teams
Technical Assessments
Uncover root causes & improvements
Case Studies
Solutions
Accelerate Quality Software
Software Delivery, DevOps, & Product Delivery
Maximize Software Investments
Product Performance, Product Scaling, & Technical Assessments
Future-Proof Innovative Software
Legacy Modernization, Product Transformation, Upgrade Rails, Technical Recruitment
About
About
What's a test double?
Approach
Meeting you where you are
Founder's Story
The origin of our mission
Culture
Culture & Careers
Double Agents decoded
Great Causes
Great code for great causes
EDI
Equity, diversity & inclusion
Insights
All Insights
Hot takes and tips for all things software
Leadership
Bold opinions and insights for tech leaders
Developer
Essential coding tutorials and tools
Product Manager
Practical advice for real-world challenges
Say Hello
Test Double logo
Menu
Services
BackGrid of dots icon
Services Overview
Holistic software investment consulting
Software Delivery
Accelerate quality software development
Product Management
Launch modern product orgs
Legacy Modernization
Renovate legacy software systems
Cycle icon
DevOps
Scale infrastructure smoothly
Upgrade Rails
Update Rails versions seamlessly
Technical Recruitment
Build tech & product teams
Technical Assessments
Uncover root causes & improvements
Case Studies
Solutions
Solutions
Accelerate Quality Software
Software Delivery, DevOps, & Product Delivery
Maximize Software Investments
Product Performance, Product Scaling, & Technical Assessments
Future-Proof Innovative Software
Legacy Modernization, Product Transformation, Upgrade Rails, Technical Recruitment
About
About
About
What's a test double?
Approach
Meeting you where you are
Founder's Story
The origin of our mission
Culture
Culture
Culture & Careers
Double Agents decoded
Great Causes
Great code for great causes
EDI
Equity, diversity & inclusion
Insights
Insights
All Insights
Hot takes and tips for all things software
Leadership
Bold opinions and insights for tech leaders
Developer
Essential coding tutorials and tools
Product Manager
Practical advice for real-world challenges
Say hello
Developers
Developers
Developers
DevOps & security

How to speed up Docker builds for cloud deployments

Docker builds are often painfully slow. Isolating where layer caching delays occur can help you optimize build times.
Mavrick Laakso
|
October 22, 2023
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Docker’s ubiquity presently is not without warrant: pretty much every deployment process I’ve seen in the past five years of my career has leveraged it to generate images for deployments.

Amazon’s ECS, Google’s Cloud Run, and Kubernetes all have images and containers at their core. Cloud native is the de-facto standard. So, accordingly, my present project (a serverless backend leveraging AWS Lambda) uses Docker to package the functions that are invoked.

This generally works great—what we author in our local environments corresponds to what we see in our cloud development environments, in staging, and in production. Our typical developer lifecycle is to author a change locally, test it locally, deploy the image to an image repository, and then use that image to deploy a container to a dev cloud environment to validate the changes in a serverless environment. However, one component of this process was a recurring thorn in our side.

Docker builds can be slow.

Particularly on this project, making a change could take upwards of a few minutes when rebuilding the Docker image to then deploy a container to a dev cloud environment. So, make a change, break your flow until the build finishes, and then forget everything you were doing. Or worse, get distracted.

With some downtime between tasks, I wanted to fix this to improve the developer experience of the project.

Getting the lay of the land

Iterative improvement is the name of the game, so rather than accepting a slow Dockerfile, I wanted to understand what the problem was and then investigate how to resolve it. This project’s codebase contained a large number of Lambda functions and their source code with their dependencies managed via npm workspaces. Further, a common module provided shared functionality across the codebase: writing to a database, making API calls, processing errors, and so forth.

$ tree .
.
├── Dockerfile
├── package-lock.json
├── package.json
├── src
│   ├── common
│   │   ├── module1.js
│   │   ├── module2.js
│   │   ├── package.json
│   │   ├── src
│   │   │   ├── module1
│   │   │   │   └── index.js
│   │   │   ├── module2
│   │   │   │   ├── index.js
│   ├── lambda1
│   │   ├── app.js
│   │   ├── package.json
│   │   └── src
│   │       ├── handler.js
│   │       ├── handler.spec.js
│   │       └── index.js
│   ├── lambda2
│   │   ├── app.js
│   │   ├── package.json
│   │   └── src
│   │       ├── handler.js
│   │       ├── handler.spec.js
│   │       └── index.js

Our Dockerfile copied in the ENV-provided Lambda source and the common directory, installed its dependencies, ran a build with esbuild, and copied the output build artifact into the deployment image:

ARG LAMBDA_DIRECTORY_NAME

# Builder image
FROM public.ecr.aws/lambda/nodejs:18 as builder
ARG LAMBDA_DIRECTORY_NAME
WORKDIR ${LAMBDA_TASK_ROOT}

RUN npm install -g npm@9

COPY package.json package.json
COPY package-lock.json package-lock.json
COPY src/common src/common
COPY src/${LAMBDA_DIRECTORY_NAME} src/${LAMBDA_DIRECTORY_NAME}

RUN npm ci
RUN LAMBDA_DIRECTORY_NAME=${LAMBDA_DIRECTORY_NAME} npm run build

# Deployment image
FROM public.ecr.aws/lambda/nodejs:18
ARG LAMBDA_DIRECTORY_NAME
WORKDIR ${LAMBDA_TASK_ROOT}

COPY --from=builder ${LAMBDA_TASK_ROOT}/src/${LAMBDA_DIRECTORY_NAME}/dist ${LAMBDA_TASK_ROOT}/dist

ENTRYPOINT /lambda-entrypoint.sh dist/app.lambdaHandler

Docker: Like an onion

A docker image is a set of layers, with each instruction in the Dockerfile generally translating to a new layer. Docker caches these layers on repeated builds and does its best to rebuild only what has been changed, making your build faster.

This is like an ordered list, a dependency chain, or an onion (if you prefer). Changing a file that is referenced in the first instruction (at the start of the list) of your Dockerfile means everything after must be re-executed. Changing a file that is referenced in the last instruction (at the end of the list) of your Dockerfile means only that instruction must be re-executed.

So, I wanted to know where the time was being spent in building the image. Using the docker buildx build command, I was able to see the amount of time spent running each instruction:

$ docker buildx build --build-arg LAMBDA_DIRECTORY_NAME=lambda1 .
[+] Building 34.2s (14/15)
 => [internal] load build definition from Dockerfile                                                                                           0.0s
 => => transferring dockerfile: 721B                                                                                                           0.0s
 => [internal] load .dockerignore                                                                                                              0.0s
 => => transferring context: 2B                                                                                                                0.0s
 => [internal] load metadata for public.ecr.aws/lambda/nodejs:18                                                                              30.2s
 => [internal] load build context                                                                                                              0.1s
 => => transferring context: 1.43MB                                                                                                            0.1s
 => [builder 1/9] FROM public.ecr.aws/lambda/nodejs:18@sha256:50f22b7077c7fbb7be2720fb228462e332850a4cd48b4132ffc3c171603ab191                 0.0s
 => CACHED [builder 2/9] WORKDIR /var/task                                                                                                     0.0s
 => [builder 3/9] RUN npm install -g npm@9                                                                                                     5.1s
 => [builder 4/9] COPY package.json package.json                                                                                               0.0s
 => [builder 5/9] COPY package-lock.json package-lock.json                                                                                     0.0s
 => [builder 6/9] COPY src/common src/common                                                                                                   0.0s
 => [builder 7/9] COPY src/lambda1 src/lambda1                                                                                                 0.0s
 => [builder 8/9] RUN npm ci                                                                                                                  29.1s
 => [builder 9/9] RUN LAMBDA_DIRECTORY_NAME=lambda1 npm run build                                                                              0.6s
 => [stage-1 3/3] COPY --from=builder /var/task/src/lambda1/dist /var/task/dist                                                                0.0s
 => exporting to image                                                                                                                         0.0s
 => => exporting layers                                                                                                                        0.0s
 => => writing image sha256:a8095a1267ddf2a08d53525231565087e1d575a38b41eb9c6eddb331d977c591                                                   0.0s

The problem

It looked like the bulk of our time was spent running npm ci.

The majority of our logic and functionality lived in our common directory, so that was the code that most frequently changed.

Whenever we made a change to common, our npm ci build instruction was re-executed. Further, since common functions as a shared library across our code, and this Dockerfile was common to all our Lambda functions, any dependent Lambdas would also have to be rebuilt in order to deploy.

So, every time we made a code change in common, for every Lambda, we had to re-invoke npm ci, leading to our slow builds, and our frequent coffee breaks.

The solution

Remember how Docker is like an onion?

We only needed to re-execute npm ci when a dependency was added, modified, or changed. So, modifying our Dockerfile to copy package.json and package-lock.json, executing the npm ci step, and then copying over our source code should result in the slow step being cached for our general case (modifying common).

We can observe this change from the following Dockerfile:

ARG LAMBDA_DIRECTORY_NAME

# Builder image
FROM public.ecr.aws/lambda/nodejs:18 as builder

ARG LAMBDA_DIRECTORY_NAME

WORKDIR ${LAMBDA_TASK_ROOT}

RUN npm install -g npm@9

COPY package.json package-lock.json ./
COPY src/common/package.json src/common/package.json
COPY src/${LAMBDA_DIRECTORY_NAME}/package.json src/${LAMBDA_DIRECTORY_NAME}/package.json

RUN npm ci

COPY src/common/src src/common/src
COPY src/common/module1.js \
     src/common/module2.js \
     ./src/common/

COPY src/${LAMBDA_DIRECTORY_NAME}/src src/${LAMBDA_DIRECTORY_NAME}/src/
COPY src/${LAMBDA_DIRECTORY_NAME}/app.js src/${LAMBDA_DIRECTORY_NAME}/

RUN LAMBDA_DIRECTORY_NAME=${LAMBDA_DIRECTORY_NAME} npm run build

# Deployment image
FROM public.ecr.aws/lambda/nodejs:18

ARG LAMBDA_DIRECTORY_NAME

WORKDIR ${LAMBDA_TASK_ROOT}

COPY --from=builder ${LAMBDA_TASK_ROOT}/src/${LAMBDA_DIRECTORY_NAME}/dist ${LAMBDA_TASK_ROOT}/dist

ENTRYPOINT /lambda-entrypoint.sh dist/app.lambdaHandler

The results

Installing a new dependency still required rerunning npm ci, meaning it took a second (or thirty of them more often). However, modifying code in common no longer triggered npm ci to re-execute. So, we could author and deploy code changes to our dev cloud environment much more quickly and not break our flow state as a result:

$ docker buildx build --build-arg LAMBDA_DIRECTORY_NAME=lambda1 .
[+] Building 0.5s (18/19)
 => [internal] load build definition from Dockerfile                                                                                           0.0s
 => => transferring dockerfile: 1.29kB                                                                                                         0.0s
 => [internal] load .dockerignore                                                                                                              0.0s
 => => transferring context: 2B                                                                                                                0.0s
 => [internal] load metadata for public.ecr.aws/lambda/nodejs:18                                                                               0.4s
 => [builder  1/13] FROM public.ecr.aws/lambda/nodejs:18@sha256:50f22b7077c7fbb7be2720fb228462e332850a4cd48b4132ffc3c171603ab191               0.0s
 => [internal] load build context                                                                                                              0.1s
 => => transferring context: 475.68kB                                                                                                          0.0s
 => CACHED [builder  2/13] WORKDIR /var/task                                                                                                   0.0s
 => CACHED [builder  3/13] RUN npm install -g npm@9                                                                                            0.0s
 => CACHED [builder  4/13] COPY package.json package-lock.json ./                                                                              0.0s
 => CACHED [builder  5/13] COPY src/common/package.json src/common/package.json                                                                0.0s
 => CACHED [builder  6/13] COPY src/lambda1/package.json src/lambda1/package.json                                                              0.0s
 => CACHED [builder  7/13] RUN npm ci                                                                                                          0.0s
 => CACHED [builder  8/13] COPY src/common/config src/common/config                                                                            0.0s
 => CACHED [builder  9/13] COPY src/common/src src/common/src                                                                                  0.0s
 => CACHED [builder 10/13] COPY src/common/module1.js      src/common/module2.js       src/common                                              0.0s
 => [builder 11/13] COPY src/lambda1/src src/lambda1/src/                                                                                      0.0s
 => [builder 12/13] COPY src/lambda1/app.js src/lambda1/                                                                                       0.0s
 => [builder 13/13] RUN LAMBDA_DIRECTORY_NAME=lambda1 npm run build                                                                            0.9s
 => [stage-1 3/3] COPY --from=builder /var/task/src/lambda1/dist /var/task/dist                                                                0.0s
 => exporting to image                                                                                                                         0.0s
 => => exporting layers                                                                                                                        0.0s
 => => writing image sha256:716193841c31688bfd4a4b08f81735accb2d5f047c9d33fd1d31461b935ecfe4                                                   0.0s

Devs are happiest when they’re working and not waiting, so I considered this a win for our team’s health and for our productivity. If you’re suffering from slow builds, I invite you to examine your Dockerfiles and think about how to order the instructions to optimize for caching slow steps.

Join the conversation about this post on our N.E.A.T community: How have you applied small changes to speed things up?

Not a N.E.A.T. community member yet? More info.

Related Insights

🔗
Dual booting a Rails upgrade with Docker and CI
🔗
Automate Docker deployment for Ruby: A DevOps guide
🔗
Navigating the challenges of using Docker for local development

Explore our insights

See all insights
Developers
Developers
Developers
You’re holding it wrong! The double loop model for agentic coding

Joé Dupuis has noticed an influx of videos and blog posts about the "correct" way of working with AI agents. Joé thinks most of it is bad advice, and has a better approach he wants to show you.

by
Joé Dupuis
Leadership
Leadership
Leadership
Don't play it safe: Improve your continuous discovery process to reduce risk

We often front-load discovery to feel confident before building—but that’s not real agility. This post explores how continuous learning reduces risk better than perfect plans ever could.

by
Doc Norton
Leadership
Leadership
Leadership
How an early-stage startup engineering team improved the bottom line fast

A fast-growing startup was burning cash faster than it could scale. Here’s how smart engineering decisions helped them improve the bottom line.

by
Jonathon Baugh
Letter art spelling out NEAT

Join the conversation

Technology is a means to an end: answers to very human questions. That’s why we created a community for developers and product managers.

Explore the community
Test Double Executive Leadership Team

Learn about our team

Like what we have to say about building great software and great teams?

Get to know us
Test Double company logo
Improving the way the world builds software.
What we do
Services OverviewSoftware DeliveryProduct ManagementLegacy ModernizationDevOpsUpgrade RailsTechnical RecruitmentTechnical Assessments
Who WE ARE
About UsCulture & CareersGreat CausesEDIOur TeamContact UsNews & AwardsN.E.A.T.
Resources
Case StudiesAll InsightsLeadership InsightsDeveloper InsightsProduct InsightsPairing & Office Hours
NEWSLETTER
Sign up hear about our latest innovations.
Your email has been added!
Oops! Something went wrong while submitting the form.
Standard Ruby badge
614.349.4279hello@testdouble.com
Privacy Policy
© 2020 Test Double. All Rights Reserved.