Dockerfile: The RUN Gotcha Explained

Docker is a popular tool for packaging applications into containers, but it can be tricky to get right. One gotcha to be aware of is how to use RUN in separate layers when installing dependencies like nvm. In this blog post, we'll explore how to avoid issues when using RUN in separate layers and share best practices to help you build efficient Docker images.

Understanding Docker Layers

Before diving into how to use RUN in separate layers, let's first understand what Docker layers are. When you build a Docker image, each instruction in your Dockerfile creates a new layer. Each layer represents a change to the file system, such as installing a package or copying a file. The layers are cached so that if you make a change to your Dockerfile and rebuild your image, Docker only builds the layers that have changed. This caching can speed up your builds and reduce the size of your images.

The Problem with Using RUN in Separate Layers

One common issue when building Docker images is using RUN in separate layers when installing dependencies like nvm. For example, consider the following Dockerfile:

FROM openjdk:11-jdk

RUN apt-get update && \
    apt-get install -y curl unzip libglu1-mesa libjaxb-api-java &&     \
    curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash

ENV NODE_VERSION=v14.16.0

RUN nvm install $NODE_VERSION

RUN nvm use $NODE_VERSION

The RUN nvm install $NODE_VERSION command fails with the error message:

[3/4] RUN nvm install v14.16.0: 
0.437 /bin/sh: 1: nvm: not found

The error occurs because nvm isn't available in the current shell session (current layer) during the RUN nvm install $NODE_VERSION command. To fix this, source the nvm setup script in each session or combine installation in one layer. In Docker, each RUN instruction creates a new layer representing a distinct filesystem snapshot. However, variables set or environment changes made in one RUN command don't persist unless explicitly managed in subsequent RUN commands.

Best Practices for Using RUN in Separate Layers

To avoid issues with RUN in separate layers, you should combine all your dependencies into a single RUN instruction. For example, you could rewrite the Dockerfile as follows:

FROM openjdk:11-jdk

ENV NODE_VERSION=v14.16.0

RUN apt-get update && \
    apt-get install -y curl unzip libglu1-mesa libjaxb-api-java && \
    curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash && \
    . ~/.nvm/nvm.sh && \
    nvm install $NODE_VERSION && \
    nvm use $NODE_VERSION

Conclusion

When building Docker images, it's important to be mindful of the layers that you are creating with each RUN instruction. By following best practices and combining dependencies into a single RUN instruction, you can avoid issues like the one we saw with nvm. This not only helps you build efficient Docker images but also ensures that your images are reliable and consistent.

Saurabh Yadav