I was building a docker image and saw the term “layers” – Explained

2022.01.13

These past few days, I am learning about containers and AWS’s containerisation services. While doing this, I was trying to build a docker image from a dockerfile. It is a simple command, with various sources to build an image from. It goes something like this:

 

docker build [OPTIONS] PATH | URL | -

Since I don’t know the syntax of a dockerfile very well yet, I decided to take a sample dockerfile from the internet to run the docker build command and watch the process unfold. For your reference this is the dockerfile which I used:

 

FROM node:12-alpine</span>

RUN apk add --no-cache python2 g++ make
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node","src/index.js"]

While running the command I saw ‘exporting layers’ pop up at the end of the build, which got me wondering, what are these ‘layers’?

Docker layers

In the 2nd last step of building the image we can see the output 'docker layers'

What are layers?

A layer or an image layer is a change on an image or an intermediate image, every change that we see in the image, starting from the base image (the base image is the one which is defined in the first line of the dockerfile), creates a new intermediate image, which is formed by adding a ‘layer’ on top of the previous intermediate image till we get the final image. 

Basically each line of the dockerfile adds one or more layers to the previous image.

Whenever there is a change in the dockerfile, docker will only build the layer which has changed, and the layers after that. This makes it faster and easier to build images for docker. This is known as layer caching.

It is also important to know that most lines will create a new layer but its only ADD, COPY or RUN commands which will create a layer which increases the size of the resulting container image.

You can run the docker history command and see the layers created during the build of your docker image. For the dockerfile which I have given above, the output of this command is in the image below.

docker history output

Layers of the image, here f702e4497f7d is the image id

Why layers?

Layers are immutable. Once created, that layer identified by a sha256 hash will never change. That immutability allows images to safely build and fork off of each other. If two dockerfiles have the same initial set of lines, and are built on the same server, they will share the same set of initial layers, saving disk space. That also means if you rebuild an image, with just the last few lines of the Dockerfile experiencing changes, only those layers need to be rebuilt and the rest can be reused from the layer cache. This can make a rebuild of docker images very fast. This is known as layer caching.

 

The second advantage is that inside a container, you see the image filesystem, but that filesystem is not copied. On top of those image layers, the container mounts it's own read-write filesystem layer. Every read of a file goes down through the layers until it hits a layer that has marked the file for deletion, has a copy of the file in that layer, or the read runs out of layers to search through. Every write makes a modification in the container specific read-write layer.

 

I hope this has helped you in understanding the concept of layers. I will continue posting more blogs of my learnings in this topic in the future.