Monday, January 6, 2025

Docker Image Layer Analysis

Lets examine various layers using below Dockerfile 

FROM python:3.12-slim

# Set the working directory

WORKDIR /app

# Copy the requirements and install dependencies

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

# Copy the application code

COPY ./app ./app

# Expose the application port

EXPOSE 8000

# Start the FastAPI app

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

1. Base Image Layer

What Happens During Image Build?

Read the Dockerfile:

Docker sequentially reads each instruction in the Dockerfile.

Base Image:

If the Dockerfile starts with a FROM instruction, Docker pulls the specified base image (e.g., python:3.12) from the registry (e.g., Docker Hub) if it's not already available locally.

Layer Creation:

For each instruction (e.g., RUN, COPY, ADD), Docker creates a layer.

Each layer represents changes made to the filesystem (e.g., installed packages, copied files).

Layer Caching:


Docker caches each layer. If you rebuild the image and the instructions haven't changed, Docker reuses the cached layer instead of executing the instruction again.

This caching improves build performance.

Final Image:


After processing all instructions, Docker combines the layers into a single image.

The resulting image contains all the instructions applied in sequence, ready for distribution or execution.



How Layers Improve Docker Build Efficiency

Caching: Docker reuses unchanged layers from the cache when rebuilding an image, speeding up builds.

Shared Layers: Containers based on the same image share layers, reducing storage usage.



Best Practices for Layer Efficiency

Minimize Layer Count: Combine commands where possible (e.g., RUN apt-get update && apt-get install).

Order Matters: Place the most frequently changing instructions last in the Dockerfile.

Leverage .dockerignore: Exclude unnecessary files to reduce context size and prevent cache invalidation.

Group related instructions to minimize layers.

Install dependencies in one step to avoid redundant layers.

Use a lightweight base image (e.g., slim).



Build using below command 


docker build -t mrrathish/docker-layer-analysis:latest .




The output is displayed like the below 



Now instead of displaying the base image name as python:3.12-slim which is specified in the Dockerfile, it displays Blob . This is because Dive analyzes and displays all layers in the resulting image, including those inherited from the base image. Docker stores image layers as content-addressable blobs. Each blob corresponds to a change in the filesystem or metadata. 

dive doesn't show the image tag (python:3.12-slim) because the tag is metadata used for convenience. Instead, it shows the underlying layers that make up the image. This is the reason the  FROM python:3.12-slim is translated into the 4 different layers that make up the base image. 



If notice, EXPOSE does not appear as a separate layer. This is because Layers in Docker images are typically created only by instructions that alter the file system, such as RUN, ADD, or COPY. The EXPOSE goes as part of Metadata of the docker image. The port specified by EXPOSE is included in the image metadata which we can see as ExposedPorts in the docker inspect <image-name> output. 


Some of he layers are 0B size, this is because These layers exist as placeholders in the image's metadata but do not contribute to the overall size.


Now let's examine how instruction change in Dockerfile affects layer and image size. Before this below steps, the size of the image created is 172MB.

Change the Dockerfile to look like below 


FROM python:3.12-slim


# Set the working directory

WORKDIR /app


# Copy the requirements and install dependencies

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt


RUN apt-get update && \

    apt-get install -y curl && \

    apt-get install -y vim


# Copy the application code

COPY ./app ./app


# Expose the application port

EXPOSE 8000


# Start the FastAPI app

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]



Now if we see the Layes, it looks like below 


Additional three layers got added.


The total images size has grown to 237MB 


Now let's combine the RUN statements into single like using && like this below. 


Now the layers looks like the below. 




The image size is reduced slightly to 236MB and the layers got reduced. 


Also in the docker file frequently changing instructions has to be placed towards bottom of the Dockerfile


If frequently changing layers are placed at the top, every subsequent layer (even rarely changing ones) must be rebuilt during a change.

By placing rarely changing instructions at the top, Docker can reuse cached layers, rebuilding only the final layers that are affected by changes.




No comments:

Post a Comment