Every container starts from an image. If you’ve ever pulled something from Docker Hub, built a Dockerfile, or deployed a containerized app, you’ve already worked with one. But what is a Docker image, exactly, and why does it matter for how you build and ship software?
A Docker image is the foundation of container-based development. It packages your code, dependencies, and runtime into a single portable unit that runs the same way everywhere.
This article breaks down how Docker images work, from layers and base images to registries, tagging strategies, security scanning, and size optimization. Whether you’re just getting started with containerization or looking to tighten your CI/CD workflow, you’ll walk away with a clear, practical understanding of what sits behind every docker run command.
What Is a Docker Image

A Docker image is a read-only template that contains everything needed to run an application inside a container. That includes the application code, runtime, system tools, libraries, and configuration files, all bundled into a single portable unit.
You build a Docker image from a set of instructions written in a file called a Dockerfile. Once built, that image can be stored in a container registry like Docker Hub and pulled onto any machine running the Docker Engine.
The image itself doesn’t run. It’s the blueprint. When you launch an image, Docker creates a container from it, which is the actual running instance of that application. Think of the image as the recipe and the container as the dish you just cooked.
Stack Overflow’s 2025 Developer Survey found Docker reached 71.1% overall adoption, the largest single-year jump of any technology measured. That’s not a niche tool anymore. It’s standard infrastructure.
And the numbers behind image distribution are wild. Docker Hub has recorded over 318 billion all-time image pulls, a 145% increase year-over-year according to the Docker Index.
Docker Image vs. Docker Container
This trips up a lot of people, and I still see experienced developers mix up the terms.
A Docker image is static. It doesn’t change, doesn’t execute, and doesn’t hold state. A Docker container is what happens when you run that image. The container gets a writable layer on top of the image’s read-only layers, and that’s where runtime data lives.
You can spin up dozens of containers from the same image. Each one runs independently with its own writable layer, but they all share the same underlying image layers. That’s what makes containerization so resource-efficient compared to running full virtual machines.
| Attribute | Docker Image | Docker Container |
|---|---|---|
| State | Read-only, immutable | Writable, ephemeral |
| Purpose | Blueprint / template | Running instance |
| Storage | Stored in registries | Exists on host at runtime |
| Lifecycle | Built once, reused many times | Created, started, stopped, removed |
How a Docker Image Relates to a Dockerfile
A Dockerfile is just a plain text file with build instructions. Each instruction (FROM, RUN, COPY, CMD) tells Docker what to include and how to configure the image.
You run docker build against that Dockerfile. Docker reads each instruction line by line, executes it, and stacks the result as a new layer in the image. The final output is your Docker image, tagged and ready to push or run.
Without a Dockerfile, there’s no image. Without the image, there’s no container. The whole chain starts with that text file in your codebase.
How Docker Image Layers Work

Every Docker image is made up of stacked, read-only layers. Each layer represents one instruction from the Dockerfile. Understanding this structure is what separates people who build efficient images from those shipping 1.5GB monsters to production.
When you write a RUN apt-get install curl instruction, Docker executes that command and captures the filesystem changes as a new layer. The next instruction creates another layer on top. These layers are stacked using a union file system (OverlayFS on most Linux setups) that merges everything into a single coherent filesystem.
Layer Caching and Build Performance
Caching is where layers really pay off. Docker checks whether an instruction has changed since the last build. If it hasn’t, Docker reuses the cached layer instead of rebuilding it.
This is why Dockerfile instruction order matters so much. Put instructions that change frequently (like COPY . . for your application code) near the bottom. Put instructions that rarely change (like installing system dependencies) near the top.
I’ve seen teams cut their build times from 8 minutes to under 90 seconds just by reordering their Dockerfile instructions. Not by changing anything about the actual application. Just by putting the volatile stuff last so Docker can cache everything above it.
One common mistake: using separate RUN instructions for every single command. Each one creates a new layer. Combining related commands into a single RUN instruction keeps your layer count (and image size) under control.
The Writable Container Layer
When a container starts, Docker adds a thin writable layer on top of the image’s read-only layers. All runtime changes (new files, modified configs, log output) go into this writable layer.
Delete the container, and that writable layer disappears. The underlying image stays untouched. That’s what makes images reusable and containers disposable.
This separation is also why Docker volumes exist. If your container writes data you actually want to keep (like database files), you mount a volume that persists independently of the container lifecycle.
How to Build a Docker Image

Building a Docker image comes down to three things: write a Dockerfile, run docker build, and tag the result. The complexity lives in how well you write that Dockerfile.
Contrary Research reports that over 17 million developers use the Docker platform globally as of 2024. Most of them are building images daily as part of their software development process.
Common Dockerfile Instructions and What They Do
FROM sets the base image. Every Dockerfile starts here. FROM node:20-alpine tells Docker to use the Node.js 20 runtime on Alpine Linux as the foundation.
RUN executes commands during the build process. Installing packages, compiling code, creating directories. Each RUN creates a new layer.
COPY and ADD bring files from your local machine (the build context) into the image. COPY is the straightforward one. ADD has extra features like auto-extracting archives, but COPY is preferred for most cases.
CMD defines the default command when a container starts from the image. ENTRYPOINT does something similar but is harder to override at runtime. Most Dockerfiles use one or both.
Here’s a basic example that ties it together:
“ FROM python:3.12-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . CMD ["python", "main.py"] `
Notice the order. Requirements get copied and installed first. Application code comes after. That way, Docker caches the dependency installation layer until requirements.txt actually changes.
Single-Stage vs. Multi-Stage Builds
A single-stage build uses one FROM instruction. Everything goes into one image, including build tools you don’t need at runtime.
Multi-stage builds changed that. You use multiple FROM statements in the same Dockerfile. The first stage compiles or builds your application. The second stage copies only the finished output into a clean, minimal image.
The difference is dramatic. Took me forever to figure out why a basic Go API image was 800MB until I realized the entire Go compiler was sitting in the final image. Multi-stage build brought it down to 12MB. Your mileage may vary, but reductions of 90% or more are common with this approach.
Multi-stage builds are especially useful in continuous integration pipelines where image transfer time directly affects deploy speed.
Base Images and Parent Images
Every Docker image starts from something. That something is either a base image or a parent image, and the distinction matters more than most tutorials let on.
What Counts as a Base Image
A true base image has no parent. In Docker, the only real base image is scratch, which is literally an empty filesystem. You'd use scratch for statically compiled binaries (Go programs, for instance) where no OS libraries are needed at all.
Everything else is technically a parent image. When your Dockerfile says FROM ubuntu:22.04, Ubuntu is the parent image your layers build on top of.
But the industry uses “base image” loosely to mean whatever sits in your FROM line. Fair enough. Just know that when someone says “choose a good base image,” they mean pick the right parent.
Choosing Between Alpine, Debian, and Distroless
This decision affects image size, security surface, and compatibility. There’s no single right answer, but there are clear tradeoffs.
| Base Image Type | Approximate Size | Best For | Tradeoff |
|---|---|---|---|
| Alpine Linux | ~5 MB | Lightweight services, microservices | Uses musl libc, some compatibility quirks |
| Debian Slim | ~80 MB | General-purpose apps | Larger than Alpine, but fewer surprises |
| Ubuntu | ~188 MB | Development, local testing | Full-featured but heavy for production |
| Distroless (Google) | ~20 MB | Production deployments | No shell, hard to debug |
Alpine’s tiny footprint makes it tempting for everything. But if your application relies on glibc (which many Node.js and Python packages do), you’ll hit weird runtime errors that eat your afternoon.
Gartner estimates that 95% of new digital workloads will be on cloud-native platforms by 2025, and most of those run containerized. Picking the right parent image for production is no longer an afterthought.
Official images on Docker Hub are maintained by Docker in partnership with upstream projects. Community images can be fine, but always check their update frequency and vulnerability status before pulling them into your build pipeline.
Docker Image Registries and Repositories
Once you build an image, it needs to go somewhere. That somewhere is a container registry, which is basically a storage and distribution system for Docker images.
A registry is the server that hosts images. A repository is a collection of related images inside that registry, usually grouped by name and differentiated by tags. So nginx is the repository, and nginx:1.25 is a specific tagged image within it.
Docker Hub and Public Registries

Docker Hub is the default. When you run docker pull nginx, Docker pulls from Hub unless you specify otherwise.
As of early 2024, Docker Hub hosts over 15 million repositories with 26 million monthly active IPs accessing them, according to Contrary Research. That makes it the largest public registry by a wide margin.
The 160+ Official Images on Docker Hub account for over 20% of all pulls. These are curated, regularly patched, and signed by Docker. For production use, official images should always be your first choice when available.
Private and Cloud-Provider Registries
Amazon ECR: tight integration with AWS services and IAM-based access control. Good if your infrastructure already lives on AWS.
Google Artifact Registry: replaced Google Container Registry. Works well with GKE and Cloud Build. Supports multi-format artifacts beyond just container images.
Azure Container Registry: integrates with Azure DevOps and AKS. Offers geo-replication for teams with global deployment needs.
GitHub Container Registry: tied directly to GitHub repositories and GitHub Actions. If your code lives on GitHub, this is the least-friction option for image storage.
Pushing and pulling images from private registries uses the same docker push and docker pull commands, just with the full registry URL prepended to the image name.
For teams managing many images across multiple environments, a private registry is non-negotiable. It gives you access control, vulnerability scanning integration, and audit trails that public registries can’t match at the organizational level.
Docker Image Tags and Versioning

Tags are how you tell Docker which version of an image you want. And how you tell them apart is more important than it sounds, because getting this wrong leads to production incidents that are really annoying to debug.
Why the “latest” Tag Causes Problems
The latest tag doesn't mean "most recent." It's just the default tag Docker applies when you don't specify one. If a maintainer pushes version 2.3 as latest and then separately pushes version 2.4 without the latest tag, latest still points to 2.3.
In production environments, relying on latest means you have no control over what actually gets deployed. Two servers pulling "the same image" at different times might end up running completely different versions.
Pin your tags. Always.
Tagging Strategies That Actually Work
Semantic versioning is the most common approach. Tag your images as myapp:1.4.2 so anyone looking at the tag knows exactly what's running.
Some teams tag by Git commit hash. Others use build numbers from their CI system. The method matters less than the consistency. Pick a strategy, write it into your software development plan, and enforce it.
For maximum reproducibility, use SHA256 image digests instead of tags. A digest like myapp@sha256:a1b2c3… points to one exact image that can never change. Tags are mutable (anyone with push access can retag), but digests are immutable.
The Docker 2025 State of App Dev report found that 92% of IT industry respondents use containers. At that scale, loose tagging practices don’t just cause confusion. They cause outages.
If your team uses continuous deployment, automating image tagging inside the deployment pipeline removes human error from the equation. Tag, push, deploy. No manual steps.
How to Reduce Docker Image Size

Image size directly affects how fast your containers deploy, how much storage you burn, and how wide your attack surface gets. A bloated image is slow to pull, slow to start, and carries libraries nobody asked for.
Production images should aim for under 500 MB. Microservices images should stay under 200 MB. IoT and edge deployments need images under 100 MB.
The good news: most of the bloat is fixable. A few Dockerfile changes can cut image sizes by 90% or more without touching your application code.
Choosing Minimal Base Images
Alpine Linux weighs roughly 5 MB. Compare that to the default Ubuntu image at 188 MB. That’s a 97% reduction before you’ve added a single line of application code.
Google’s distroless images sit around 20 MB and strip out everything, including the shell. No package manager, no debugging tools, nothing an attacker can use to poke around. Perfect for production. Terrible for troubleshooting.
If your app relies on glibc, Debian Slim (~80 MB) avoids the musl compatibility headaches that Alpine sometimes introduces with Python and Node.js packages.
Using Multi-Stage Builds Effectively
One practical case: a Go API image built with a standard Dockerfile came out to 277 MB. After switching to a multi-stage build (compiling in golang:1.23-alpine, then copying the binary to a clean Alpine runtime), the final image dropped to 9 MB. That's a 96.75% reduction.
The pattern is straightforward:
- Stage 1 installs build tools, compiles code, runs tests
- Stage 2 starts from a minimal image and copies only the finished artifact
Build tools, compilers, dev dependencies, test frameworks. None of that belongs in your final image. Multi-stage builds make sure it stays out.
Other Size Reduction Techniques
Combine RUN commands: each RUN creates a layer. Three separate apt-get commands mean three layers of overhead. Chain them with && into one.
Use .dockerignore: without it, your entire build context (including nodemodules, .git, test fixtures) gets sent to the Docker daemon. That slows builds and can accidentally leak sensitive files.
Clean up in the same layer: if you install packages with apt-get, run apt-get clean in the same RUN instruction. Cleaning in a separate layer doesn't actually remove the data from the previous layer.
Tools like Dive let you inspect each layer of your image to see exactly where the size is coming from. Took me about ten minutes with Dive to find 400 MB of cached pip downloads hiding in one of our Python images.
Docker Image Security

A NetRise 2024 study found that commonly used Docker Hub containers contained an average of 604 known vulnerabilities, with over 45% of them more than two years old. That’s not a theoretical risk. That’s what you’re pulling into your infrastructure every time you run docker pull without checking.
Red Hat’s 2024 State of Kubernetes Security report paints a similar picture: two-thirds of organizations delayed app deployment because of container security concerns, and 46% experienced revenue or customer loss from security incidents.
Vulnerability Scanning Tools
| Tool | Type | Best For |
|---|---|---|
| Trivy | Open-source (Aqua Security) | Fast CI/CD scanning, broad coverage |
| Snyk | Commercial platform | Developer workflow integration, auto-fix PRs |
| Docker Scout | Built into Docker (Snyk-based) | Quick checks during local development |
| Grype | Open-source (Anchore) | Lightweight scanning, SBOM generation |
Trivy averaged 12 seconds per image scan in benchmark tests, roughly 3x faster than Clair, with a 94.5% detection rate across OS packages and application dependencies.
Snyk was named a Gartner Magic Quadrant Leader for Application Security Testing in 2025. Its reachability analysis filters out noise by checking whether vulnerable code paths are actually called by your application, cutting alert volume by 30-70%.
Best Practices for Image Security
Never run containers as root. Define a non-root user in your Dockerfile with USER. If an attacker breaks into the container, they inherit whatever privileges it runs with.
Avoid unverified third-party images. Official images on Docker Hub are maintained and patched by Docker in partnership with upstream maintainers. Random community images? You don’t know who built them or when they were last updated.
Sign your images with Docker Content Trust or cosign to verify they haven’t been tampered with between the registry and your runtime environment. Medplum, a healthcare platform, adopted Docker Hardened Images (non-root by default) to cut CVE noise and strengthen HIPAA/SOC 2 compliance.
Docker Images in CI/CD Pipelines

Docker images are the unit of deployment in modern CI/CD. You build them, test them, scan them, and push them to a registry. Then your orchestration system pulls and runs them. Every step is automated.
GitHub Actions runs over 5 million workflows daily, and Docker/Kubernetes usage within those workflows has grown by 40% year-over-year, according to GitHub statistics. CI/CD has standardized on containers as the packaging format.
Building and Pushing Images in CI
The typical flow in a build server looks like this:
- Check out code from the Git repository
- Run docker build
with the commit SHA as the image tag
- Push the tagged image to a private registry
GitHub Actions, GitLab CI, and Jenkins all have native support for this workflow. GitHub Actions even has a dedicated docker/build-push-action that handles multi-platform builds and registry authentication in one step.
Teams that use automated CI/CD pipelines report an average 48% faster release cycle from commit to production. Containerized test runners cut feedback times by 35-60% because tests run in parallel across isolated environments.
Caching Image Layers in CI
Layer caching is the difference between a 2-minute build and a 15-minute build.
Most CI platforms support Docker layer caching natively or through BuildKit’s cache export/import. The idea is the same as local builds: if a layer hasn’t changed, don’t rebuild it.
GitHub Actions supports caching through the actions/cache action or BuildKit's inline cache. GitLab CI offers a built-in Docker layer cache when using the Docker-in-Docker service.
Security Scanning as a Pipeline Gate
The smartest teams don’t just scan images. They block deployments when scans fail.
A typical gate looks like: run Trivy or Snyk after the build step, fail the pipeline if any critical or high-severity CVE is found, and generate a report as a build artifact for audit trails.
This fits into the broader DevOps practice of shifting security left, catching problems during development instead of after deployment. The collaboration between dev and ops teams is what makes this work in practice. Without shared ownership, security scanning either gets skipped or becomes a bottleneck.
Common Docker Image Commands

You don’t need to memorize every flag. But these are the commands you’ll run every day when working with Docker images. Bookmark this or keep it in a terminal snippet.
Building and Tagging
docker build -t myapp:1.0 . builds an image from the Dockerfile in the current directory and tags it as myapp:1.0.
docker tag myapp:1.0 registry.example.com/myapp:1.0 adds a registry prefix so you can push the image to a specific container registry. Wait, I already linked container registry earlier. Let me skip re-linking that.
BuildKit (enabled by default since Docker 23.0) gives you parallel layer execution and better caching. If you’re on an older version, set DOCKERBUILDKIT=1 before running builds.
Pulling and Pushing
| Command | What It Does |
|---|---|
docker pull nginx:1.25 | Downloads an image from the registry |
docker push myapp:1.0 | Uploads an image to the registry |
docker images | Lists all locally stored images |
docker image inspect myapp:1.0 | Shows detailed metadata (layers, config, size) |
When you run docker pull, Docker only downloads layers you don't already have locally. Shared base layers between images get reused. That's why pulling a second image that uses the same base is much faster than the first.
Cleanup and Maintenance
Images pile up fast if you don’t clean them. Especially on CI runners and local development machines.
docker rmi myapp:1.0 removes a specific image. docker image prune removes all dangling images (layers with no tag). Add -a to remove all unused images, not just dangling ones.
docker save -o myapp.tar myapp:1.0 exports an image to a tar archive for offline transfer. docker load -i myapp.tar imports it on another machine. Useful for air-gapped environments where you can’t pull from a registry.
If you’re managing images as part of a larger software configuration management strategy, automated cleanup policies on your registry matter just as much as local pruning. Most private registries support retention rules that delete images older than a set number of days or keep only the last N tags per repository.
FAQ on What Is a Docker Image
What is a Docker image in simple terms?
A Docker image is a read-only template that contains your application code, runtime, libraries, and configuration files. It’s the blueprint Docker uses to create containers. The image itself doesn’t run. It just defines what runs.
What is the difference between a Docker image and a container?
An image is static and immutable. A Docker container is the running instance created from that image. You can launch multiple containers from one image, each with its own writable layer on top.
Where are Docker images stored?
Locally, Docker stores images on your machine’s filesystem managed by the Docker Engine. Remotely, images live in a container registry like Docker Hub, Amazon ECR, or Google Artifact Registry. You push and pull between the two.
What is a Dockerfile?
A Dockerfile is a plain text file with build instructions. Each line (FROM, RUN, COPY, CMD) tells Docker what to include in the image. You run docker build against it to produce the final image.
What are Docker image layers?
Each instruction in a Dockerfile creates a read-only layer. Layers stack on top of each other using a union file system like OverlayFS. Docker caches unchanged layers to speed up rebuilds.
What is a base image?
A base image is the starting point defined in your Dockerfile’s FROM instruction. Common choices include Alpine Linux, Debian Slim, and Ubuntu. The base image directly affects your final image size and security surface.
How do I reduce Docker image size?
Use multi-stage builds to separate build tools from the final artifact. Choose minimal base images like Alpine. Combine RUN commands and add a .dockerignore file to exclude unnecessary files from the build context.
How do I pull a Docker image?
Run docker pull followed by the image name and tag. For example, docker pull nginx:1.25 downloads that specific version from Docker Hub. Without a tag, Docker defaults to latest.
Are Docker images secure?
Not automatically. Images can contain known vulnerabilities in their dependencies. Scan images with tools like Trivy or Snyk before deployment. Use official images, avoid running as root, and keep base images updated.
Can I build a Docker image without Docker Desktop?
Yes. Docker Engine on Linux works without Docker Desktop. You can also use alternatives like Podman or BuildKit standalone. Cloud CI services like GitHub Actions and GitLab CI build images without any local Docker installation.
Conclusion
Understanding what is a Docker image gives you control over how applications get packaged, distributed, and deployed. It’s the core building block behind every container workflow, from local development to Kubernetes orchestration at scale.
The practical side matters most. Writing efficient Dockerfiles, picking the right base image, using multi-stage builds, and scanning for vulnerabilities before pushing to a registry. These aren’t optional steps. They’re what separates a reliable deployment from a slow, bloated, insecure one.
Tagging discipline, layer caching, and registry management all tie directly into how your team handles software release cycles and source control management.
Docker images aren’t going anywhere. With container adoption still climbing across every industry, knowing how to build, optimize, and secure them is a baseline skill for any developer or operations engineer working with modern infrastructure.



