This article explains how to build, tag, and push container images to Amazon ECR with production discipline. It focuses on reproducibility, security boundaries, traceability, and operational clarity—treating container images as part of the software supply chain, not just build artifacts.

How to Build and Push Production-Ready Container Images to Amazon ECR

Introduction: “It builds” is not the same as “it’s ready”

I remember the first time I pushed a container image to Amazon Elastic Container Registry(ECR). I was so excited, felt like I had achieved one of the greatest feats. It was straightforward: I wrote a Dockerfile, ran docker build, tagged the image, and pushed it. I confirmed the image existed, went into my EC2 instance, pulled it, and ran it.

Well, years later, I realised that things were that straightforward because I did it wrong. The problems didn’t happen in one day or in one project. They appeared later in a several incidents:

I once found that we had an image in our production environment with a critical vulnerability. I discovered this after reading a newsletter from a DevOps blog.
A team for a cryptocurrency startup I joined had their secrets baked into the image layers.
A CI/CD failure that taught me that the image I build locally on my laptop needs to be consistent with the one I build for the production environment.
We had a service deployed to kubernetes that refused to update because we set our imagePullPolicy to ifNotPresent while using the latest tag.

There is a very big difference between an image that works and an image that is production-ready and Amazon ECR is more than just a place where we store images.

In reality, Amazon ECR is at the center of the software delivery chain. How one builds, and pushes images into ECR determines the security, reliability, and traceability of everything that runs after that.

In this article, I walk you through how I approach ECR as part of a disciplined production workflow instead of just a container storage service.

What Amazon ECR is -and What It’s Not

Before we dive into how to use ECR correctly, let’s take a look about the role it plays

Amazon Elastic Container Registry is:

A private container image registry provided by AWS
Integrated with AWS IAM for authentication and authorisation
Capable of scanning images for known vulnerabilities
Regionally scoped and tightly coupled to AWS infrastructure

What ECR does not do

It does not build images for you
It does not enforce good Dockerfile practices
It does not prevent you from pushing unsafe or poorly designed images
It doesn’t replace good CI/CD practices

ECR will faithfully store whatever you give it. This means that it is your responsibility to make sure that your image is safe/ready for production, and not ECR’s.

What “Production-Ready” Actually Means for Container Images

The phrase “production-ready” is often used loosely to mean functional or feature-complete software that works in basic demos. In our context, it means an image satisfies a few non-negotiable standards.

A production-ready image should meet the following standards:

It should be reproducable: This means that the same inputs should always produce the same image. It requires more than good intentions: pinned base images, deterministic dependency installation, and an awareness that package repositories, timestamps, and network access can quietly undermine reproducibility.
Minimal : It should contain only what is required to run the application. Every other files from the application should be stripped off. For example, in react apps, get rid of the node_modules, package.json, or lock files, and just export the static files from a build when that’s possible.
Secure by default: It does not run as root, does not embed secrets, and minimizes attack surface.
Immutable: Once built and tagged, it is never modified in place.
Traceable: You can tell when, how, and from which code it was built.

Everything else - performance, scalability, cost - builds on these foundations.

Designing an Image Strategy Before Writing a Dockerfile

A common mistake is starting with the Dockerfile and figuring out strategy later. That almost always leads to cleanup work.

An image strategy in Docker refers to the planned approach for building efficient, secure, and optimized container images before writing a Dockerfile. It covers ways to minimize image size, speed up builds, improve security, and boost performance. This planning ensures the resulting images are lightweight and production-ready.

Before building anything, decide a few things clearly.

One Image, One Responsibility: A container image should do one thing well. If an image needs to be configured differently per environment, that configuration should happen at runtime, not build time.

This allows:
- The same image to run in dev, staging, and production
- Clear rollback paths
- Predictable behavior
When the images per environment are different, it becomes meaningless to try and compare them. It also becomes harder to detect failure as to opposed to having the same image such that if something fails in dev, we fix it before it moves on to prod.
Tagging is a Promise: Tags aren’t just labels - they’re commitments to what users get.

Use a clear strategy:
- Fixed version tags (like 1.4.2 or git-sha) that never change.
- Optional rolling tags (main, stable) for ease.
Avoid:
- Pushing “latest” to production.
- Retagging rebuilt images.
Changing a deployed tag breaks trust and messes up debugging and makes post-incident analysis unreliable.

Writing a Production-Grade Dockerfile

The Dockerfile is not just a build script. It is a security and operational boundary.

Base Image Selection

Base images are inherited risk - they carry whatever vulnerabilities, bloat, or outdated packages the original creator baked in. Your choice of a base image can affect the overall outcome of your image build process.

Some practical guidelines:

Prefer official images or well-maintained minimal images
Understand the trade-offs:
- Alpine is small, but can cause compatibility issues
- Distroless images reduce attack surface, but require maturity
Pin base image versions explicitly. Instead of FROM node:alpine (unpinned) use FROM node:20-alpine3.19 (pinned). An unpinned base image introduces invisible change, even when your own code has not changed.

Multi-Stage Builds

A container image is built in stages. Each stage produces files, binaries, or artifacts that can be passed forward, but the next stage does not need to inherit everything that came before it.

This distinction is important.

By default, a single-stage Dockerfile accumulates everything: compilers, package managers, temporary files, and runtime dependencies all end up in the final image - even if they were only needed briefly during the build.

A multi-stage Dockerfile separates concerns explicitly. A typical structure looks like this:

A build stage that includes: compilers, build tools, and dependency managers
A runtime stage that receives: onnly the compiled binaries or required artifacts and the libraries needed to execute the application

Everything else is left behind.

The runtime image does not need:

The compiler that produced the binary
The package manager used to install dependencies
Temporary files created during the build

As a result, multi-stage builds:

Reduce image size: Smaller images are faster to pull, faster to deploy, and simpler to reason about during incidents.
Remove build tools from production: If a shell, compiler, or package manager is not present, it cannot be abused. Entire attack paths disappear simply because the tools are not there.
Limit the blast radius of vulnerabilities: Vulnerabilities in build-time dependencies cannot be exploited at runtime if those dependencies never make it into the final image.

This separation mirrors a broader principle: production environments should contain only what they need to run, not what was needed to create them.

Multi-stage builds make that principle enforceable, not aspirational.

Running as Non-Root

Containers often run as root by default - not because they need to, but because it is the easiest option.

When an application runs as root inside a container, it has full control inside that container. If something goes wrong, such as a bug in the application, a vulnerable library, or a misconfiguration, that power can be abused. In the worst case, an attacker can use that foothold to access the host machine itself or interfere with other workloads running nearby.

Containers are meant to isolate applications, but that isolation is not absolute. Running as root increases the impact of mistakes when the boundary is crossed.

A production image should:

Define a non-root user
Switch to that user explicitly
Ensure file permissions are correct for runtime access

This does not make containers “secure,” but it removes an entire class of avoidable risk.
When something goes wrong, running as non-root limits how far the failure can spread.

Layer Hygiene

Every instruction in a Dockerfile creates a new image layer. Each layer is kept, cached, and shipped, even if it only existed to support a temporary step during the build.

Good layer hygiene means being intentional about what actually ends up in the final image:

Combine related commands so temporary steps do not leave permanent traces
Remove temporary files and caches once they are no longer needed
Use .dockerignore to prevent unnecessary files from ever entering the build

Large images are not just slower to deploy. They carry more unused files, tools, and libraries -each one of those, another thing that can fail, leak information, or be exploited when something goes wrong.

Building Images Correctly: Local vs CI

Local builds are useful for development.
They allow engineers to iterate quickly and test ideas. They should not be the source of truth for production images.

A healthy production workflow looks like this:

Developers build locally for iteration
CI systems build images that will be deployed
CI builds are:
- Ephemeral
- Repeatable
- Logged and auditable

CI, however, is not automatically trustworthy. If build machines change over time, tools are updated without notice, or old dependency caches are reused, the same build can start producing different images without anyone realizing it.

When production images are built on laptops, or on loosely controlled CI runners, you can no longer reliably answer where an image came from or why it behaves the way it does. At that point, traceability is already gone.

Authenticating to Amazon ECR the Right Way

ECR authentication is often misunderstood because it looks like a Docker concern, but it is really an IAM concern.

At its core:

Docker clients authenticate using temporary credentials
Those credentials are issued based on IAM permissions

This means access to ECR is meant to be short-lived and role-based, not hard-coded. In practice:

EC2, ECS, and EKS workloads should use IAM roles
CI/CD systems should use OIDC-based role assumption
Long-lived access keys should be avoided entirely

If AWS access keys are stored in CI environment variables, access has already been decoupled from identity and intent - and failures or leaks become much harder to contain.

Tagging and Pushing Images to ECR

Pushing an image to ECR is straightforward. What matters is what that image represents once it is there.

In a production workflow, an image is not just something you push - it is something you will later need to identify, audit, and possibly roll back.

A healthy flow looks like this:

The image is built in CI, not on a developer’s machine
The image is tagged with:
- A unique identifier tied to the source code (for example, a Git commit SHA)
- Optionally, a human-friendly version tag
CI authenticates to ECR using a role
The image is pushed
Metadata about the build is recorded
This discipline exists for one reason:

You must be able to answer, at any time, exactly which image is running in production and where it came from.

If an image tag cannot be traced back to a specific build and a specific commit, it is no longer a reliable production artifact.

Image Scanning and Its Limits

Amazon ECR can scan images for known vulnerabilities. This is useful, but it is important to understand what scanning does - and what it does not do.

Image scanning can:

Identify known vulnerabilities in operating system packages and libraries

It cannot:

Understand how your application behaves
Detect logical or configuration flaws
Decide whether a vulnerability is exploitable in your environment
Enforce fixes on its own

For this reason, scanning should be treated as one signal among many, not as proof of safety.

It works best as:

A warning mechanism for serious issues
A gate for clearly unacceptable risk
A supplement to good base image selection and build practices

Relying on scanners alone shifts responsibility away from design decisions - and that is where most real failures begin.

Operating ECR in Real Systems

Once images are built and pushed regularly, operational concerns start to matter.

Lifecycle Policies

Without lifecycle policies, container registries tend to grow quietly and indefinitely. Over time, old images accumulate, storage costs rise without visibility, engineers struggle to identify which images are still relevant during incidents. Lifecycle policies exist to impose order.

They should retain recent images, preserve images that are currently deployed or referenced, and remove artifacts that are no longer used. The goal is not aggressive cleanup. It is being able to reason clearly when something breaks.

Cross-Account Patterns

In more mature environments, images are built in one account, and deployed from other accounts. This separation forces clarity around ownership and trust.

To make this work safely, repository access must be explicit, deployment accounts should not be able to mutate images and trust should be intentional and limited

A common rule is build once, consume read-only. When deployment environments can only pull immutable images - not rebuild or retag them, the integrity of the supply chain is preserved across accounts.

Common Mistakes to Avoid

Over the years, there are some mistakes I have seen appear repeatedly in production systems:

Deploying the latest tag to production
Baking secrets into images
Rebuilding images separately for each environment
Giving CI unrestricted access to ECR
Treating image size as an aesthetic concern

These issues often remain invisible during normal operation. They surface during incidents, when clarity matters most.

Let’s do a recap of what we learnt

A production-ready ECR workflow is not complicated, but it is intentional.

It means:

Writing Dockerfiles that include only what is needed
Building images in controlled, repeatable CI environments
Using short-lived, role-based authentication
Tagging images immutably
Managing images with the expectation that they will be inspected later
Deploying the same artifact across environments

This is less about tools and more about discipline.

Conclusion

A container image is not just a file stored in a registry.

It is a promise:

That what was tested is what is running
That changes are intentional
That failures can be understood after the fact

Amazon ECR gives you a place to store images. Whether those images deserve trust depends entirely on how they were built, tagged, and pushed.

Production readiness is rarely about adding more steps. It is about removing ambiguity - and keeping it removed.

How to Build and Push Production-Ready Container Images to Amazon ECR

How to Build and Push Production-Ready Container Images to Amazon ECR

Introduction: “It builds” is not the same as “it’s ready”

What Amazon ECR is -and What It’s Not

What “Production-Ready” Actually Means for Container Images

Designing an Image Strategy Before Writing a Dockerfile

Writing a Production-Grade Dockerfile

Base Image Selection

Multi-Stage Builds

Running as Non-Root

Layer Hygiene

Building Images Correctly: Local vs CI

Authenticating to Amazon ECR the Right Way

Tagging and Pushing Images to ECR

Image Scanning and Its Limits

Operating ECR in Real Systems

Lifecycle Policies

Cross-Account Patterns

Common Mistakes to Avoid

Let’s do a recap of what we learnt

Conclusion

About David Essien

Related Articles

CI/CD Pipeline Security: A Complete Guide (Part 2)

CI/CD Pipeline Security: A Complete Guide

How to Write a Dockerfile for Node.js Apps (Step-by-Step to Production-Ready)