25 Docker & K8s Questions That Fail Candidates (2026)

Q: What is the difference between COPY and ADD in a Dockerfile, and when should you use each?

COPY copies files from the build context to the image filesystem. It's predictable and explicit. ADD does everything COPY does plus: Automatically extracts .tar archives into the destination directory. Supports fetching remote URLs (though this creates an uncacheable layer and is a security risk). Best practice: Always use COPY unless you spe

Q: What is the difference between a Docker container and a Docker image?

A Docker image is an immutable, read-only template — a snapshot of a filesystem and configuration instructions. It is not running; it's a blueprint. A Docker container is a running instance of an image — an isolated process with its own writable layer (the container layer) on top of the read-only image layers. Multiple containers can share the same

Why Every Senior Backend Interview Now Includes Infrastructure Questions

The era of the backend engineer who "just writes APIs" is over. In 2026, senior backend engineering roles at companies scaling to 10M+ requests per day require candidates who can confidently discuss container orchestration, cloud networking, resource optimization, and production incident resolution. The proliferation of Kubernetes as the default deployment substrate means that even engineers who've never touched a Dockerfile on their laptop will be asked to debug a CrashLoopBackOff or explain the difference between a Deployment and a StatefulSet in an interview.

This guide covers the 25 most commonly asked Docker, Kubernetes, and AWS ECS questions across senior backend engineering loops at companies ranging from FAANG to fast-growing cloud-native startups. Each answer goes beyond surface-level definitions to provide the production context that signals real operational experience to interviewers.

🛠️ Practice Backend & DevOps Interview Questions

Our AI interviews you on Kubernetes scaling, Docker builds, and AWS architecture. Get expert rubric feedback on your systems thinking.

Start DevOps Mock Interview →

Section 1: Docker Deep Dive (Questions 1–8)

1. What is the practical value of Docker Multi-Stage Builds, and what does a production-grade example look like?

In a naive Dockerfile, the final image contains your build toolchain (Node.js npm packages, build compilers, dev dependencies) alongside your production application code. For a typical Next.js application, this produces images exceeding 1.2GB. Multi-stage builds use FROM ... AS [name] to declare multiple build phases, then copy only the compiled output into a minimal final image:

# Stage 1: Install dependencies and build
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --frozen-lockfile
COPY . .
RUN npm run build

# Stage 2: Production runtime (no devDependencies, no source code)
FROM node:22-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
COPY package*.json ./
RUN npm ci --only=production --frozen-lockfile
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/public ./public
EXPOSE 3000
CMD ["npm", "start"]

This reduces the final image from ~1.2GB to ~180MB — a 85% reduction. In production, this means faster pull times during scaling events and a dramatically smaller attack surface for supply chain security vulnerabilities.

2. What is the difference between COPY and ADD in a Dockerfile, and when should you use each?

COPY copies files from the build context to the image filesystem. It's predictable and explicit. ADD does everything COPY does plus:

Automatically extracts .tar archives into the destination directory.
Supports fetching remote URLs (though this creates an uncacheable layer and is a security risk).

Best practice: Always use COPY unless you specifically need automatic tar extraction. Using ADD for simple file copying is considered a Dockerfile anti-pattern because it makes the build behavior less transparent.

3. Explain Docker layer caching and how to optimize your Dockerfile for maximum cache reuse.

Each instruction in a Dockerfile creates an immutable layer. Docker caches each layer keyed on the instruction and the content that triggers it. When any layer is invalidated (because content changed), all subsequent layers must be rebuilt. The most common mistake: copying the entire application source code before running npm install:

# ❌ BAD: Any source file change invalidates npm install cache
COPY . .
RUN npm install

# ✅ GOOD: package.json changes rarely — npm install cache is preserved
COPY package*.json ./
RUN npm install
COPY . .  # Source code changes don't affect the install cache layer

This optimization alone can reduce CI/CD build times from 4 minutes to under 30 seconds for teams that deploy frequently.

4. What is the difference between a Docker container and a Docker image?

A Docker image is an immutable, read-only template — a snapshot of a filesystem and configuration instructions. It is not running; it's a blueprint. A Docker container is a running instance of an image — an isolated process with its own writable layer (the container layer) on top of the read-only image layers. Multiple containers can share the same underlying image without duplicating the image layers; each container only adds a thin writable layer for its runtime modifications.

5. What are Docker health checks and why are they essential in production?

A Docker health check is an instruction that tells the Docker daemon to periodically probe the container to verify it's functioning correctly — not just running, but actually healthy:

HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3   CMD curl -f http://localhost:3000/health || exit 1

Without a health check, Docker and orchestrators (Kubernetes, ECS) only know whether the container process is running. A container can be running while the application inside is deadlocked, out of memory, or unable to connect to its database. Health checks enable orchestrators to detect and automatically restart unhealthy containers before they impact end users.

6. Explain Docker networking modes: bridge, host, and overlay.

Bridge (default): Each container gets a private IP on a virtual network internal to the Docker host. Containers communicate via container names or IPs. Traffic leaving the host is NAT'd. Ideal for single-host development environments.
Host: The container shares the host machine's network namespace directly. No NAT, no virtual interface — the container binds directly to host ports. Maximum network performance but zero isolation. Used in high-throughput scenarios like network monitoring tools.
Overlay: Creates a distributed virtual network spanning multiple Docker hosts (Swarm mode). Containers on different hosts can communicate as if on the same network. The foundation of Kubernetes pod-to-pod communication via CNI plugins like Calico or Flannel.

7. What is a Docker init process and when do you need one?

By default, the first process in a container (PID 1) is responsible for signal handling and zombie process reaping. If your application is not designed to handle SIGTERM correctly, Docker's graceful shutdown will fail and containers will be force-killed after a timeout. Using --init flag or the tini init system as your container's PID 1 ensures proper signal forwarding and zombie reaping:

FROM node:22-alpine
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]

8. What are Docker volumes and why are they preferable to bind mounts in production?

Docker volumes are managed storage areas completely controlled by the Docker daemon, stored at /var/lib/docker/volumes/. Bind mounts map a specific host filesystem path into the container. In production: always use volumes over bind mounts because volumes are platform-agnostic (no host path dependency), have better performance on macOS and Windows (no filesystem translation overhead), and can be easily backed up, migrated, or attached to new containers without knowing the host's directory structure.

Section 2: Kubernetes Architecture & Operations (Questions 9–17)

9. Explain the full Kubernetes Pod Lifecycle with all possible states and their implications.

Understanding the Pod lifecycle is fundamental — interviewers use it to diagnose whether candidates can actually debug production issues:

Pending: The pod has been accepted by the API server and etcd, but the scheduler hasn't assigned it to a node yet. Common causes: insufficient CPU/memory on all nodes, PVC not bound, node affinity rules unmet.
Init: Init containers are running. Init containers must complete successfully before app containers start. Used for database migration checks, secret fetching, or config setup.
Running: At least one container is running. Note: Running does not mean healthy — a container can be Running but failing its health check probes.
Succeeded: All containers exited with code 0. Typical for Job and CronJob resources.
Failed: At least one container exited with non-zero status and will not be restarted (based on the RestartPolicy).
CrashLoopBackOff: The container is repeatedly crashing. Kubernetes applies exponential backoff: 10s, 20s, 40s, 80s, 160s, then caps at 5 minutes between restart attempts. This is not a final state — it will continue restarting forever until manually deleted or the underlying issue is fixed.
OOMKilled: The container exceeded its memory limit and was killed by the kernel's OOM killer. You'll see this in kubectl describe pod under "Last State." The fix is typically increasing memory limits or fixing a memory leak.

10. What is the difference between liveness probes, readiness probes, and startup probes?

Misunderstanding these three probes is a common senior engineer mistake:

Startup Probe: Only runs during initial startup. Gives the application time to initialize (e.g., warm up a cache, run DB migrations) before liveness/readiness probes begin. If the startup probe fails within the configured threshold, the container is killed and restarted. Set failureThreshold * periodSeconds to your worst-case startup time.
Liveness Probe: Runs continuously during the container's life. If it fails, Kubernetes kills and restarts the container. Use it to detect deadlocks or states the application cannot recover from on its own (e.g., an app stuck in an infinite loop).
Readiness Probe: Determines whether the pod should receive traffic from the Service's load balancer. If the readiness probe fails, the pod is removed from the Service's Endpoints — traffic stops being routed to it, but the container is NOT killed. Use it for temporary capacity constraints (e.g., the app is processing a large batch job and temporarily over capacity).

livenessProbe:
  httpGet:
    path: /healthz
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5
startupProbe:
  httpGet:
    path: /healthz
    port: 3000
  failureThreshold: 30   # Allow up to 30 * 10s = 5 minutes for startup
  periodSeconds: 10

11. Explain the difference between a Deployment, StatefulSet, and DaemonSet.

Deployment: Manages a set of identical, stateless pods. Pods are interchangeable — any pod can handle any request. Provides rolling updates, rollbacks, and scaling. Use for: web servers, API services, microservices.
StatefulSet: Manages pods with stable, persistent identities (pod-0, pod-1, pod-2). Each pod has a dedicated PersistentVolumeClaim that follows it across rescheduling. Pods are created and deleted in order. Use for: databases (Postgres, MongoDB, Kafka), distributed caches (Redis Cluster), applications that need stable network identifiers.
DaemonSet: Ensures exactly one pod runs on every node (or a subset of nodes matching a node selector). Used for node-level concerns: log collectors (Fluentd), monitoring agents (Datadog, Prometheus node exporter), network plugins (Calico), security agents.

12. What is a Kubernetes HorizontalPodAutoscaler (HPA) and what metrics can it scale on?

HPA automatically scales the number of pod replicas in a Deployment or StatefulSet based on observed metrics. The classic trigger is CPU utilization:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-server-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70   # Scale up when avg CPU > 70%
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"     # Custom metric: 1000 req/s per pod

Modern HPA configurations scale on custom metrics (via Prometheus Adapter) like request queue depth, active WebSocket connections, or GPU utilization — giving much more precise scaling signals than raw CPU percentage.

13. What are Kubernetes Resource Requests vs Limits, and what happens when they're misconfigured?

Requests: The minimum guaranteed resources the container will receive. Used by the scheduler to determine which node has sufficient capacity for the pod. The container is guaranteed these resources even under node pressure.
Limits: The maximum resources the container can use. If CPU usage exceeds the limit, the container is CPU-throttled (not killed). If memory exceeds the limit, the container is OOMKilled immediately.

Common misconfiguration patterns and their consequences:

Setting limits much higher than requests causes "noisy neighbor" problems — a single pod can consume all node resources, starving other pods.
Not setting limits at all means a memory leak in one pod can crash the entire node.
Setting CPU limits too aggressively causes CPU throttling, leading to high latency even when the node has spare capacity (a very common and hard-to-diagnose production issue).

14. Explain Kubernetes Network Policies and why they're essential for production security.

By default, all pods in a Kubernetes cluster can communicate with all other pods — a flat, open network. Network Policies act as pod-level firewall rules, controlling which pods can communicate with which other pods on which ports:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-only-api-to-database
spec:
  podSelector:
    matchLabels:
      app: postgres
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: api-server     # Only the api-server pod can talk to postgres
    ports:
    - protocol: TCP
      port: 5432

Without Network Policies, a compromised frontend pod could directly query your database. With Network Policies, lateral movement attacks are dramatically limited even if one pod is compromised.

15–17: Helm, Istio, and GitOps — Advanced Interview Questions

15. What is Helm and why do teams use it over raw kubectl YAML? Helm is the de facto package manager for Kubernetes, templating YAML manifests with configurable values. Instead of maintaining environment-specific YAML files (different resource limits for dev vs prod, different replica counts), Helm charts parameterize all variable values and support version-controlled releases with rollback capability.

16. Explain Istio's role as a service mesh. Istio injects a sidecar proxy (Envoy) into every pod, transparently intercepting all inbound and outbound network traffic. This enables mTLS between all services, advanced traffic routing (canary deployments, circuit breaking, retry policies), distributed tracing, and request-level metrics — all without modifying application code. The trade-off: the sidecar adds ~50MB memory per pod and introduces a 1-2ms latency overhead per request.

17. What is GitOps and how does Argo CD implement it? GitOps treats your Git repository as the single source of truth for cluster state. Argo CD continuously monitors a Git repository and automatically applies any drift between the declared desired state in Git and the actual cluster state. This creates a fully auditable, pull-based deployment model where every change to production is traceable to a specific Git commit.

Section 3: AWS ECS vs. EKS Architecture Trade-offs (Questions 18–25)

18. When should you choose AWS ECS over Kubernetes (EKS) and vice versa?

Factor	Prefer ECS	Prefer EKS
Team K8s expertise	Low — want AWS-managed simplicity	High — comfortable with K8s ecosystem
Workload complexity	Simple microservices, batch jobs	Complex multi-team platform requirements
AWS ecosystem lock-in	Comfortable with AWS-native tooling	Need cloud portability / multi-cloud
Operational overhead	Lower — no control plane management	Higher — must manage K8s upgrades
Advanced scheduling	Limited — basic placement constraints	Rich — affinity, taints, topology spread

19–25: ECS Deep-Dive Questions

19. What is the difference between ECS with EC2 launch type vs. AWS Fargate? EC2 launch type: you manage the underlying EC2 instances (patching, scaling, capacity planning). Fargate: serverless containers where AWS manages the underlying infrastructure entirely. You pay per vCPU/memory second consumed by running tasks. Fargate eliminates cluster over-provisioning but at a higher per-unit compute cost.

20. How do ECS Task IAM Roles provide per-container authorization? ECS Task IAM Roles allow specific containers to assume specific IAM roles without storing long-lived credentials. The ECS agent injects temporary credentials via the container's metadata endpoint. This means your API container can have S3 read permissions while your worker container has SQS write permissions — with zero credential management in your application code.

21. Explain ECS Service Auto Scaling with Application Auto Scaling. ECS Service Auto Scaling uses AWS Application Auto Scaling to adjust the desired task count based on CloudWatch metrics. Common triggers: ALB RequestCountPerTarget (scale when each task handles more than N requests/minute), CPU utilization averaged across all tasks, and SQS queue depth for worker services.

22-25: Advanced topics. Questions on ECS Service Connect vs AWS App Mesh for service-to-service communication, ECS Exec for secure container debugging without exposing SSH, Capacity Provider strategies for mixing Spot and On-Demand instances, and Blue/Green deployment with AWS CodeDeploy are all common at Staff+ backend engineering loops. These all demonstrate that a candidate has genuine production operational experience with container orchestration at scale.

Related Backend & System Design Guides

Mastering System Design: API Rate Limiting — a deep system design walkthrough that complements containerization knowledge with distributed architecture patterns.
System Design for 10M+ Users: Scaling Architectures — horizontal scaling, database sharding, and CDN strategies that pair directly with the K8s orchestration in this guide.

Can You Answer These Under 45-Minute Interview Pressure?

Knowing the answers is different from delivering them fluently while being probed by a live interviewer. Our AI DevOps and backend interview tracks simulate the exact pressure you'll face. Start free.

Start Backend & DevOps Mock Interview →

Docker vs Kubernetes vs AWS ECS: Top 25 DevOps Interview Questions for Backend Engineers (2026)