Do I need CKA/CKAD certifications?

Helpful for job applications, not required to pass interviews. They force you to actually run kubectl, which pays off.

What kubectl commands should I know?

get, describe, logs, exec, apply, rollout, port-forward, and cp. You should also read YAML fluently.

Docker vs containerd vs Podman?

Kubernetes uses containerd under the hood. Docker Desktop is a dev tool. Podman is a rootless alternative. Fundamentals are the same.

Helm has the ecosystem; Kustomize has simpler mental model. Most orgs use one, often both — know both conceptually.

How do you secure a container?

Non-root user, read-only filesystem, minimal base image, scanned for CVEs, no privileged capabilities, SecurityContext in k8s.

Docker & Kubernetes Interview Questions (2026) — 100 Q&A

Docker is an open-source platform that packages an application and all its dependencies into a self-contained unit called a container. Containers share the host OS kernel but are isolated via Linux namespaces and cgroups, making them far lighter than virtual machines. The key benefit is environment consistency: a container that runs on a developer laptop runs identically in CI, staging, and production. Docker also provides a layered image format that caches unchanged layers so rebuilds and pulls are fast.

Virtual machines include a full guest OS kernel, a hypervisor layer, and virtualised hardware, so each VM typically uses gigabytes of RAM and takes minutes to boot. Containers share the host OS kernel and use namespaces/cgroups for isolation, so they start in milliseconds and use megabytes of RAM. The trade-off is isolation: a VM provides hardware-level separation (a guest kernel bug cannot escape), while containers share the host kernel, making them slightly less isolated. For most workloads the performance and density advantages of containers outweigh the isolation trade-off.

A Docker image is a read-only, layered filesystem snapshot that contains the application code, runtime, libraries, and config. It is the blueprint. A container is a running instance of an image: Docker takes the image layers, adds a thin read-write layer on top, and starts a process inside an isolated namespace. You can run many containers from the same image simultaneously; each gets its own writable layer but shares the underlying read-only layers, saving disk space.

A Dockerfile is a plain-text script of instructions that Docker executes sequentially to build an image. Each instruction (`FROM`, `RUN`, `COPY`, etc.) creates a new immutable layer. The file is checked into source control alongside the application code, making builds reproducible and auditable.

dockerfile
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
CMD ["node", "server.js"]

The `FROM` instruction sets the base image for the build. Every subsequent instruction operates on top of that base. Docker pulls the specified image from the registry if it is not cached locally. You can use `FROM scratch` to start with an empty filesystem, which is useful for statically compiled binaries. In multi-stage builds you write multiple `FROM` statements, each starting a new stage with a clean base layer.

`RUN` executes a command at build time and commits the result as a new image layer — used for installing packages. `CMD` provides the default command that runs when the container starts, but it can be overridden by arguments passed to `docker run`. `ENTRYPOINT` sets the fixed executable; any `CMD` or `docker run` arguments are appended to it as parameters. Best practice: set `ENTRYPOINT` to the binary and `CMD` to default flags so the container is easy to use as a CLI tool.

dockerfile
ENTRYPOINT ["nginx"]
CMD ["-g", "daemon off;"]

Use `docker build` with the build context (directory containing the Dockerfile) and an optional tag.

bash
# Build and tag from current directory
docker build -t myapp:1.0 .

# Use a different Dockerfile location
docker build -f docker/Dockerfile.prod -t myapp:prod .

# Pass a build argument
docker build --build-arg NODE_ENV=production -t myapp:prod .

Docker sends the build context to the Docker daemon, which executes each instruction in order. Use `.dockerignore` to exclude files from the context to keep it small.

Use `docker run` with the image name. Common flags: `-d` to run detached, `-p` to publish ports, `-e` to set environment variables, `-v` to mount volumes, `--name` to give the container a name.

bash
# Run detached, publish port, set env var
docker run -d -p 8080:3000 -e NODE_ENV=production --name api myapp:1.0

# Run interactively and remove on exit
docker run -it --rm node:20-alpine sh

`docker ps` lists running containers, showing the container ID, image, command, created time, status, ports, and name. `docker ps -a` includes stopped containers. `docker logs <container>` streams the stdout/stderr of a container; `docker logs -f` follows the log output in real time, and `--tail 100` limits output to the last 100 lines.

bash
docker ps
docker ps -a --format "table {{.Names}}\t{{.Status}}"
docker logs -f --tail 50 api

A container's writable layer is destroyed when the container is removed. Docker volumes provide persistent storage that lives outside the container lifecycle and is managed by the Docker daemon on the host at `/var/lib/docker/volumes/`. Volumes can be shared between containers and are the recommended way to persist database files, user uploads, and logs. Bind mounts map a specific host path into the container and are useful for development (live code reloading).

bash
# Named volume
docker run -v pgdata:/var/lib/postgresql/data postgres:16

# Bind mount for dev
docker run -v $(pwd):/app node:20-alpine

Docker networks control how containers communicate with each other and the host. The default **bridge** network creates a virtual ethernet switch; containers get private IPs and communicate by name on user-defined bridge networks. **Host** mode removes network isolation — the container shares the host's network stack directly, useful for high-throughput scenarios but risky. **None** gives the container a loopback interface only, completely isolating it from all networks. User-defined bridge networks provide automatic DNS resolution between containers, which the default `bridge` network does not.

bash
docker network create mynet
docker run --network mynet --name db postgres:16
docker run --network mynet --name api myapp:1.0
# api container can reach db via hostname "db"

Docker Compose is a tool for defining and running multi-container applications with a single YAML file (`docker-compose.yml`). It manages service definitions, networks, volumes, environment variables, and port mappings declaratively. `docker compose up -d` starts all services; `docker compose down` stops and removes them. It is ideal for local development environments that need a database, cache, and app server to run together reproducibly.

yaml
services:
  api:
    build: .
    ports: ["3000:3000"]
    depends_on: [db]
  db:
    image: postgres:16
    environment:
      POSTGRES_PASSWORD: secret
    volumes:
      - pgdata:/var/lib/postgresql/data
volumes:
  pgdata:

A Deployment is a higher-level Kubernetes object that manages a desired number of identical Pod replicas. It owns a ReplicaSet and uses a rolling update strategy by default — when you update the pod template, the Deployment gradually creates new pods and terminates old ones with zero downtime. You can pause, resume, and roll back Deployments. Deployments are the standard way to run stateless workloads.

bash
kubectl create deployment nginx --image=nginx:1.25 --replicas=3
kubectl set image deployment/nginx nginx=nginx:1.26
kubectl rollout status deployment/nginx

`kubectl` is the command-line tool for interacting with a Kubernetes cluster via the Kubernetes API server. It reads cluster connection details from `~/.kube/config` (the kubeconfig file) and communicates over HTTPS. Common operations include applying manifests, inspecting resources, viewing logs, exec-ing into containers, and managing rollouts. `kubectl` supports multiple contexts so you can switch between clusters quickly.

bash
kubectl config get-contexts
kubectl config use-context prod-cluster

`kubectl get` lists resources (pods, services, deployments, etc.) in table form. `kubectl describe` shows detailed information about a specific resource including events, which is invaluable for debugging. `kubectl apply -f manifest.yaml` creates or updates resources to match the manifest (declarative). `kubectl delete` removes resources by name or label selector.

bash
kubectl get pods -n production -o wide
kubectl describe pod api-6f9d8b-xyz -n production
kubectl apply -f k8s/
kubectl delete deployment old-service -n staging

A ConfigMap stores non-sensitive configuration data as key-value pairs that can be injected into Pods as environment variables, command-line arguments, or mounted as files in a volume. This decouples configuration from container images, so you can change settings without rebuilding. Changes to a mounted ConfigMap volume propagate to running Pods (with some latency), while env var injection requires a Pod restart.

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  LOG_LEVEL: "info"
  DATABASE_URL: "postgres://db:5432/mydb"

A container registry stores and distributes container images by name and tag. Kubernetes nodes pull images via the container runtime (containerd/CRI-O) using the image reference in the Pod spec. For private registries, you create a Kubernetes Secret of type `kubernetes.io/dockerconfigjson` containing registry credentials and reference it in the Pod spec's `imagePullSecrets` field. Cloud providers offer integrated credential helpers (e.g., AWS ECR, GCP Artifact Registry) that auto-refresh short-lived tokens so you don't need to rotate the Secret manually.

bash
kubectl create secret docker-registry regcred \
  --docker-server=myregistry.io \
  --docker-username=user --docker-password=pass

Multi-stage builds use multiple `FROM` statements in one Dockerfile. Each stage is isolated: you can use a full build toolchain (e.g., `golang:1.22`) in stage 1 to compile the binary, then copy only the compiled artifact into a minimal runtime image (e.g., `gcr.io/distroless/static`) in stage 2. The final image contains none of the build tools, source code, or intermediate files, dramatically reducing the attack surface and image size. You reference a previous stage with `COPY --from=builder`.

dockerfile
FROM golang:1.22 AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /app ./cmd/server

FROM gcr.io/distroless/static:nonroot
COPY --from=builder /app /app
ENTRYPOINT ["/app"]

This typically reduces a Go image from ~900 MB to ~10 MB.

The `.dockerignore` file lists paths that Docker excludes from the build context sent to the daemon. Without it, `docker build .` sends everything — including `node_modules/`, `.git/`, test fixtures, and local `.env` files — which inflates build time and risks leaking secrets into the image. Common entries: `.git`, `node_modules`, `*.log`, `.env*`, `coverage/`, `dist/` (if not needed). Excluding `node_modules` is especially important because `npm install` inside the container must run fresh anyway and the local folder can be gigabytes.

.git
.env*
node_modules
dist
coverage
*.log

`docker history <image>` shows each layer, its creation command, and its size. Layers that consume the most space are candidates for optimisation. Because Docker's build cache invalidates all layers below the first changed instruction, layer ordering directly affects cache hit rates. Put infrequently changing instructions (OS package installs, dependency downloads) near the top and frequently changing instructions (copying app source code) near the bottom.

bash
docker history myapp:latest
# Good order:
# COPY package.json .     ← changes rarely
# RUN npm ci              ← cached if package.json unchanged
# COPY . .               ← changes often, invalidates only layers below

A poorly ordered Dockerfile that copies source code before installing dependencies re-runs `npm ci` on every code change.

By default Docker containers run as UID 0 (root) inside the container. If the container runtime is misconfigured or a container escape vulnerability is exploited, root in the container can become root on the host. The `USER` instruction in a Dockerfile switches to a non-root user for all subsequent `RUN`, `CMD`, and `ENTRYPOINT` instructions. Create the user explicitly in the Dockerfile so you control the UID.

dockerfile
FROM node:20-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --chown=appuser:appgroup . .
USER appuser
CMD ["node", "server.js"]

Kubernetes enforces this at the cluster level via `securityContext.runAsNonRoot: true` and `runAsUser`.

BuildKit is Docker's next-generation build engine, enabled with `DOCKER_BUILDKIT=1` or `docker buildx build`. `--mount=type=cache` mounts a persistent cache directory that is not included in the final image layer — perfect for `npm`, `pip`, or `go` module caches that speed up repeated builds without bloating the image. `--mount=type=secret` makes a secret (e.g. a private npm token) available at build time without baking it into any layer. `--ssh` forwards SSH agent credentials into the build so you can clone private repositories without exposing keys.

dockerfile
# Cache npm modules across builds
RUN --mount=type=cache,target=/root/.npm \
    npm ci --prefer-offline

# Use a secret at build time (never in layer)
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
    npm install

Three complementary hardening strategies: **Read-only root filesystem** (`--read-only`) prevents the container process from writing anywhere except explicitly mounted writable volumes or tmpfs mounts, blocking most persistence-based attacks. **Capability dropping** (`--cap-drop ALL --cap-add NET_BIND_SERVICE`) removes Linux capabilities the container does not need; most apps need zero capabilities. **Seccomp profiles** filter which syscalls the container can make — the Docker default profile blocks ~44 dangerous syscalls; a custom profile can be even stricter. In Kubernetes these map to `securityContext.readOnlyRootFilesystem`, `securityContext.capabilities`, and `securityContext.seccompProfile`.

bash
docker run --read-only --cap-drop ALL \
  --security-opt seccomp=/path/to/profile.json myapp

The default **RollingUpdate** strategy gradually replaces old Pods with new ones. `maxSurge` controls how many extra Pods above the desired count can run during the update; `maxUnavailable` controls how many Pods can be unavailable. This achieves zero downtime but means both versions run simultaneously, so your app must be backward-compatible. The **Recreate** strategy terminates all old Pods before starting new ones, causing downtime but ensuring only one version runs at a time — useful for databases or apps with breaking schema changes.

yaml
strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0

A **request** is the minimum resource a Pod needs; the scheduler uses requests to decide which Node has enough capacity to place the Pod. A **limit** is the maximum the container may use; the kernel enforces CPU limits via cgroup throttling and memory limits via OOM killer. Setting requests without limits allows a noisy-neighbour to starve other Pods on the same node. Setting limits below actual usage causes CPU throttling (bad for latency) or OOMKill (bad for reliability). Best practice: set requests equal to the P95 usage and limits at 2–3× requests for CPU, and set memory limits tightly since memory is not compressible.

yaml
resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "512Mi"

The HPA controller polls the Metrics Server (or custom metrics adapter) every 15 seconds and compares current utilisation to a target. For CPU-based scaling: `desiredReplicas = ceil(currentReplicas * currentCPU / targetCPU)`. The HPA respects `minReplicas` and `maxReplicas` bounds and has a stabilisation window to prevent flapping (default: 5 minutes to scale down, 3 minutes to scale up). You can also scale on custom metrics (e.g., RPS from Prometheus via the KEDA adapter) or external metrics (e.g., SQS queue depth).

bash
kubectl autoscale deployment api \
  --cpu-percent=60 --min=2 --max=20
kubectl get hpa

A **PersistentVolume (PV)** is a cluster-level storage resource provisioned by an admin or dynamically by a StorageClass. A **PersistentVolumeClaim (PVC)** is a user's request for storage — it specifies size, access mode (ReadWriteOnce, ReadWriteMany), and optionally a StorageClass. Kubernetes binds a PVC to a suitable PV. A **StorageClass** defines the "class" of storage (e.g., `gp3`, `io2`, `nfs`) and the provisioner that creates volumes on demand. Dynamic provisioning via StorageClass is the modern approach: a PVC is created, the provisioner creates a cloud disk, and a PV is automatically bound.

yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes: [ReadWriteOnce]
  storageClassName: gp3
  resources:
    requests:
      storage: 50Gi

A **Job** runs one or more Pods to completion — it tracks successful completions and retries on failure up to `backoffLimit`. Jobs are for batch tasks: data migrations, report generation, one-off scripts. `parallelism` and `completions` control concurrent execution. A **CronJob** wraps a Job with a cron schedule and creates a new Job at each trigger. `concurrencyPolicy` controls what happens if a previous Job is still running: `Allow`, `Forbid`, or `Replace`.

yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: db-cleanup
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: cleanup
            image: myapp:latest
            command: ["node", "scripts/cleanup.js"]
          restartPolicy: OnFailure

Init containers are special containers that run and complete before the main application containers start. They run sequentially, one at a time, and must exit with code 0 for the Pod to proceed. Use cases: waiting for a dependency to become available (database readiness check), running database migrations before the app starts, downloading configuration or secrets from a vault, setting up file permissions in a shared volume. Because they run before the app, they can use a different (more privileged) image without exposing those tools in the production container.

yaml
initContainers:
- name: wait-for-db
  image: busybox
  command: ['sh', '-c', 'until nc -z db 5432; do sleep 2; done']

**Node selectors** (`nodeSelector`) are the simplest form: schedule a Pod only on Nodes with a specific label (e.g., `disktype: ssd`). **Node affinity** is more expressive — it supports `In`, `NotIn`, `Exists` operators and has `requiredDuringSchedulingIgnoredDuringExecution` (hard) vs `preferredDuringSchedulingIgnoredDuringExecution` (soft) rules. **Pod affinity** schedules a Pod near (same zone/node) other Pods matching a label selector — useful for co-locating a cache with an app for low latency. **Pod anti-affinity** is the opposite: spread Pods across failure domains (nodes, zones) to improve availability.

yaml
affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - topologyKey: kubernetes.io/hostname
      labelSelector:
        matchLabels:
          app: api

Taints are applied to Nodes to repel Pods that do not explicitly tolerate them, allowing you to reserve Nodes for specific workloads (GPU nodes, spot nodes, control-plane nodes). A Pod must declare a matching **toleration** to be scheduled on a tainted Node. Taint effects: `NoSchedule` (don't schedule without toleration), `PreferNoSchedule` (try to avoid), `NoExecute` (evict existing Pods that don't tolerate). Tolerations do not require the Pod to be scheduled on a tainted Node — combine with node affinity to both attract and repel.

bash
# Taint a GPU node
kubectl taint nodes gpu-node-1 nvidia.com/gpu=true:NoSchedule

yaml
# Toleration in Pod spec
tolerations:
- key: "nvidia.com/gpu"
  operator: "Equal"
  value: "true"
  effect: "NoSchedule"

RBAC (Role-Based Access Control) controls which subjects (users, groups, ServiceAccounts) can perform which verbs (get, list, create, delete, patch) on which API resources. A **Role** grants permissions within a single Namespace; a **ClusterRole** grants permissions cluster-wide or can be bound per-Namespace. A **RoleBinding** attaches a Role or ClusterRole to a subject within a Namespace; a **ClusterRoleBinding** attaches a ClusterRole cluster-wide. The principle of least privilege: ServiceAccounts for applications should only have the specific permissions needed.

yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: production
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

By default, all Pods in a Kubernetes cluster can communicate with all other Pods. A **NetworkPolicy** is a namespace-scoped resource that uses label selectors to define allowed ingress/egress traffic for Pods. NetworkPolicies are enforced by the CNI plugin — not all CNIs support them (Flannel does not; Calico and Cilium do). The default-deny pattern works by applying an empty policy that selects all Pods but specifies no `ingress` or `egress` rules, then adding specific allow policies.

yaml
# Default deny all ingress in namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
spec:
  podSelector: {}
  policyTypes: [Ingress]

Helm is the package manager for Kubernetes. A **chart** is a directory of templates (Go templates that produce YAML manifests), a `values.yaml` file with default configuration, and a `Chart.yaml` with metadata. When you run `helm install`, Helm renders templates by merging `values.yaml` with any overrides (`--set` or `-f`), sends the resulting manifests to the API server, and records the **release** (name + revision + rendered manifests) as a Secret in the target Namespace. `helm upgrade` creates a new revision; `helm rollback` returns to a previous one.

bash
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install my-postgres bitnami/postgresql \
  --set auth.postgresPassword=secret \
  --namespace db --create-namespace
helm upgrade my-postgres bitnami/postgresql --set image.tag=16.2.0

Kustomize is a built-in Kubernetes tool (`kubectl apply -k`) for customising YAML without templating. It uses a **base** (the original manifests) and **overlays** (patches for each environment) composed via a `kustomization.yaml` file. Choose Kustomize over Helm when: you own the manifests and don't need packaging/versioning, you want patches rather than parameterisation, or you need to patch third-party YAML without forking it. Choose Helm when: you need versioned, distributable packages (e.g. distributing your software to customers), need conditional logic in templates, or need complex dependency management. In practice, many teams use both: Kustomize for their own app overlays and Helm for third-party dependencies.

yaml
# kustomization.yaml (production overlay)
bases:
  - ../../base
patchesStrategicMerge:
  - increase-replicas.yaml
images:
  - name: myapp
    newTag: "1.5.0"

`kubectl rollout status` watches a Deployment (or StatefulSet/DaemonSet) and streams progress, returning exit code 0 when the rollout completes successfully or non-zero on failure — making it composable with CI scripts. `kubectl rollout undo` rolls back to the previous revision by reverting the pod template; you can target a specific revision with `--to-revision=N`. The Deployment's `revisionHistoryLimit` (default 10) controls how many old ReplicaSets are kept for rollback.

bash
kubectl rollout status deployment/api -n production
# Watch live progress, exit 0 on success

kubectl rollout undo deployment/api -n production
# Roll back to previous revision

kubectl rollout history deployment/api
# Show revision list

`kubectl port-forward` tunnels a local port to a port on a Pod (or Service), allowing you to access services inside the cluster without exposing them externally. `kubectl exec` runs a command inside a running container, or opens an interactive shell.

bash
# Forward local 5432 to postgres Pod port 5432
kubectl port-forward pod/postgres-0 5432:5432 -n db

# Open interactive shell in a Pod
kubectl exec -it deployment/api -n production -- sh

# Run a one-off command
kubectl exec deployment/api -- node -e "console.log(process.env)"

# Debug with an ephemeral container (1.23+)
kubectl debug -it pod/api-xyz --image=busybox --target=api

Ephemeral debug containers (1.23+) let you attach a debug image to a running Pod without restarting it.

A **ResourceQuota** limits the total aggregate resources consumed within a Namespace — total CPU, memory, and number of objects (Pods, Services, PVCs). It prevents one team from consuming the entire cluster. A **LimitRange** sets default and maximum resource requests/limits for individual containers and Pods in a Namespace. Without LimitRange, a Pod can be created with no `resources` set, giving it BestEffort QoS and potentially unlimited consumption. Together, they enforce a "guardrails" policy: teams work within their allocated quota, and all Pods get sensible defaults.

yaml
apiVersion: v1
kind: LimitRange
metadata:
  name: container-defaults
spec:
  limits:
  - type: Container
    default:
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"

A PodDisruptionBudget sets a policy on how many replicas of a labelled set of Pods must remain available (or can be unavailable) during voluntary disruptions — like `kubectl drain` during a Node upgrade, or a Deployment rollout. Without a PDB, draining a Node could take all replicas of a service offline simultaneously. `minAvailable: 2` guarantees at least 2 replicas are up; `maxUnavailable: 1` allows only one to be down at a time. PDBs only protect against voluntary disruptions (drains, evictions); involuntary ones (hardware failure) can still reduce replicas below the budget.

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: api

Metrics Server is a cluster add-on that collects real-time CPU and memory usage from every kubelet and exposes them through the Kubernetes Metrics API (`metrics.k8s.io`). `kubectl top pods` and `kubectl top nodes` query this API to display current resource consumption. It is designed for autoscaling (HPA reads from it) and operational visibility — it does not store historical data (use Prometheus/Grafana for that). Metrics Server must be installed separately (it's not included in all distributions); on EKS you deploy it from the official manifest, on GKE it's pre-installed.

bash
kubectl top pods -n production --sort-by=memory
kubectl top nodes

Gatekeeper is a Kubernetes-native policy engine that runs as a ValidatingAdmissionWebhook backed by Open Policy Agent (OPA). Policy logic is written in **Rego** (OPA's query language) and packaged as a **ConstraintTemplate** CRD, which defines a new CRD kind (e.g., `K8sRequiredLabels`). A **Constraint** is an instance of that template applied to specific resource types and scopes, carrying the policy parameters. When a resource is created, Gatekeeper evaluates all matching Constraints and denies the request if any Rego policy returns violations. Gatekeeper also runs an audit controller that periodically checks existing resources against all policies and reports violations. This enables a GitOps-friendly approach: policies are YAML in a repo, continuously reconciled by the Gatekeeper controller.

yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-team-label
spec:
  match:
    kinds: [{apiGroups: [""], kinds: ["Pod"]}]
  parameters:
    labels: ["team"]

A Go binary compiled with `CGO_ENABLED=0` is fully statically linked and needs no shared libraries, making it a perfect fit for a distroless or scratch base image. The build uses multi-stage to keep the final image minimal.

dockerfile
FROM golang:1.22 AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
    go build -ldflags="-s -w" -trimpath -o /app ./cmd/server

FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /app /app
USER nonroot:nonroot
ENTRYPOINT ["/app"]

`-s -w` strips debug symbols and DWARF info. `-trimpath` removes local file paths from the binary. `nonroot` ensures the container starts as UID 65532. The resulting image is typically 8–15 MB with zero shell, no package manager, and a minimal CVE footprint. Use `gcr.io/distroless/static-debian12:debug` locally if you need a shell.

In CI environments, each build runs in a fresh agent, so package managers re-download all dependencies from the internet every time — `npm ci`, `pip install`, `go mod download` can take minutes. BuildKit's `--mount=type=cache` mounts a persistent directory at build time (not included in the image layer) that survives between builds when cache is exported/imported. In GitHub Actions with `docker/build-push-action`, you set `cache-from: type=gha` and `cache-to: type=gha,mode=max` to persist BuildKit's cache in GitHub's cache service.

dockerfile
# Go modules cache — survives rebuilds
RUN --mount=type=cache,target=/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    go build -o /app ./cmd/server

# npm cache
RUN --mount=type=cache,target=/root/.npm \
    npm ci

This can reduce a 6-minute dependency install to under 30 seconds on a warm cache.

An SBOM is a machine-readable list of all software components, libraries, and their versions inside a container image — enabling vulnerability tracking and licence compliance. **Syft** (by Anchore) generates SBOMs in SPDX, CycloneDX, or Syft JSON format from images, directories, or OCI archives. **Trivy** can both generate SBOMs and scan them for known CVEs against the NVD and OS vulnerability databases. A production CI workflow:

bash
# Build image
docker build -t myapp:$SHA .

# Generate SBOM
syft myapp:$SHA -o spdx-json > sbom.spdx.json

# Scan for CVEs with exit code on HIGH/CRITICAL
trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:$SHA

# Attach SBOM to image in registry (OCI artifact)
cosign attach sbom --sbom sbom.spdx.json myapp:$SHA

Failing builds on critical CVEs creates a shift-left security gate that prevents known-vulnerable images from ever reaching production.

`CrashLoopBackOff` means the container starts, exits non-zero, and the kubelet applies an exponential back-off (10s, 20s, 40s... up to 5min) before restarting. Systematic approach:

bash
# 1. Get the exit code and last state
kubectl describe pod <pod> -n <ns>
# Look at: Last State, Exit Code, Reason (OOMKilled vs Error)

# 2. Read logs from the crashed container (previous instance)
kubectl logs <pod> --previous -n <ns>

# 3. If logs are empty (crash before writing anything) — run the image manually
docker run --rm myapp:tag

# 4. Check events for image pull errors, config mount issues
kubectl get events -n <ns> --sort-by=.lastTimestamp

# 5. Use ephemeral debug container if shell needed
kubectl debug -it pod/<pod> --image=busybox --target=mycontainer

Common root causes: wrong `ENTRYPOINT`/`CMD`, missing env vars or secrets, config file not found, permission error on mounted volume, OOMKilled (check `Exit Code: 137`), or application crash on startup (check `--previous` logs).

`OOMKilled` (Exit Code 137) means the container exceeded its memory limit and the kernel's OOM killer terminated it. Diagnostic approach:

bash
# 1. Confirm OOMKill in describe
kubectl describe pod <pod> | grep -A5 "Last State"
# Reason: OOMKilled

# 2. Check actual memory usage trend
kubectl top pod <pod> --containers
# Use Grafana/Prometheus: container_memory_working_set_bytes

# 3. Look for memory growth over time (leak vs spike)
# container_memory_working_set_bytes{pod=~"api-.*"}

# 4. Heap dump (Node.js example)
kubectl exec <pod> -- node -e "process.kill(process.pid, 'SIGUSR2')"
# Writes heapdump to disk; copy out with kubectl cp

For Node.js: enable `--expose-gc` and use `clinic.js` or `v8-profiler`. For Go: add `pprof` endpoint and hit `/debug/pprof/heap`. Set memory limits to P99 usage + 20% headroom. For legitimate spikes (batch jobs), use `init containers` to run the memory-intensive work separately, or increase the limit temporarily during the batch window.

A production CI/CD pipeline should be automated, reproducible, and safe. Each stage gates the next:

bash
# 1. Build (BuildKit, layer cache from registry)
docker buildx build --cache-from type=registry,ref=$CACHE_REF \
  --cache-to type=registry,ref=$CACHE_REF,mode=max \
  -t $IMAGE_REF --push .

# 2. Scan — fail on HIGH/CRITICAL CVEs
trivy image --exit-code 1 --severity HIGH,CRITICAL $IMAGE_REF

# 3. Sign image
cosign sign --yes $IMAGE_REF

# 4. Deploy to staging (Kustomize overlay)
kubectl apply -k k8s/overlays/staging
kubectl rollout status deployment/api -n staging --timeout=5m

# 5. Run smoke tests; promote to production
kubectl apply -k k8s/overlays/production
kubectl rollout status deployment/api -n production --timeout=10m

# 6. Automatic rollback on failure
if ! kubectl rollout status deployment/api -n production; then
  kubectl rollout undo deployment/api -n production
  exit 1
fi

GitOps variant: merge to main → Argo CD detects new image tag → deploys → health checks gate promotion. Always have a `PodDisruptionBudget` and `minAvailable` to protect production during the rollout.

Zero-downtime deployments require coordination at three layers. **Rolling update configuration**: set `maxUnavailable: 0` and `maxSurge: 1` so new Pods are fully ready before old ones are terminated. **PodDisruptionBudget**: set `minAvailable` to at least 50% of replicas to protect against simultaneous voluntary disruptions during rollouts and node drains. **Graceful shutdown with preStop hook**: when Kubernetes terminates a Pod, it sends SIGTERM but simultaneously removes the Pod from Service endpoints. There is a race condition: in-flight requests may arrive after SIGTERM but before the process exits. Fix with a `preStop` sleep to drain the endpoint update propagation, then the app handles SIGTERM:

yaml
lifecycle:
  preStop:
    exec:
      command: ["/bin/sh", "-c", "sleep 5"]
terminationGracePeriodSeconds: 60

Set `terminationGracePeriodSeconds` longer than the slowest request. For Node.js/Express, listen for SIGTERM, stop accepting new connections, wait for active requests to finish (`server.close()`), then exit. Readiness probes ensure the new Pod only receives traffic when it is truly ready. Together these eliminate the 502/503 errors typical of naive rolling deployments.

Docker & Kubernetes Interview Questions (2026)

Frequently Asked Questions

Do I need CKA/CKAD certifications?

What kubectl commands should I know?

Docker vs containerd vs Podman?

Helm or Kustomize?

How do you secure a container?

Related Topics

Ready to apply?