Key Takeaways
- Never create Pods directly: Always use a Deployment. Pods without a controller don’t restart on failure.
- Rolling update with
maxUnavailable: 0: Zero-downtime deploys. New pods start before old pods stop. - Probes save you: Liveness restarts crashed containers; readiness prevents traffic to containers that aren’t ready.
- Always set resources:
requestsenables scheduler placement;limitsprevents OOM chaos.
Introduction
Direct Answer: How do I create a Kubernetes Deployment with rolling updates and health checks in 2026?
A production Kubernetes Deployment requires: spec.replicas: 3 for redundancy, spec.strategy.type: RollingUpdate with maxUnavailable: 0 and maxSurge: 1 for zero-downtime deploys, spec.template.spec.containers[].resources.requests and .limits for CPU and memory, livenessProbe (HTTP GET to /health) to restart deadlocked containers, and readinessProbe (HTTP GET to /ready) to remove unready containers from load balancing. Apply with kubectl apply -f deployment.yaml. Monitor the rollout with kubectl rollout status deployment/myapp. Roll back a failed deploy with kubectl rollout undo deployment/myapp.
Deployment Lifecycle Diagram
OLD VERSION (v1.0.0)
3 pods all running v1.0.0
↓
USER UPDATES IMAGE IN DEPLOYMENT YAML
↓
KUBECTL APPLY -F DEPLOYMENT.YAML
↓
ROLLING UPDATE STARTS (maxUnavailable: 0, maxSurge: 1)
Initial: [v1] [v1] [v1] (3 old pods)
↓
Step 1: [v1] [v1] [v1] [v2] (1 new pod starts, now 4 total = surge)
↓
Step 2: [v1] [v1] [--] [v2] (1 old pod terminates gracefully)
↓
Step 3: [v1] [v1] [v2] [v2] (2 new pods running)
↓
Step 4: [v1] [--] [v2] [v2] (1 old pod terminates)
↓
Final: [v2] [v2] [v2] (3 new pods, all running v1.1.0)
↓
DEPLOYMENT STABLE
Traffic flows to all 3 v2 pods
Old v1 pods deleted
Zero downtime achieved ✓
Timeline: Usually 30-60 seconds (depends on pod startup time + liveness probe checks)
Part 1: A Complete Production Deployment
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: production
labels:
app: myapp
version: "1.0.0"
spec:
replicas: 3
# ── Rolling update strategy (zero downtime) ──────────────────────────────
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # Never reduce below 3 replicas
maxSurge: 1 # Allow up to 4 replicas during rollout
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: "1.0.0"
spec:
# ── Security ────────────────────────────────────────────────────────
securityContext:
runAsNonRoot: true
runAsUser: 1001
automountServiceAccountToken: false
containers:
- name: myapp
image: myregistry.com/myapp:1.0.0
ports:
- containerPort: 3000
# ── Resource Management ─────────────────────────────────────────
resources:
requests:
memory: "128Mi" # Guaranteed minimum — scheduler uses this
cpu: "100m" # 100 millicores = 0.1 CPU core
limits:
memory: "256Mi" # Hard ceiling — OOM kill if exceeded
cpu: "500m" # Throttled if exceeded
# ── Liveness Probe — restarts container if unhealthy ────────────
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30 # Wait before first check
periodSeconds: 10 # Check every 10 seconds
timeoutSeconds: 5
failureThreshold: 3 # Restart after 3 consecutive failures
# ── Readiness Probe — removes from load balancing if not ready ──
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
# ── Graceful shutdown ────────────────────────────────────────────
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
env:
- name: NODE_ENV
value: production
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: myapp-secrets
key: db-password
# ── Scheduling ──────────────────────────────────────────────────────
terminationGracePeriodSeconds: 30
# Spread pods across nodes
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: myapp
kubectl create namespace production
kubectl apply -f deployment.yaml
kubectl rollout status deployment/myapp -n production
Expected output:
Waiting for deployment "myapp" rollout to finish: 0 of 3 updated replicas are available...
Waiting for deployment "myapp" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "myapp" rollout to finish: 2 of 3 updated replicas are available...
deployment "myapp" successfully rolled out
Part 2: Rolling Update
# Update the container image (triggers rolling update)
kubectl set image deployment/myapp myapp=myregistry.com/myapp:1.1.0 -n production
# Watch the rollout in real time
kubectl rollout status deployment/myapp -n production -w
Expected output:
Waiting for deployment "myapp" rollout to finish: 1 out of 3 new replicas have been updated...
Waiting for deployment "myapp" rollout to finish: 2 out of 3 new replicas have been updated...
Waiting for deployment "myapp" rollout to finish: 1 old replicas are pending termination...
deployment "myapp" successfully rolled out
# Rollback on failure
kubectl rollout undo deployment/myapp -n production
# View rollout history
kubectl rollout history deployment/myapp -n production
Expected output:
REVISION CHANGE-CAUSE
1 <none>
2 kubectl set image deployment/myapp myapp=1.1.0
Part 3: Service + Ingress
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: production
spec:
selector:
app: myapp # Routes to Pods with this label
ports:
- port: 80
targetPort: 3000
type: ClusterIP # Internal only — exposed via Ingress
---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp
namespace: production
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp
port:
number: 80
tls:
- hosts:
- myapp.example.com
secretName: myapp-tls
kubectl apply -f service.yaml -f ingress.yaml -n production
kubectl get ingress -n production
Part 4: Pod Disruption Budget
# pdb.yaml — ensure at least 2 pods always available during cluster maintenance
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
namespace: production
spec:
minAvailable: 2 # At least 2 pods must remain during voluntary disruptions
selector:
matchLabels:
app: myapp
kubectl apply -f pdb.yaml -n production
kubectl get pdb -n production
Expected output:
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
myapp-pdb 2 N/A 1 5s
Troubleshooting
Pod stuck in Pending
kubectl describe pod POD_NAME -n production | grep -A5 "Events:"
# Common causes: insufficient CPU/memory, node selector mismatch, PVC not bound
Pod CrashLoopBackOff
kubectl logs POD_NAME -n production --previous # Logs from crashed container
kubectl describe pod POD_NAME -n production | grep "Exit Code"
Rolling update stuck
# Check if readinessProbe is failing on new pods
kubectl describe deployment myapp -n production | grep -A10 "Conditions:"
kubectl get events -n production --sort-by='.lastTimestamp' | tail -10
When to Use Kubernetes Deployments vs Docker Compose
| Factor | Docker Compose | Kubernetes Deployment | Choose |
|---|---|---|---|
| Server count | 1 server | 3+ servers (cluster) | K8s if multi-node, Compose for single-server |
| Downtime tolerance | ~30 seconds/deploy | Zero downtime (rolling updates) | K8s for critical services |
| Scaling complexity | Manual scale=3 in YAML | kubectl scale deployment myapp --replicas=10 | K8s if auto-scaling needed |
| Network isolation | One bridge network | Multiple networks, service mesh | K8s for multi-tenant isolation |
| Learning curve | 1-2 hours | 40+ hours | Compose to learn, K8s for production |
| Operations cost | Manual updates, no health checks | Automated probes, upgrades, rollbacks | K8s saves operational overhead |
| Typical use case | Dev, test, small prod | Enterprise, high-availability, microservices | Use Compose first, graduate to K8s |
TL;DR: Start with Docker Compose on a single server. Once you have 3+ servers or need 99.9% uptime, migrate to Kubernetes.
Part 4: Advanced Deployment Patterns — When to Use Blue-Green vs Rolling Updates
Developer question: “When should I use blue-green deployment instead of rolling updates?”
Rolling updates are your default (covered in Part 1). Use blue-green only if:
- Database migrations are risky and need instant rollback
- Your infra budget allows 2x pods temporarily
- You’re deploying across multiple regions with strict cutover timing
Decision tree:
Is this a simple code deploy (no DB schema changes)?
→ Yes: Use rolling updates (less resource overhead)
→ No: Database migration + app version bump?
→ Yes: Use blue-green (safer rollback, zero downtime)
→ No: Use rolling updates + careful testing
Does your infra have spare capacity (2x resource headroom)?
→ Yes: Blue-green available
→ No: Must use rolling updates (limited resource overhead)
Are you deploying within a single datacenter?
→ Yes: Rolling updates fine (low latency impact)
→ No: Multi-region deployment with strict SLAs?
→ Yes: Blue-green (instant cutover across regions)
→ No: Rolling updates per region
Real-world example: Deploying a shopping cart service:
- Rolling update approach: New pods start running v1.1 → old pods drain requests → old pods stop. Result: 10-30 seconds of mixed versions, slight bump in latency.
- Blue-green approach: Deploy all v1.1 pods, test them, flip DNS/load-balancer. Old v1.0 pods still exist but get no traffic. If v1.1 crashes, flip back in <1 second. Old pods deleted after confirming stability.
Now let’s build blue-green:
Multi-Region Deployments & Blue-Green Strategy
For zero-downtime deployments across multiple regions, blue-green deployments switch traffic between two complete, identical environments:
# Blue environment (current production)
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
selector:
app: myapp
version: "1.0.0" # Points to blue (v1.0.0)
ports:
- port: 80
targetPort: 3000
---
# Green environment (staging the new version)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-green
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: "2.0.0" # Green deployment (v2.0.0)
template:
metadata:
labels:
app: myapp
version: "2.0.0"
spec:
containers:
- name: myapp
image: myapp:2.0.0 # New version pre-warmed
# Once green is healthy, switch traffic with:
# kubectl patch service myapp-service -p '{"spec":{"selector":{"version":"2.0.0"}}}'
Advantages:
- Instant rollback: flip selector back to v1.0.0 if green has issues
- Full testing environment (green) matches production exactly
- Zero downtime during cutover (DNS is updated, not connection-dropped)
Disadvantages:
- Requires 2x resource capacity (both blue and green running)
- Database migrations must be backward-compatible or run pre-cutover
Pod Disruption Budgets (PDB) for Cluster Maintenance
PDBs guarantee minimum availability during voluntary disruptions (node drains, updates):
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
spec:
minAvailable: 2 # Always keep ≥2 pods running
selector:
matchLabels:
app: myapp
---
# Alternative: maxUnavailable
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
spec:
maxUnavailable: 1 # Allow 1 pod to be disrupted (same effect as minAvailable: 2 for 3 replicas)
selector:
matchLabels:
app: myapp
Production scenario: During cluster upgrade, Kubernetes respects the PDB and drains nodes slowly, ensuring your service stays available.
Part 5: Observability & Debugging Deployments
Comprehensive Logging Strategy
# View logs from multiple pods simultaneously
kubectl logs -l app=myapp -n production --tail=100 -f
# View logs from previous pod if current one crashed
kubectl logs POD_NAME -n production --previous
# Structured logging: export metrics to Prometheus
kubectl port-forward -n production svc/prometheus 9090:9090
# Then visit http://localhost:9090 and query:
# rate(http_requests_total{job="myapp"}[5m]) # Requests per second
Debugging Deployment Issues
Symptom: Pods never reach “Running” state
kubectl describe deployment myapp -n production | grep -A5 "Conditions:"
# Check: ImagePullBackOff? Network issue? Insufficient resources?
# Solution: Check events
kubectl get events -n production --sort-by='.lastTimestamp' | tail -20
Symptom: Pods crash immediately (CrashLoopBackOff)
# View crash logs
kubectl logs POD_NAME -n production --previous
# Check exit code
kubectl describe pod POD_NAME -n production | grep "Exit Code"
# Exit code 1 = app error
# Exit code 137 = OOM (out of memory)
# Exit code 143 = SIGTERM (intentional shutdown)
# If OOM: increase memory limit in Deployment spec
Symptom: Readiness probe failing, traffic not routing
# Test the readiness endpoint manually
kubectl exec POD_NAME -n production -- curl -v http://127.0.0.1:3000/health
# If it fails: debug the app directly
kubectl exec -it POD_NAME -n production -- /bin/bash
# (inside container) curl http://localhost:3000/health
# Check app logs: tail -f /var/log/app.log
Part 6: Horizontal Pod Autoscaling (HPA)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale up when avg CPU > 70%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80 # Scale up when avg memory > 80%
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 min before scaling down
policies:
- type: Percent
value: 50
periodSeconds: 60 # Scale down by max 50% per minute
scaleUp:
stabilizationWindowSeconds: 0 # Scale up immediately
policies:
- type: Percent
value: 100 # Double replicas on spike
periodSeconds: 60
Key insights:
- HPA requires metric collection (Prometheus/metrics-server)
- Set conservative scale-down windows to avoid flapping (rapid up/down)
- Aggressive scale-up (100% per min) to handle traffic spikes
- Pair with PDB to prevent eviction during scale-down
Part 7: Network Policies for Multi-Tenant Isolation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: myapp-network-policy
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: production
ports:
- protocol: TCP
port: 3000
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 5432 # PostgreSQL
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 53 # DNS
This policy ensures:
- Only traffic from production namespace reaches port 3000
- Outbound: Only to PostgreSQL (5432) and DNS (53)
- No egress to internet, preventing data leaks
Part 8: Deployment Troubleshooting Decision Tree
Pod Stuck? Detailed Debugging Flowchart
┌─ kubectl get pods -n production
│ (check STATUS of all pods)
│
├─ STATUS: "ImagePullBackOff"?
│ ├─ Yes → kubectl describe pod POD_NAME | grep -i image
│ │ ├─ Check: is registry URL correct? Is image tag correct?
│ │ ├─ Check: do you have registry credentials configured?
│ │ │ (secret in imagePullSecrets, docker login credentials)
│ │ └─ Fix: correct image field in deployment.yaml, apply again
│ └─ No → Continue to next check
│
├─ STATUS: "CrashLoopBackOff"?
│ ├─ Yes → kubectl logs POD_NAME --previous
│ │ ├─ Read the error (database connection? permission denied?)
│ │ ├─ Common causes:
│ │ │ - Database password wrong (check secret)
│ │ │ - File permissions (runAsUser, fsGroup wrong)
│ │ │ - Port already in use (sidecar conflict?)
│ │ └─ Fix: update deployment.yaml, kubectl apply
│ └─ No → Continue
│
├─ STATUS: "Pending"?
│ ├─ Yes → kubectl describe pod POD_NAME
│ │ ├─ Look for: "insufficient memory" or "insufficient cpu"
│ │ ├─ Check: kubectl top nodes (is cluster full?)
│ │ ├─ Fix options:
│ │ │ - Add more nodes to cluster
│ │ │ - Reduce resource requests (if safe)
│ │ │ - Delete pods from other deployments
│ │ └─ Verify: kubectl top pod (is actual usage matching requests?)
│ └─ No → Continue
│
├─ STATUS: "Running" but liveness failing?
│ ├─ Yes → kubectl logs POD_NAME (tail last 50 lines)
│ │ ├─ Is application healthy? (check /health endpoint)
│ │ ├─ Is port 3000 actually listening?
│ │ └─ Fix: adjust livenessProbe.initialDelaySeconds (wait longer)
│ └─ No → Continue
│
└─ STATUS: "Running" and healthy!
└─ ✓ Pod is working correctly
Step-by-Step Pod Debugging
Step 1: Verify pod exists and status
kubectl get pods -n production
# Output: myapp-xyz123 Running (or ImagePullBackOff, CrashLoopBackOff, etc.)
kubectl describe pod myapp-xyz123 -n production | head -30
# Look for: Events section at bottom (shows last actions — failures)
Step 2: Check application logs
kubectl logs myapp-xyz123 -n production
# See what the application is doing
kubectl logs myapp-xyz123 --previous -n production
# If pod crashed, view logs from previous attempt
Step 3: Check resource constraints
kubectl top pod myapp-xyz123 -n production
# Is pod using requested resources? (CPU/Memory actual vs request)
kubectl top nodes
# Is cluster actually full?
Step 4: Verify networking
kubectl exec myapp-xyz123 -n production -- curl http://localhost:3000/health
# Can the pod reach its own health endpoint?
kubectl port-forward myapp-xyz123 3000:3000 -n production
# On your machine: curl http://localhost:3000/health
Deployment Rolling Update Stuck
Problem: "kubectl rollout status deployment/myapp" shows "waiting..."
Solution flowchart:
1. Check current rollout status
$ kubectl rollout status deployment/myapp
(Output: Waiting for deployment rollout to finish...)
2. Check replica details
$ kubectl get rs -n production
(Look for: old ReplicaSet still has pods)
3. Diagnose why old pods won't terminate
$ kubectl describe rs OLD_RS_NAME
(Check: PodDisruptionBudget preventing drain? finalizers blocking?)
4. If stuck on old RS
Option A: Force delete old pods
$ kubectl delete pods -l app=myapp,version=1.0.0 -n production
Option B: Relax PDB to allow more disruptions
$ kubectl patch pdb myapp-pdb -p '{"spec":{"maxUnavailable":2}}'
Option C: Rollback entire deployment
$ kubectl rollout undo deployment/myapp
5. Verify rollout completes
$ kubectl rollout status deployment/myapp
(Output: deployment "myapp" successfully rolled out)
Full Troubleshooting Reference Table
| Issue | Diagnosis | Fix |
|---|---|---|
| ImagePullBackOff | Registry/auth error | Check registry URL, credentials, image tag |
| CrashLoopBackOff | App crashes at startup | kubectl logs --previous, fix config/secret |
| Pending | No available node resources | kubectl top nodes, add nodes or reduce requests |
| Running but not ready | Liveness/readiness probe failing | Increase initialDelaySeconds, check /health endpoint |
| Rollout stuck | Old pods won’t terminate | Check PDB, try force delete or rollback |
| Service no IP | Service type wrong | Use LoadBalancer for external access, not ClusterIP |
Original Troubleshooting Section
Pod not running?
├─ Check node resources: kubectl top nodes
│ └─ If full: scale cluster (add nodes) or reduce resource requests
├─ Check image pull: kubectl describe pod POD_NAME
│ └─ If ImagePullBackOff: wrong registry, wrong image tag, authentication issue
├─ Check probes: kubectl logs POD_NAME
│ └─ If readiness failing: endpoint not responding, port wrong
Deployment rolling update stuck?
├─ Check rollout status: kubectl rollout status deployment/myapp
├─ Check replica status: kubectl get replicasets
│ └─ If old RS still exists: new pods failing, old pods not scaling down
├─ Increase PDB timeout: kubectl patch pdb myapp-pdb -p '{"spec":{"maxUnavailable":2}}'
└─ Force rollout: kubectl rollout undo deployment/myapp (if failed)
Service not accessible from outside cluster?
├─ Check Service type: kubectl get service myapp
│ └─ If ClusterIP: use port-forward or change to LoadBalancer/NodePort
├─ Check Ingress: kubectl get ingress
│ └─ If configured: verify DNS points to ingress IP, TLS cert is valid
└─ Check NetworkPolicy: does policy allow traffic from outside namespace?
Conclusion
Kubernetes Deployments provide the machinery for production-grade workload management: declarative desired state, rolling updates with zero downtime, health probe–based traffic management, and resource governance. The Pod Disruption Budget ensures availability during cluster maintenance.
Build on this with k3s Kubernetes Install on Ubuntu and Docker Networking for Container Isolation.
People Also Ask
What is the difference between requests and limits in Kubernetes?
requests is the amount of CPU/memory the scheduler guarantees will be available to the container — the scheduler uses requests to decide which node to place the pod on. limits is the maximum the container can use — if it exceeds the memory limit, it gets OOM-killed; if it exceeds the CPU limit, it gets throttled. Set requests conservatively (what the container needs under normal load) and limits generously (the maximum safe usage before it becomes a noisy neighbour). A container with no requests is treated as having requests=0 — the scheduler may place it on an already-overloaded node.
Further Reading
- k3s Kubernetes Install on Ubuntu 24.04 — prerequisite: install k3s before Deployments
- Kubernetes RBAC and Pod Security — detailed in Part 7
- Docker Compose Tutorial 2026 — simpler alternative for single-server deployments
Tested on: Ubuntu 24.04 LTS (3× Hetzner CX22). k3s v1.32.2. Last verified: April 30, 2026.