🔹 3. Copying a Pod for Post-Mortem Debugging
Sometimes a Pod crashes immediately (e.g., due to bad config, missing secrets, startup errors). In such cases, you don’t have enough time to attach a shell or inject an ephemeral container before it dies.
👉 Solution: use kubectl debug --copy-to to clone the Pod into a stable version that won’t crash, so you can perform a post-mortem analysis.
⸻
🔹 Step 1: Clone the Pod into a Debug Version
kubectl debug pod/my-crashing-pod \
--copy-to=postmortem-pod \
--image=busybox \
-- bash -c "sleep 1d"
--copy-to=postmortem-pod → Creates a new Pod called postmortem-pod with the same configuration as the original.
• --image=busybox → Ensures the new Pod uses a stable image with debugging tools (instead of the broken one).
• sleep 1d → Keeps the container running for 1 day (adjust as needed), preventing immediate crash.
Step 2: Exec into the Stable Copy
kubectl exec -it postmortem-pod -- sh
Now you have an interactive shell inside the cloned Pod.
⸻
🔹 What Can You Inspect?
Once inside, you can check:
• Configuration files
cat /etc/config/app.conf
• Mounted secrets
ls /var/run/secrets/kubernetes.io/serviceaccount
Persistent volumes (PVCs) → same mounts as original Pod.
• Environment variables
env | grep DB_
• Application logs left behind in mounted volumes.
This helps you verify if the issue was caused by:
• Wrong config or env vars
• Missing secrets/config maps
• Corrupted volume mounts
• Crash-looping due to command misconfiguration
Why This Is Powerful
✅ Gives you time to debug a Pod that would otherwise crash instantly.
✅ Preserves volumes, secrets, and environment variables for accurate debugging.
✅ Lets you swap the container image for a debug-friendly one (busybox, ubuntu, netshoot, etc.).
✅ Non-destructive — original Pod stays intact (though it may still be crash-looping).
Real-World Example: Debugging a Crashing App
Suppose my-crashing-pod is failing because of a missing DB connection string.
• You clone it with --copy-to.
• Exec in, run env | grep DB_, and discover the variable is not set.
• You check the ConfigMap/Secret mount, realize it’s missing.
• Root cause: the Deployment forgot to mount the db-secret.
Visual Sequence (Mermaid)
sequenceDiagram
participant User
participant kubectl
participant API[Kubernetes API Server]
participant PodCrasher[my-crashing-pod (CrashLoopBackoff)]
participant PodClone[postmortem-pod (Stable copy)]
User->>kubectl: kubectl debug pod/my-crashing-pod --copy-to=postmortem-pod
kubectl->>API: Request clone with new image (busybox)
API->>PodClone: Create postmortem-pod with same config/volumes/env
User->>PodClone: kubectl exec -it postmortem-pod -- sh
PodClone->>User: Stable shell for inspection
Summary:
When Pods crash too quickly to debug, cloning them with kubectl debug --copy-to gives you a stable replica for investigation. This allows full inspection of config, volumes, secrets, and logs without modifying the original Pod.
4. Reading Logs Across Container Restarts
When Pods crash or restart, simply running kubectl logs shows logs from the current running container. This means you miss the previous attempt (which may contain the real cause of the crash).
👉 Kubernetes stores logs for both the current and the last terminated instance of a container.
Viewing the Previous Container’s Logs
# Logs from the last run before restart
kubectl logs my-app-pod -c app --previous
• -c app → If your Pod has multiple containers, specify which one.
• --previous → Fetch logs from the last terminated instance (before the container restarted).
This is essential for debugging CrashLoopBackOff situations, where the Pod dies quickly and restarts.
⸻
Handling Multiple Restarts
If a Pod restarts many times, you’ll often need logs from each failed run. You can loop through them:
for i in {1..5}; do
echo "--- Restart #$i ---"
kubectl logs my-app-pod -c app --previous --since=1h
done
--since=1h → Restricts logs to the last hour (avoids giant logs).
• The loop will repeatedly fetch the previous logs after each restart.
🔑 Note: Kubernetes only retains logs for the last terminated container instance, not the full restart history. For deeper history, you need a log aggregator (e.g., EFK/ELK, Loki, Datadog, Splunk).
Trimming Large Logs
When containers generate huge logs, --since and --tail are lifesavers:
# Last 100 lines from the previous instance
kubectl logs my-app-pod -c app --previous --tail=100
# Logs from the last 30 minutes
kubectl logs my-app-pod -c app --previous --since=30m
Debugging Workflow
. Check Pod status & restarts
kubectl get pod my-app-pod
kubectl describe pod my-app-pod
Look at Restart Count and termination reasons.
2. Fetch logs from the last crash
kubectl logs my-app-pod -c app --previous
3. Filter for timeframe or lines if logs are too large.
4. Escalate to log aggregation if you need full history beyond one restart.
Visual Flow (Mermaid)
sequenceDiagram
participant User
participant kubectl
participant KubeAPI
participant Pod
participant Container
User->>kubectl: kubectl logs my-app-pod -c app
kubectl->>KubeAPI: Request logs (current container)
KubeAPI->>Container: Fetch running logs
Container->>User: Returns logs (current only)
User->>kubectl: kubectl logs my-app-pod -c app --previous
kubectl->>KubeAPI: Request logs (terminated container)
KubeAPI->>Pod: Get last restart logs
Pod->>User: Returns logs from previous instance
Summary:
• kubectl logs → current instance logs.
• kubectl logs --previous → last terminated instance logs (great for crash debugging).
• Use --since / --tail to keep logs manageable.
• For full restart history, integrate a centralized logging solution (ELK, Loki, etc.).
⸻
Would you like me to also show how to pipe logs directly into grep/jq for structured filtering (e.g., finding error stack traces across restarts)?
No comments:
Post a Comment