-- Living Mobile --: Kubernetes additional tips

🔹 3. Copying a Pod for Post-Mortem Debugging

Sometimes a Pod crashes immediately (e.g., due to bad config, missing secrets, startup errors). In such cases, you don’t have enough time to attach a shell or inject an ephemeral container before it dies.

👉 Solution: use kubectl debug --copy-to to clone the Pod into a stable version that won’t crash, so you can perform a post-mortem analysis.

⸻

🔹 Step 1: Clone the Pod into a Debug Version

kubectl debug pod/my-crashing-pod \

--copy-to=postmortem-pod \

--image=busybox \

-- bash -c "sleep 1d"

--copy-to=postmortem-pod → Creates a new Pod called postmortem-pod with the same configuration as the original.

• --image=busybox → Ensures the new Pod uses a stable image with debugging tools (instead of the broken one).

• sleep 1d → Keeps the container running for 1 day (adjust as needed), preventing immediate crash.

Step 2: Exec into the Stable Copy

kubectl exec -it postmortem-pod -- sh

Now you have an interactive shell inside the cloned Pod.

⸻

🔹 What Can You Inspect?

Once inside, you can check:

• Configuration files

cat /etc/config/app.conf

• Mounted secrets

ls /var/run/secrets/kubernetes.io/serviceaccount

Persistent volumes (PVCs) → same mounts as original Pod.

• Environment variables

env | grep DB_

• Application logs left behind in mounted volumes.

This helps you verify if the issue was caused by:

• Wrong config or env vars

• Missing secrets/config maps

• Corrupted volume mounts

• Crash-looping due to command misconfiguration

Why This Is Powerful

✅ Gives you time to debug a Pod that would otherwise crash instantly.

✅ Preserves volumes, secrets, and environment variables for accurate debugging.

✅ Lets you swap the container image for a debug-friendly one (busybox, ubuntu, netshoot, etc.).

✅ Non-destructive — original Pod stays intact (though it may still be crash-looping).

Real-World Example: Debugging a Crashing App

Suppose my-crashing-pod is failing because of a missing DB connection string.

• You clone it with --copy-to.

• Exec in, run env | grep DB_, and discover the variable is not set.

• You check the ConfigMap/Secret mount, realize it’s missing.

• Root cause: the Deployment forgot to mount the db-secret.

Visual Sequence (Mermaid)

sequenceDiagram

participant User

participant kubectl

participant API[Kubernetes API Server]

participant PodCrasher[my-crashing-pod (CrashLoopBackoff)]

participant PodClone[postmortem-pod (Stable copy)]

User->>kubectl: kubectl debug pod/my-crashing-pod --copy-to=postmortem-pod

kubectl->>API: Request clone with new image (busybox)

API->>PodClone: Create postmortem-pod with same config/volumes/env

User->>PodClone: kubectl exec -it postmortem-pod -- sh

PodClone->>User: Stable shell for inspection

Summary:

When Pods crash too quickly to debug, cloning them with kubectl debug --copy-to gives you a stable replica for investigation. This allows full inspection of config, volumes, secrets, and logs without modifying the original Pod.

4. Reading Logs Across Container Restarts

When Pods crash or restart, simply running kubectl logs shows logs from the current running container. This means you miss the previous attempt (which may contain the real cause of the crash).

👉 Kubernetes stores logs for both the current and the last terminated instance of a container.

Viewing the Previous Container’s Logs

# Logs from the last run before restart

kubectl logs my-app-pod -c app --previous

• -c app → If your Pod has multiple containers, specify which one.

• --previous → Fetch logs from the last terminated instance (before the container restarted).

This is essential for debugging CrashLoopBackOff situations, where the Pod dies quickly and restarts.

⸻

Handling Multiple Restarts

If a Pod restarts many times, you’ll often need logs from each failed run. You can loop through them:

for i in {1..5}; do

echo "--- Restart #$i ---"

kubectl logs my-app-pod -c app --previous --since=1h

done

--since=1h → Restricts logs to the last hour (avoids giant logs).

• The loop will repeatedly fetch the previous logs after each restart.

🔑 Note: Kubernetes only retains logs for the last terminated container instance, not the full restart history. For deeper history, you need a log aggregator (e.g., EFK/ELK, Loki, Datadog, Splunk).

Trimming Large Logs

When containers generate huge logs, --since and --tail are lifesavers:

# Last 100 lines from the previous instance

kubectl logs my-app-pod -c app --previous --tail=100

# Logs from the last 30 minutes

kubectl logs my-app-pod -c app --previous --since=30m

Debugging Workflow

. Check Pod status & restarts

kubectl get pod my-app-pod

kubectl describe pod my-app-pod

Look at Restart Count and termination reasons.

2. Fetch logs from the last crash

kubectl logs my-app-pod -c app --previous

3. Filter for timeframe or lines if logs are too large.

4. Escalate to log aggregation if you need full history beyond one restart.

Visual Flow (Mermaid)

sequenceDiagram

participant User

participant kubectl

participant KubeAPI

participant Pod

participant Container

User->>kubectl: kubectl logs my-app-pod -c app

kubectl->>KubeAPI: Request logs (current container)

KubeAPI->>Container: Fetch running logs

Container->>User: Returns logs (current only)

User->>kubectl: kubectl logs my-app-pod -c app --previous

kubectl->>KubeAPI: Request logs (terminated container)

KubeAPI->>Pod: Get last restart logs

Pod->>User: Returns logs from previous instance

Summary:

• kubectl logs → current instance logs.

• kubectl logs --previous → last terminated instance logs (great for crash debugging).

• Use --since / --tail to keep logs manageable.

• For full restart history, integrate a centralized logging solution (ELK, Loki, etc.).

⸻

Would you like me to also show how to pipe logs directly into grep/jq for structured filtering (e.g., finding error stack traces across restarts)?

-- Living Mobile --

Sunday, September 7, 2025

Kubernetes additional tips - part 2

No comments:

Post a Comment