Skip to content

Pending Pods

1. Summary

Pending pods usually indicate a scheduling boundary: no matching nodes, insufficient capacity, quota limits, or volume attachment constraints.

flowchart TD
    A[Symptom] --> B[Hypotheses]
    B --> C[Evidence]
    C --> D[Disprove weak paths]
    D --> E[Mitigation]

2. Common Misreadings

  • The first visible symptom is the root cause.
  • Restarting the pod proves the issue is fixed.
  • If one namespace is affected, the cluster is healthy.

3. Competing Hypotheses

  • H1: No nodes satisfy selectors, taints, or affinity.
  • H2: CPU or memory requests exceed available capacity.
  • H3: Cluster autoscaler cannot add nodes.
  • H4: Persistent volume binding is blocking scheduling.

4. What to Check First

kubectl describe pod <pod-name> -n <namespace>
kubectl get nodes -L kubernetes.azure.com/agentpool
kubectl get pvc -A

5. Evidence to Collect

  • Scheduler events from kubectl describe pod.
  • Node labels, taints, and allocatable resources.
  • Autoscaler status and node pool min/max bounds.
  • PVC binding status.

6. Validation and Disproof by Hypothesis

  • 0/NN nodes are available messages usually disprove image-related causes.
  • If pods request impossible resources, fix requests before expanding the cluster.
  • If autoscaler is enabled but no nodes appear, inspect quota and subnet capacity.

7. Likely Root Cause Patterns

  • Over-sized resource requests.
  • Too-restrictive affinity or toleration design.
  • Max node count reached.
  • Storage class or PVC mismatch.

8. Immediate Mitigations

  • Reduce constraints or add the correct node pool.
  • Expand autoscaler bounds if quota allows.
  • Resolve PVC binding issues.
  • Re-run scheduling validation after each change.

9. Prevention

  • Review scheduler events during every release.
  • Keep namespace quotas aligned with real capacity.
  • Test new affinity or taint strategies outside production first.

See Also

Sources