Skip to content

Diagnostic Commands

These commands are grouped by investigation goal so you can collect evidence quickly during AKS incidents.

Topic/Command Groups

flowchart TD
    A[Diagnostics] --> B[Cluster State]
    A --> C[Pods]
    A --> D[Networking]
    A --> E[Nodes]
    A --> F[Azure Resource State]

Cluster and workload state

kubectl get nodes -o wide
kubectl get pods -A -o wide
kubectl get events -A --sort-by=.lastTimestamp
kubectl top nodes
kubectl top pods -A

Pod-specific investigation

kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get deployment <deployment-name> -n <namespace> -o yaml

Networking and ingress

kubectl get svc -A
kubectl get endpoints -A
kubectl get ingress -A
kubectl describe ingress <ingress-name> -n <namespace>
kubectl exec -it <pod-name> -n <namespace> -- nslookup <service-name>

Azure-side checks

az aks show --resource-group $RG --name $CLUSTER_NAME --output yaml
az aks nodepool list --resource-group $RG --cluster-name $CLUSTER_NAME --output table
az vm list-usage --location $LOCATION --output table

Usage Notes

  • Capture command output in incident notes before changing configuration.
  • Prefer comparing one failing object with one healthy object in the same cluster.
  • During severe incidents, start with read-only commands and widen only as evidence improves.

See Also

Sources