Skip to content

Diagnostic Commands¶

These commands are grouped by investigation goal so you can collect evidence quickly during AKS incidents.

Topic/Command Groups¶

flowchart TD
    A[Diagnostics] --> B[Cluster State]
    A --> C[Pods]
    A --> D[Networking]
    A --> E[Nodes]
    A --> F[Azure Resource State]

Cluster and workload state¶

kubectl get nodes -o wide
kubectl get pods -A -o wide
kubectl get events -A --sort-by=.lastTimestamp
kubectl top nodes
kubectl top pods -A

Pod-specific investigation¶

kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get deployment <deployment-name> -n <namespace> -o yaml

Networking and ingress¶

kubectl get svc -A
kubectl get endpoints -A
kubectl get ingress -A
kubectl describe ingress <ingress-name> -n <namespace>
kubectl exec -it <pod-name> -n <namespace> -- nslookup <service-name>

Azure-side checks¶

az aks show --resource-group $RG --name $CLUSTER_NAME --output yaml
az aks nodepool list --resource-group $RG --cluster-name $CLUSTER_NAME --output table
az vm list-usage --location $LOCATION --output table

Usage Notes¶

Capture command output in incident notes before changing configuration.
Prefer comparing one failing object with one healthy object in the same cluster.
During severe incidents, start with read-only commands and widen only as evidence improves.

See Also¶

Sources¶