Diagnostic Commands¶
These commands are grouped by investigation goal so you can collect evidence quickly during AKS incidents.
Topic/Command Groups¶
flowchart TD
A[Diagnostics] --> B[Cluster State]
A --> C[Pods]
A --> D[Networking]
A --> E[Nodes]
A --> F[Azure Resource State] Cluster and workload state¶
kubectl get nodes -o wide
kubectl get pods -A -o wide
kubectl get events -A --sort-by=.lastTimestamp
kubectl top nodes
kubectl top pods -A
Pod-specific investigation¶
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get deployment <deployment-name> -n <namespace> -o yaml
Networking and ingress¶
kubectl get svc -A
kubectl get endpoints -A
kubectl get ingress -A
kubectl describe ingress <ingress-name> -n <namespace>
kubectl exec -it <pod-name> -n <namespace> -- nslookup <service-name>
Azure-side checks¶
az aks show --resource-group $RG --name $CLUSTER_NAME --output yaml
az aks nodepool list --resource-group $RG --cluster-name $CLUSTER_NAME --output table
az vm list-usage --location $LOCATION --output table
Usage Notes¶
- Capture command output in incident notes before changing configuration.
- Prefer comparing one failing object with one healthy object in the same cluster.
- During severe incidents, start with read-only commands and widen only as evidence improves.