Troubleshooting¶
Hypothesis-driven troubleshooting for Azure Networking incidents: start with symptom classification, collect the minimum evidence, then jump to the right playbook.
How this section works¶
mermaid flowchart TD A[Observe symptom] --> B{What fails first?} B -->|Name resolution| C[DNS track] B -->|Reachability or loss| D[Connectivity track] B -->|Wrong path or transit| E[Routing track] C --> F[First 10 Minutes: DNS] D --> G[First 10 Minutes: Connectivity] E --> H[First 10 Minutes: Routing] F --> I[DNS playbooks] G --> J[Connectivity playbooks] H --> K[Routing playbooks]
Start Here¶
| Need | Go to |
|---|---|
| Understand where Azure networking fails | Architecture Overview |
| Route a symptom to the right playbook | Decision Tree |
| Know what evidence to collect first | Evidence Map |
| Build a troubleshooting mindset | Mental Model |
| Use 60-second triage cards | Quick Diagnosis Cards |
| Run first-response checks | First 10 Minutes |
| Open a canonical troubleshooting guide | Playbooks |
Quick symptom routing¶
| Symptom pattern | First response | Likely playbooks |
|---|---|---|
| FQDN resolves to wrong IP, NXDOMAIN, or timeout | DNS Checklist | DNS Resolution Failures, Cannot Reach Private Endpoint |
| Service is unreachable from internet, VNet, or external target | Connectivity Checklist | Inbound Connectivity Issues, Outbound Connectivity Issues |
| Packets take the wrong path, peering fails, or hybrid routes disappear | Routing Checklist | Peering and Routing Issues, Hybrid Connectivity Issues |
| Failures are intermittent or latency-only | Connectivity Checklist | Intermittent Network Failures, Latency and Packet Loss |
| You suspect policy ordering confusion | Routing Checklist | NSG vs UDR vs Firewall |
Topic map¶
Meta documents¶
First 10 Minutes¶
Playbooks¶
Tip
Separate the incident into three layers before going deep: name resolution, path selection, and policy enforcement. Most Azure networking incidents become much easier once those layers are isolated.