VM Troubleshooting Mental Model¶

The core VM troubleshooting habit is simple: classify the failure domain first, then collect disproving evidence before committing to a root cause.

Classification model¶

flowchart TD
    A[Observed symptom] --> B{Failure domain}
    B -->|Admin path broken| C[Connectivity]
    B -->|Runtime degraded| D[Performance]
    B -->|Startup or recovery broken| E[Boot and disk]
    C --> C1[Check network, auth, agent]
    D --> D1[Check CPU, memory, disk, network]
    E --> E1[Check boot artifacts, serial console, snapshot path]

Four rules¶

Start with the narrowest true symptom: “cannot SSH” is better than “VM is broken.”
Use competing hypotheses: at least two plausible explanations before taking action.
Prefer disproof over confirmation: look for evidence that would invalidate your favorite theory.
Separate platform from guest: Azure state and guest state are not the same thing.

Typical category mistakes¶

Mistake	What it causes	Better move
treating every access failure as an NSG issue	misses guest firewall, VM agent, or credential problems	check network path and guest readiness together
using only CPU for performance diagnosis	misses memory pressure and disk throttling	inspect CPU, memory, disk, and queue/latency together
trying RDP/SSH fixes during boot corruption	wastes time on a path that cannot work yet	switch immediately to Boot Diagnostics and Serial Console
retrying backup without checking agent state	repeats the same failed snapshot workflow	validate VM agent and extension health first

Investigation rhythm¶

graph TD
    A[Symptom] --> B[Hypotheses]
    B --> C[Collect evidence]
    C --> D[Disprove weak hypotheses]
    D --> E[Mitigate]
    E --> F[Prevention update]

How to apply this in practice¶

Use Quick Diagnosis Cards when speed matters.
Use the matching First 10 Minutes checklist to stabilize routing.
Open one canonical playbook and finish the evidence loop before jumping categories.

VM Troubleshooting Mental Model¶

Classification model¶

Four rules¶

Typical category mistakes¶

Investigation rhythm¶

How to apply this in practice¶

See Also¶

Sources¶