Skip to content

Troubleshooting Architecture Overview

This page answers the first question in a storage incident: where in the storage path can this fail? Use it to classify the problem before opening a detailed playbook.

Storage failure path

```mermaid flowchart LR A[Client or workload] --> B[Identity or token] B --> C[DNS resolution] C --> D[Network path] D --> E[Storage account front door] E --> F[Service endpoint
Blob Files Queue Table] F --> G[Data object share queue table entity]

B -. FP-SEC-01 .-> B1[RBAC SAS key auth failure]
C -. FP-ACC-01 .-> C1[Wrong public/private DNS resolution]
D -. FP-ACC-02 .-> D1[Firewall NSG route port issue]
E -. FP-PERF-01 .-> E1[Account throttle or service latency]
F -. FP-PERF-02 .-> F1[Partitioning protocol or transfer inefficiency]
G -. FP-REC-01 .-> G1[Delete overwrite retention mismatch]

```

Failure domains and first checks

Failure Point Typical Symptom First Evidence Primary Playbook
FP-SEC-01 Identity and authorization 403, auth mismatch, token rejected error code, token/SAS fields, RBAC scope Authorization Failures
FP-ACC-01 DNS and name resolution public IP instead of private IP, name lookup failure nslookup, zone link state, endpoint FQDN Private Endpoint and DNS Issues
FP-ACC-02 Connectivity path timeout, cannot mount, cannot reach account storage firewall, port test, private endpoint approval Cannot Access Storage Account
FP-PERF-01 Account/service saturation 429, 503, latency spike transaction metrics, success rate, server latency Throttling and Performance Issues
FP-PERF-02 Transfer design inefficiency slow upload/download, many small-file delays concurrency settings, RTT, object size mix Slow Upload / Download
FP-REC-01 Protection and recovery gap deleted or overwritten data cannot be restored retention state, versioning, soft delete, backup Data Protection and Recovery Issues

Public and private access model

mermaid flowchart TD A[Client request for <account>.blob.core.windows.net] --> B{DNS answer} B -->|Public IP| C[Public endpoint path] B -->|Private IP| D[Private endpoint path] C --> E{Firewall allows source?} D --> F{Private endpoint approved and routable?} E -->|No| G[Access blocked] F -->|No| G E -->|Yes| H[Storage service] F -->|Yes| H

The most common misclassification is treating a DNS or routing problem as an authorization problem. If the request is hitting the wrong endpoint path, the auth evidence is often misleading.

Evidence layers to collect in order

  1. Symptom evidence: exact error code, timestamp, protocol, target endpoint.
  2. Path evidence: DNS answer, firewall state, private endpoint state, required port reachability.
  3. Identity evidence: RBAC role, SAS fields, account key policy, token audience and expiry.
  4. Performance evidence: server latency, end-to-end latency, transaction spikes, concurrency level.
  5. Recovery evidence: retention settings that were enabled before the incident.

Quick routing examples

See Also

Sources