Throttling and Performance Issues¶

1. Summary¶

When Azure Storage returns 429 or 503, the investigation should focus on request shape, burst concurrency, and whether pressure is account-wide or concentrated on a hot path.

mermaid flowchart TD A[429 or 503] --> B{Burst or sustained?} B -->|Burst| C[Check concurrency and partition skew] B -->|Sustained| D[Check account design and workload spread] C --> E[Retry with backoff] D --> F[Distribute load or scale design]

2. Common Misreadings¶

Treating retries as a fix instead of as evidence of pressure.
Looking only at average latency instead of server latency and transaction bursts.
Ignoring partition or object hot spots.

3. Competing Hypotheses¶

H1: Burst concurrency is exceeding service tolerance.
H2: Workload is concentrated on hot partitions or objects.
H3: Retry behavior is amplifying pressure.
H4: The issue is really client inefficiency rather than account throttling.

4. What to Check First¶

Presence of 429 or 503 in the same time window.
Transaction count and availability trend.
SuccessServerLatency versus SuccessE2ELatency.
Retry implementation and burst shape.

5. Evidence to Collect¶

Metrics around the incident window.
Request volume pattern and concurrency level.
Object or partition access concentration.
Retry policy behavior from client logs or code path.

6. Validation and Disproof by Hypothesis¶

H1: Burst concurrency pressure¶

Support: sharp traffic spikes align with 429/503.
Weaken: moderate steady load with no burst pattern.

H2: Hot partition or object¶

Support: throttling clusters around one prefix, object, or narrow key range.
Weaken: pressure is evenly distributed.

H3: Retry amplification¶

Support: retries multiply immediately after first failures and deepen the spike.
Weaken: well-spaced exponential backoff already exists.

H4: Not really throttling¶

Support: 429/503 absent and server latency remains low.
Weaken: explicit throttle codes and server latency growth are present.

7. Likely Root Cause Patterns¶

Sudden parallel request bursts.
Uneven workload distribution.
Aggressive retries without jitter.
Latency-sensitive and batch workloads sharing the same path.

8. Immediate Mitigations¶

Add exponential backoff with jitter.
Reduce burst concurrency.
Spread load across partitions or accounts where appropriate.
Isolate batch traffic from latency-sensitive traffic.

9. Prevention¶

Load test with realistic burst patterns.
Design object naming and partition use for even distribution.
Monitor throttle-related metrics continuously.