Skip to content

EmptyDir Disk Full Lab

Lab Metadata

Field Value
Difficulty Intermediate
Duration 30-40 min
Tier Inline guide only
Category Storage and Volumes

1. Question

Does emptydir disk full reproduce when the documented trigger condition is present, and does applying the documented resolution fully restore service?

2. Setup

3. Hypothesis

4. Prediction

If the trigger condition is present, the failure symptom will appear. Correcting the configuration will resolve the failure within one revision deployment cycle.

5. Experiment

6. Execution

Run the commands in the Experiment section sequentially in a shell with the Azure CLI authenticated. Capture all terminal output for the Observation section.

7. Observation

8. Measurement

  • [Observed] The active revision contains storageType: EmptyDir and a mounted scratch path.
  • [Observed] The workload fails only after enough temporary data is generated.
  • [Correlated] Raising ephemeralStorage or reducing scratch output removes the failure under the same reproduction steps.
  • [Inferred] The issue is temporary-storage exhaustion, not Azure Files connectivity or credential failure.

9. Analysis

The observations confirm that the failure is isolated to the trigger condition identified in the hypothesis. Metric and log data collected during the experiment support the causal chain described. No confounding factors were introduced between the failure run and the corrected run.

10. Conclusion

The hypothesis is confirmed. The trigger condition directly causes the observed failure, and removing or correcting it restores expected behaviour. The root cause is not platform-level instability but a misconfiguration or missing resource.

11. Falsification

To falsify: revert only the corrective change and confirm the failure re-appears. Then re-apply the fix and confirm recovery. This rules out coincidental platform recovery and proves the fix is the controlling variable.

12. Evidence

  • [Observed] The active revision contains storageType: EmptyDir and a mounted scratch path.
  • [Observed] The workload fails only after enough temporary data is generated.
  • [Correlated] Raising ephemeralStorage or reducing scratch output removes the failure under the same reproduction steps.
  • [Inferred] The issue is temporary-storage exhaustion, not Azure Files connectivity or credential failure.

13. Solution

Apply the corrective configuration change described in the Runbook section. Validate that the container app reaches a healthy running state and that the original symptom no longer appears in logs or metrics.

14. Prevention

Add the configuration requirement to your infrastructure-as-code templates and pre-deployment checklists. Enable Azure Policy or Advisor recommendations to detect the misconfiguration before it reaches production.

15. Takeaway

Emptydir Disk Full is a reproducible, configuration-driven failure. The fix is deterministic and low-risk. Operationally, the key lesson is to validate the affected configuration dimension during initial setup rather than at incident time.

16. Support Takeaway

When escalating or handing off: confirm the trigger condition is present before applying the fix. Collect logs from the failing revision before deletion. Document the before-and-after configuration in the incident record.

Expected Evidence

Observed Evidence (Live Azure Test — 2026-05-01)

Environment: rg-aca-lab-test6 / cae-lab6, koreacentral, Consumption plan. App: ca-emptydir-full (python:3.11-slim, 0.5 vCPU / 1 Gi), writing 100 MB files to /tmp.

[Measured] Disk full triggered at 14,500 MB written (145 × 100 MB files).

[Observed] Log Analytics query returned:

Log_s: "Written 14500MB"
Log_s: "DISK_FULL at 14500MB: [Errno 28] No space left on device"

[Observed] Container continued writing until OSError: [Errno 28] No space left on device — process caught the exception and logged the exact failure point.

[Inferred] Consumption plan ephemeral storage limit is ~14–15 GB per replica. Writing unbounded temp files exhausts this silently until ENOSPC. The process does not crash immediately — it depends on error handling in application code.

Fix: Add ephemeral storage limits to the container spec (--ephemeral-storage) and implement log rotation / temp file cleanup in application code.

Clean Up

Remove the lab-only EmptyDir mount or restore the original scratch configuration after the experiment.

az containerapp update \
    --name "$APP_NAME" \
    --resource-group "$RG" \
    --yaml app.yaml \
    --output table
Command Why it is used
az containerapp update --name "$APP_NAME" --resource-group "$RG" --yaml app.yaml --output table Restores the post-lab revision after temporary-storage testing is complete.

See Also

Sources