Image Size Startup Delay¶
Use this playbook when new revisions eventually succeed but take far too long to become ready, especially after scale-from-zero or after an image update.
Symptom¶
- New revisions remain in provisioning longer than expected.
- First requests after scale-out or scale-from-zero show large latency spikes.
- Startup probes fail only on fresh revisions or freshly pulled images.
- No single application exception explains the slow start window.
Possible Causes¶
- The image is large, so pull and extraction time dominate startup.
- The image contains unnecessary build tools, package caches, or debug artifacts.
- Startup initialization does too much work after the image is pulled.
- Scale-to-zero exposes image pull latency more visibly than always-ready replicas do.
Diagnosis Steps¶
flowchart TD
A[Slow revision start or cold start] --> B[Check whether old warm replicas avoid the issue]
B --> C{Only new replicas are slow?}
C -->|No| D[Investigate app startup code and dependencies]
C -->|Yes| E[Inspect image source and deployment pattern]
E --> F{Large image or repeated pull behavior?}
F -->|Yes| G[Image size likely contributes materially]
F -->|No| H[Focus on probe timing and app initialization]
G --> I[Trim image and retest]
H --> J[Tune probes or startup path] -
Capture the configured image and revision timeline.
-
Review system logs for slow provisioning or repeated startup attempts.
-
Correlate revision creation and startup signals.
let AppName = "ca-myapp"; ContainerAppSystemLogs_CL | where ContainerAppName_s == AppName | where TimeGenerated > ago(2h) | where Reason_s has_any ("RevisionProvisioning", "RevisionProvisioned", "ReplicaStarted", "ProbeFailed") or Log_s has_any ("pull", "start", "ready") | project TimeGenerated, RevisionName_s, ReplicaName_s, Reason_s, Log_s | order by TimeGenerated asc -
Compare the behavior with scale-to-zero settings to see whether image pull cost is only visible on fresh replicas.
| Command or Query | Why it is used |
|---|---|
az containerapp show --query image | Confirms the exact image reference used by the slow revision. |
az containerapp revision list | Shows whether startup delay is revision-specific or persistent. |
az containerapp logs show --type system | Captures provisioning and startup events around the delay window. |
| KQL timeline query | Correlates the gap between revision provisioning and replica readiness. |
Resolution¶
- Reduce image size with multi-stage builds and by removing build-time tooling from the runtime image.
- Keep language package caches, test assets, and debug binaries out of the production image.
- Move expensive initialization out of the startup path where possible.
- Keep one ready replica for user-facing apps if cold-start latency is unacceptable.
Prevention¶
- Treat image size as a production SLO input, not just a build concern.
- Review image contents after dependency or base-image changes.
- Measure revision-ready time before and after image changes.
- Keep scale-to-zero decisions aligned with acceptable startup latency.