Skip to content

Platform Limits¶

Use this page as a quick checkpoint before scaling, rollout, or incident response.

flowchart TD
    REQ[Incoming Request] --> TIMEOUT{"Ingress Timeout (~240s)"}
    TIMEOUT -->|Under limit| SCALE{"Scale Rules"}
    TIMEOUT -->|Over limit| ASYNC["Move to Async Pattern"]
    SCALE --> REPLICA["Replica Sizing (CPU/Memory)"]
    REPLICA --> MINMAX["Min/Max Replicas"]
    MINMAX --> QUOTA["Subscription Quota"]

Common Limits and Timeouts¶

Area	Typical value / behavior	Notes
Ingress request timeout	~240 seconds	Long-running HTTP should move to async/background patterns
Scale to zero	Supported (`minReplicas=0`)	First request may incur cold start latency
Revision model	Immutable revisions	Any template/config change creates a new revision
Log ingestion delay	Usually 1-2 minutes	Use `az containerapp logs show --follow` for near real-time
Traffic split granularity	Percentage per active revision	Multiple revision mode required

Resource Sizing Reference¶

Setting	Where configured	Practical guidance
CPU / Memory per replica	Container template resources	Right-size for startup + peak load
`minReplicas`	Scale section	`0` for cost, `>=1` for lower latency
`maxReplicas`	Scale section	Set high enough for burst traffic
HTTP concurrency threshold	HTTP scale rule	Lower value = faster scale-out

Quota / Capacity Checks¶

# Region resource availability and subscription constraints
az containerapp env list --resource-group "$RG"

# Current app scaling bounds
az containerapp show \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --query "properties.template.scale"

Observed baseline before limit testing:

$ az containerapp show --name "$APP_NAME" --resource-group "$RG" --query provisioningState --output tsv
Succeeded

$ az containerapp revision list --name "$APP_NAME" --resource-group "$RG" --output table
Name               Active    TrafficWeight    Replicas    HealthState    RunningState
-----------------  --------  ---------------  ----------  -------------  ------------
ca-myapp--0000001  True      100              1           Healthy        Running

Symptoms That Often Indicate Limits¶

Symptom	Likely limit area	First action
Frequent 504s on heavy endpoints	Ingress timeout	Move to async job pattern; shorten sync request path
High latency after idle period	Scale-to-zero cold start	Increase `minReplicas`
Throttled/slow during burst	`maxReplicas` too low or high concurrency target	Raise `maxReplicas`, lower concurrency threshold
New revision unhealthy under load	CPU/memory under-sized	Increase container resources

Design Guardrails¶

Workload type	Recommendation
API calls < 30s	Standard HTTP ingress pattern
API calls 30-240s	Careful timeout handling, aggressive observability
API calls > 240s	Queue + worker revision (non-HTTP trigger)
High burst traffic	Pre-warm with `minReplicas >= 1` + tuned scale rules

For architecture-level implications, see How Container Apps Works.

Sources¶