Common Anti-Patterns¶

This quick-reference guide lists frequent Azure Container Apps anti-patterns seen in production, why they fail, and what to do instead. Use it as a pre-deployment review checklist and incident prevention reference.

Prerequisites¶

Azure Container Apps environment and at least one deployed app
Azure CLI with Container Apps extension
Access to ACR, Log Analytics, and app configuration

export RG="rg-aca-prod"
export APP_NAME="ca-orders-api"
export ENVIRONMENT_NAME="cae-prod-shared"
export ACR_NAME="acrsharedprod"
export LOCATION="koreacentral"

az extension add --name "containerapp" --upgrade
az account show --output table

Main Content¶

Anti-pattern quick reference table¶

Problem	Why It Fails	Better Approach
Using `:latest` image tag in production	Revisions are not reproducible; rollbacks can pull unexpected image content	Use immutable version tags (`v1.2.3`) or digest pinning
Fat container images larger than 1 GB	Slow pulls increase cold start and deployment risk; more registry storage cost	Use minimal runtime images, multi-stage builds, and dependency trimming
No health probes configured	Platform cannot distinguish slow start from broken runtime; unhealthy replicas linger	Configure readiness and liveness probes matching startup/runtime behavior
Storing secrets directly in environment variables	Secret sprawl, accidental leakage in logs, manual rotation pain	Use Container Apps secrets and Key Vault references
Using ACR admin credentials instead of managed identity	Shared static credentials create blast radius and rotation risk	Use managed identity for image pulls and disable admin auth where possible
Setting `minReplicas=0` for latency-sensitive APIs	Cold starts increase p95 and p99 latency	Keep `minReplicas` at 1 or higher for strict latency services
Single revision mode for production apps	No safe canary or traffic split; rollback requires disruptive full switch	Use multiple revision mode with staged rollout and validation gates
Hardcoded connection strings	Secrets baked into image/config drift across environments	Resolve from secret store at runtime with identity-based access
Ignoring SIGTERM and ungraceful shutdown	In-flight requests/jobs are interrupted; data corruption and retry storms	Handle SIGTERM, drain requests, and flush telemetry before exit
Not configuring ingress when external access is required	App is healthy but unreachable from clients	Explicitly configure external ingress, target port, and transport
Using Container Apps for persistent stateful workloads	Ephemeral replicas are poor fit for durable local-state dependencies	Use managed state services or AKS for strict persistent workload patterns
Over-provisioning CPU and memory "just in case"	Continuous waste and lower bin-packing efficiency	Right-size from observed p95 usage and revise periodically
Not separating dev, staging, prod environments	Shared blast radius and accidental cross-environment impact	Separate environments and enforce scoped RBAC and network boundaries
Running database migrations in app startup	Startup path becomes risky and slow; scale-out can trigger concurrent migrations	Run migrations as controlled job or pipeline step before traffic shift
Ignoring scale rule interaction and precedence	Competing scalers create unstable replica behavior and cost spikes	Document scaler intent, test interaction, and set clear thresholds

1) Using `:latest` image tag in production¶

latest breaks revision traceability because the same tag can point to different image contents over time.

Bad pattern:

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --image "$ACR_NAME.azurecr.io/orders-api:latest"

Better pattern:

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --image "$ACR_NAME.azurecr.io/orders-api:v1.9.4"

2) Fat container images larger than 1 GB¶

Large images increase pull duration and make scale-from-zero slower.

Operational impact:

Longer revision provisioning windows
Higher failure probability during pull/network pressure
Increased registry storage and transfer overhead

Better approach:

Use slim base image
Remove build-time tooling from runtime image
Keep only required runtime assets

3) No health probes configured¶

Without probes, platform-level health decisions are less precise.

Bad outcome examples:

Broken dependency initialization treated as healthy
Stuck replicas remain in rotation too long

Better approach with explicit probe settings:

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --set "properties.template.containers[0].probes[0].type=Readiness" \
        "properties.template.containers[0].probes[0].httpGet.path=/health" \
        "properties.template.containers[0].probes[0].httpGet.port=8000"

Probe design

Readiness should represent dependency readiness for serving traffic. Liveness should represent process health, not external dependency state.

4) Secrets in environment variables instead of Container Apps secrets¶

Directly embedding secrets in plain environment values increases exposure risk.

Bad pattern:

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --set-env-vars "DATABASE_URL=Server=tcp:db.example;User Id=admin;Password=<plaintext>"

Better pattern using platform secret abstraction:

az containerapp secret set \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --secrets "db-connection=<redacted-secret-value>"

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --set-env-vars "DATABASE_URL=secretref:db-connection"

5) Using ACR admin credentials instead of managed identity¶

ACR admin credentials are long-lived shared secrets and weaken least privilege.

Better approach:

Assign managed identity to app.
Grant AcrPull role to that identity.
Configure registry identity-based auth.

Assign identity:

az containerapp identity assign \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --system-assigned

6) `minReplicas=0` on latency-sensitive APIs¶

For strict SLO APIs, scale-to-zero introduces predictable cold-start delay.

Bad pattern:

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --min-replicas 0

Better pattern:

az containerapp update \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --min-replicas 1 \
  --max-replicas 20

7) Single revision mode in production¶

Single revision mode removes safe progressive rollout options.

Risks:

All traffic instantly exposed to new build
Rollback depends on full revision replacement
Reduced observability for side-by-side comparison

Better approach:

Enable multiple revisions
Send small traffic percentage to candidate revision
Promote after SLO and error checks

8) Hardcoded connection strings¶

Hardcoded values in code or container args become operational debt.

Failure modes:

Secrets leak through repository, logs, or diagnostics
Environment promotion requires image rebuilds
Rotation causes downtime or drift

Better approach:

Use secret references and managed identity
Use one code artifact across environments
Resolve environment-specific values at deploy/runtime

9) Ignoring SIGTERM and ungraceful shutdown¶

Container Apps sends termination signal during scale-in and revision changes.

If ignored:

In-flight requests can fail
Background work may be interrupted mid-transaction
Telemetry may never flush

Better approach:

Handle SIGTERM in runtime
Stop accepting new work immediately
Drain active work and close cleanly within timeout

10) Missing ingress configuration for externally consumed apps¶

A common issue is healthy app replicas with no reachable endpoint.

Set ingress explicitly during create:

az containerapp create \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --environment "$ENVIRONMENT_NAME" \
  --image "$ACR_NAME.azurecr.io/orders-api:v1.9.4" \
  --ingress external \
  --target-port 8000

Validate ingress state:

az containerapp show \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --query "properties.configuration.ingress" \
  --output json

11) Using Container Apps for persistent local-state workloads¶

Container Apps is optimized for stateless or externally stateful services.

Anti-pattern examples:

Local filesystem as primary database
Stateful leader election dependent on replica identity
Long-running node-local storage assumptions

Better approach:

Use managed database/storage services for durable state
For strict container-orchestrated persistent patterns, evaluate AKS

12) Over-provisioning CPU and memory¶

Provisioning "just in case" directly inflates cost and can mask performance problems.

Review configured resources:

az containerapp show \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --query "properties.template.containers[0].resources" \
  --output json

Better approach:

Start with measured baseline
Increase only when metrics show sustained contention
Reassess after each major release

13) Mixing dev/staging/prod in one environment¶

Single shared environment for all stages causes blast-radius coupling.

Typical failures:

Non-production experiments impact production networking or quotas
Shared secrets and RBAC reduce control boundaries
Troubleshooting signal is noisy and ambiguous

Better approach:

Separate managed environments per stage
Separate resource groups and budgets
Enforce stage-specific RBAC policies

14) Running database migrations in app startup¶

Startup migrations turn deployment into an implicit schema operation.

Risk pattern:

Horizontal scale events run migration logic concurrently
App startup blocks on long migration tasks
Failed migration leaves mixed schema/runtime state

Better approach:

Run migrations as explicit pre-deployment job
Verify migration success before traffic shift
Keep startup path limited to health-ready initialization

Example migration as job:

az containerapp job create \
  --name "job-db-migrate" \
  --resource-group "$RG" \
  --environment "$ENVIRONMENT_NAME" \
  --trigger-type "Manual" \
  --replica-timeout 1800 \
  --replica-retry-limit 0 \
  --image "$ACR_NAME.azurecr.io/orders-migration:v1.9.4"

15) Ignoring scale rule interaction and precedence¶

Multiple scalers can interact in unexpected ways when thresholds are inconsistent.

Symptoms:

Replica oscillation (rapid up/down)
Excessive scale-out under one hot metric
Cost spikes without throughput gains

Better approach:

Document each scale rule intent and owner
Test rule combinations under synthetic load
Align cooldown and threshold settings with downstream capacity

Inspect current scale settings:

az containerapp show \
  --name "$APP_NAME" \
  --resource-group "$RG" \
  --query "properties.template.scale" \
  --output json

Anti-pattern severity and impact matrix¶

Anti-Pattern	Severity	Impact Area	Detection Difficulty	Fix Effort
`:latest` image tag	🔴 High	Release safety	Easy — check image field	Low
No health probes	🔴 High	Runtime reliability	Easy — check probe config	Low
Secrets in plain env vars	🔴 High	Security	Medium — audit env config	Medium
ACR admin credentials	🟠 Medium	Security	Easy — check registry auth	Medium
Ignoring SIGTERM	🔴 High	Runtime reliability	Hard — requires load testing	Medium
Fat images > 1 GB	🟠 Medium	Cost, cold start	Easy — check image size	Medium
`minReplicas=0` on SLO APIs	🟠 Medium	Latency	Medium — correlate cold starts	Low
Single revision mode	🟠 Medium	Release safety	Easy — check revision mode	Low
Hardcoded connection strings	🔴 High	Security, operations	Medium — code/config audit	High
Missing ingress config	🟡 Low	Connectivity	Easy — check ingress field	Low
Persistent local state	🔴 High	Architecture fit	Hard — requires design review	High
Over-provisioned resources	🟠 Medium	Cost	Medium — compare usage vs config	Low
Mixed dev/staging/prod env	🟠 Medium	Blast radius	Easy — list environment apps	High
DB migrations in startup	🔴 High	Release safety	Medium — review entrypoint	Medium
Scale rule conflicts	🟠 Medium	Cost, reliability	Hard — requires load testing	Medium

Anti-pattern relationship map¶

graph TD
    A[Anti-Patterns] --> B[Release Safety]
    A --> C[Runtime Reliability]
    A --> D[Security and Identity]
    A --> E[Cost Control]
    A --> F[Architecture Fit]

    B --> B1[:latest tag]
    B --> B2[Single revision mode]
    B --> B3[Startup migrations]

    C --> C1[No probes]
    C --> C2[Ignoring SIGTERM]
    C --> C3[Scale rule conflicts]

    D --> D1[Secrets in plain env]
    D --> D2[ACR admin credentials]
    D --> D3[Hardcoded connection strings]

    E --> E1[Over-provisioning resources]
    E --> E2[minReplicas misconfiguration]
    E --> E3[Fat images]

    F --> F1[Persistent workload mismatch]
    F --> F2[No environment separation]
    F --> F3[Ingress misconfiguration]

Pre-production anti-pattern review checklist¶

Review category	Key check	Pass criteria
Image management	No mutable production tags	Version or digest pinned
Runtime health	Readiness/liveness probes present	Probe paths tested
Secret handling	No plaintext credentials in env vars	Secret references used
Identity	Managed identity configured for dependencies	Least privilege applied
Scaling	Min/max replicas and scale rules justified	No contradictory scaler behavior
Release safety	Multiple revisions with progressive rollout	Rollback path documented
Environment boundaries	Stage isolation enforced	Separate environments for prod and non-prod

Bookmark this page

Most high-severity Container Apps incidents map to one or more anti-patterns in this list. Running this checklist before every major release prevents avoidable outages.

Advanced Topics¶

Automate anti-pattern detection in CI using policy checks for image tags, resource limits, and revision mode.
Build organization-level deployment templates that enforce managed identity and secret reference defaults.
Combine platform diagnostics and KQL dashboards to detect anti-pattern drift (for example sudden increase in image pull failures or cold-start latency).
Establish a quarterly architecture review to identify workloads that should move to jobs, managed services, or AKS based on state and performance requirements.

Common Anti-Patterns¶

Prerequisites¶

Main Content¶

Anti-pattern quick reference table¶

1) Using :latest image tag in production¶

2) Fat container images larger than 1 GB¶

3) No health probes configured¶

4) Secrets in environment variables instead of Container Apps secrets¶

5) Using ACR admin credentials instead of managed identity¶

6) minReplicas=0 on latency-sensitive APIs¶

7) Single revision mode in production¶

8) Hardcoded connection strings¶

9) Ignoring SIGTERM and ungraceful shutdown¶

10) Missing ingress configuration for externally consumed apps¶

11) Using Container Apps for persistent local-state workloads¶

12) Over-provisioning CPU and memory¶

13) Mixing dev/staging/prod in one environment¶

14) Running database migrations in app startup¶

15) Ignoring scale rule interaction and precedence¶

Anti-pattern severity and impact matrix¶

Anti-pattern relationship map¶

Pre-production anti-pattern review checklist¶

Advanced Topics¶

See Also¶

1) Using `:latest` image tag in production¶

6) `minReplicas=0` on latency-sensitive APIs¶