Container Apps Jobs¶
Azure Container Apps Jobs run finite background work instead of continuously serving traffic. Use jobs when execution has a clear start and finish, such as scheduled cleanup, data import, queue processing, and file transformation.
What are Jobs?¶
A Container App is optimized for long-running API or worker services. A Container Apps Job is optimized for one execution unit that can complete, fail, and retry according to policy.
Use jobs when:
- Work is bounded and can be retried safely.
- You need manual, scheduled, or event-driven triggers.
- Scale behavior is tied to execution count, not HTTP traffic.
Use apps when:
- You need always-on request handling.
- You expose an ingress endpoint to clients.
- You maintain long-lived process state in memory.
Jobs must be idempotent
Retries and parallel executions can reprocess the same work item. Design input handling so duplicate execution does not corrupt state.
Execution Models¶
Manual trigger¶
Manual jobs are ideal for one-off tasks: backfills, schema migrations, reprocessing failed records, or operator-driven maintenance.
Scheduled¶
Scheduled jobs run by cron expression and are useful for recurring operations such as nightly compaction, report generation, and stale artifact cleanup.
az containerapp job create \
--name "$JOB_NAME" \
--resource-group "$RG" \
--environment "$ENVIRONMENT_NAME" \
--trigger-type "Schedule" \
--cron-expression "0 */6 * * *" \
--image "$ACR_NAME.azurecr.io/python-job:v1"
Event-driven¶
Event-driven jobs react to external signals such as queue depth or blob events. This model is suited for asynchronous throughput pipelines where scale and cost efficiency matter.
az containerapp job create \
--name "$JOB_NAME" \
--resource-group "$RG" \
--environment "$ENVIRONMENT_NAME" \
--trigger-type "Event" \
--scale-rule-name "queue-processor" \
--scale-rule-type "azure-servicebus" \
--scale-rule-metadata "queueName=jobs" "messageCount=10" "namespace=<servicebus-namespace>.servicebus.windows.net" \
--image "$ACR_NAME.azurecr.io/python-job:v1"
App vs Job Decision Matrix¶
| Decision area | Container App | Container Apps Job |
|---|---|---|
| Primary workload | API/service traffic | Batch or async task execution |
| Lifetime model | Long-running process | Finite run with completion |
| Trigger model | HTTP/event-driven scaling | Manual, cron, event trigger |
| Ingress requirement | Common | Usually none |
| Retry behavior | App-level logic | Job execution retry policy |
| Cost profile | Baseline runtime + scale | Pay during execution windows |
| Best examples | Public API, internal service | ETL, cleanup, periodic reporting |
Configuration Patterns¶
Timeout and retry settings¶
- Set
--replica-timeoutbased on worst-case execution plus headroom. - Use
--replica-retry-limitfor transient failures only. - Ensure your code is idempotent before increasing retry counts.
Parallelism and replica completion¶
--parallelismcontrols concurrent replicas for one execution.--replica-completion-countcontrols how many successful replicas mark completion.- Start with conservative values, then scale after verifying external dependency limits.
Trigger configuration¶
- Manual: prefer for operator-controlled execution.
- Scheduled: use UTC cron and document business timezone assumptions.
- Event-driven: verify scale rule metadata and identity permissions to event source.
Identity and Secrets¶
Jobs use the same identity patterns as apps:
- Prefer user-assigned or system-assigned managed identity.
- Assign least-privilege RBAC to data stores and messaging services.
- Store configuration in environment variables and sensitive values in secrets.
- Use Key Vault references for centralized secret lifecycle management.
Monitoring Job Executions¶
Track executions and troubleshoot failures with Azure CLI:
Monitor these signals:
- Execution success/failure ratio
- Retry count trend
- Duration distribution (p50/p95)
- Dependency-specific errors (authentication, throttling, timeout)
Set timeouts from measured runtime
Start from observed p95 execution duration and add headroom. Avoid unlimited timeout behavior that hides stuck executions.
Common Patterns¶
Data processing pipeline¶
Ingest file or queue item, transform it, then write canonical output. Keep each step idempotent so retries are safe.
Scheduled cleanup¶
Run daily cleanup to remove expired blobs, temporary records, or stale artifacts. Add dry-run mode for safe validation.
Event-driven media processing¶
Trigger on new uploads, transcode or enrich content, and emit completion metadata for downstream consumers.
Job Lifecycle¶
flowchart LR
A[Trigger] --> B[Execution Created]
B --> C[Replica Starts]
C --> D{Result}
D -->|Success| E[Completed]
D -->|Failure| F{Retry Limit Reached?}
F -->|No| C
F -->|Yes| G[Failed]