Jobs Troubleshooting¶
Use this page when a Container Apps Job fails, hangs, misses a schedule, or does not fan out the way you expected.
Symptom¶
Common symptoms:
- job fails immediately after starting
- job runs but never completes
- event-driven job never triggers
- scheduled job appears to miss a window
- replica fan-out does not match expectations
flowchart TD
A[Job issue observed] --> B{Execution created?}
B -->|No| C[Inspect trigger type and metadata]
B -->|Yes| D{Container starts successfully?}
D -->|No| E[Check image pull, config, startup]
D -->|Yes| F{Execution finishes?}
F -->|No| G[Check loops, deadlocks, timeout]
F -->|Yes but wrong fan-out| H[Check parallelism and completion count]
C --> I[Re-run and verify]
E --> I
G --> I
H --> I Possible Causes¶
| Symptom | Possible causes |
|---|---|
| Fails immediately | bad image, registry auth, missing env var, startup crash, secret or identity issue |
| Runs but never completes | infinite loop, blocked I/O, no exit path, timeout mismatch |
| Event-driven job not triggering | scaler metadata mismatch, auth failure, unsupported scaler assumption, empty source |
| Scheduled run missed | cron misunderstanding, UTC/local-time confusion, prior long-running execution, external scheduler assumptions |
| Replica fan-out not working | parallelism misunderstood, replicaCompletionCount set too low or too high, workload itself serializes all work |
Diagnosis Steps¶
Job fails immediately¶
- List executions and inspect the failed run.
- Check image and registry configuration.
- Review console and system logs for startup/auth errors.
az containerapp job execution list \
--name "$JOB_NAME" \
--resource-group "$RG" \
--output table
az containerapp job execution show \
--name "$JOB_NAME" \
--resource-group "$RG" \
--job-execution-name "$EXECUTION_NAME" \
--output json
Job runs but never completes¶
- Compare observed runtime with
replicaTimeout. - Look for application logs that show loops or blocked dependencies.
- Verify the process exits after successful work.
Event-driven job not triggering¶
- Re-check scaler metadata names and values.
- Validate identity or secret access to the event source.
- Confirm there is actually backlog or lag to trigger on.
Do not assume Jobs support every app scaler
If you copied scaler configuration from a continuously running Container App, verify that the same scaler is currently supported for event-driven Jobs before you keep debugging metadata.
Scheduled job missed an execution¶
- Translate the cron expression into UTC and local business time.
- Compare the expected window with recent execution timestamps.
- Check whether the previous run was still active or failed unexpectedly.
Overlap behavior still needs direct verification
This troubleshooting path treats overlap as a possibility you must design around. Confirm current product behavior before you assume the platform will serialize or skip overlapping schedule windows for you.
Replica fan-out not working¶
- Confirm configured
parallelismandreplicaCompletionCount. - Inspect the workload: it may still process input sequentially inside each replica.
- Validate that external dependencies are not enforcing serialization.
Resolution¶
| Symptom | Resolution |
|---|---|
| Fails immediately | fix image pull, env vars, secret refs, identity, or startup command |
| Never completes | add explicit exit path, reduce work per execution, raise timeout only after measuring |
| Event-driven not triggering | correct scaler metadata, fix auth, switch to a verified event source if needed |
| Missed schedule | rewrite cron in UTC, widen interval, add external lock or alerting |
| Fan-out mismatch | align parallelism, replicaCompletionCount, and workload partition logic |
Prevention¶
- emit structured logs with execution correlation fields
- document cron expressions in UTC and business-local time
- validate scaler config in a lower environment before production
- keep retries low until idempotency is proven
- separate input repair, replay, and dead-letter procedures in the runbook