Poison Message Handling¶
Trigger: Queue Storage | State: stateless | Guarantee: at-least-once | Difficulty: beginner
Overview¶
The examples/reliability/poison_message_handling/ sample shows how Azure Functions queue triggers
handle repeatedly failing messages. When processing keeps throwing an exception and the dequeue limit
is exhausted, Queue Storage automatically moves the message to the poison queue for separate
inspection and follow-up handling.
This keeps the main queue flowing while still preserving failed payloads for monitoring, triage, and manual replay.
When to Use¶
- You process queue messages that can fail permanently because of bad payloads or unsupported state.
- You want automatic isolation of repeatedly failing messages instead of blocking the main queue.
- You need a simple operational pattern for logging or alerting on poison messages.
When NOT to Use¶
- You need custom dead-letter routing logic beyond the built-in Queue Storage poison queue behavior.
- You must mutate or repair failed messages inline before the retry limit is reached.
- The workload requires exactly-once delivery guarantees instead of at-least-once processing.
Architecture¶
flowchart LR
producer[Producer] --> main[main queue]
main --> worker[queue processor]
worker -->|success| done[work completed]
worker -->|retries exhausted| poison[poison queue]
poison --> handler[poison message handler]
handler --> ops[logging / alerting]
Behavior¶
sequenceDiagram
participant Producer
participant Queue as Main queue
participant Worker as queue_processor
participant Poison as poison queue
participant Handler as poison_handler
Producer->>Queue: enqueue message
Queue->>Worker: deliver message
Worker->>Worker: validate + process
alt processing succeeds
Worker-->>Queue: complete
else processing fails repeatedly
Worker-->>Queue: raise exception
Queue->>Worker: retry until maxDequeueCount
Queue->>Poison: move message after retry exhaust
Poison->>Handler: trigger poison handler
Handler->>Handler: log payload and alert context
end
Prerequisites¶
- Python 3.10+
- Azure Functions Core Tools v4
- Azure Storage account or Azurite for queue triggers
- Queue
ordersavailable in the configured storage account
Project Structure¶
examples/reliability/poison_message_handling/
|-- function_app.py
|-- host.json
|-- local.settings.json.example
|-- pyproject.toml
`-- README.md
Implementation¶
The main queue trigger processes messages from orders. For demonstration, messages with
"should_fail": true raise an exception every time so you can watch the runtime retry and then move
the message to orders-poison.
The poison queue trigger listens to orders-poison and emits structured logs that can feed dashboard
queries, alerts, or incident workflows. This recipe keeps integration scope intentionally small:
logging only.
@app.queue_trigger(arg_name="msg", queue_name="orders", connection="AzureWebJobsStorage")
def queue_processor(msg: func.QueueMessage) -> None:
payload = json.loads(msg.get_body().decode("utf-8"))
if payload.get("should_fail"):
raise RuntimeError("Simulated processing failure")
@app.queue_trigger(arg_name="msg", queue_name="orders-poison", connection="AzureWebJobsStorage")
def poison_handler(msg: func.QueueMessage) -> None:
logging.error("Poison message detected id=%s payload=%s", msg.id, msg.get_body().decode("utf-8"))
Run Locally¶
cd examples/reliability/poison_message_handling
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cp local.settings.json.example local.settings.json
func start
Enqueue a successful message:
Enqueue a poison-message demo payload:
Expected Output¶
[Information] Processing order_id=A100 dequeue_count=1
[Warning] Simulated failure for order_id=A200 dequeue_count=1
[Warning] Simulated failure for order_id=A200 dequeue_count=2
[Warning] Simulated failure for order_id=A200 dequeue_count=3
[Error] Poison message detected id=<message-id> payload={"order_id": "A200", "should_fail": true}
Production Considerations¶
- Tune
maxDequeueCountto balance transient retry tolerance against operational noise. - Include message IDs, dequeue counts, and business keys in logs for easier replay decisions.
- Add alerting on poison-queue activity so failures do not stay silent.
- Separate permanent validation failures from transient dependency failures where possible.
- Build replay tooling carefully so poison messages can be fixed and re-submitted safely.