Skip to content

Cold Start Mitigation

Trigger: HTTP / Timer | State: stateless | Guarantee: request-response | Difficulty: beginner

Overview

Cold starts happen when Azure Functions needs to allocate a new worker and initialize your Python app before serving traffic. This recipe shows how to reduce that first-hit latency with four practical techniques:

  • keep critical workloads on Premium plan so pre-warmed instances are available
  • use a warmup trigger to preload dependencies during scale-out
  • defer expensive modules with lazy imports so startup work stays small
  • reuse outbound clients with module-level connection caching instead of rebuilding them on every request

The matching example in examples/runtime-and-ops/cold_start_mitigation/ combines an HTTP trigger, a warmup trigger, structured logging via azure-functions-logging-python, and a simple outbound session cache. It also points to azure-functions-doctor-python for deployment diagnostics.

When to Use

  • You have bursty traffic and the first request after idle is noticeably slower.
  • Your app imports heavy SDKs, ML packages, or large dependency graphs.
  • You need predictable latency for user-facing APIs.
  • You want a low-complexity mitigation before moving to custom containers.

When NOT to Use

  • Your workload is entirely background-driven and a few seconds of startup latency is acceptable.
  • You only need throughput tuning rather than startup optimization.
  • Your largest delays come from downstream APIs or databases, not function initialization.

Architecture

flowchart TD
    A[New request arrives] --> B{Worker already warm?}
    B -- Yes --> C[Reuse loaded modules]
    C --> D[Reuse cached outbound session]
    D --> E[Return response quickly]

    B -- No --> F[Allocate new worker]
    F --> G[Load Python worker + app]
    G --> H[Import lightweight startup path only]
    H --> I[Warmup trigger preloads selected dependencies]
    I --> J[Create module-level connection cache]
    J --> K[Serve first request]
    K --> E

Prerequisites

  • Python 3.10+
  • Azure Functions Core Tools v4
  • An Azure Functions app using the Python v2 programming model
  • Premium plan if you need pre-warmed instances in production

Project Structure

examples/runtime-and-ops/cold_start_mitigation/
|-- function_app.py
|-- host.json
|-- local.settings.json.example
|-- pyproject.toml
`-- README.md

Implementation

The example keeps top-level imports minimal, then defers heavier work until the request path or warmup path needs it.

_cached_session = None

def _lazy_import_requests():
    import importlib
    return importlib.import_module("requests")

def get_session():
    global _cached_session
    if _cached_session is None:
        requests = _lazy_import_requests()
        session = requests.Session()
        _cached_session = session
    return _cached_session

The warmup trigger preloads the same dependency chain used by the HTTP trigger so scale-out instances are ready before receiving user traffic.

@app.warm_up_trigger("warmup")
def warmup(warmup_context) -> None:
    _ = warmup_context
    _lazy_import_requests()
    get_session()

Key tactics:

  • Lazy imports: keep global startup focused on the runtime, logging, and route registration.
  • Connection pooling: cache a single outbound requests.Session() per worker process.
  • Warmup trigger: preload dependencies during scale-out on supported hosting plans.
  • Premium plan: keep pre-warmed instances available for latency-sensitive workloads.

Behavior

sequenceDiagram
    participant Scale as Scale controller
    participant Instance as New function instance
    participant Warmup as Warmup trigger
    participant Cache as Dependency/session cache
    participant Client as HTTP client

    Scale->>Instance: Add instance for anticipated load
    Instance->>Warmup: Invoke warmup function
    Warmup->>Cache: Import heavy modules lazily
    Warmup->>Cache: Initialize cached outbound session
    Warmup-->>Instance: Mark instance ready
    Client->>Instance: Send first HTTP request
    Instance->>Cache: Reuse preloaded modules and session
    Instance-->>Client: Return faster first response

Run Locally

cd examples/runtime-and-ops/cold_start_mitigation
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cp local.settings.json.example local.settings.json
func start

To simulate a ready instance locally, call the warmup endpoint before the HTTP route:

curl -i http://localhost:7071/admin/warmup
curl -i "http://localhost:7071/api/cold-start-demo?ping=0"

Expected Output

[INFO] Warmup trigger invoked; preloading requests session.
[INFO] HTTP cold-start demo invoked. session_reused=True dependency_loaded=True

Production Considerations

  • Pre-warmed instances are available on Premium plan, which is the most direct mitigation for user-facing APIs.
  • Warmup triggers run during scale-out, not before every single cold start scenario.
  • Keep warmup work small and deterministic; preload only the dependencies that materially improve latency.
  • Module-level caches are per worker process and can disappear whenever the host recycles.
  • Pair latency tuning with azure-functions-doctor-python checks so storage, app settings, and extension issues do not masquerade as cold-start problems.