Skip to content

Lab Guide: Cold Start on Azure Functions

This Level 3 lab reproduces and analyzes Azure Functions cold-start behavior across hosting plans, with emphasis on FC1 Flex Consumption evidence. You will build a falsifiable timeline that separates worker provisioning delay from host startup time and request execution time.

Lab Metadata

Field Value
Difficulty L3 (advanced troubleshooting and evidence correlation)
Duration 60-90 minutes
Tier Azure Functions Flex Consumption (FC1) primary; comparison with Consumption (Y1) and Premium (EP)
Runtime Python 3.11 / Functions v4 (HTTP trigger scenario)
Trigger mechanism Idle period -> first HTTP request, restart cycle, and scale event observation
Key endpoints /api/health, /api/info
Diagnostic categories requests, traces, dependencies, App Insights query exports
Artifact root labs/cold-start/artifacts/

What this lab is designed to prove

This lab is intentionally designed to falsify an oversimplified claim: "cold start means host startup is always slow." The FC1 evidence shows a different reality: - Full-idle cold hit can be around 30.485s end-to-end client-side. - Host startup can still be fast (Host started (363ms), Host started (453ms)). - Worker provisioning and assignment dominate full-idle latency. - Scale-out events create intermediate latency bands (1719ms-1842ms server-side) without a full idle cold path. - Fully warm baseline remains low (3.63ms-5.86ms server-side, 67ms-99ms client-side).


1) Background

Cold start in Azure Functions is not a single runtime metric. It is a chain of platform and application phases that may overlap, and each phase leaves different telemetry signatures.

1.1 Cold-start phase model

flowchart TD
    A[Trigger: first request after idle, restart, scale-out, host move] --> B[Platform worker allocation]
    B --> C["Site assignment and container/runtime bring-up"]
    C --> D[Functions host initialization]
    D --> E[Language worker initialization]
    E --> F[Trigger listener and route readiness]
    F --> G[First invocation executes]
    G --> H[Steady warm execution]
    B --> B1[Potential dominant delay on FC1 full idle]
    D --> D1[Often sub-second in healthy cases]
    G --> G1[Client sees combined elapsed path]

1.2 Platform cold start vs app cold start

Scope What changed Dominant latency source Typical evidence Operational meaning
Platform cold start No active worker instance for app; new worker assignment required Worker provisioning and scheduling High client first-hit time + normal/fast host-start trace Capacity path cost, not necessarily app code regression
App/host cold start Host/runtime starts on an already allocated worker Host init + extension load + worker init Starting Host, Job host started, Host started (Xms) Runtime startup overhead inside already allocated compute
Scale event warm-up Additional instance joins under load Partial warm-up + listener readiness Worker started traces + moderate duration bumps Elastic behavior; can add 1-2s class spikes
Warm steady state Existing initialized host serves requests Function logic and dependencies Low stable request duration and low jitter Normal healthy baseline
### 1.3 App under test
The lab workload is a minimal HTTP-triggered app used to expose startup and request timing differences without heavy downstream dependency noise.
Design intent for this lab:
1. Preserve a very low warm execution baseline.
2. Trigger a full idle-to-first-hit path in FC1.
3. Compare full idle cold, post-restart cold, and fully warm behavior.
4. Cross-check timeline with host startup traces from traces table.
The intentionally measured outcomes from this repository's recorded run:
- Full-idle cold request: 30.485s client-side.
- Post-restart first request: 3.156s client-side.
- Warm requests: 0.067s-0.099s client-side.
- Warm server-side execution (health): 3.63ms-5.86ms.
- Scale-event server-side burst: 1719ms-1842ms.
### 1.4 Request-path and startup-path timing differences
A single request timing value does not directly expose phase attribution.
Timing perspective Measured by Includes Excludes Common misread
--- --- --- --- ---
Client end-to-end (curl time_total) Client DNS, TLS, frontend queueing, worker assignment wait, host readiness, function execution Internal platform phase labels "Function code took 30 seconds"
Request duration (requests.duration) Application Insights requests Server-side request handling window Pre-routing worker allocation delay in some cases "No cold start because request duration is small"
Host startup traces traces Host lifecycle checkpoints Network/TLS/client overhead "Host started fast therefore no cold start happened"
Scale-event trace + request correlation traces + requests Mid-flight elasticity behavior Full idle-to-allocate phase certainty "All spikes are dependency latency"
Interpretation rule for this lab:
- Use at least two channels for attribution: traces lifecycle + requests duration bands.
- Treat client-side first hit as symptom, not root cause.
- Validate whether host startup is the bottleneck or merely one sub-phase.
### 1.5 Timeline diagram
sequenceDiagram
    participant Client as External Client
    participant FE as Azure Front End
    participant Scale as Scale Controller
    participant Worker as FC1 Worker
    participant Host as Functions Host
    participant Func as HTTP Function
    Client->>FE: First request after idle
    FE->>Scale: Need available worker
    Scale->>Worker: Allocate/provision worker
    Worker->>Host: Start host process
    Host-->>Worker: Host started ("363ms / 453ms observed")
    Worker->>Func: Route first invocation
    Func-->>Client: HTTP 200 (30.485s observed in full idle case)
    Note over Client,Worker: Full idle cold path dominated by worker provisioning
    Client->>FE: Warm follow-up request
    FE->>Worker: Route to warm instance
    Worker->>Func: Execute function
    Func-->>Client: HTTP 200 (67-99ms observed)

1.6 Warm-up and mitigation controls

Control Plan availability What it does Cold-start impact Notes for this lab
Always-ready instances Flex Consumption, Premium Keeps baseline instances active Reduces idle-to-first-hit delay Critical comparison concept for FC1
Pre-warmed instances Premium Maintains pre-initialized workers for burst Minimizes scale-out delay and startup variance Premium requires minimum always-ready footprint
Minimum always-ready instance count Premium (at least 1 practical baseline) Keeps app hot Eliminates scale-to-zero cold start Cost tradeoff accepted for low-latency SLO
Consumption default scale-to-zero Y1 No always-ready reserve Typical cold starts in 5-15s band Cost-optimized, latency-variable profile
Startup code optimization All plans Reduce module import/init time Improves host/app startup sub-phase Does not remove worker allocation delay
Scheduled synthetic warm traffic All plans Keeps app active Can reduce idle cold frequency Operational workaround, not platform guarantee
### 1.7 Why this matters for troubleshooting quality
Misclassification of cold-start signals creates expensive operational mistakes:
1. False rollback decisions when code is healthy.
2. Escalation to app team when capacity-path behavior is the true driver.
3. Incorrect mitigation (dependency tuning) for a provisioning bottleneck.
4. Under-investment in plan choice (Y1 vs FC1 vs EP) for latency-sensitive workloads.
A high-quality triage process must identify:
- Whether latency cluster is full idle provisioning, host startup, scale-out warm-up, or dependency delay.
- Whether mitigation is architectural (plan and always-ready) or code-level (startup workload reduction).
### 1.8 MS Learn grounding
This lab aligns to Microsoft Learn guidance on:
- Azure Functions hosting plans and scale behavior.
- Monitoring and diagnosing Azure Functions with Application Insights.
- Cold-start mitigation strategies through plan capabilities and startup optimization.
Authoritative links are listed in Sources.
---
## 2) Hypothesis
### 2.1 Formal hypothesis statement
> In Azure Functions Flex Consumption (FC1), severe first-hit latency after idle is primarily driven by worker provisioning and assignment delay, while host startup itself can remain fast (sub-second); therefore, host startup traces and end-to-end latency can diverge significantly.
### 2.2 Causal chain
flowchart LR
    A[Idle period with no active instance] --> B[First request arrives]
    B --> C[Platform allocates worker]
    C --> D[Host starts quickly once worker available]
    D --> E[Function invocation executes]
    E --> F[Client receives response]
    C --> C1[Potential long delay window]
    D --> D1["Observed 363ms / 453ms host start"]
    F --> F1[Observed ~30.5s full-idle client latency]

2.3 Proof criteria

All criteria below should be satisfied to support the hypothesis: 1. traces include healthy/fast host startup messages (for example Host started (363ms)). 2. Full-idle first-hit client latency is dramatically larger than host startup duration. 3. Warm baseline remains low and stable (67ms-99ms client-side; 3.63ms-5.86ms server-side). 4. Scale-event latency band appears between warm and full-idle cold (1719ms-1842ms server-side observed). 5. Post-restart first hit is significantly lower than full-idle first hit (3.156s vs 30.485s).

2.4 Disproof criteria

Any condition below weakens or falsifies the hypothesis: - Host startup trace durations are consistently high and close to first-hit total latency. - Warm baseline remains elevated after startup window (persistent regression). - Dependency failures or timeouts explain most of the latency increase. - No evidence of idle-to-first-hit or scale transitions in telemetry windows.

2.5 Expected outcomes

Expected by plan profile in this lab context: | Plan | Expected cold behavior | Typical first-hit range | Lab interpretation focus | |---|---|---|---| | FC1 (Flex Consumption) | Full idle can show high tail due to provisioning; host startup often fast | Can reach 30s+ in worst full-idle paths | Separate provisioning latency from host startup duration | | Y1 (Consumption) | Scale-to-zero common; cold path more visible | Often 5-15s, workload dependent | Cost-first with variable first-hit latency | | EP (Premium) | Always-ready and pre-warmed reduce cold impact | Usually low variance first hit | Validate configuration of always-ready/pre-warmed counts |

2.6 Counter-hypothesis tested implicitly

Counter-hypothesis:

"If first hit is slow, host startup must also be slow." This lab is expected to disprove that simplification in FC1 by showing a long client cold path with fast host-start traces.


3) Runbook

Use this runbook exactly as written to collect reproducible evidence.

3.1 Prerequisites

Requirement Validation command Expected output pattern
Azure CLI authenticated az account show --output table Active subscription row with masked ID
Functions Core Tools func --version v4.x
Python runtime python3 --version Python 3.11.x
curl curl --version curl version line
App Insights query access az monitor app-insights component show --app "$AI_NAME" --resource-group "$RG" --output table Component details
Lab source present ls "labs/cold-start" Contains lab assets (at minimum README.md)
### 3.2 Variables
Use the repository variable conventions in commands.
RG="rg-coldstart-lab-krc"
BASE_NAME="coldstartlab"
APP_NAME="${BASE_NAME}-func"
AI_NAME="${BASE_NAME}-insights"
LOCATION="koreacentral"
For consistency in documentation and triage notes, reference these as $RG, $BASE_NAME, $APP_NAME, $AI_NAME, and $LOCATION. The Bicep template derives resource names from $BASE_NAME, so $APP_NAME and $AI_NAME are computed to match.
Optional helper variables:
APP_URL="https://$APP_NAME.azurewebsites.net"
START_TS_UTC=$(date --utc +"%Y%m%dT%H%M%SZ")
ARTIFACT_ROOT="labs/cold-start/artifacts/$START_TS_UTC"
mkdir -p "$ARTIFACT_ROOT"
### 3.3 Deploy infrastructure
Deploy the lab infrastructure from infra/flex-consumption/.
az group create \
  --name "$RG" \
  --location "$LOCATION"
az deployment group create \
  --resource-group "$RG" \
  --template-file "infra/flex-consumption/main.bicep" \
  --parameters baseName="$BASE_NAME"
Deploy application package from the apps/python/ directory:
func azure functionapp publish "$APP_NAME" --python
Run this command from apps/python/ where function_app.py and host.json reside.
### 3.4 Verify baseline configuration
Run baseline checks before triggering cold path measurements.
az functionapp show \
  --resource-group "$RG" \
  --name "$APP_NAME" \
  --output table
az functionapp config show \
  --resource-group "$RG" \
  --name "$APP_NAME"
curl --silent --show-error "$APP_URL/api/health"
curl --silent --show-error "$APP_URL/api/info"
Expected baseline snippets (sanitized):
State              Running
DefaultHostName    <masked-function-app>.azurewebsites.net
Kind               functionapp,linux
{"status":"healthy","timestamp":"2026-04-04T11:11:32Z","version":"1.0.0"}
### 3.5 Trigger measurement workflow
Execute the measurement workflow in this order:
1. Capture warm baseline with 3 quick requests.
2. Keep app idle for ~13 minutes.
3. Send first request and capture full idle cold latency.
4. Send immediate follow-up requests to capture warm band.
5. Restart app and capture post-restart first request.
6. Query telemetry with fixed time range and export artifacts.
Warm baseline and idle-to-cold sample commands:
curl --silent --show-error --output /dev/null --write-out "warm_1 %{http_code} %{time_total}\n" "$APP_URL/api/health"
curl --silent --show-error --output /dev/null --write-out "warm_2 %{http_code} %{time_total}\n" "$APP_URL/api/health"
curl --silent --show-error --output /dev/null --write-out "warm_3 %{http_code} %{time_total}\n" "$APP_URL/api/health"
sleep 780
curl --silent --show-error --output /dev/null --write-out "cold_idle %{http_code} %{time_total}\n" "$APP_URL/api/health"
curl --silent --show-error --output /dev/null --write-out "warm_after_cold %{http_code} %{time_total}\n" "$APP_URL/api/health"
Restart and post-restart sample:
az functionapp restart \
  --resource-group "$RG" \
  --name "$APP_NAME"
curl --silent --show-error --output /dev/null --write-out "cold_after_restart %{http_code} %{time_total}\n" "$APP_URL/api/health"
curl --silent --show-error --output /dev/null --write-out "warm_after_restart %{http_code} %{time_total}\n" "$APP_URL/api/health"
### 3.6 Manual fallback
If automation script is unavailable, use this deterministic fallback set.
#### 3.6.1 Generate time-tagged artifact folder
START_TS_UTC=$(date --utc +"%Y%m%dT%H%M%SZ")
ARTIFACT_ROOT="labs/cold-start/artifacts/$START_TS_UTC"
mkdir -p "$ARTIFACT_ROOT"
#### 3.6.2 Capture latency series into CSV
for i in 1 2 3; do
  curl --silent --show-error --output /dev/null --write-out "$i,%{http_code},%{time_total}\n" "$APP_URL/api/health"
done > "$ARTIFACT_ROOT/warm-baseline.csv"
sleep 780
curl --silent --show-error --output /dev/null --write-out "idle_cold,%{http_code},%{time_total}\n" "$APP_URL/api/health" > "$ARTIFACT_ROOT/idle-cold.csv"
az functionapp restart --resource-group "$RG" --name "$APP_NAME"
curl --silent --show-error --output /dev/null --write-out "restart_cold,%{http_code},%{time_total}\n" "$APP_URL/api/health" > "$ARTIFACT_ROOT/restart-cold.csv"
#### 3.6.3 Capture health and config snapshots
az functionapp show \
  --resource-group "$RG" \
  --name "$APP_NAME" \
  --output json > "$ARTIFACT_ROOT/functionapp-show.json"
az functionapp config show \
  --resource-group "$RG" \
  --name "$APP_NAME" \
  --output json > "$ARTIFACT_ROOT/functionapp-config.json"
curl --silent --show-error "$APP_URL/api/health" > "$ARTIFACT_ROOT/health.json"
### 3.7 Collect KQL evidence
Use the KQL library queries with app filter adjusted to your $APP_NAME.
#### Query 1: Function execution summary
let appName = "func-myapp-prod";
requests
| where timestamp > ago(1h)
| where cloud_RoleName =~ appName
| where operation_Name startswith "Functions."
| summarize
    Invocations = count(),
    Failures = countif(success == false),
    FailureRatePercent = round(100.0 * countif(success == false) / count(), 2),
    P95Ms = round(percentile(duration, 95), 2)
  by FunctionName = operation_Name
| order by Failures desc, P95Ms desc
#### Query 3: Cold start analysis
let appName = "func-myapp-prod";
traces
| where timestamp > ago(6h)
| where cloud_RoleName =~ appName
| where message has_any ("Host started", "Initializing Host", "Host lock lease acquired")
| summarize StartupEvents=count() by bin(timestamp, 5m)
| join kind=leftouter (
    requests
    | where timestamp > ago(6h)
    | where cloud_RoleName =~ appName
    | where operation_Name startswith "Functions."
    | summarize FirstInvocation=min(timestamp), FirstDurationMs=arg_min(timestamp, round(duration, 2)) by bin(timestamp, 5m)
) on timestamp
| order by timestamp desc
#### Query 7: Scaling events timeline
let appName = "func-myapp-prod";
traces
| where timestamp > ago(6h)
| where cloud_RoleName =~ appName
| where message has_any ("scale", "instance", "worker", "concurrency", "drain")
| project timestamp, severityLevel, message
| order by timestamp desc
#### Query 8: Host startup/shutdown events
let appName = "func-myapp-prod";
traces
| where timestamp > ago(12h)
| where cloud_RoleName =~ appName
| where message has_any ("Starting Host", "Host started", "Job host started", "Host shutdown", "Host is shutting down", "Stopping JobHost")
| project timestamp, severityLevel, message
| order by timestamp desc
CLI execution examples for App Insights:
az monitor app-insights query \
  --apps "$AI_NAME" \
  --resource-group "$RG" \
  --analytics-query "traces | where timestamp > ago(12h) | where cloud_RoleName =~ '$APP_NAME' | where message has_any ('Starting Host','Host started','Job host started','Host shutdown','Host is shutting down','Stopping JobHost') | project timestamp, severityLevel, message | order by timestamp desc" \
  --output table
az monitor app-insights query \
  --apps "$AI_NAME" \
  --resource-group "$RG" \
  --analytics-query "traces | where timestamp > ago(6h) | where cloud_RoleName =~ '$APP_NAME' | where message has_any ('scale','instance','worker','concurrency','drain') | project timestamp, severityLevel, message | order by timestamp desc" \
  --output table
Export JSON artifacts for incident evidence:
az monitor app-insights query \
  --apps "$AI_NAME" \
  --resource-group "$RG" \
  --analytics-query "let appName = '$APP_NAME'; requests | where timestamp > ago(1h) | where cloud_RoleName =~ appName | where operation_Name startswith 'Functions.' | summarize Invocations = count(), Failures = countif(success == false), FailureRatePercent = round(100.0 * countif(success == false) / count(), 2), P95Ms = round(percentile(duration, 95), 2) by FunctionName = operation_Name | order by Failures desc, P95Ms desc" \
  --output json > "$ARTIFACT_ROOT/kql-query1-function-execution-summary.json"
az monitor app-insights query \
  --apps "$AI_NAME" \
  --resource-group "$RG" \
  --analytics-query "let appName = '$APP_NAME'; traces | where timestamp > ago(6h) | where cloud_RoleName =~ appName | where message has_any ('Host started', 'Initializing Host', 'Host lock lease acquired') | summarize StartupEvents=count() by bin(timestamp, 5m) | join kind=leftouter (requests | where timestamp > ago(6h) | where cloud_RoleName =~ appName | where operation_Name startswith 'Functions.' | summarize FirstInvocation=min(timestamp), FirstDurationMs=arg_min(timestamp, round(duration, 2)) by bin(timestamp, 5m)) on timestamp | order by timestamp desc" \
  --output json > "$ARTIFACT_ROOT/kql-query3-cold-start-analysis.json"
az monitor app-insights query \
  --apps "$AI_NAME" \
  --resource-group "$RG" \
  --analytics-query "let appName = '$APP_NAME'; traces | where timestamp > ago(6h) | where cloud_RoleName =~ appName | where message has_any ('scale', 'instance', 'worker', 'concurrency', 'drain') | project timestamp, severityLevel, message | order by timestamp desc" \
  --output json > "$ARTIFACT_ROOT/kql-query7-scaling-events.json"
az monitor app-insights query \
  --apps "$AI_NAME" \
  --resource-group "$RG" \
  --analytics-query "let appName = '$APP_NAME'; traces | where timestamp > ago(12h) | where cloud_RoleName =~ appName | where message has_any ('Starting Host', 'Host started', 'Job host started', 'Host shutdown', 'Host is shutting down', 'Stopping JobHost') | project timestamp, severityLevel, message | order by timestamp desc" \
  --output json > "$ARTIFACT_ROOT/kql-query8-host-startup-shutdown.json"
### 3.8 Real output snippets
The following snippets are from the repository's real FC1 lab evidence (sanitized).
#### Host startup traces
Host started (363ms)
Host started (453ms)
#### End-to-end latency captures
Warm baseline:
warm_1 200 0.091
warm_2 200 0.067
warm_3 200 0.074
Idle cold:
cold_idle 200 30.485
Post-restart cold:
cold_after_restart 200 3.156
Warm again:
warm_after_restart 200 0.099
warm_after_restart 200 0.084
warm_after_restart 200 0.074
#### Server-side duration band
health function:
- warm: 3.63ms to 5.86ms
- cold start or scale-out warm-up (attribution requires correlation with traces): 1719ms to 1842ms
#### Scaling timeline snippet
2026-04-04T11:32:20Z  Worker process started and initialized.
2026-04-04T11:31:50Z  Worker process started and initialized.
2026-04-04T11:31:20Z  Worker process started and initialized.
2026-04-04T11:30:50Z  Worker process started and initialized.
Worker process started and initialized may indicate scale-out, restart, or initial startup.
### 3.9 Interpretation checklist
Use this checklist while triaging:
Check Evidence source Pass condition
--- --- ---
Warm baseline captured warm-baseline.csv At least 3 successful rows
Full-idle cold captured idle-cold.csv One clear high-latency outlier vs warm
Host startup traces found Query 8 Host started (Xms) present near window
Scale events visible Query 7 Worker/instance traces during spikes
Request duration distribution stable warm Query 1 Low P95 once warm
Cold-start correlation present Query 3 Startup event bins align with higher first duration
### 3.10 Decision logic during triage

flowchart TD
    A[User reports first-hit latency spike] --> B{Warm requests also slow?}
    B -->|No| C{Host startup traces healthy and quick?}
    B -->|Yes| D[Investigate sustained regression path]
    C -->|Yes| E{Idle or scale event present in same window?}
    C -->|No| F["Investigate host/app startup regression"]
    E -->|Idle full cold| G[Classify as provisioning-dominant cold start]
    E -->|Scale-out only| H[Classify as elastic warm-up latency]
    E -->|Neither| I[Check dependency latency and retries]
    D --> J[Correlate dependencies, exceptions, concurrency]
    F --> K[Review startup code, extension load, config]

4) Experiment Log (Artifact-Based)

This section records a complete evidence chain using the FC1 run profile already captured in this repository context.

4.1 Artifact inventory

Category Artifact path Purpose
Baseline configuration labs/cold-start/artifacts/<timestamp>/functionapp-show.json Capture app state and hostname
Baseline config detail labs/cold-start/artifacts/<timestamp>/functionapp-config.json Confirm runtime and startup settings
Warm baseline latency labs/cold-start/artifacts/<timestamp>/warm-baseline.csv Establish low-latency reference
Idle cold latency labs/cold-start/artifacts/<timestamp>/idle-cold.csv Capture full idle first-hit delay
Restart cold latency labs/cold-start/artifacts/<timestamp>/restart-cold.csv Compare against full idle cold
Health snapshot labs/cold-start/artifacts/<timestamp>/health.json Confirm endpoint health
KQL cold-start export labs/cold-start/artifacts/<timestamp>/kql-query3-cold-start-analysis.json Bin-level startup and first duration correlation
KQL scale export labs/cold-start/artifacts/<timestamp>/kql-query7-scaling-events.json Scale timeline evidence
KQL host lifecycle export labs/cold-start/artifacts/<timestamp>/kql-query8-host-startup-shutdown.json Host lifecycle sequence
KQL execution summary export labs/cold-start/artifacts/<timestamp>/kql-query1-function-execution-summary.json Warm/cold duration distribution and failures
### 4.2 Baseline evidence snapshot
#### 4.2.1 Function app state
Name              <masked-function-app>
Location          koreacentral
State             Running
Kind              functionapp,linux
DefaultHostName   <masked-function-app>.azurewebsites.net
#### 4.2.2 Baseline health
{
  "status": "healthy",
  "timestamp": "2026-04-04T11:10:52Z",
  "version": "1.0.0"
}
#### 4.2.3 Baseline plan assumptions table
Plan assumption Baseline value Why it matters
--- --- ---
Hosting profile FC1 Flex Consumption Can scale to zero and scale out rapidly
Initial idle period before cold sample ~13 minutes Increases likelihood of triggering full idle cold path
Endpoint for measurement /api/health Lightweight endpoint isolates startup path
Runtime Python 3.11 Startup profile includes worker initialization
Region koreacentral Keeps measurements region-specific
### 4.3 Latency dataset (raw)
#### 4.3.1 Warm baseline dataset
Sample HTTP code Client time_total (s)
--- ---: ---:
Warm 1 200 0.091
Warm 2 200 0.067
Warm 3 200 0.074
#### 4.3.2 Full idle cold dataset
Sample HTTP code Client time_total (s)
--- ---: ---:
Cold hit after ~13 min idle 200 30.485
#### 4.3.3 Post-restart cold dataset
Sample HTTP code Client time_total (s)
--- ---: ---:
First hit after restart 200 3.156
#### 4.3.4 Warm-after-restart dataset
Sample HTTP code Client time_total (s)
--- ---: ---:
Warm R1 200 0.099
Warm R2 200 0.084
Warm R3 200 0.074
### 4.4 Latency summary statistics
Metric Value
--- ---:
Warm mean (3 samples) 0.077 s
Warm min 0.067 s
Warm max 0.091 s
Full idle cold 30.485 s
Restart cold 3.156 s
Warm-after-restart mean (3 samples) 0.086 s
Derived ratios:
Comparison Ratio
--- ---:
Full idle cold / warm mean 395.9x
Restart cold / warm mean 41.0x
Full idle cold / restart cold 9.7x
Interpretation:
- Full idle path has an additional high-latency phase absent in restart path.
- Restart still incurs startup cost but much lower than full idle allocation path.
- Warm requests remain tightly clustered under 100ms client-side.
### 4.5 Startup telemetry and host lifecycle observations
#### 4.5.1 Host lifecycle snippets (real)
Host started (363ms)
Host started (453ms)
#### 4.5.2 Health function duration bands (real)
warm: 3.63ms - 5.86ms
scale/cold event: 1719ms - 1842ms
#### 4.5.3 Meaning of the combined signal
Signal Observation Attribution
--- --- ---
Host startup message duration Sub-second (363-453ms) Host startup itself not severe bottleneck
Full idle first request 30.485s Dominant delay before/around worker allocation
Scale-event duration band 1719-1842ms Intermediate elasticity cost
Warm server-side duration 3.63-5.86ms Function logic path is healthy
### 4.6 KQL observations (query-by-query)
#### 4.6.1 Query 1 execution summary highlights
Expected pattern for this lab:
Indicator Expected
--- ---
Failure rate Near 0% for health endpoint
P95 warm duration Low after warm-up window
Outlier spikes Limited to cold/scale bins
Sample interpretation table:
FunctionName Invocations Failures
--- ---: ---:
Functions.health 40+ 0
Functions.info Variable 0
#### 4.6.2 Query 3 cold-start analysis highlights
What to validate:
1. Startup event bins appear in same windows as elevated first invocation duration.
2. FC1 can show many startup events without indicating failure by itself.
3. Gap between startup events and request windows helps identify allocation delays.
Interpretation table:
StartupEvents trend FirstDurationMs trend Conclusion
--- --- ---
Stable low/expected Low Warm steady state
Spike with elevated first duration Elevated Cold/scale transition present
Spike but low first duration Low Benign scale behavior possible
No startup events and high durations High Check dependency or app regression
#### 4.6.3 Query 7 scaling events timeline highlights
Representative lines seen in this repository context:
2026-04-04T11:32:20Z Worker process started and initialized.
2026-04-04T11:31:50Z Worker process started and initialized.
2026-04-04T11:31:20Z Worker process started and initialized.
2026-04-04T11:30:50Z Worker process started and initialized.
Interpretation:
- Burst of worker-initialization messages may indicate scale-out, restart, or initial startup.
- Intermediate latency (~1.7s-1.8s) is expected during such events.
#### 4.6.4 Query 8 host startup/shutdown highlights
Expected healthy sequence:
Timestamp pattern Message pattern Health meaning
--- --- ---
Start window Starting Host Host lifecycle begins
Near start window Job host started Runtime initialization complete
Near start window Host started (Xms) Host readiness achieved
Steady state No frequent shutdown loop Stable runtime
Abnormal pattern to watch:
- Repeated Host is shutting down and Host started loops with concurrent request failures.
### 4.7 Comparative plan interpretation
#### 4.7.1 FC1 (Flex Consumption)
Characteristic Observed / expected in this lab
--- ---
Full idle cold path Can exceed 30s client-side
Host startup duration Often sub-second once worker allocated
Scale behavior Rapid worker events; intermediate latency possible
Mitigation path Always-ready instances and startup optimization
#### 4.7.2 Y1 (Consumption)

Comparison context

The following values are typical expectations from Microsoft documentation, not measured evidence from this lab run.

Characteristic Typical expectation
Full idle cold path Commonly 5-15s (workload dependent)
Always-ready support Not available
Operational profile Cost-efficient, latency variance higher
#### 4.7.3 EP (Premium)

Comparison context

The following values are typical expectations from Microsoft documentation, not measured evidence from this lab run.

Characteristic Typical expectation
Baseline availability At least one always-ready instance practical baseline
Pre-warmed behavior Additional pre-warmed instances reduce scale-out spikes
Cold-start profile Lowest variance when configured correctly
### 4.8 Triangulated evidence table
Evidence type Real value / pattern
--- ---
Client full-idle first hit 30.485s
Host startup traces Host started (363ms), Host started (453ms)
Warm request client latency 67ms-99ms
Warm request server-side 3.63ms-5.86ms
Scale-event server-side 1719ms-1842ms
Post-restart cold 3.156s
### 4.9 Core finding and explanation

Core finding (validated with real FC1 evidence)

Full-idle cold latency (30.485s) is real and severe, but host startup trace durations remain fast (363ms, 453ms). Therefore, this run supports the conclusion that FC1 worker provisioning/allocation dominates the full idle cold path, while host initialization is a smaller sub-phase once compute is assigned. Additional evidence (1719ms-1842ms scale-event server-side band) confirms an intermediate latency class distinct from both full idle cold and fully warm steady state.

4.10 Hypothesis verdict

Criterion Verdict Evidence
Full-idle first hit significantly high Supported 30.485s client sample
Host startup can still be fast Supported Host started (363ms), Host started (453ms)
Warm path remains healthy Supported 67ms-99ms client, 3.63ms-5.86ms server
Intermediate scale band exists Supported 1719ms-1842ms server-side
Counter-hypothesis (host startup must be slow) Rejected Large divergence between host-start ms and end-to-end cold latency
Final verdict: Hypothesis supported.
### 4.11 Practical troubleshooting implications
1. When first hit is slow, do not stop at host startup duration.
2. Always compare full idle cold, restart cold, and warm baselines separately.
3. For FC1 latency-sensitive APIs, evaluate always-ready instance configuration first.
4. For Y1 workloads with strict p95 targets, reassess hosting plan suitability.
5. For EP workloads, verify always-ready and pre-warmed settings before code-level escalation.
6. Use scale-event traces to explain 1-2s spikes that are not full cold starts.
### 4.12 Reproducibility notes
- All identifiers in examples are masked (<subscription-id>, <masked-function-app>).
- Commands use long flags only.
- Values in the core findings section map to real repository evidence points from the cold-start lab context.
- If your measured values differ, preserve trend analysis logic rather than absolute thresholds.
---
## Expected Evidence
Use this section as a validation rubric while running the lab.
### Before Trigger (Baseline)
Evidence Source Expected State What to Capture
--- --- ---
Function app state Running and reachable az functionapp show output snapshot
Warm request timings Stable low latency 3 baseline samples in CSV
Runtime trace health No restart loop Query 8 snapshot in baseline window
Endpoint response 200 and healthy payload health.json
Plan context FC1 profile confirmed Config snapshot with region/runtime
### During Incident
Evidence Source Expected State Key Indicator
--- --- ---
Full idle first-hit measurement High latency outlier ~30.485s client-side sample
Query 3 cold-start bins Startup events near elevated first duration bins Correlation visible in same period
Query 7 scale timeline Worker start events during burst windows Multiple worker-init traces
Query 8 lifecycle traces Host-start sequence present Host started (363ms) class message
Request duration telemetry Intermediate spikes during scale events 1719ms-1842ms server-side band
### After Recovery
Evidence Source Expected State Key Indicator
--- --- ---
Warm follow-up requests Return to low latency 67ms-99ms client-side band
Server-side durations Very low steady state 3.63ms-5.86ms band
Failure metrics No sustained error burst Query 1 low failure rate
Host lifecycle No repeated crash-loop signals No persistent start/shutdown oscillation
Incident classification Provisioning-dominant cold path Hypothesis remains supported
### Evidence Timeline
graph LR
    A[Baseline Warm Capture] --> B[Idle Window ~13m]
    B --> C[First Hit: 30.485s]
    C --> D["Host Traces: 363ms / 453ms"]
    D --> E[Scale Event Band: 1719-1842ms]
    E --> F[Warm Recovery: 67-99ms client, 3.63-5.86ms server]
    F --> G[Verdict: Provisioning-dominant cold start]

Evidence Chain: Why This Proves the Hypothesis

Falsification logic

The hypothesis is supported only when all three classes of evidence align: 1. A severe full-idle client symptom (30.485s) is present. 2. Host startup duration remains fast (363ms/453ms), ruling out host-start as the dominant bottleneck. 3. Warm steady-state remains healthy (67-99ms client, 3.63-5.86ms server), ruling out sustained regression. If warm path does not recover, or host startup durations are consistently large and close to first-hit totals, this hypothesis should be rejected and alternative causes investigated (dependency latency, runtime crash loop, or application startup regression).

Repeated-Run Worksheet

Use this worksheet template to record results from multiple runs. Row 1 contains the reference FC1 data; fill in rows 2 and 3 with your own measurements.

Use this worksheet to run at least three cycles and reduce one-off noise. | Cycle | Idle duration (min) | First hit (s) | Restart hit (s) | Warm mean (s) | Host started ms values | Scale-event ms band | Verdict | |---|---:|---:|---:|---:|---|---|---| | Run 1 | 13 | 30.485 | 3.156 | 0.077 | 363, 453 | 1719-1842 | Provisioning-dominant | | Run 2 | 13 | | | | | | | | Run 3 | 13 | | | | | | | Decision guidance after repeated runs: 1. If full-idle first-hit remains high while host-start stays sub-second, keep provisioning-dominant classification. 2. If host-start grows into multi-second range, inspect runtime startup path and extension initialization. 3. If warm mean drifts upward across cycles, investigate sustained regression causes before final attribution. 4. If scale-event band is frequent and overlaps traffic spikes, review always-ready and plan sizing strategy. Operational handoff checklist: | Item | Completed | |---|---| | Baseline, idle-cold, restart-cold, warm-post datasets captured | [ ] | | Query 1, 3, 7, 8 exports attached to incident record | [ ] | | FC1 vs Y1 vs EP mitigation recommendation documented | [ ] | | Related playbook cross-reference included in incident summary | [ ] | | PII check completed on exported artifacts | [ ] |

Clean Up

az group delete \
  --name "$RG" \
  --yes \
  --no-wait
Optional local cleanup:
rm -rf "labs/cold-start/artifacts"

See Also

Sources