Hosting Plans¶

Choosing a hosting plan is the most important Azure Functions platform decision. It controls scale behavior, cold start profile, networking options, limits, and baseline cost.

Prerequisites¶

Before selecting a plan, align on these inputs with your application and platform teams:

A latency objective (for example p95 HTTP response target) and whether cold starts are acceptable.
A traffic profile (steady, bursty, batch, or unpredictable) with expected baseline and peak concurrency.
Networking and security requirements, including VNet integration, private endpoints, and egress controls.
Runtime and operating-system constraints (Linux/Windows support and language stack requirements).
Execution characteristics such as average duration, maximum duration, and trigger model.
Cost model preference (pure consumption, warm baseline, or fixed-capacity reservation).

Validation first

Hosting choice is not just a pricing decision. Validate scale, networking, and timeout behavior in a staging environment before production rollout.

Main Content¶

Hosting plans at a glance¶

Azure Functions supports four main hosting models:

Plan	Best for	Scale profile	Cold start profile
Consumption	Bursty, lower-cost serverless workloads	0 to platform limit	Present after idle
Flex Consumption	New serverless apps needing VNet + high scale	0 to high scale ceiling, per-function scaling	Reduced with always-ready
Premium	Low-latency, enterprise integration	Elastic with warm baseline	Eliminated for warm baseline
Dedicated (App Service Plan)	Predictable, fixed-capacity workloads	Manual/autoscale by plan rules	None (always on)

Decision rule

Start with Flex Consumption for most new serverless workloads, then choose Premium when you need permanently warm behavior and advanced App Service premium features.

flowchart LR
    A[Workload Requirement] --> B{Scale to zero required?}
    B -->|Yes| C{Private networking required?}
    B -->|No| D{Existing App Service estate?}
    C -->|No| E[Consumption or Flex]
    C -->|Yes| F[Flex Consumption]
    D -->|Yes| G[Dedicated]
    D -->|No| H{Low-latency warm baseline needed?}
    H -->|Yes| I[Premium]
    H -->|No| J[Dedicated or Premium]

Consumption plan (classic)¶

Consumption is the original serverless model.

Key characteristics¶

Scale to zero when idle.
Billed per execution and execution resources.
No VNet integration.
Typical default function timeout: 5 minutes, maximum 10 minutes.

Design trade-offs¶

Use Consumption when:

traffic is sporadic,
private networking is not required,
and occasional cold starts are acceptable.

Avoid Consumption when you require private-only backend access or strict latency SLOs.

Flex Consumption plan¶

Flex Consumption combines serverless economics with modern networking and scaling controls.

Key characteristics¶

Scale to zero when idle.
Per-function (or function-group) scaling model.
Supports VNet integration and private endpoints.
One Function App per plan.
Linux-based runtime model.

Critical platform facts¶

No Kudu/SCM site is available.
Identity-based storage is required for host storage configuration.
Blob trigger must use Event Grid source on Flex; standard polling blob trigger is not supported.
Default function timeout is 30 minutes; maximum is unbounded.

Flex memory profiles¶

At creation time, you choose an instance memory size profile (for example 512 MB, 2048 MB, 4096 MB). This choice affects throughput density and cost.

Always-ready behavior¶

Flex supports always-ready instances per function group to reduce startup latency while retaining scale-to-zero for non-baseline traffic.

Premium plan¶

Premium (Elastic Premium) is designed for consistently low latency and enterprise workloads.

Key characteristics¶

Instances are permanently warm at configured minimum count.
Elastic scale beyond baseline.
Supports VNet integration, private endpoints, and advanced App Service capabilities.
No practical execution timeout ceiling for most long-running patterns (host settings still apply).

Premium is a fit when¶

cold start must be eliminated,
private networking is mandatory,
long-running executions are expected,
and you need features like always-warm baseline instances.

Dedicated plan (App Service Plan)¶

Dedicated runs Functions on pre-provisioned App Service compute.

Key characteristics¶

Fixed VM allocation (you pay regardless of invocation volume).
Full App Service plan capabilities.
Manual/autoscale rules at App Service layer.
Suitable for shared hosting with existing web workloads.

Dedicated is a fit when¶

you already operate App Service capacity,
predictable fixed spend is preferred,
and always-on behavior is required.

Capability matrix¶

Capability	Consumption	Flex Consumption	Premium	Dedicated
Scale to zero	Yes	Yes	No	No
VNet integration	No	Yes	Yes	Yes
Inbound private endpoint	No	Yes	Yes	Yes
Deployment slots	Yes (2 including production)	No	Yes	Yes
Kudu/SCM	Yes	No	Yes	Yes
Apps per plan	Multiple	One	Multiple	Multiple

Timeout reference (design-time)¶

Plan	Default timeout	Maximum timeout
Consumption (classic)	5 minutes	10 minutes
Flex Consumption	30 minutes	Unbounded
Premium	30 minutes (common default)	Unbounded
Dedicated	30 minutes (common default)	Unbounded

Note

HTTP responses still have platform front-end constraints independent of background function timeout. Design long-running HTTP operations using async patterns.

Cost comparison (relative)¶

The table below is conceptual and should be validated with the Azure Pricing Calculator for your region and workload.

Plan	Idle cost profile	Burst cost profile	Steady-state cost profile	Cost predictability
Consumption	Very low	Can increase rapidly with high execution volume	Can become less efficient than warm plans at high sustained traffic	Low to medium
Flex Consumption	Low with optional always-ready baseline	Efficient burst handling with per-function scaling	Moderate; depends on memory profile and always-ready count	Medium
Premium	Baseline cost always present	Incremental during burst above warm baseline	Often efficient for high sustained workloads needing low latency	Medium to high
Dedicated	Fixed regardless of invocation count	Usually unchanged unless autoscale adds workers	Predictable if workload is stable	High

CLI examples (plan creation/selection)¶

Use long-form flags only.

# Consumption Function App
az functionapp create \
  --resource-group "$RG" \
  --name "$APP_NAME" \
  --storage-account "$STORAGE_NAME" \
  --consumption-plan-location "$LOCATION" \
  --runtime node \
  --functions-version 4

# Flex Consumption Function App
az functionapp create \
  --resource-group "$RG" \
  --name "$APP_NAME" \
  --storage-account "$STORAGE_NAME" \
  --flexconsumption-location "$LOCATION" \
  --runtime python \
  --runtime-version 3.12

# Premium plan + Function App
az functionapp plan create \
  --resource-group "$RG" \
  --name "$PLAN_NAME" \
  --location "$LOCATION" \
  --sku EP1 \
  --is-linux

az functionapp create \
  --resource-group "$RG" \
  --name "$APP_NAME" \
  --plan "$PLAN_NAME" \
  --storage-account "$STORAGE_NAME" \
  --runtime dotnet-isolated \
  --functions-version 4

CLI inspection examples with output¶

# Show plan details
az functionapp plan show \
  --resource-group "$RG" \
  --name "$PLAN_NAME" \
  --output json

Example output (PII masked):

{
  "id": "/subscriptions/<subscription-id>/resourceGroups/rg-functions-demo/providers/Microsoft.Web/serverfarms/plan-demo-functions",
  "name": "plan-demo-functions",
  "resourceGroup": "rg-functions-demo",
  "location": "koreacentral",
  "kind": "linux",
  "sku": {
    "name": "EP1",
    "tier": "ElasticPremium",
    "size": "EP1",
    "capacity": 1
  },
  "reserved": true,
  "numberOfWorkers": 1,
  "status": "Ready"
}

# List Function Apps in a resource group
az functionapp list \
  --resource-group "$RG" \
  --output table

Example output (PII masked):

Name                    Location      State    ResourceGroup
----------------------  ------------  -------  -------------------
func-demo-api           koreacentral  Running  rg-functions-demo
func-demo-jobs          koreacentral  Running  rg-functions-demo
func-demo-ingest        koreacentral  Stopped  rg-functions-demo

Hosting decision workflow¶

Decide if scale-to-zero is required.
Decide if private networking is required.
Decide if cold-start elimination is required.
Decide if one-app-per-plan (Flex) is acceptable.
Validate timeout requirements against plan limits.

If you need both serverless cost behavior and private networking, Flex is usually the first candidate.

Workflow flowchart¶

flowchart TD
    A[Start: Define workload requirements] --> B{Need scale-to-zero?}
    B -->|No| C{Need strict low latency?}
    B -->|Yes| D{"Need VNet/private endpoints?"}
    D -->|No| E[Choose Consumption]
    D -->|Yes| F{Accept one app per plan?}
    F -->|Yes| G[Choose Flex Consumption]
    F -->|No| H[Choose Premium]
    C -->|Yes| H
    C -->|No| I{Have existing App Service capacity?}
    I -->|Yes| J[Choose Dedicated]
    I -->|No| K[Choose Premium or Dedicated by cost model]

Troubleshooting matrix¶

Symptom	Likely Cause	Validation Path
Cold start exceeding SLA	Consumption plan idle deallocation	Review invocation gap and first-request latency metrics; consider Flex always-ready or Premium warm instances
VNet-dependent calls fail on startup	Plan does not support required private networking path	Verify plan capability matrix, check networking configuration, then move to Flex/Premium/Dedicated as needed
Blob-trigger function not firing on Flex	Blob trigger source not configured for Event Grid	Validate trigger configuration and storage event subscription wiring
Deployment diagnostics missing on Flex	Expectation of Kudu/SCM availability	Confirm Flex platform constraints; use deployment logs and Application Insights telemetry
Unbounded execution assumed for HTTP request	Confusion between function timeout and HTTP front-end limits	Validate HTTP behavior with async pattern and durable orchestration approach
Premium costs unexpectedly high	Warm instance baseline set above actual need	Inspect minimum instance settings versus real traffic baseline and reduce warm capacity where safe

Anti-patterns¶

Choosing Consumption, then requiring private-only data paths later.
Choosing Dedicated for highly sporadic workloads.
Ignoring Flex one-app-per-plan constraint in multi-app tenancy designs.
Assuming Kudu is available on Flex.

Language Guide

For Python-specific deployment nuances on each plan, see Python Runtime.

Advanced Topics¶

Plan migration strategies¶

Use migration as a controlled rollout, not a same-day in-place switch:

Baseline current latency, error rate, and concurrency under representative load.
Provision target plan in parallel with matching app settings and identities.
Deploy the same artifact, then run synthetic and replay tests.
Shift traffic gradually (for example by route or percentage) and observe key indicators.
Keep rollback path ready until stability is validated over at least one peak cycle.

Flex memory profile selection guide¶

Use these selection heuristics:

Start with 2048 MB for mixed trigger production workloads unless profiling shows a lower or higher requirement.
Use 512 MB for lightweight handlers with small dependency footprints.
Use 4096 MB for CPU-intensive transforms, larger runtime dependencies, or memory-heavy bindings.
Re-evaluate profile choice after observing p95 duration, worker restarts, and memory pressure trends.

Multi-app vs single-app design in Premium¶

Premium supports multiple Function Apps per plan, which enables density but introduces tenancy choices:

Multi-app-per-plan is efficient when apps have aligned scaling behavior and shared ownership.
Single-app-per-plan improves isolation for compliance, release independence, and noisy-neighbor control.
Separate plans are recommended when one workload is highly bursty and another is latency critical.

Cost estimation formulas (conceptual)¶

Use conceptual formulas to compare plan behavior before detailed calculator modeling:

Consumption monthly cost ~= execution_count * execution_unit_price + execution_gb_seconds * gb_second_price
Flex monthly cost ~= variable_execution_cost + always_ready_instance_hours * instance_hour_price
Premium monthly cost ~= warm_instance_count * hours_per_month * instance_hour_price + burst_scale_cost
Dedicated monthly cost ~= worker_count * hours_per_month * worker_hour_price

Hosting Plans¶

Prerequisites¶

Main Content¶

Hosting plans at a glance¶

Consumption plan (classic)¶

Key characteristics¶

Design trade-offs¶

Flex Consumption plan¶

Key characteristics¶

Critical platform facts¶

Flex memory profiles¶

Always-ready behavior¶

Premium plan¶

Key characteristics¶

Premium is a fit when¶

Dedicated plan (App Service Plan)¶

Key characteristics¶

Dedicated is a fit when¶

Capability matrix¶

Timeout reference (design-time)¶

Cost comparison (relative)¶

CLI examples (plan creation/selection)¶

CLI inspection examples with output¶

Hosting decision workflow¶

Workflow flowchart¶

Troubleshooting matrix¶

Anti-patterns¶

Advanced Topics¶

Plan migration strategies¶

Flex memory profile selection guide¶

Multi-app vs single-app design in Premium¶

Cost estimation formulas (conceptual)¶

Language-Specific Details¶

See Also¶

Sources¶