Hosting Plans¶
Choosing a hosting plan is the most important Azure Functions platform decision. It controls scale behavior, cold start profile, networking options, limits, and baseline cost.
Prerequisites¶
Before selecting a plan, align on these inputs with your application and platform teams:
- A latency objective (for example p95 HTTP response target) and whether cold starts are acceptable.
- A traffic profile (steady, bursty, batch, or unpredictable) with expected baseline and peak concurrency.
- Networking and security requirements, including VNet integration, private endpoints, and egress controls.
- Runtime and operating-system constraints (Linux/Windows support and language stack requirements).
- Execution characteristics such as average duration, maximum duration, and trigger model.
- Cost model preference (pure consumption, warm baseline, or fixed-capacity reservation).
Validation first
Hosting choice is not just a pricing decision. Validate scale, networking, and timeout behavior in a staging environment before production rollout.
Main Content¶
Hosting plans at a glance¶
Azure Functions supports four main hosting models:
| Plan | Best for | Scale profile | Cold start profile |
|---|---|---|---|
| Consumption | Bursty, lower-cost serverless workloads | 0 to platform limit | Present after idle |
| Flex Consumption | New serverless apps needing VNet + high scale | 0 to high scale ceiling, per-function scaling | Reduced with always-ready |
| Premium | Low-latency, enterprise integration | Elastic with warm baseline | Eliminated for warm baseline |
| Dedicated (App Service Plan) | Predictable, fixed-capacity workloads | Manual/autoscale by plan rules | None (always on) |
Decision rule
Start with Flex Consumption for most new serverless workloads, then choose Premium when you need permanently warm behavior and advanced App Service premium features.
flowchart LR
A[Workload Requirement] --> B{Scale to zero required?}
B -->|Yes| C{Private networking required?}
B -->|No| D{Existing App Service estate?}
C -->|No| E[Consumption or Flex]
C -->|Yes| F[Flex Consumption]
D -->|Yes| G[Dedicated]
D -->|No| H{Low-latency warm baseline needed?}
H -->|Yes| I[Premium]
H -->|No| J[Dedicated or Premium] Consumption plan (classic)¶
Consumption is the original serverless model.
Key characteristics¶
- Scale to zero when idle.
- Billed per execution and execution resources.
- No VNet integration.
- Typical default function timeout: 5 minutes, maximum 10 minutes.
Design trade-offs¶
Use Consumption when:
- traffic is sporadic,
- private networking is not required,
- and occasional cold starts are acceptable.
Avoid Consumption when you require private-only backend access or strict latency SLOs.
Flex Consumption plan¶
Flex Consumption combines serverless economics with modern networking and scaling controls.
Key characteristics¶
- Scale to zero when idle.
- Per-function (or function-group) scaling model.
- Supports VNet integration and private endpoints.
- One Function App per plan.
- Linux-based runtime model.
Critical platform facts¶
- No Kudu/SCM site is available.
- Identity-based storage is required for host storage configuration.
- Blob trigger must use Event Grid source on Flex; standard polling blob trigger is not supported.
- Default function timeout is 30 minutes; maximum is unbounded.
Flex memory profiles¶
At creation time, you choose an instance memory size profile (for example 512 MB, 2048 MB, 4096 MB). This choice affects throughput density and cost.
Always-ready behavior¶
Flex supports always-ready instances per function group to reduce startup latency while retaining scale-to-zero for non-baseline traffic.
Premium plan¶
Premium (Elastic Premium) is designed for consistently low latency and enterprise workloads.
Key characteristics¶
- Instances are permanently warm at configured minimum count.
- Elastic scale beyond baseline.
- Supports VNet integration, private endpoints, and advanced App Service capabilities.
- No practical execution timeout ceiling for most long-running patterns (host settings still apply).
Premium is a fit when¶
- cold start must be eliminated,
- private networking is mandatory,
- long-running executions are expected,
- and you need features like always-warm baseline instances.
Dedicated plan (App Service Plan)¶
Dedicated runs Functions on pre-provisioned App Service compute.
Key characteristics¶
- Fixed VM allocation (you pay regardless of invocation volume).
- Full App Service plan capabilities.
- Manual/autoscale rules at App Service layer.
- Suitable for shared hosting with existing web workloads.
Dedicated is a fit when¶
- you already operate App Service capacity,
- predictable fixed spend is preferred,
- and always-on behavior is required.
Capability matrix¶
| Capability | Consumption | Flex Consumption | Premium | Dedicated |
|---|---|---|---|---|
| Scale to zero | Yes | Yes | No | No |
| VNet integration | No | Yes | Yes | Yes |
| Inbound private endpoint | No | Yes | Yes | Yes |
| Deployment slots | Yes (2 including production) | No | Yes | Yes |
| Kudu/SCM | Yes | No | Yes | Yes |
| Apps per plan | Multiple | One | Multiple | Multiple |
Timeout reference (design-time)¶
| Plan | Default timeout | Maximum timeout |
|---|---|---|
| Consumption (classic) | 5 minutes | 10 minutes |
| Flex Consumption | 30 minutes | Unbounded |
| Premium | 30 minutes (common default) | Unbounded |
| Dedicated | 30 minutes (common default) | Unbounded |
Note
HTTP responses still have platform front-end constraints independent of background function timeout. Design long-running HTTP operations using async patterns.
Cost comparison (relative)¶
The table below is conceptual and should be validated with the Azure Pricing Calculator for your region and workload.
| Plan | Idle cost profile | Burst cost profile | Steady-state cost profile | Cost predictability |
|---|---|---|---|---|
| Consumption | Very low | Can increase rapidly with high execution volume | Can become less efficient than warm plans at high sustained traffic | Low to medium |
| Flex Consumption | Low with optional always-ready baseline | Efficient burst handling with per-function scaling | Moderate; depends on memory profile and always-ready count | Medium |
| Premium | Baseline cost always present | Incremental during burst above warm baseline | Often efficient for high sustained workloads needing low latency | Medium to high |
| Dedicated | Fixed regardless of invocation count | Usually unchanged unless autoscale adds workers | Predictable if workload is stable | High |
CLI examples (plan creation/selection)¶
Use long-form flags only.
# Consumption Function App
az functionapp create \
--resource-group "$RG" \
--name "$APP_NAME" \
--storage-account "$STORAGE_NAME" \
--consumption-plan-location "$LOCATION" \
--runtime node \
--functions-version 4
# Flex Consumption Function App
az functionapp create \
--resource-group "$RG" \
--name "$APP_NAME" \
--storage-account "$STORAGE_NAME" \
--flexconsumption-location "$LOCATION" \
--runtime python \
--runtime-version 3.12
# Premium plan + Function App
az functionapp plan create \
--resource-group "$RG" \
--name "$PLAN_NAME" \
--location "$LOCATION" \
--sku EP1 \
--is-linux
az functionapp create \
--resource-group "$RG" \
--name "$APP_NAME" \
--plan "$PLAN_NAME" \
--storage-account "$STORAGE_NAME" \
--runtime dotnet-isolated \
--functions-version 4
CLI inspection examples with output¶
# Show plan details
az functionapp plan show \
--resource-group "$RG" \
--name "$PLAN_NAME" \
--output json
Example output (PII masked):
{
"id": "/subscriptions/<subscription-id>/resourceGroups/rg-functions-demo/providers/Microsoft.Web/serverfarms/plan-demo-functions",
"name": "plan-demo-functions",
"resourceGroup": "rg-functions-demo",
"location": "koreacentral",
"kind": "linux",
"sku": {
"name": "EP1",
"tier": "ElasticPremium",
"size": "EP1",
"capacity": 1
},
"reserved": true,
"numberOfWorkers": 1,
"status": "Ready"
}
# List Function Apps in a resource group
az functionapp list \
--resource-group "$RG" \
--output table
Example output (PII masked):
Name Location State ResourceGroup
---------------------- ------------ ------- -------------------
func-demo-api koreacentral Running rg-functions-demo
func-demo-jobs koreacentral Running rg-functions-demo
func-demo-ingest koreacentral Stopped rg-functions-demo
Hosting decision workflow¶
- Decide if scale-to-zero is required.
- Decide if private networking is required.
- Decide if cold-start elimination is required.
- Decide if one-app-per-plan (Flex) is acceptable.
- Validate timeout requirements against plan limits.
If you need both serverless cost behavior and private networking, Flex is usually the first candidate.
Workflow flowchart¶
flowchart TD
A[Start: Define workload requirements] --> B{Need scale-to-zero?}
B -->|No| C{Need strict low latency?}
B -->|Yes| D{"Need VNet/private endpoints?"}
D -->|No| E[Choose Consumption]
D -->|Yes| F{Accept one app per plan?}
F -->|Yes| G[Choose Flex Consumption]
F -->|No| H[Choose Premium]
C -->|Yes| H
C -->|No| I{Have existing App Service capacity?}
I -->|Yes| J[Choose Dedicated]
I -->|No| K[Choose Premium or Dedicated by cost model] Troubleshooting matrix¶
| Symptom | Likely Cause | Validation Path |
|---|---|---|
| Cold start exceeding SLA | Consumption plan idle deallocation | Review invocation gap and first-request latency metrics; consider Flex always-ready or Premium warm instances |
| VNet-dependent calls fail on startup | Plan does not support required private networking path | Verify plan capability matrix, check networking configuration, then move to Flex/Premium/Dedicated as needed |
| Blob-trigger function not firing on Flex | Blob trigger source not configured for Event Grid | Validate trigger configuration and storage event subscription wiring |
| Deployment diagnostics missing on Flex | Expectation of Kudu/SCM availability | Confirm Flex platform constraints; use deployment logs and Application Insights telemetry |
| Unbounded execution assumed for HTTP request | Confusion between function timeout and HTTP front-end limits | Validate HTTP behavior with async pattern and durable orchestration approach |
| Premium costs unexpectedly high | Warm instance baseline set above actual need | Inspect minimum instance settings versus real traffic baseline and reduce warm capacity where safe |
Anti-patterns¶
- Choosing Consumption, then requiring private-only data paths later.
- Choosing Dedicated for highly sporadic workloads.
- Ignoring Flex one-app-per-plan constraint in multi-app tenancy designs.
- Assuming Kudu is available on Flex.
Language Guide
For Python-specific deployment nuances on each plan, see Python Runtime.
Advanced Topics¶
Plan migration strategies¶
Use migration as a controlled rollout, not a same-day in-place switch:
- Baseline current latency, error rate, and concurrency under representative load.
- Provision target plan in parallel with matching app settings and identities.
- Deploy the same artifact, then run synthetic and replay tests.
- Shift traffic gradually (for example by route or percentage) and observe key indicators.
- Keep rollback path ready until stability is validated over at least one peak cycle.
Flex memory profile selection guide¶
Use these selection heuristics:
- Start with 2048 MB for mixed trigger production workloads unless profiling shows a lower or higher requirement.
- Use 512 MB for lightweight handlers with small dependency footprints.
- Use 4096 MB for CPU-intensive transforms, larger runtime dependencies, or memory-heavy bindings.
- Re-evaluate profile choice after observing p95 duration, worker restarts, and memory pressure trends.
Multi-app vs single-app design in Premium¶
Premium supports multiple Function Apps per plan, which enables density but introduces tenancy choices:
- Multi-app-per-plan is efficient when apps have aligned scaling behavior and shared ownership.
- Single-app-per-plan improves isolation for compliance, release independence, and noisy-neighbor control.
- Separate plans are recommended when one workload is highly bursty and another is latency critical.
Cost estimation formulas (conceptual)¶
Use conceptual formulas to compare plan behavior before detailed calculator modeling:
- Consumption monthly cost ~= execution_count * execution_unit_price + execution_gb_seconds * gb_second_price
- Flex monthly cost ~= variable_execution_cost + always_ready_instance_hours * instance_hour_price
- Premium monthly cost ~= warm_instance_count * hours_per_month * instance_hour_price + burst_scale_cost
- Dedicated monthly cost ~= worker_count * hours_per_month * worker_hour_price
Language-Specific Details¶
- Python hosting and runtime guidance
- Python platform limits and sizing considerations
- Node.js language guide landing page
- Java language guide landing page
- .NET language guide landing page
See Also¶
- Architecture
- Scaling
- Networking
- Security
- Deployment Scenarios — Cross-plan comparison of VNet, PE, identity, and deployment patterns