Dedicated GPU Profiles¶

Dedicated GPU profiles are the Azure Container Apps option for GPU-backed workloads that need reserved hardware instead of serverless GPU allocation. Microsoft Learn does not label these dedicated GPU profiles as preview, but it does document important region and capacity limits.

Main Content¶

Capacity and region limits apply

Microsoft Learn states that GPU-enabled Dedicated profiles are available only in select regions and that capacity is allocated on a per-case basis. Plan capacity requests early and validate regional availability before you commit to a design.

Current GPU profile names¶

GPU model	Current Dedicated profile names	Allocation
A100	`NC24-A100`, `NC48-A100`, `NC96-A100`	Per node

Microsoft Learn also lists serverless Consumption GPU profile names separately:

Consumption-GPU-NC24-A100
Consumption-GPU-NC8as-T4

That means T4 currently appears in Consumption GPU profiles, not in the Dedicated GPU list reviewed for this guide.

Placement model¶

flowchart TD
    ENV[Workload profiles environment] --> GP[Dedicated GPU profile]
    GP --> APP1[Inference API]
    GP --> APP2[Batch scoring worker]
    APP1 --> N1[GPU-backed nodes]
    APP2 --> N1
    N1 --> B[Dedicated plan instance billing]

When Dedicated GPU is the better fit¶

Choose Dedicated GPU profiles when you need:

Stable GPU-backed inference capacity.
Multiple apps sharing the same dedicated GPU node pool.
A reserved environment design instead of per-replica serverless GPU billing.

Typical examples include:

Real-time model inference APIs.
Batch image or document processing.
GPU-heavy background workers with predictable utilization.

Cost considerations¶

Dedicated GPU placement changes the cost model:

Billing follows workload profile instances, not individual apps.
Extra instances add cost as profiles scale out.
Dedicated plan management charges apply when you use dedicated workload profiles.

Compare serverless GPU and dedicated GPU deliberately

If your workload is sporadic, Consumption GPU may fit better. If you need reserved GPU capacity or multiple apps sharing the same GPU pool, Dedicated GPU is usually the cleaner design.

Dedicated GPU Profiles¶

Main Content¶

Current GPU profile names¶

Placement model¶

When Dedicated GPU is the better fit¶

Cost considerations¶

See Also¶

Sources¶