Best Practices¶
This section is the design judgment layer of the guide. Read it after Platform and before implementation so architectural choices are intentional rather than reactive.
Main Content¶
Why this section exists¶
Use this section as the bridge between platform knowledge and implementation decisions:
- Platform explains how Container Apps behaves.
- Best Practices helps you decide what to do with that behavior.
- Language Guides show how to implement those decisions.
flowchart LR
A[Platform Understanding<br/>How Container Apps works] --> B[Best Practices<br/>Design judgment and trade-offs]
B --> C[Implementation<br/>Language Guides and recipes]
C --> D[Operations<br/>Run, monitor, and improve] Document map¶
| Topic | Purpose | Primary Outcome |
|---|---|---|
| Container Design | Establish image, probe, startup, and logging standards | Consistent container builds across teams |
| Revision Strategy | Choose revision mode, traffic split, and rollback approach | Lower deployment risk and faster recovery |
| Scaling | Match scale rules to workload characteristics | Better performance stability and cost efficiency |
| Networking | Design secure inbound/outbound connectivity | Controlled traffic paths and reliable DNS behavior |
| Identity and Secrets | Apply managed identity and secret management patterns | Reduced attack surface and stronger secret hygiene |
| Reliability | Build resilience for transient failures and outages | Improved availability and clearer failure handling |
| Cost Optimization | Right-size profiles, registries, and log retention | Lower operational cost without sacrificing reliability |
| Jobs | Design job triggers, retry, and execution patterns | Reliable batch and event-driven processing |
| Anti-Patterns | Identify common Container Apps design mistakes early | Fewer avoidable incidents and refactoring cycles |
How to use this section¶
- Read Platform first to align on environment, revision, scale, and networking behavior.
- Establish a baseline from Container Design, Revision Strategy, and Scaling.
- Add boundary controls through Networking and Identity and Secrets.
- Harden for production with Reliability, Cost Optimization, and Jobs.
- Run Anti-Patterns as a final pre-production and post-incident review checklist.
Design judgment over checklist thinking
This section is not only a static checklist. Use it to reason about trade-offs in your context: traffic shape, compliance, team maturity, dependency profile, and cost envelope.
Recommended reading paths¶
Path A: New production rollout¶
- Container Design
- Revision Strategy
- Scaling
- Networking
- Identity and Secrets
- Reliability
- Cost Optimization
- Jobs
- Anti-Patterns
Path B: Existing app hardening¶
- Anti-Patterns
- Identity and Secrets
- Networking
- Reliability
- Revision Strategy
- Scaling
- Container Design
- Cost Optimization
- Jobs
Path C: Performance and cost tuning¶
Decision areas covered¶
- Compute baseline selection across Consumption and workload profiles, including replica and resource boundaries.
- Container standards for image hardening, startup behavior, health probes, graceful shutdown, and structured logs.
- Identity and secret patterns using managed identity, Key Vault references, and rotation workflows.
- Connectivity controls for ingress scope, internal service calls, private endpoints, DNS, and egress constraints.
- Release safety through revision mode, traffic splitting, rollback criteria, and blast-radius control.
- Scale behavior design using KEDA trigger fit, cooldown windows, concurrency shaping, and scale-to-zero boundaries.
- Reliability mechanics for dependency failure handling, transient retries, observability thresholds, and recovery runbooks.
Who should read this¶
| Role | Start With | Then Read |
|---|---|---|
| Application architects | Container Design → Revision Strategy | Scaling, Reliability, Networking |
| Platform engineers | Networking → Identity and Secrets | Revision Strategy, Reliability, Cost Optimization |
| Tech leads | Anti-Patterns → Container Design | Scaling, Cost Optimization, Jobs |
| Operators | Reliability → Scaling | Revision Strategy, Anti-Patterns, Recovery guidance in Operations |
What this section is not¶
- Not a replacement for platform docs.
- Not language-specific implementation detail.
- Not a one-time task; revisit after major workload changes.
Avoid copy-paste architecture
A reference architecture is a starting point, not a universal answer. Validate each decision against your own latency, throughput, compliance, and operational constraints.
Quality gate before implementation¶
- [ ] Compute profile and image strategy are defined, including base image, startup command, and resource sizing.
- [ ] HTTPS ingress model is chosen (external or internal), with exposure scope and custom domain needs documented.
- [ ] Managed identity and secret lifecycle strategy are documented, including Key Vault integration and rotation ownership.
- [ ] Networking design is validated for inbound/outbound dependencies, DNS behavior, and private connectivity.
- [ ] Logging and telemetry baseline is implemented with structured logs, metric targets, and trace correlation.
- [ ] Deployment and rollback strategy is validated using revision traffic controls and explicit rollback triggers.
Decision record template¶
| Decision | Context | Alternatives considered | Trade-off | Validation | Review date |
|---|---|---|---|---|---|
| Use workload profile with min replicas 2 for API app | API has weekday peaks and strict p95 latency objective | Consumption with scale-to-zero; Consumption with min replicas 1 | Higher baseline cost in exchange for reduced cold-start risk and steadier latency | 14-day p95 latency and cost review from Log Analytics + Metrics | 2026-07-01 |
Common outcomes when this layer is skipped¶
- Apps are deployed with default scale behavior that does not match traffic shape, causing unstable latency or unnecessary spend.
- Secrets are injected directly as environment values without managed rotation ownership or Key Vault integration.
- Revision mode and rollback procedures are undefined, making incident recovery slow and high risk.
- Ingress and networking choices are made ad hoc, resulting in avoidable connectivity failures and DNS confusion.
- Probe and startup baselines are inconsistent, increasing restart loops and hard-to-diagnose runtime failures.
Keeping guidance current¶
- Revisit decisions after major workload profile changes (traffic growth, new regions, new dependencies).
- Re-evaluate after platform feature updates that affect scaling, networking, identity, or jobs.
- Update decision records after incidents, postmortems, and significant architecture deviations.
- Review quarterly with platform, security, and operations stakeholders to remove stale assumptions.
Advanced Topics¶
- Define environment-specific baseline profiles (dev, pre-prod, prod) with explicit trade-off boundaries.
- Encode standards as policy-as-code and CI checks to prevent drift before deployment.
- Run architecture drift reviews against actual revision and scaling behavior in production telemetry.
- Tie design decisions to SLO indicators so trade-offs remain measurable over time.