Maintenance Windows¶
Maintenance windows align AKS upgrades and disruptive platform changes with business-approved change periods. They reduce surprise, but they do not remove the need for validation.
Prerequisites¶
- Business change windows are known.
- Upgrade and node image rollout cadence is defined.
- Critical workload disruption tolerance is understood.
When to Use¶
- Defining production change governance.
- Preparing auto-upgrade and node image update policies.
- Reducing overlap with peak traffic periods.
Procedure¶
flowchart TD
A[Define business windows] --> B[Map to AKS maintenance policy]
B --> C[Test in non-production]
C --> D[Review after each cycle] - Choose maintenance windows that avoid peak business traffic.
- Align maintenance policy with upgrade cadence and on-call coverage.
- Test how workloads behave during node drains and rolling updates.
- Review whether the maintenance window is still appropriate after growth or regional expansion.
Verification¶
az aks show --resource-group $RG --name $CLUSTER_NAME --query autoUpgradeProfile --output yaml
kubectl get pdb -A
kubectl get nodes
Rollback / Troubleshooting¶
- If maintenance events still create incidents, review PDBs, readiness behavior, and workload singleton patterns.
- If auto-upgrade timing is too risky, pause or narrow the automation scope and move to a controlled manual process.