Lab 03: Azure Monitor Alerts¶
This lab turns monitoring data into action. You will create an action group, build metric and scheduled query alerts, and add an alert processing rule so your sandbox can model both detection and noise suppression patterns.
Lab Metadata¶
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Estimated Duration | 45-60 minutes |
| Azure Monitor Tier | Detection and response |
| Primary Services | Metric alerts, scheduled query alerts, action groups, alert processing rules |
| Skills Practiced | Alert design, notification routing, rule scoping, suppression |
Prerequisites¶
- Complete Lab 01: Log Analytics Workspace Setup.
- Complete Lab 02: Custom KQL Queries if you want reusable KQL available.
- Azure CLI authenticated with
az login. - Permission to create action groups and alert rules.
- A VM or other resource emitting metrics and logs.
Set variables:
export RG="rg-monitoring-lab01"
export WORKSPACE_NAME="lawmonlab01"
export VM_NAME="vmmonlab01"
export ACTION_GROUP_NAME="ag-monitoring-lab03"
export METRIC_ALERT_NAME="alert-vm-high-cpu-lab03"
export LOG_ALERT_NAME="alert-heartbeat-missing-lab03"
export APR_NAME="apr-maintenance-window-lab03"
export WORKSPACE_ID=$(az monitor log-analytics workspace show \
--resource-group "$RG" \
--workspace-name "$WORKSPACE_NAME" \
--query "id" \
--output tsv)
export VM_ID=$(az vm show \
--resource-group "$RG" \
--name "$VM_NAME" \
--query "id" \
--output tsv)
Architecture Diagram¶
flowchart TD
Metric[VM metric] --> MA[Metric alert]
Log[Heartbeat query] --> LQA[Scheduled query alert]
MA --> AG[Action group]
LQA --> AG
AG --> Email[Email notification]
AG --> Webhook[Webhook or automation]
APR[Alert processing rule] --> AG
APR -. suppresses .-> MA
APR -. suppresses .-> LQA Lab Objectives¶
- Create an action group with a human notification target.
- Build a metric alert for CPU utilization.
- Build a log alert for missing heartbeat data.
- Test the underlying query logic before trusting the alert.
- Add an alert processing rule for maintenance or quiet hours.
Step-by-Step Instructions¶
Step 1: Create an action group¶
Replace user@example.com with a mailbox you control in the lab subscription.
az monitor action-group create \
--name "$ACTION_GROUP_NAME" \
--resource-group "$RG" \
--short-name "monlab" \
--action email "OpsEmail" user@example.com \
--output json
Capture the action group resource ID:
export ACTION_GROUP_ID=$(az monitor action-group show \
--name "$ACTION_GROUP_NAME" \
--resource-group "$RG" \
--query "id" \
--output tsv)
Step 2: Review available VM metrics¶
az monitor metrics list-definitions \
--resource "$VM_ID" \
--query "[?contains(name.value, 'CPU')].{metric:name.value,namespace:resourceId}" \
--output table
This confirms the exact metric name before you create the rule.
Step 3: Create a CPU metric alert¶
az monitor metrics alert create \
--name "$METRIC_ALERT_NAME" \
--resource-group "$RG" \
--scopes "$VM_ID" \
--condition "avg Percentage CPU > 80" \
--window-size "5m" \
--evaluation-frequency "1m" \
--severity 2 \
--description "Trigger when VM CPU average exceeds 80 percent for five minutes." \
--action "$ACTION_GROUP_ID" \
--output json
Read the rule back:
az monitor metrics alert show \
--name "$METRIC_ALERT_NAME" \
--resource-group "$RG" \
--query "{name:name,severity:severity,enabled:enabled,windowSize:windowSize,evaluationFrequency:evaluationFrequency}" \
--output json
Step 4: Validate the heartbeat query before creating a log alert¶
az monitor log-analytics query \
--workspace "$WORKSPACE_ID" \
--analytics-query "Heartbeat | where Computer == '$VM_NAME' | where TimeGenerated > ago(15m) | summarize LastSeen=max(TimeGenerated), Beats=count() by Computer" \
--output table
This baseline matters because a log alert is only as good as the query it evaluates.
Step 5: Create a scheduled query alert¶
az monitor scheduled-query create \
--name "$LOG_ALERT_NAME" \
--resource-group "$RG" \
--scopes "$WORKSPACE_ID" \
--condition "count 'HeartbeatGap' > 0" \
--condition-query "HeartbeatGap=Heartbeat | where Computer == '$VM_NAME' | summarize LastSeen=max(TimeGenerated) by Computer | where LastSeen < ago(10m)" \
--evaluation-frequency "5m" \
--window-size "10m" \
--severity 2 \
--skip-query-validation true \
--action-groups "$ACTION_GROUP_ID" \
--description "Trigger when the lab VM stops sending heartbeat records for more than ten minutes." \
--output json
Review the scheduled query rule:
az monitor scheduled-query show \
--name "$LOG_ALERT_NAME" \
--resource-group "$RG" \
--query "{name:name,severity:severity,enabled:enabled,evaluationFrequency:evaluationFrequency,windowSize:windowSize}" \
--output json
Step 6: Create an alert processing rule for planned maintenance¶
Use a recurrence that fits your own test window. The example below suppresses action groups during a short maintenance period.
az monitor alert-processing-rule create \
--name "$APR_NAME" \
--resource-group "$RG" \
--scopes "$RG" \
--rule-type "RemoveAllActionGroups" \
--schedule "{\"effectiveFrom\":\"2026-04-07T22:00:00\",\"effectiveUntil\":\"2026-04-07T23:00:00\",\"timeZone\":\"UTC\"}" \
--description "Suppress lab notifications during a planned maintenance window." \
--output json
Read the processing rule back:
az monitor alert-processing-rule show \
--name "$APR_NAME" \
--resource-group "$RG" \
--query "{name:name,enabled:enabled,scopes:scopes}" \
--output json
Step 7: Review all alert artifacts together¶
az monitor action-group list \
--resource-group "$RG" \
--query "[].{name:name,enabled:enabled}" \
--output table
az monitor metrics alert list \
--resource-group "$RG" \
--query "[].{name:name,severity:severity,enabled:enabled}" \
--output table
az monitor scheduled-query list \
--resource-group "$RG" \
--query "[].{name:name,severity:severity,enabled:enabled}" \
--output table
Validation Steps¶
Perform these checks before declaring success:
- Confirm the action group exists and includes a receiver.
az monitor action-group show \
--name "$ACTION_GROUP_NAME" \
--resource-group "$RG" \
--query "{name:name,shortName:groupShortName,emailReceivers:emailReceivers}" \
--output json
- Confirm the metric alert is enabled and scoped correctly.
az monitor metrics alert show \
--name "$METRIC_ALERT_NAME" \
--resource-group "$RG" \
--query "{name:name,enabled:enabled,scopes:scopes}" \
--output json
- Confirm the log alert is enabled and references the workspace.
az monitor scheduled-query show \
--name "$LOG_ALERT_NAME" \
--resource-group "$RG" \
--query "{name:name,enabled:enabled,scopes:scopes}" \
--output json
- Confirm the alert processing rule exists.
az monitor alert-processing-rule list \
--resource-group "$RG" \
--query "[].{name:name,enabled:enabled}" \
--output table
Validation succeeds when the action group, metric alert, log alert, and alert processing rule are all present and enabled in the expected scope.
Cleanup Instructions¶
If you are done with alerts, remove them in reverse dependency order:
Keep the workspace and VM if you are proceeding to later labs.