Overview¶

Azure Monitor provides a full stack monitoring solution for your applications and infrastructure. This guide provides a structured approach to implementing observability using native Azure tools.

Core Capabilities¶

Azure Monitor collects and analyzes telemetry from your cloud and on-premises environments. It helps you understand how your applications perform and proactively identifies issues affecting them and the resources they depend on.

graph TD
    A[Sources] --> B[Azure Monitor]
    B --> C[Insights]
    B --> D[Visualize]
    B --> E[Analyze]
    B --> F[Respond]
    B --> G[Integrate]

    subgraph "Data Platform"
        B1[(Metrics)]
        B2[(Logs)]
        B3[(Traces)]
        B4[(Changes)]
    end

Guide Scope¶

This guide covers the implementation and operational aspects of Azure Monitor. It focuses on:

Platform Configuration: Setting up Log Analytics workspaces, Data Collection Rules, and Managed Identities.
Service Monitoring: Specific strategies for AKS, App Service, Functions, and Virtual Machines.
Operational Excellence: Alerting strategies, dashboarding, and cost management.
Advanced Troubleshooting: Using Kusto Query Language (KQL) and specialized playbooks.

Target Audience¶

Platform Engineers: Responsible for the shared monitoring infrastructure and governance.
Developers: Implementing application-level instrumentation and tracing.
SRE/Operations: Managing alerts, responding to incidents, and ensuring system reliability.
Architects: Designing resilient systems with observability in mind.

Structure¶

The guide is organized into logical sections that follow the monitoring lifecycle:

Start Here: Orientation, role-based learning paths, and repository navigation.
Platform: Core Azure Monitor architecture, data platform, workspace design foundations, and security concepts.
Best Practices: Recommended patterns for retention, alerting, access control, tagging, and cost optimization.
Service Guides: Tailored monitoring implementations for App Service, Container Apps, Functions, AKS, and virtual machines.
Operations: Repeatable day-2 runbooks for workspaces, diagnostics, alert rules, dashboards, exports, and cost control.
Troubleshooting: Decision trees, evidence mapping, KQL query packs, and incident playbooks.
Reference: Quick lookup material for CLI commands, KQL syntax, diagnostic tables, and platform limits.

Overview¶

Core Capabilities¶

Guide Scope¶

Target Audience¶

Structure¶

See Also¶

Sources¶