Creating and analyzing metrics and alerts
To analyze alerts and metrics across Azure Monitor, we need to go to the monitoring resource inside the Azure portal. In the upcoming sections, we will set up metrics and alerts and show you how to analyze them.
Metrics
Metrics describe an aspect of a system at a particular point in time and are displayed in numerical values. They are collected at regular intervals and are identified with a timestamp, a name, a value, and one or more defining labels. They are capable of supporting near real-time scenarios and are useful for alerting. Alerts can be fired quickly with relatively simple logic.
Metrics in Azure Monitor are stored in a time-series database that is optimized for analyzing timestamped data. This makes metrics suited for the fast detection of issues. They can help to detect how your service or system is performing, but to get the overall picture, they typically need to be combined with logs to identify the root cause of issues.
You can use metrics for the following scenarios:
- Analyzing: Collected metrics can be analyzed using a chart in Metric Explorer. Metrics from various resources can be compared as well.
- Visualizing: You can create an Azure Monitor workbook to combine multiple datasets into an interactive report. Azure Monitor workbooks can combine text, Azure metrics, analytics queries, and parameters into rich interactive reports.
- Alerting: Metric alert rules can be configured to send out notifications to the user. They can also take automatic action when the metric value crosses a threshold.
- Automating: To increase and decrease resources based on metric values that cross a threshold, autoscaling can be used.
- Exporting: Metrics can be streamed to an Event Hub to route them to external systems. Metrics can also be routed to logs in the Log Analytics workspace in order to be analyzed together with the Azure Monitor logs and to store the metric values for more than 93 days.
- Retrieving: Metrics values can be retrieved from the command line using PowerShell cmdlets and the CLI, and from custom applications using the Azure Monitoring REST API.
- Archiving: Metrics data can be archived in Azure Storage. It can store the performance or health history of your resource for compliance, auditing, or offline reporting purposes.
There are four main sources of metrics that are collected by Azure Monitor. Once they are collected and stored in the Azure Monitor Metric database, they can be evaluated together regardless of their source:
- Platform metrics: These metrics give you visibility of the health and performance of your Azure resources. Without any configuration required, a distinct set of metrics is created for each type of Azure resource. By default, they are collected at one-minute intervals. However, you can configure them to run at different intervals as well.
- Guest OS metrics: These metrics are collected from the guest OS of a virtual machine. To enable guest OS metrics for Windows machines, the Windows Diagnostic Extension (WAD) agent needs to be installed. For Linux machines, the InfluxData Telegraf agent needs to be installed.
- Application metrics: These metrics are created by Application Insights. They can help to detect performance issues for your custom applications and track trends in how the application is being used.
- Custom metrics: These are metrics that you define manually. You can define them in your custom applications that are monitored by Application Insights or you can define custom metrics for an Azure service using the custom metrics API.
Tip
For more information about the InfluxData Telegraf agent, go to the InfluxData website, https://www.influxdata.com/time-series-platform/telegraf/.
Multi-dimensional metrics
Metrics data often has limited information to provide context for collected values. This challenge is addressed by Azure Monitor using multi-dimensional metrics. The dimensions of the metrics are name-value pairs that store additional data that describe the metric value. For example, a metric called available disk space could have a dimension called Drive with the values C: and D: stored inside. This value would allow the viewing of available disk space across all drives, or each drive individually.
In the next section, we are going to create a metric in the Azure portal.
Creating a metric
To display the metrics for a Windows VM (if you followed the creating a VM walk-through detailed in the Technical requirements section) in Azure Monitor, follow these steps:
- Navigate to the Azure portal by opening https://portal.azure.com.
- In the left-hand menu, select Monitor to open the Azure Monitor overview blade.
- First, we’re going to look at metrics. Therefore, in the left-hand menu, select Metrics or select the Explore Metrics button from the overview blade.
- In the Metrics overview blade, the Select a Scope may be automatically displayed; if not, click on the + Select a scope button. A new blade will open up where you can select the subscription, the resource group, and the resource type. Select the subscription that is used for the Windows VM, select the resource group, and then select the VM. You can filter by other resource types, as well:
- Click on Apply.
- Then you can select the metric type. Select CPU Credits Consumed, for instance:
Tip
Take some time to look at the different metrics that you can choose from. This may be a part of the exam questions.
- You can select a different type of aggregation as well, such as the count, average, and more, in the filter box. At the top-right of the blade, you can select a different time range for your metric as well:
- You can also pin this metric to the overview dashboard in the Azure portal. Therefore, click on the Pin to dashboard button, and then choose to pin it to the current dashboard or create a new dashboard for it. For now, select Pin to current dashboard:
- If you now select Dashboard from the left-hand menu, you’ll see that this metric is added to it. This way, you can easily analyze this metric without needing open Azure Monitor.
Important note
Metrics are also available directly from the Azure resource blades. So, for instance, if you have a VM, go to the VM resource by selecting it. Then, in the left-hand menu, under Monitoring, you can select Metrics.
In the next section, we’re going to look at how to set up and analyze alerts in Azure Monitor.
Alerts
With alerts, Azure can proactively notify you when critical conditions occur in the Azure or on-premises environment. Alerts can also attempt to take corrective actions automatically. Alert rules that are based on metrics will provide near real-time alerting, based on the metric. Alerts that are created based on logs can merge data from different resources.
The alerts in Azure Monitor use action groups, which are unique sets of recipients and actions that can be shared across multiple rules. These action groups can use webhooks to start external actions, based on the requirements that are set up for this alert. These external actions can then be picked up by different Azure resources, such as Runbooks, Functions, or Logic Apps. Webhooks can also be used to add these alerts to external IT Service Management (ITSM) tools.
You can also set alerts for all of the different Azure resources. In the following sections, we are going to create an alert.
Creating an alert and an action group
To create an alert, follow these steps:
- From the Azure Monitor overview blade, in the left-hand menu, select Alerts. You can also go to the alerts settings by clicking on Create alert to create an alert directly.
- In the Alerts blade, click on + New alert rule in the top menu:
- The Create rule blade is displayed. Here, you can create the rule and action groups. To create a new rule, you need to first select the resource. Click on the Select button in the RESOURCE section:
- In the next blade, you can filter by the subscription and resource type. Select Virtual machines:
- Select the VM from the list and click Done.
- Now that we have selected a resource, we’re going to set up the condition. Click on Add condition.
- The condition blade is open, and so we can filter by a certain signal. Select Percentage CPU and click Done:
- Next, you can set the alert logic for this alert. You can choose multiple operators, set the aggregation type, and set the threshold value for this alert. Set the following:
a) Threshold: Static (in the next section, we are going to cover the difference between static and dynamic thresholds)
b) Operator: Greater than
c) Aggregation type: Average
d) Threshold Value: 90%
- Leave Evaluated based on with its default settings.
- This alert will notify you when the CPU usage of the VMs is greater than 90% over a 5-minute period. Azure Monitor will check this every minute:
- Click on Done to create this condition.
- Now, we have to create an action group to send the alert to. This is then responsible for handling the alert and taking further action on it. The action group that you create here can be reused across other alerts as well. So, in our case, we will create an email action group that will send out an email to a certain email address. After it has been created, you can add this action group to other alerts. Under Action group, select the Create new button.
- In the Action Group blade, add the following settings:
a) Action group name: Type
Send email
.b) Short name: Type
email
.c) Subscription: Select the subscription where the VM is created.
d) Resource group: Select Default-ActivityLogAlerts (to be created).
- Then, we have to provide the actual action. Add the following values:
a) Action name:
email
b) Action type: Email/SMS/Push/Voice
- Then, select Edit details and select the Email checkbox. Provide an email address and click on the OK button:
- Click on OK again.
- Finally, you have to specify an alert name, set the severity level of the alert, and click on Create alert rule:
We have now created an alert and an action group that will alert a user via email when the CPU goes over 90%. In the next section, we’re going to create a baseline for resources.