Skip to content

Range Anomaly Detection


Current Document Location

This document is the second step in the detection rule configuration process. After completing the configuration, please return to the main document to continue with the third step: Event Notification.

Within the selected time range, the system performs anomaly detection on metric data. If the proportion of detected data points with sudden anomalies exceeds the preset threshold percentage, a range anomaly event is triggered.

Suitable for monitoring data/metrics with stable trends. For example, detecting when the proportion of data points with sudden anomalies in host CPU usage in the last 1 day exceeds 10%, generating an anomaly event.

Detection Configuration

Detection Frequency

Set the time period for executing detection, automatically matching the selected detection range.

Detection Range (Dropdown Options) Detection Frequency
15m 5m
30m 5m
1h 15m
4h 30m
12h 1h
1d 1h

Detection Range

Set the data time range queried for each detection (❗️The detection range should be greater than or equal to the detection frequency, and must match the actual data reporting period to avoid missed detection or false alarms).

  • Preset options: Last 15 minutes, Last 30 minutes, Last 1 hour, Last 4 hours, Last 12 hours, Last 1 day

  • Custom format: Custom input for detection range, e.g., 20m (last 20 minutes), 2h (last 2 hours), 1d (last 1 day).

Detection Metrics

Define the detection data source and aggregation method based on DQL (❗️Please avoid selecting high-cardinality fields as detection dimensions. If configured improperly, overly lenient trigger conditions may cause frequent alerts. The current query returns a maximum of 100,000 records).

Configuration Elements

Configuration Item Description
Workspace Default is the current workspace, can be switched to other authorized workspaces

After authorization, you can use detection metrics from other workspaces under the current account to create monitors. After the rule is successfully created, cross-workspace alert configuration can be achieved. Please note that when you select another workspace, the dropdown list for detection metrics will only display data types that have been authorized for use by the current workspace.
Data Source Type Metrics, LOG, Infrastructure, Resource Catalog, Events, APM, RUM, Network, Profile, etc.
Query Method Simple Query, Expression Query
Detection Dimension Any string-type (keyword) field in the configuration data can be selected as a detection dimension. Currently, a maximum of three fields can be selected as detection dimensions. By combining multiple detection dimension fields, a specific detection object can be determined. The system will judge whether the statistical metric corresponding to a detection object meets the threshold of the trigger condition. If the condition is met, an event is generated.

(For example, selecting detection dimensions host and host_ip, the detection object could be {host: host1, host_ip: 127.0.0.1}).
Filter Conditions Filter the data of detection metrics based on metric tags to limit the data scope of detection; supports adding one or multiple tag filters; supports fuzzy match and fuzzy non-match filter conditions.
Aggregation Algorithm Avg by (average), Min by (minimum), Max by (maximum), Sum by (sum), Last (last value), First by (first value), Count by (count of data points), Count_distinct by (count of distinct data points), p50 (median value), p75 (value at the 75th percentile), p90 (value at the 90th percentile), p99 (value at the 99th percentile), etc.
Alias Custom detection metric name.

Click to view Detailed Explanation of Query Methods.

Trigger Conditions

Configure trigger conditions and severity levels. When the query result contains multiple values, an event is generated if any value meets the trigger condition.

Supports configuring four-level thresholds: Critical, Severe, Important, Warning, and Normal recovery condition.

Level Configuration Description
Critical When mutation direction is Up or Down/Up/Down, Result >= [value] % Compares the proportion of data points with mutation anomalies. Triggers an event if not within the configured range.
Severe When mutation direction is Up or Down/Up/Down, Result >= [value] % Compares the proportion of data points with mutation anomalies. Triggers an event if not within the configured range.
Important When mutation direction is Up or Down/Up/Down, Result >= [value] % Compares the proportion of data points with mutation anomalies. Triggers an event if not within the configured range.
Warning When mutation direction is Up or Down/Up/Down, Result >= [value] % Compares the proportion of data points with mutation anomalies. Triggers an event if not within the configured range.
Normal No events generated for [N] consecutive detections After the detection rule takes effect, if the data detection result changes from abnormal (Critical, Severe, Important, Warning) to normal within the configured number of custom detections, a recovery alert event is triggered.
❗️ Recovery alert events are not subject to Alert Silence restrictions. If the number of recovery alert event detections is not set, the alert event will not recover and will remain in the Events > Unrecovered Events List.

For more details, refer to Event Level Description.

Bulk Alert Protection

Enabled by default in the system.

When the number of alerts generated by a single detection exceeds a preset threshold, the system automatically switches to a status summary strategy: instead of processing each alert object individually, it generates a small number of summary alerts based on event status and pushes them.

This ensures the timeliness of notifications while significantly reducing alert noise and avoiding timeout risks caused by processing too many alerts.

When this switch is on, subsequent Event Details generated by monitors after detecting anomalies will not display historical records and related events.

Data Gap

Processing strategy when the detection metric query result is empty within the detection range:

Option Description
Do not trigger event (default) Links to the time range of the detection range. Determines whether to generate an event based on the query results of the detection metric in the last several minutes. Suitable for scenarios where data gaps are allowed.
Treat query result as 0 Links to the time range of the detection range. Treats the query result of the detection metric in the last several minutes as 0, and re-compares it with the thresholds configured in the Trigger Conditions above to determine whether to trigger an anomaly event.
Custom fill and trigger event Supports custom filling of detection range values and triggers the following event types respectively: Data Gap Event, Critical Event, Important Event, Warning Event, and Recovery Event.

❗️When choosing this strategy, it is recommended that the configured custom data gap time be ≥ the time interval of the detection range; if the configured time ≤ the detection range time interval, situations where both data gap and anomaly conditions are met may occur. In such cases, the data gap processing result will be applied first.

When Trigger Conditions, Data Gap, and Information Generation are configured simultaneously, the triggering priority is judged as follows: Data Gap > Trigger Conditions > Information Event Generation.

That is: first judge whether there is a data gap, then judge whether the threshold is triggered, and finally judge whether to generate an information event.

Information Generation

After enabling this option, the system writes all detection results that do not match the above trigger conditions as "Information" events.

Suitable for scenarios where recording normal status changes or low-priority information is needed.

Subsequent Configuration

After completing the above detection configuration, please continue to configure:

  1. Event Notification: Define event title, content, notification members, data gap handling, and associated incidents;

  2. Alert Configuration: Select alert strategies, set notification targets, and mute periods;

  3. Association: Associate dashboards for quick jump to view data;

  4. Permissions: Set operation permissions to control who can edit/delete this monitor.

Feedback

Is this page helpful? ×