Outlier Detection¶
Through algorithmic analysis of metrics or statistical data of detection objects within specific groups, significant outlier deviations can be identified. If the detected inconsistency exceeds the preset threshold, the system will generate an outlier detection anomaly event for subsequent alert tracking and analysis. This method helps to promptly detect and handle potential anomalies, improving monitoring accuracy and response speed.
Use Cases¶
You can configure appropriate distance parameters based on the characteristics of metric data to trigger urgent events when data significantly deviates from the normal range. For example, you can set up monitoring so that when the memory usage of a host is significantly higher than other hosts, the system can promptly issue an alert. Such configurations help quickly identify and respond to potential performance issues or anomalies.
Detection Configuration¶
Detection Frequency¶
Automatically matches the selected detection interval. The default is 5 minutes.
Detection Interval¶
The time range for querying detection metrics.
| Detection Interval (Dropdown Options) | Default Detection Frequency |
|---|---|
| 15m | 5m |
| 30m | 5m |
| 1h | 15m |
| 4h | 30m |
| 12h | 1h |
| 1d | 1h |
Detection Metrics¶
The metrics being monitored.
| Field | Description |
|---|---|
| Data Type | The current data type being detected, including detection metrics, logs, infrastructure, Resource Catalog, events, APM, RUM, network, and Profile. |
| Measurement | The measurement to which the current detection metric belongs. |
| Metric | The metric targeted by the current detection. |
| Aggregation Algorithm | Includes Avg by (average), Min by (minimum), Max by (maximum), Sum by (sum), Last (last value), First by (first value), Count by (data points), Count_distinct by (unique data points), p50 (median), p75 (75th percentile), p90 (90th percentile), p99 (99th percentile). |
| Detection Dimension | Any string type (keyword) field in the configuration data can be selected as a detection dimension. Currently, up to three fields can be selected as detection dimensions. By combining multiple detection dimension fields, a specific detection object can be determined. Guance will determine whether the statistical metrics of a detection object meet the trigger condition threshold. If the condition is met, an event is generated.(For example, selecting detection dimensions host and host_ip, the detection object can be {host: host1, host_ip: 127.0.0.1}.) |
| Filter Conditions | Filter the data of detection metrics based on metric tags to limit the data range; supports adding one or more tag filters; supports fuzzy matching and fuzzy non-matching filter conditions. |
| Alias | Custom detection metric name. |
| Query Method | Supports simple query and expression query. |
Cross-Workspace Query Metrics¶
After authorization, detection metrics from other workspaces under the current account can be selected. After the monitor rule is successfully created, cross-workspace alert configuration can be achieved.
Note
After selecting another workspace, the detection metric dropdown options will only display the data types that have been authorized for the current workspace.
Trigger Conditions¶
Set the trigger conditions for alert levels: You can configure any one of the trigger conditions for emergency, normal, data gap, or information.
Configure trigger conditions and severity. When the query result is multiple values, an event is generated if any value meets the trigger condition.
Severity |
Description |
|---|---|
| Emergency (Red) | Uses the DBSCAN algorithm, which can configure an appropriate distance parameter based on metric data characteristics to trigger emergency events. The distance parameter represents the maximum distance between one sample and another sample that are adjacent, not the maximum limit of intra-cluster distance. (float, default=0.5) ❗️ You can configure any floating-point value within the range(0-3.0). If not configured, the default distance parameter is 0.5. The larger the distance setting, the fewer the anomalies detected. If the distance value is set too small, many outliers may be detected. If the distance value is set too large, no outliers may be detected. Therefore, it is necessary to set an appropriate distance parameter based on different data characteristics. |
| Normal (Green) | You can configure the number of times. If the detection metric triggers an "emergency" anomaly event, and then N consecutive detections are normal, a "normal" event is generated. Used to determine whether the anomaly event has returned to normal, it is recommended to configure this. |
Bulk Alert Protection¶
Enabled by default.
When the number of alerts generated by a single detection exceeds the preset threshold, the system automatically switches to a status summary strategy: instead of processing each alert object individually, a small number of summary alerts are generated and pushed based on the event status.
This ensures the timeliness of notifications while significantly reducing alert noise, avoiding the risk of timeout due to processing too many alerts.
Note
When this switch is enabled, the subsequent Event Details generated by the monitor after detecting anomalies will not display historical records and associated events.
Data Gap¶
For data gap status, seven strategies can be configured.
-
Link to the detection interval time range, determine the query result of the most recent minutes of the detection metric, do not trigger an event;
-
Link to the detection interval time range, determine the query result of the most recent minutes of the detection metric, treat the query result as 0; At this time, the query result will be re-compared with the threshold configured in the Trigger Conditions above to determine whether to trigger an anomaly event.
-
Custom fill the detection interval value, trigger data gap event, trigger emergency event, trigger important event, trigger warning event, and trigger recovery event; When selecting this type of configuration strategy, the custom data gap time configuration is recommended to be >= detection interval time interval. If the configured time <= detection interval time interval, there may be situations where both data gap and anomaly conditions are met. In this case, only the data gap processing result will be applied.
Information Generation¶
Enable this option to generate "information" events for detection results that do not match the above trigger conditions.
Note
When trigger conditions, data gap, and information generation are configured simultaneously, the priority for triggering is as follows: data gap > trigger conditions > information event generation.
Other Configurations¶
For more details, refer to Rule Configuration.