Skip to content

Mutation Detection


By comparing the absolute change or relative percentage change of the same metric across two different time periods, it determines whether an anomaly has occurred. This method is often used to track peaks or fluctuations in metrics. When an anomaly is detected, it can more accurately generate event records for subsequent analysis and handling.

Use Cases

Mutation detection is suitable for monitoring short-term versus long-term data relative changes or rates of change. For example, setting the percentage difference between the number of MySQL connections over the last 15 minutes and the average value over the past day to be greater than 500% means that if the average number of connections over the last 15 minutes exceeds five times the average number of connections from the previous day, the system will trigger a warning.

It is recommended to use statistical functions such as average (AVG), maximum (MAX), minimum (MIN) rather than the last value (LAST) function to calculate these metrics, reducing the impact of abnormal data and enhancing the accuracy of monitoring.

Detection Configuration

Detection Metrics

The monitored metric data. It compares the difference or percentage difference of this metric across two time periods.

Field Description
Data Type The data type for the current detection rule.
Measurement The measurement set where the current detection metric resides.
Metrics The specific metric targeted by the current detection.
Aggregation Algorithm Includes Avg by (average), Min by (minimum), Max by (maximum), Sum by (sum), Last (last value), First by (first value), Count by (number of data points), Count_distinct by (number of non-repeating data points), p50 (median value), p75 (value at the 75th percentile), p90 (value at the 90th percentile), p99 (value at the 99th percentile).
Detection Dimensions Any string type (keyword) field in the configuration can be selected as a detection dimension. Currently, up to three fields are supported for detection dimensions. By combining multiple detection dimension fields, a specific detection object can be determined. Guance will judge whether the statistical metric corresponding to a certain detection object meets the threshold conditions for triggering an event. If the conditions are met, an event will be generated.
* (For example, selecting detection dimensions HOST and host_ip, then the detection object could be {HOST: host1, host_ip: 127.0.0.1}).
Filtering Conditions Filters the data of the detection metric based on its tags, limiting the scope of detection data; supports adding one or more tag filters; supports fuzzy matching and fuzzy mismatch filtering conditions.
Alias Custom name for the detection metric.
Query Method Supports simple queries and expression-based queries.

Time intervals include last month, last week, yesterday, 1 hour ago, compared to the previous period, last 15 minutes, last 30 minutes, last 1 hour, last 4 hours, last 12 hours, and last 1 day.

Note

For the detection intervals "yesterday" and "one hour ago," the comparison is made between the same time ranges for the differences or percentage differences of the detection metrics. For other detection intervals, the comparison is made between two time periods for the differences or percentage differences of the detection metrics.

Detection Frequency

The execution frequency of the detection rule automatically matches the larger time range of the two selected detection intervals chosen by the user. Includes 1 minute, 5 minutes, 15 minutes, 30 minutes, and 1 hour.

Trigger Conditions

Set the alert level trigger conditions: You can configure any one of the following trigger conditions - urgent, important, warning, data interruption, informational:

  1. Pre-trigger condition configuration: Enabled by default; when the detection value meets the threshold set in the pre-trigger condition (operators support >, >=, <, <=, with > selected by default), it continues to evaluate the mutation detection rules; disabling this configuration performs only the mutation detection rule evaluation;

  2. Mutation rule configuration: Three types of data comparisons - upward mutation (data increase), downward mutation (data decrease), upward or downward mutations - to evaluate the mutation detection rules.

Configure the trigger conditions and severity levels. When the query results return multiple values, an event is generated if any value satisfies the trigger condition.

For more details, refer to Event Level Description.

Alert Levels
  1. Alert Levels Urgent (Red), Important (Orange), Warning (Yellow): Based on evaluating the configured conditions operators.

  2. Alert Level Normal (Green): Based on the configured detection count, as follows:

    • Each execution of a detection task counts as 1 detection, e.g., Detection Frequency = 5 minutes equals 1 detection = 5 minutes;
    • You can customize the detection count, e.g., Detection Frequency = 5 minutes, then 3 detections = 15 minutes;
    Level Description
    Normal After the detection rule takes effect, if urgent, important, or warning anomalies occur, within the configured custom detection count, if the data detection result returns to normal, a recovery alert event is generated.
    ⚠ Recovery alert events are not restricted by alert muting. If no recovery alert event detection count is set, the alert event will not recover and will remain in the Events > Unrecovered Events List.

Data Interruption

For data interruption status, seven strategies can be configured.

  1. Linking the detection interval time range, judge the query results of the most recent minutes of the detection metrics, do not trigger an event;

  2. Linking the detection interval time range, judge the query results of the most recent minutes of the detection metrics, treat the query results as 0; At this point, the query results will be recompared with the thresholds configured in the trigger conditions, thus determining whether to trigger an anomaly event.

  3. Custom fill for the detection interval value, trigger data interruption events, trigger urgent events, trigger important events, trigger warning events, and trigger recovery events; Choosing this type of configuration strategy, it is suggested that the custom data interruption time configuration should be >= detection interval time. If the configured time <= the detection interval time, there may be simultaneous satisfaction of data interruption and anomaly situations, in which case only the data interruption processing results will be applied.

Information Generation

If this option is enabled, detection results that do not match the above trigger conditions will generate "informational" events that are written into the logs.

Note

If trigger conditions, data interruptions, and information generation are configured simultaneously, the following priority applies: data interruption > trigger conditions > information event generation.

Other Configurations

For more details, refer to Rule Configuration.

Feedback

Is this page helpful? ×