Application Performance Metrics Detection¶

Used to monitor key metric data within the APM workspace. After setting a threshold range, the system automatically issues a warning when metrics exceed the threshold.

Use Cases¶

Monitoring metrics data of APM services;
Counting the number of links that meet conditions within a specified time period, triggering an anomaly event when exceeding custom thresholds.

Detection Configuration¶

Detection Frequency¶

The execution frequency of the detection rule.

Detection Interval¶

The time range for querying metrics during each task execution. Affected by the detection frequency, different options may be available.

Detection Frequency	Detection Interval (Dropdown Options)
30s	1m/5m/15m/30m/1h/3h
1m	1m/5m/15m/30m/1h/3h
5m	5m/15m/30m/1h/3h
15m	15m/30m/1h/3h/6h
30m	30m/1h/3h/6h
1h	1h/3h/6h/12h/24h
6h	6h/12h/24h
12h	12h/24h
24h	24h

Detection Metrics¶

Set the detection data metrics, which can be used to configure service metrics data within the workspace for a specified time range.

Service MetricsTrace Statistics

Field	Description
Services	Monitor application performance monitoring services within the current workspace.
Metrics	Specific detection metrics, including request counts, error request counts, request error rates, average requests per second, average response times, P50 response times, P75 response times, P90 response times, and P99 response times.
Filtering Conditions	Screen detection data based on tags associated with the metrics, limiting the scope of detection. Supports adding one or more tag filters, as well as fuzzy matching and non-matching filtering conditions.
Detection Dimensions	Any string type (`keyword`) fields in the configured data can be selected as detection dimensions. Currently, up to three fields are supported. By combining multiple detection dimension fields, a specific detection object can be determined. The system will determine whether the statistical metrics of this detection object meet the threshold of the trigger condition, and if so, an event is generated. For example, selecting detection dimensions `host` and `host_ip` results in a detection object like `{host: host1, host_ip: 127.0.0.1}`.

Counts the number of traces that meet conditions within a specified time period, triggering an anomaly event when exceeding custom thresholds, which can be used for notifications about abnormal errors in service traces.

Field	Description
Source	The data source for the current detection metrics.
Filtering Conditions	Filter trace `span` using tags to limit the scope of detection data. Supports adding one or more tag filtering conditions.
Aggregation Algorithm	Default is “*”, corresponding aggregation function is `count`. If another field is selected, the aggregation function automatically changes to `count distinct` (the number of data points where the keyword appears).
Detection Dimensions	Any string type (`keyword`) fields in the configured data can be selected as detection dimensions. Currently, up to three fields are supported. By combining multiple detection dimension fields, a specific detection object can be determined. The system will determine whether the statistical metrics of this detection object meet the threshold of the trigger condition, and if so, an event is generated. For example, selecting detection dimensions `host` and `host_ip` results in a detection object like `{host: host1, host_ip: 127.0.0.1}`.

Trigger Conditions¶

Set alert level trigger conditions: You can configure any one of emergency, critical, warning, or normal trigger conditions arbitrarily.

Configure trigger conditions and severity levels. When the query result contains multiple values, an event is generated if any value meets the trigger condition.

For more details, refer to Event Level Description.

If continuous trigger judgment is enabled, it means that after the trigger condition is met multiple times consecutively, events are triggered again. The maximum limit is 10 times.

Alert Levels

Alert Levels Emergency (Red), Critical (Orange), Warning (Yellow): Based on evaluating operators in the configuration conditions.

Alert Level Normal (Green): Based on the number of detections configured, as follows:

Each execution of a detection task counts as 1 detection, such as detection frequency = 5 minutes, then 1 detection = 5 minutes;
You can customize the number of detections, such as detection frequency = 5 minutes, then 3 detections = 15 minutes.

Level

Description

Normal

After the detection rule takes effect, if emergency, critical, or warning anomaly events occur, and the data detection results return to normal within the configured custom number of detections, a recovery alert event is generated.

Recovery alert events are not subject to alert muting. If the recovery alert event detection count is not set, the alert event will not recover and will remain in the Events > Unrecovered Events List.

Data Gaps¶

You can configure seven strategies for handling data gap states.

Linking the detection interval time range, judge the query results of the most recent minutes of the detection metrics, do not trigger events;
Linking the detection interval time range, judge the query results of the most recent minutes of the detection metrics, query results are considered as 0; at this point, the query results will be compared again with the thresholds configured in the trigger conditions above to determine whether an anomaly event should be triggered.
Custom fill-in for the detection interval value, trigger data gap events, trigger emergency events, trigger critical events, trigger warning events, and trigger recovery events; choosing this type of configuration strategy suggests that the custom data gap time configuration should be >= detection interval time. If the configured time <= the detection interval time, there might be simultaneous satisfaction of data gaps and anomalies, in which case only the data gap processing result will apply.

Information Generation¶

Enabling this option generates "information" events for detection results that do not match any of the above trigger conditions.

Note

If trigger conditions, data gaps, and information generation are configured simultaneously, the following priority order applies: data gaps > trigger conditions > information event generation.

Other Configurations¶

For more details, refer to Rule Configuration.