RUM Anomaly Detection¶
Used to monitor user access metric data within the workspace. Threshold ranges can be set, and the system automatically alerts when metrics exceed these thresholds.
Use Cases¶
Supports monitoring metric data for various types of applications including Web, Android, iOS, and Miniapp. For example, it can monitor the JS error rate for Web applications based on the city dimension.
Configuration¶
Detection Frequency¶
Refers to the execution frequency of the detection rule.
Detection Interval¶
Refers to the time range queried for the detection metrics. Affected by the detection frequency, the available detection interval options will differ.
| Detection Frequency | Detection Interval (Dropdown Options) |
|---|---|
| 30s | 1m/5m/15m/30m/1h/3h |
| 1m | 1m/5m/15m/30m/1h/3h |
| 5m | 5m/15m/30m/1h/3h |
| 15m | 15m/30m/1h/3h/6h |
| 30m | 30m/1h/3h/6h |
| 1h | 1h/3h/6h/12h/24h |
| 6h | 6h/12h/24h |
| 12h | 12h/24h |
| 24h | 24h |
Detection Metrics¶
Configure the detection metrics. You can set up monitoring for metric data of applications under a single application type within the current workspace, supporting a defined time range. For example: metric data for all applications under the Web type in the current workspace.
| Field | Description |
|---|---|
| Application Type | Application types supported by RUM, including Web, Android, iOS, Miniapp |
| Application Name | Fetches the list of applications corresponding to the selected application type. |
| Metric | List of metrics categorized by application type: |
| Filter Conditions | Filters the scope of detection data based on metric tags. Supports adding multiple tags, with options for fuzzy matching or fuzzy non-matching. |
| Detection Dimensions | You can select up to three string-type (keyword) fields from the configured data as detection dimensions. The combination of multiple fields determines the unique detection object. The system uses this to judge whether the metric reaches the threshold to trigger an event.For example: Selecting detection dimensions host and host_ip means the detection object could be {host: host1, host_ip: 127.0.0.1}. |
Web / Miniapp Metric Description¶
| Metric | Query Example |
| JS Error Count |
R::error:(count(`__docid`) as `JS Error Count`) { `app_id` = ' |
| JS Error Rate |
Web: eval(A/B, alias='Page JS Error Rate', A="R::view:(count(`view_url`)) {`view_error_count` > 0, `app_id` = ' |
| Resource Error Count |
R::resource:(count(`resource_url`) as `Resource Error Count`) {`resource_status` >=400, `app_id` = ' |
| Resource Error Rate |
eval(A/B, alias='Resource Error Rate', A="R::`resource`:(count(`resource_url`)) { `resource_status` >= '400',`app_id` = ' |
| Average First Paint Time | R::page:(avg(page_fpt)){`app_id` = '#{appid}'} |
| Average Page Loading Time | R::view:(avg(loading_time)){`app_id` = '#{appid}'} |
| Slow Page Load Count | R::resource:(count(resource_load)){`app_id` = '#{appid}',`resource_load`>8000000000,resource_type='document'} |
| Average Resource Loading Time | R::resource:(avg(`resource_load`) as `Loading Time` ) {`app_id` = '#{appid}',resource_type!='document'} |
| LCP (largest_contentful_paint) |
Includes aggregation functions: avg, P75, P90, P99 R::view:(avg(largest_contentful_paint)){`app_id` = '#{appid}'} R::view:(percentile(`largest_contentful_paint`,75)){`app_id` = '#{appid}'} R::view:(percentile(`largest_contentful_paint`,90)){`app_id` = '#{appid}'} R::view:(percentile(`largest_contentful_paint`,99)){`app_id` = '#{appid}'} |
| FID (first_input_delay) |
Includes aggregation functions: avg, P75, P90, P99 R::view:(avg(first_input_delay)){`app_id` = '#{appid}'} R::view:(percentile(`first_input_delay`,75)){`app_id` = '#{appid}'} R::view:(percentile(`first_input_delay`,90)){`app_id` = '#{appid}'} R::view:(percentile(`first_input_delay`,99)){`app_id` = '#{appid}'} |
| CLS (cumulative_layout_shift) |
Includes aggregation functions: avg, P75, P90, P99 R::view:(avg(cumulative_layout_shift)){`app_id` = '#{appid}'} R::view:(percentile(`cumulative_layout_shift`,75)){`app_id` = '#{appid}'} R::view:(percentile(`cumulative_layout_shift`,90)){`app_id` = '#{appid}'} R::view:(percentile(`cumulative_layout_shift`,99)){`app_id` = '#{appid}'} |
| FCP (first_contentful_paint) |
Includes aggregation functions: avg, P75, P90, P99 R::view:(avg(first_contentful_paint)){`app_id` = '#{appid}'} R::view:(percentile(`first_contentful_paint`,75)){`app_id` = '#{appid}'} R::view:(percentile(`first_contentful_paint`,90)){`app_id` = '#{appid}'} R::view:(percentile(`first_contentful_paint`,99)){`app_id` = '#{appid}'} |
Android / IOS Metric Description¶
| Metric | Query Example |
| Launch Time |
R::action:(avg(duration)) { `app_id` = ' |
| Total Crash Count |
R::error:(count(error_type)) {app_id=' |
| Total Crash Rate |
eval(A.a1/B.b1, alias='Total Crash Rate',A="R::error:(count(error_type) as a1) {app_id=' |
| Resource Error Count |
R::resource:(count(`resource_url`) as `Resource Error Count`) {`resource_status` >=400, `app_id` = ' |
| Resource Error Rate |
eval(A/B, alias='Resource Error Rate', A="R::`resource`:(count(`resource_url`)) { `resource_status` >= '400',`app_id` = ' |
| Average FPS |
R::view:(avg(`fps_avg`)) { `app_id` = ' |
| Average Page Loading Time |
R::view:(avg(`loading_time`)) { `app_id` = ' |
| Average Resource Loading Time |
R::resource:(avg(`duration`)) { `app_id` = ' |
| Stutter Count |
R::long_task:(count(`view_id`)) { `app_id` = ' |
| Page Error Rate |
eval(A/B, alias='Page Error Rate',A="R::view:(count(`view_name`)) {`view_error_count` > 0, `app_id` = ' |
Trigger Conditions¶
Set trigger conditions for alert severity levels: You can configure any one of the trigger conditions for Critical, Important, Warning, or Normal.
Configure the trigger conditions and severity. When the query result contains multiple values, an event is triggered if any value meets the trigger condition.
For more details, refer to Event Level Description.
Consecutive Trigger Judgment¶
If Consecutive Trigger Judgment is enabled, you can configure the system to generate an event only after the trigger condition is met for a consecutive number of times. The maximum limit is 10 times.
Bulk Alert Protection¶
Enabled by default.
When the number of alerts generated in a single detection exceeds a preset threshold, the system automatically switches to a status summary strategy: instead of processing each alert object individually, it generates a small number of summary alerts based on event status and pushes them.
This ensures the timeliness of notifications while significantly reducing alert noise and avoiding timeout risks caused by processing too many alerts.
Note
When this switch is on, the subsequent event details generated by the monitor after detecting anomalies will not display historical records and associated events.
Alert Level¶
-
Alert Level Critical (red), Important (orange), Warning (yellow);
-
Alert Level Normal (green): Based on the configured number of detections, explained as follows:
-
Each execution of a detection task counts as 1 detection. For example, if
Detection Frequency = 5 minutes, then 1 detection = 5 minutes; -
You can customize the number of detections. For example, if
Detection Frequency = 5 minutes, then 3 detections = 15 minutes.
Level Description Normal After the detection rule takes effect, if an Critical, Important, or Warning abnormal event occurs, and the data detection result returns to normal within the configured custom number of detections, a recovery alert event is generated.
❗️ Recovery alert events are not subject to Alert Silence restrictions. If the number of detections for recovery alert events is not set, the alert event will not recover and will remain in the Events > Unrecovered Events List. -
Data Gap¶
Seven strategies can be configured for data gap status.
-
Linked with the detection interval time range, judge the query result of the detection metric for the most recent minutes, do not trigger an event;
-
Linked with the detection interval time range, judge the query result of the detection metric for the most recent minutes, treat the query result as 0; at this time, the query result will be re-compared with the threshold configured in the Trigger Conditions above to determine whether to trigger an abnormal event.
-
Custom fill the detection interval value, trigger data gap event, trigger critical event, trigger important event, trigger warning event, and trigger recovery event; when selecting this type of configuration strategy, the custom data gap time configuration is recommended to be >= detection interval time span. If the configured time <= detection interval time span, it may be possible to meet both data gap and abnormal conditions simultaneously. In this case, only the data gap processing result will be applied.
Information Generation¶
Enable this option to generate "Information" events for detection results that do not match any of the above trigger conditions.
Note
If trigger conditions, data gap, and information generation are configured simultaneously, the triggering is judged according to the following priority: Data Gap > Trigger Conditions > Information Event Generation.
Other Configuration¶
For more details, refer to Rule Configuration.