RUM Metrics Anomaly Detection¶

Current Document Position

This document is the second step in the detection rule configuration process. After configuration, please return to the main document to continue with the third step: Event Notification.

Used to monitor user access metric data within a workspace. Supports setting threshold ranges for performance metrics of various application types such as Web, Android, iOS, Miniapp, React Native, HarmonyOS, etc. The system automatically triggers alerts when metrics exceed the thresholds.

Applicable to scenarios requiring front-end application performance monitoring. For example, monitoring JS error rates for Web applications based on city dimensions, or monitoring page load times for Miniapps, crash rates for mobile applications, etc.

Detection Configuration¶

Detection Frequency¶

Set the time period for executing the detection.

Preset options: 1 minute, 5 minutes, 10 minutes, 15 minutes, 30 minutes, 1 hour
Default selection: 5 minutes
Crontab mode: Click "Switch to Crontab Mode" to configure a custom period, supporting scheduled task execution based on seconds, minutes, hours, days, months, weeks, etc.

Detection Interval¶

Set the data time range queried for each detection (❗️The detection interval should be greater than or equal to the detection frequency, and should match the actual data reporting cycle to avoid missed detections or false positives).

Detection Frequency	Detection Interval (Dropdown Options)
30s	1m/5m/15m/30m/1h/3h
1m	1m/5m/15m/30m/1h/3h
5m	5m/15m/30m/1h/3h
15m	15m/30m/1h/3h/6h
30m	30m/1h/3h/6h
1h	1h/3h/6h/12h/24h
6h	6h/12h/24h
12h	12h/24h
24h	24h

Custom format: Custom input for detection interval, e.g., 20m (last 20 minutes), 2h (last 2 hours), 1d (last 1 day).

Detection Metrics¶

Set the metric data for detection. Can be configured for metric data of applications under a single application type within the current workspace (❗️Avoid selecting high-cardinality fields as detection dimensions. If configured improperly, overly lenient trigger conditions may cause frequent alerts. The current query maximum return count is 100,000 records).

Configuration Elements¶

Configuration Item	Description
Application Type	Application types supported by RUM, including: Web, Android, iOS, Miniapp, HarmonyOS
Application Name	Based on the selected application type, retrieves the corresponding application list. Supports selecting all or specified applications.
Metric	Displays corresponding performance metrics based on the application type. See metric descriptions below for details.
Filter Conditions	Filters the detection metric data based on metric tags to limit the data scope. Supports adding one or multiple tag filters. Supports fuzzy match and fuzzy not-match filter conditions.
Detection Dimension	Any string-type (`keyword`) field in the configuration data can be selected as a detection dimension. Currently, a maximum of three fields can be selected as detection dimensions. By combining multiple detection dimension fields, a specific detection object can be determined. The system will judge whether the statistical metric corresponding to a detection object meets the threshold of the trigger condition. If the condition is met, an event is generated. (For example, selecting detection dimensions `host` and `host_ip`, the detection object could be `{host: host1, host_ip: 127.0.0.1}`.)
Additional Information	Additional fields are only used for extra queries and will not be used for trigger condition judgment. They can be configured into event notifications. If multiple matching values are detected, one record will be returned randomly.

Web / Miniapp Metric Description¶

Metric	DQL Query Example
JS Error Count	`R::error:(count(`__docid`) as`JS Error Count`) { app_id = '<Application ID>' }`
JS Error Rate	Web: `eval(A/B, alias='Page JS Error Rate', A="R::view:(count(view_url)) {view_error_count > 0, app_id = '<Application ID>'}", B="R::view:(count(view_url)) { app_id = '<Application ID>'}")` Miniapp: `eval(A/B, alias='JS Error Rate', A="R::view:(count(view_name)) {view_error_count > 0, app_id = '<Application ID>' }", B="R::view:(count(view_name)) { app_id = '<Application ID>' }")`
Resource Error Count	`R::resource:(count(resource_url) as`Resource Error Count`) {resource_status >=400, app_id = '<Application ID>'}"`
Resource Error Rate	`eval(A/B, alias='Resource Error Rate', A="R::resource:(count(resource_url)) { resource_status >= '400',app_id = '<Application ID>' }", B="R::resource:(count(resource_url)) { app_id = '<Application ID>' }")`
Average First Paint Time	`R::page:(avg(page_fpt)){app_id = '<Application ID>'}`
Average Page Load Time	`R::view:(avg(loading_time)){app_id = '<Application ID>'}`
Page Slow Load Count	`R::resource:(count(resource_load)){app_id = '<Application ID>',resource_load>8000000000,resource_type='document'}"`
Average Resource Load Time	`R::resource:(avg(resource_load) as`Load Time`) {app_id = '<Application ID>',resource_type!='document'}"`
LCP (largest_contentful_paint)	Supported aggregation functions: avg, percentile `R::view:(avg(largest_contentful_paint)){app_id = '<Application ID>'}` `R::view:(percentile(largest_contentful_paint,75)){app_id = '<Application ID>'}` `R::view:(percentile(largest_contentful_paint,90)){app_id = '<Application ID>'}` `R::view:(percentile(largest_contentful_paint,99)){app_id = '<Application ID>'}`
FID (first_input_delay)	Supported aggregation functions: avg, percentile `R::view:(avg(first_input_delay)){app_id = '<Application ID>'}` `R::view:(percentile(first_input_delay,75)){app_id = '<Application ID>'}` `R::view:(percentile(first_input_delay,90)){app_id = '<Application ID>'}` `R::view:(percentile(first_input_delay,99)){app_id = '<Application ID>'}`
CLS (cumulative_layout_shift)	Supported aggregation functions: avg, percentile `R::view:(avg(cumulative_layout_shift)){app_id = '<Application ID>'}` `R::view:(percentile(cumulative_layout_shift,75)){app_id = '<Application ID>'}` `R::view:(percentile(cumulative_layout_shift,90)){app_id = '<Application ID>'}` `R::view:(percentile(cumulative_layout_shift,99)){app_id = '<Application ID>'}`
FCP (first_contentful_paint)	Supported aggregation functions: avg, percentile `R::view:(avg(first_contentful_paint)){app_id = '<Application ID>'}` `R::view:(percentile(first_contentful_paint,75)){app_id = '<Application ID>'}` `R::view:(percentile(first_contentful_paint,90)){app_id = '<Application ID>'}` `R::view:(percentile(first_contentful_paint,99)){app_id = '<Application ID>'}`

Android / iOS Metric Description¶

Metric	DQL Query Example
Launch Time	`R::action:(avg(duration)) { app_id = '<Application ID>' ,action_type='app_cold_launch'}"`
Total Crash Count	`R::error:(count(error_type)) {app_id='<Application ID>',error_source = 'logger' and is_web_view !='true'}"`
Total Crash Rate	`eval(A.a1/B.b1, alias='Total Crash Rate',A="R::error:(count(error_type) as a1) {app_id='<Application ID>',error_source = 'logger',is_web_view !='true'} ",B="R::action:(count(action_name) as b1) { app_id = '<Application ID>',action_type in [`launch_cold`,`launch_hot`,`launch_warm`]} ")"`
Resource Error Count	`R::resource:(count(resource_url) as`Resource Error Count`) {resource_status >=400, app_id = '<Application ID>'}"`
Resource Error Rate	`eval(A/B, alias='Resource Error Rate', A="R::resource:(count(resource_url)) { resource_status >= '400',app_id = '<Application ID>' }", B="R::resource:(count(resource_url)) { app_id = '<Application ID>' }")`
Average FPS	`R::view:(avg(fps_avg)) { app_id = '<Application ID>' }"`
Average Page Load Time	`R::view:(avg(loading_time)) { app_id = '<Application ID>' }"`
Average Resource Load Time	`R::resource:(avg(duration)) { app_id = '<Application ID>' }"`
Stutter Count	`R::long_task:(count(view_id)) { app_id = '<Application ID>' }"`
Page Error Rate	`eval(A/B, alias='Page Error Rate',A="R::view:(count(view_name)) {view_error_count > 0, app_id = '<Application ID>' }",B="R::view:(count(view_name)) { app_id = '<Application ID>' }")"`

Trigger Conditions¶

Configure trigger conditions and severity levels. When the query result contains multiple values, an event is generated if any value satisfies the trigger condition.

Supports configuring four-level thresholds: Fatal, Severe, Important, Warning, as well as a Normal recovery condition.

Level	Configuration	Description
Fatal	When Result >= `[Value]`	Highest level alert, requires immediate handling.
Severe	When Result >= `[Value]`	High-level alert, requires priority handling.
Important	When Result >= `[Value]`	Medium-level alert, requires attention.
Warning	When Result >= `[Value]`	Low-level alert, requires monitoring.
Normal	No events generated for `[N]` consecutive detections	After the detection rule takes effect, if the data detection result changes from abnormal (Fatal, Severe, Important, Warning) to normal within the configured custom detection count, a recovery alert event is triggered. ❗️ Recovery alert events are not restricted by Alert Mute. If the recovery alert event detection count is not set, the alert event will not recover and will remain in the Events > Unrecovered Events List.

For more details, refer to Event Level Description.

Advanced Options¶

Consecutive Trigger Judgment¶

When enabled, events are only generated when the trigger condition is continuously met, avoiding false positives from transient fluctuations (❗️Maximum configuration limit is 10 times).

Bulk Alert Protection¶

Enabled by default in the system.

When the number of alerts generated in a single detection exceeds a preset threshold, the system automatically switches to a status summary strategy: Instead of processing each alert object individually, a small number of summary alerts are generated and pushed based on the event status.

This ensures notification timeliness while significantly reducing alert noise and avoiding timeout risks from processing too many alerts.

When this switch is enabled, subsequent Event Details generated by the monitor after detecting anomalies will not display historical records and associated events.

Data Gap¶

Processing strategy when the detection metric query result is empty within the detection interval:

Option	Description
Do Not Trigger Event (Default)	Links to the detection interval time range. Determines whether to generate an event based on the query results of the detection metric in the last several minutes. Suitable for scenarios where data gaps are allowed.
Treat Query Result as 0	Links to the detection interval time range. Treats the query results of the detection metric in the last several minutes as 0, and re-compares them with the thresholds configured in the Trigger Conditions above to determine whether to trigger an anomaly event.
Custom Fill and Trigger Event	Supports custom filling of the detection interval value, and triggers the following event types respectively: Data Gap Event, Severe Event, Important Event, Warning Event, and Recovery Event. ❗️When choosing this strategy, it is recommended to configure the custom data gap time ≥ the detection interval time. If the configured time ≤ the detection interval time, situations where both data gap and anomaly conditions are met may occur. In such cases, the data gap processing result will be applied first.

When Trigger Conditions, Data Gap, and Information Generation are configured simultaneously, the triggering is judged according to the following priority: Data Gap > Trigger Conditions > Information Event Generation.

That is: first judge whether there is a data gap, then judge whether the threshold is triggered, and finally judge whether to generate an information event.

Information Generation¶

When this option is enabled, the system writes all detection results that do not match the above trigger conditions as "Information" events.

Suitable for scenarios requiring recording normal status changes or low-priority information.

Subsequent Configuration¶

After completing the above detection configuration, please continue to configure:

Event Notification: Define event title, content, notification members, data gap handling, and associated incidents;
Alert Configuration: Select alert strategies, set notification targets, and mute periods;
Association: Associate dashboards for quick jump to view data;
Permissions: Set operation permissions to control who can edit/delete this monitor.