User Access Metrics Monitoring¶
This is used to monitor metrics data related to user access within the workspace. Threshold ranges can be set, and when the metrics exceed these thresholds, the system will automatically trigger an alert. Additionally, it supports configuring alert rules for individual metrics and allows customization of the severity level of alerts.
Use Cases¶
It supports monitoring metric data for various application types including Web, Android, iOS, and Miniapp. For example, you can monitor the JS error rate on the Web side based on city dimensions.
Monitoring Configuration¶
Monitoring Frequency¶
This refers to the execution frequency of the monitoring rule; by default, 5 minutes is selected.
Monitoring Interval¶
This refers to the time range for querying monitored metrics. Depending on the monitoring frequency, the selectable monitoring intervals may vary.
Monitoring Frequency | Monitoring Interval (Dropdown Options) |
---|---|
1m | 1m/5m/15m/30m/1h/3h |
5m | 5m/15m/30m/1h/3h |
15m | 15m/30m/1h/3h/6h |
30m | 30m/1h/3h/6h |
1h | 1h/3h/6h/12h/24h |
6h | 6h/12h/24h |
12h | 12h/24h |
24h | 24h |
Monitored Metrics¶
Set the metrics for the monitored data. It supports setting all or individual applications under a single application type within the current workspace over a certain time range. For instance: the metric data for all Web-type applications in the current workspace.
Field | Description |
---|---|
Application Type | Supported application types for RUM, including Web, Android, iOS, Miniapp |
Application Name | Retrieves the corresponding application list based on the application type, supporting full selection and single selection |
Metric | Retrieves the metric list based on the application type, Web/Miniapp (including JS error count, JS error rate, resource error count, resource error rate, average first render time, average page load time, LCP(largest_contentful_paint), FID(first_input_delay), CLS(cumulative_layout_shift), FCP(first_contentful_paint), etc.) Android/IOS (including startup duration, total crash count, total crash rate, resource error count, resource error rate, FPS, average page load time, etc.). |
Filter Conditions | Filters the monitored metric data based on tags associated with the metrics, limiting the scope of monitored data. Supports adding one or more tag filters, supporting fuzzy matching and non-fuzzy matching filter conditions. |
Monitoring Dimensions | All string type (keyword ) fields in the configuration can be selected as monitoring dimensions. Currently, up to three fields can be selected for monitoring dimensions. By combining multiple monitoring dimension fields, a specific monitoring object can be determined. Guance will judge whether the statistical metrics corresponding to a specific monitoring object meet the threshold conditions to trigger events.* (For example, selecting monitoring dimensions host and host_ip , then the monitoring object could be {host: host1, host_ip: 127.0.0.1} .)* |
Web / Miniapp Metric Description¶
Metric | Query Example |
JS Error Count |
R::error:(count(`__docid`) as `JS Error Count`) { `app_id` = ' |
JS Error Rate |
Web: eval(A/B, alias='Page JS Error Rate', A="R::view:(count(`view_url`)) {`view_error_count` > 0, `app_id` = ' |
Resource Error Count |
R::resource:(count(`resource_url`) as `Resource Error Count`) {`resource_status` >=400, `app_id` = ' |
Resource Error Rate |
eval(A/B, alias='Resource Error Rate', A="R::`resource`:(count(`resource_url`)) { `resource_status` >= '400',`app_id` = ' |
Average First Render Time | R::page:(avg(page_fpt)){`app_id` = '#{appid}'} |
Average Page Load Time | R::view:(avg(loading_time)){`app_id` = '#{appid}'} |
Slow Page Load Count | R::resource:(count(resource_load)){`app_id` = '#{appid}',`resource_load`>8000000000,resource_type='document'} |
Average Resource Load Time | R::resource:(avg(`resource_load`) as `Load Time` ) {`app_id` = '#{appid}',resource_type!='document'} |
LCP (largest_contentful_paint) | Includes aggregate functions: avg, P75, P90, P99 R::view:(avg(largest_contentful_paint)){`app_id` = '#{appid}'} R::view:(percentile(`largest_contentful_paint`,75)){`app_id` = '#{appid}'} R::view:(percentile(`largest_contentful_paint`,90)){`app_id` = '#{appid}'} R::view:(percentile(`largest_contentful_paint`,99)){`app_id` = '#{appid}'} |
FID (first_input_delay) | Includes aggregate functions: avg, P75, P90, P99 R::view:(avg(first_input_delay)){`app_id` = '#{appid}'} R::view:(percentile(`first_input_delay`,75)){`app_id` = '#{appid}'} R::view:(percentile(`first_input_delay`,90)){`app_id` = '#{appid}'} R::view:(percentile(`first_input_delay`,99)){`app_id` = '#{appid}'} |
CLS (cumulative_layout_shift) | Includes aggregate functions: avg, P75, P90, P99 R::view:(avg(cumulative_layout_shift)){`app_id` = '#{appid}'} R::view:(percentile(`cumulative_layout_shift`,75)){`app_id` = '#{appid}'} R::view:(percentile(`cumulative_layout_shift`,90)){`app_id` = '#{appid}'} R::view:(percentile(`cumulative_layout_shift`,99)){`app_id` = '#{appid}'} |
FCP (first_contentful_paint) | Includes aggregate functions: avg, P75, P90, P99 R::view:(avg(first_contentful_paint)){`app_id` = '#{appid}'} R::view:(percentile(`first_contentful_paint`,75)){`app_id` = '#{appid}'} R::view:(percentile(`first_contentful_paint`,90)){`app_id` = '#{appid}'} R::view:(percentile(`first_contentful_paint`,99)){`app_id` = '#{appid}'} |
Android / IOS Metric Description
Metric | Query Example |
Startup Duration |
R::action:(avg(duration)) { `app_id` = ' |
Total Crash Count |
R::error:(count(error_type)) {app_id=' |
Total Crash Rate |
eval(A.a1/B.b1, alias='Total Crash Rate',A="R::error:(count(error_type) as a1) {app_id=' |
Resource Error Count |
R::resource:(count(`resource_url`) as `Resource Error Count`) {`resource_status` >=400, `app_id` = ' |
Resource Error Rate |
eval(A/B, alias='Resource Error Rate', A="R::`resource`:(count(`resource_url`)) { `resource_status` >= '400',`app_id` = ' |
Average FPS |
R::view:(avg(`fps_avg`)) { `app_id` = ' |
Average Page Load Time |
R::view:(avg(`loading_time`)) { `app_id` = ' |
Average Resource Load Time |
R::resource:(avg(`duration`)) { `app_id` = ' |
Stutter Count |
R::long_task:(count(`view_id`)) { `app_id` = ' |
Page Error Rate |
eval(A/B, alias='Page Error Rate',A="R::view:(count(`view_name`)) {`view_error_count` > 0, `app_id` = ' |
Trigger Conditions¶
Set the trigger conditions for alert levels: You can configure any one of the emergency, important, warning, and normal trigger conditions arbitrarily.
Configure the trigger conditions and severity level. When the query result has multiple values, if any value meets the trigger condition, an event will be generated.
For more details, refer to Event Level Description.
If Continuous Trigger Judgment is enabled, you can configure the trigger conditions to take effect after multiple consecutive judgments. The maximum limit is 10 times.
Alert Levels
- Alert Levels Emergency (Red), Important (Orange), Warning (Yellow): Based on the judgment operator configured in the conditions.
For more details, refer to Operator Description.
-
Alert Level Normal (Green): Based on the configured detection times, as follows:
- Each execution of a detection task counts as 1 detection, e.g.,
Detection Frequency = 5 minutes
, then 1 detection = 5 minutes; - You can customize the number of detections, e.g.,
Detection Frequency = 5 minutes
, then 3 detections = 15 minutes.
Level Description Normal After the detection rule takes effect, if an emergency, important, or warning abnormal event occurs, and within the configured custom detection times, the data detection results return to normal, then a recovery alert event will be generated.
Recovery alert events are not subject to Alert Mute restrictions. If the detection times for recovery alert events are not set, the alert event will not recover and will always appear in the Events > Unrecovered Events List.
- Each execution of a detection task counts as 1 detection, e.g.,
Data Disruption¶
For data disruption status, seven strategies can be configured.
-
Link with the detection interval time range, judge the query results of the most recent minutes for the monitored metrics, do not trigger events;
-
Link with the detection interval time range, judge the query results of the most recent minutes for the monitored metrics, treat the query results as 0; at this point, the query results will be recompared with the thresholds configured in the Trigger Conditions above to determine whether to trigger an abnormal event.
-
Customize the fill-in value for the detection interval, trigger data disruption events, trigger emergency events, trigger important events, trigger warning events, and trigger recovery events; choose this type of configuration strategy, it is recommended that the custom data disruption time configuration be >= detection interval time. If the configured time <= detection interval time, there may be cases where both data disruption and abnormal situations are met simultaneously, in which case only the data disruption processing result will be applied.
Information Generation¶
Enable this option, and unmatched detection results from the above trigger conditions will generate "information" events and be written into the logs.
Note
If trigger conditions, data disruption, and information generation are configured simultaneously, the following priority applies: Data Disruption > Trigger Conditions > Information Event Generation.
Other Configurations¶
For more details, refer to Rule Configuration.