Detection Rules¶
Guance supports over a dozen monitoring detection rules, covering different data ranges.
Rule Types¶
Rule Name |
Data Range |
Basic Description |
---|---|---|
Threshold Detection | All | Abnormal detection of metrics data based on set thresholds. |
Mutation Detection | Metrics(M) | Abnormal detection of sudden abnormal behavior in metrics based on historical data, often applicable to business data and scenarios with short time windows. |
Interval Detection | Metrics(M) | Detection of abnormal data points in metrics based on dynamic threshold ranges, often applicable to stable trend timelines. |
Interval Detection V2 | Metrics(M) | Detection of abnormal data points in metrics based on dynamic threshold ranges, often applicable to stable trend timelines. |
Outlier Detection | Metrics(M) | Detects whether there are outlier deviations in the metrics/statistics of the detected object under specific groupings. |
Log Detection | Logs(L) | Abnormal detection of business applications based on log data. |
Process Anomaly Detection | Process Objects(O::host_processes ) |
Periodically detects process data to understand process anomalies. |
Infrastructure Survival Detection V2 | Objects(O) | Sets survival conditions based on infrastructure object data to monitor the stability of infrastructure. |
Application Performance Metric Detection | Traces(T) | Sets threshold rules based on application performance monitoring data to detect abnormal situations. |
User Access Metric Detection | User Access Data(R) | Sets threshold rules based on user access monitoring data to detect abnormal situations. |
Composite Detection | All | Combines multiple monitors' results into one monitor through an expression, triggering alarms based on the combined result. |
Security Check Anomaly Detection | Security Checks(S) | Abnormal detection based on data generated by security checks, effectively sensing the health status of hosts. |
Usability Data Detection | Usability Data(L::type ) |
Sets threshold rules based on usability monitoring data to detect abnormal situations. |
Network Data Detection | Network(N) | Sets threshold rules based on network data to detect the stability of network performance. |
External Event Detection | Others | By specifying a URL address, sends abnormal events or records generated by third-party systems to an HTTP server via POST requests, generating Guance event data. |
Rule Configuration¶
Detection Configuration¶
Set corresponding detection frequencies, detection intervals, detection metrics, etc., for different detection rules.
Event Notifications¶
Event Title¶
Defines the event name for alarm trigger conditions; can use pre-set template variables.
Note
In the latest version, the monitor name will be automatically generated after entering the event title. In older monitors, there may be inconsistencies between the monitor name and the event title, so it is recommended to synchronize them to the latest version.
Event Content¶
Write the event notification content; when the trigger condition is met, the system will send this content externally. It generally includes the following information:
- Body text in Markdown format;
- Can insert associated links and template variables;
- Add associated logs or error messages based on advanced settings;
- Target notification members for sending event content.
Note
The @ member
configuration will only take effect and send the event content to the specified member when correlation anomaly tracking is enabled.
Associated Links¶
Monitors will automatically generate jump links based on the detection metrics in the detection configuration. You can adjust filtering conditions and time ranges after inserting the link. Generally, it is a fixed link address prefix that contains the current domain name and workspace ID; you can also choose to customize the jump link.
Among these, if you need to insert a link to jump to the dashboard, based on the above logic, you will also need to supplement the dashboard's ID and name, and adjust view variables and time ranges as needed.
Custom Advanced Settings¶
Through advanced settings, you can add associated logs or error stacks to the event content to view contextual data when an anomaly occurs.
- Adding Associated Logs:
Query:
For example, get a log message
with an index of default
:
Associated Logs:
- Adding Associated Error Stacks
Query:
{% set dql_data = DQL("T::re(`.*`):(`error_message`,`error_stack`){ (`source` NOT IN ['service_map', 'tracing_stat', 'service_list_1m', 'service_list_1d', 'service_list_1h', 'profile']) AND (`error_stack` = exists()) } LIMIT 1") %}
Associated Error Stack:
Custom Notification Content¶
By default, the system uses event content as the alarm notification content. If you need to customize the actual external notification sent, you can enable the switch here and fill in the notification information.
Note
Different alarm notification targets support different Markdown syntax. For example, WeCom does not support unordered lists.
Data Gap Events¶
Customize the notification content for data gaps. You can configure the title and content of this type of event that will be sent externally.
If not configured here, the official default notification template will be used when sending externally.
Correlated Incident Tracking¶
After enabling correlation, if an anomaly event occurs under this monitor, an Issue will be created synchronously. You can choose to create Issues synchronously for different event levels.
- Select event level;
- Define the final level of the generated Issue;
- Select the responsible person for this type of Issue;
- Choose the delivery channel;
- Optionally choose to close the Issue synchronously after the event recovers.
Issues generated here can be viewed at Incident Tracking > Selected Channel.
Alert Configuration¶
After the monitor meets the trigger conditions, immediately send alert messages to the specified notification objects. The alert strategy includes the event levels that need to be notified, notification objects, and the alert silence cycle.
The alert strategy supports single or multiple selections, and clicking on the strategy name expands the details page. To modify the strategy, click Edit Alert Strategy.
Correlation¶
Supports associating monitors with dashboards for quick jumps to related data.
Permissions¶
After setting monitor operation permissions, ensure that different users perform compliant operations based on their roles and permission levels.
- Not enabling this configuration: Follows the default permissions for "Monitor Configuration Management";
- Enabling this configuration and selecting custom permission objects: Only the creator and authorized objects can enable/disable, edit, and delete the rules set for this monitor;
- Enabling this configuration but not selecting custom permission objects: Only the creator has the enable/disable, edit, and delete permissions for this monitor.
Note
The Owner role of the current workspace is not affected by the operational permission configuration here.
Recover Monitors¶
Supports viewing the status, last update time, creation time, and creator of existing monitors. You can recover the historical configuration of monitors to quickly communicate and collaborate with other team members to update monitors.
Operation Example:
In Monitor > Monitor, select an existing monitor to edit. On the monitor configuration page, click the button in the top right corner to view the monitor's status, last update time, creation time, and creator.
Click the view button to the right of the update time in the above image, which opens a new browser window to view the previous version of the monitor configuration;
Click the Restore This Version button in the top right corner of the previous version monitor. In the pop-up dialog box, confirm the restoration to restore to the previous version of the monitor configuration for editing and saving.