Monitoring¶
Guance provides a comprehensive anomaly monitoring system. Based on unified platform data, it offers a complete solution from detection and alerting to incident management. By creating monitors, you can perform continuous state judgment on data such as metrics, logs, application performance, user access, and objects. When a monitor detects an anomaly, it automatically triggers an alert and generates an incident, notifying relevant personnel through preset notification strategies. It also supports alert muting and SLO management, enabling refined alert governance and stability measurement.
Getting Started¶
-
Monitors are the core components for executing detection tasks. They support setting detection rules for data sources such as time series metrics, logs, Application Performance Management (APM), and User Experience (RUM). You can set various trigger rules, including threshold detection, mutation detection, and range detection, based on different monitoring scenarios, and flexibly configure detection frequency, trigger conditions, etc., to ensure accurate identification of system anomalies.
-
Monitors support the integration of Intelligent Detection algorithms. Using machine learning technology, it automatically analyzes the historical data characteristics and periodic patterns of monitoring metrics, intelligently identifying abnormal fluctuations in the data. This feature is suitable for detecting complex metrics with periodicity and trends, effectively compensating for the limitations of fixed threshold detection and improving the accuracy and timeliness of anomaly discovery.
-
Used to establish a complete alert mechanism from anomaly detection to notification handling. By creating alert strategies, you can define conditions for triggering alerts and execute corresponding notification actions. Alert strategies determine the detection source of alerts by binding monitors, set trigger conditions to define the incident severity level, and configure notification rules to select notification targets and delivery channels.
-
When creating an alert strategy, configure notification targets to define the recipients of alert messages. Supports creating multiple notification targets, including types such as DingTalk, Lark, and WeCom bots. Each notification target, once created, can be bound within an alert strategy, establishing a correspondence between alert incidents and message recipients. Through this mechanism, different alert incidents can be sent to specified teams or platforms.
-
All triggered alerts are uniformly aggregated into the alert Incident Center. To avoid alert interference during planned maintenance or known issues, you can set mute rules to suppress alert notifications for specific monitors or monitored objects for a specified period.
-
SLO (Service Level Objective) management allows you to define service stability targets based on data generated by monitors (such as request success rate, latency, etc.). You can create SLOs and configure target values. The system continuously tracks SLO achievement and remaining error budget, providing a quantitative basis for service stability.