Error Tracking¶
Guance provides an application performance monitoring error data analysis explorer, where you can quickly view the historical trends and distribution of similar errors in the trace under Application Performance Monitoring > Error Tracking, helping to quickly pinpoint performance issues.
The Error Tracking Explorer includes two lists: All Errors and Pattern Analysis:
-
All Errors: Used for overall viewing of all trace errors that occur in the project application;
-
Pattern Analysis: Used for quickly viewing the most frequent occurrences of trace errors that need to be resolved
Guance Explorer provides powerful query and analysis functions. Please refer to Explorer Description.
All Errors¶
In Guance workspace Application Performance Monitoring > Error Tracking, select the All Errors list to view and analyze error data from all traces.
Note: The statistics for all errors are based on the error status status=error
and contain the error_type
field in Span.
AI Error Analysis¶
Guance provides a one-click capability to parse error data. It uses large models to automatically extract key information from the data and combines it with online search engines and operations knowledge bases to quickly analyze possible causes of failures and provide preliminary solutions.
- Click on a single data point to expand the details page;
- Click AI Error Analysis in the top-right corner;
- Start anomaly analysis.
Correlation Analysis¶
In the Error Tracking Explorer, you can click any error to view the corresponding error trace details, including services, error types, error content, error distribution charts, error details, trace details, extended attributes, as well as associated logs, hosts, networks, etc.
In the error explorer detail page > Error Distribution Chart, based on the fields error_message
and error_types
, aggregate statistics for similar error traces within the selected time range of the error explorer, and automatically select the appropriate time interval to display the error distribution trend, helping you intuitively see the time points or ranges when errors frequently occur, quickly locating trace issues.
Pattern Analysis¶
If you need to view errors with higher frequency, you can choose the Pattern Analysis list in Guance workspace Application Performance Monitoring > Error Tracking.
Pattern Analysis performs similarity calculations on all trace data based on clustering fields. According to the selected time range in the upper-right corner, it fixes the current time period and retrieves 10,000 data entries within that period for cluster analysis. Similar error traces are aggregated, and common patterns are extracted and counted to help quickly identify abnormal traces and locate problems.
By default, aggregation is done based on the error_message
field, but up to 3 custom clustering fields can be input.
Pattern Analysis Details¶
- In the pattern analysis list, you can click any error to view all related error traces, and clicking the trace will take you to the error trace details page for analysis;
- On the pattern analysis page, click the sort icon & to sort the document quantity in ascending/descending order (default descending).
- If you need to export a particular data entry, open its details page and click the icon in the top-right corner.
Automatic Issue Discovery¶
Based on Guance's monitoring data from APM Error Tracking, once you enable the Automatic Issue Discovery configuration, the system will statistically aggregate corresponding anomaly data according to different grouping dimensions and perform stack tracing for subsequent similar issues, automatically condensing them into Issues. Issues generated through this entry will help you intuitively obtain the context and root cause of the problem, significantly reducing the average time to resolve issues.
Start Configuration¶
Note: Before enabling this configuration, you must configure rules first. Otherwise, it will not support enabling.
Data Source: i.e., the activation entry of the current configuration page.
Composite Dimensions: Classification and statistics based on configured field combinations, including service
, version
, resource
, error_type
.
For the data source, you can add filtering conditions to filter the data, and Guance will further query the data that meets the conditions, narrowing down the scope of available data.
Detection Frequency: Guance will query the data time range based on the frequency you select, including 5 minutes, 10 minutes, 15 minutes, 30 minutes, and 1 hour.
Issue Definition: After enabling this configuration, Issues will be presented externally based on the definition here. To avoid missing information, fill out sequentially.
In both the Title and Description sections of the Issue, the following template variables are supported:
Variable | Meaning |
---|---|
count |
Statistical count |
service |
Service name |
version |
Version |
resource |
Resource name |
error_type |
Error type |
error_message |
Error content |
error_stack |
Error stack |
View Issues¶
After saving the configuration and enabling it, the Issues automatically discovered and generated by the system will be displayed in Console > Incident Tracking.