Log Monitoring¶
The Log Monitoring feature is used to monitor all log data generated by log collectors within the workspace. It supports setting alert rules based on log keywords to quickly identify abnormal patterns that deviate from expected behavior, such as abnormal tags appearing in log text. This enables timely detection and response to potential security threats or system issues.
Use Cases¶
It is widely applicable in IT monitoring scenarios for code exceptions or task scheduling detection. For example, monitoring high log error rates.
Configuration¶
Monitoring Frequency¶
The execution frequency of monitoring rules; defaults to 5 minutes.
Monitoring Interval¶
The time range for metric queries each time the task is executed. The available monitoring intervals vary depending on the monitoring frequency.
| Monitoring Frequency | Monitoring Interval (Dropdown Options) |
|---|---|
| 1m | 1m/5m/15m/30m/1h/3h |
| 5m | 5m/15m/30m/1h/3h |
| 15m | 15m/30m/1h/3h/6h |
| 30m | 30m/1h/3h/6h |
| 1h | 1h/3h/6h/12h/24h |
| 6h | 6h/12h/24h |
| 12h | 12h/24h |
| 24h | 24h |
Monitoring Metrics¶
Monitor the number of logs containing specified keywords in the log list of the designated monitoring object within a certain time range.
| Field | Description |
|---|---|
| Index | The index to which the current monitoring metric belongs; multiple selections are allowed. ❗️ After setting the index in Logs > Index, when selecting "Logs" as the data source for chart queries, different indices corresponding to log content can be selected, defaulting to the default index. |
| Source | The data source of the current monitoring metric, supports selecting all (*) or a specific single data source. |
| Keyword Search | Supports keyword search. |
| Filter Conditions | Filters the data of the monitoring metrics based on the tags of the metrics, limiting the data scope; supports adding one or more tag filters; supports fuzzy match and fuzzy not match conditions. |
| Aggregation Algorithm | Defaults to "*", corresponding to the count function. If another field is selected, the function automatically changes to count distinct (counts the number of data points where the keyword appears). |
| Monitoring Dimension | Any string type (keyword) field in the configuration data can be selected as a monitoring dimension, currently supporting up to three fields. By combining multiple monitoring dimension fields, a specific monitoring object can be determined. The system will judge whether the statistical metrics of a monitoring object meet the threshold of the trigger conditions, and if so, an event will be generated.(For example, selecting monitoring dimensions host and host_ip, the monitoring object can be {host: host1, host_ip: 127.0.0.1}.) When the monitoring object is "Logs", it defaults to status, host, service, source, filename as monitoring dimensions. |
| Query Method | Supports simple query and expression query. If the query method is expression query and contains multiple queries, the log monitoring object is the same. For example, if the monitoring object of expression query A is "Logs", then the monitoring object of expression query B is also "Logs". |
Trigger Conditions¶
Set the trigger conditions for alert levels: You can configure any one of emergency, important, warning, or normal trigger conditions.
Configure trigger conditions and severity levels. When the query result contains multiple values, any value that meets the trigger conditions will generate an event.
For more details, refer to Event Level Description.
Continuous Trigger Judgment¶
If continuous trigger judgment is enabled, you can configure the system to generate an event after the trigger conditions are met multiple times consecutively. The maximum limit is 10 times.
Bulk Alert Protection¶
Enabled by default.
When the number of alerts generated in a single detection exceeds the preset threshold, the system automatically switches to a summary strategy by status: instead of processing each alert object individually, it generates a few summary alerts based on the event status and pushes them.
This ensures the timeliness of notifications while significantly reducing alert noise, avoiding the risk of timeout due to processing too many alerts.
Note
When this switch is enabled, subsequent Event Details generated by the monitor after detecting anomalies will not display historical records and related events.
Alert Levels¶
-
Alert Level Emergency (Red), Important (Orange), Warning (Yellow): Based on the configured condition judgment operators.
-
Alert Level Normal (Green): Based on the configured detection count, explained as follows:
-
Each execution of a detection task counts as 1 detection, e.g., if
Detection Frequency = 5 minutes, then 1 detection = 5 minutes; -
You can customize the detection count, e.g., if
Detection Frequency = 5 minutes, then 3 detections = 15 minutes.
Level Description Normal After the detection rule takes effect, if emergency, important, or warning abnormal events are generated, and the data detection results return to normal within the configured custom detection count, a recovery alert event is generated.
❗️ Recovery alert events are not subject to Alert Silence restrictions. If the detection count for recovery alert events is not set, the alert events will not recover and will continue to appear in the Events > Unrecovered Events List. -
Data Gap¶
For data gap status, seven strategies can be configured.
-
Link the detection interval time range to judge the query results of the most recent minutes of the monitoring metrics, do not trigger events;
-
Link the detection interval time range to judge the query results of the most recent minutes of the monitoring metrics, treat the query results as 0; at this time, the query results will be re-compared with the thresholds configured in the Trigger Conditions above to determine whether to trigger abnormal events.
-
Custom fill the detection interval value, trigger data gap events, trigger emergency events, trigger important events, trigger warning events, and trigger recovery events; for this type of configuration strategy, it is recommended that the custom data gap time configuration >= detection interval time interval. If the configured time <= detection interval time interval, there may be cases where both data gap and abnormal conditions are met, in which case only the data gap processing results will be applied.
Information Generation¶
When this option is enabled, detection results that do not match the above trigger conditions will generate "information" events.
Note
When trigger conditions, data gap, and information generation are configured simultaneously, the triggering is judged in the following priority: data gap > trigger conditions > information event generation.
Other Configurations¶
For more details, refer to Rule Configuration.