Infrastructure Liveness Detection V2¶
Used to monitor the stability of data reporting for critical objects in infrastructure (such as HOSTs, CONTAINERS, Pods, etc.). By setting detection conditions and alert levels, anomalies can be detected and handled promptly, ensuring stable operation of the infrastructure.
Detection Configuration¶
Detection Frequency¶
The execution frequency of the detection rule.
The system defaults to the following frequencies:
- 5m (default display)
- 15m
- 30m
- 1h
- 6h
- 12h
- 24h
It also supports custom input detection frequency, formatted like: 20m (20 minutes), 2h (2 hours), 1d (1 day).
Note
Since object data reporting updates every 5 minutes, the detection frequency should be greater than 5 minutes and less than 1 day.
Detection Interval¶
The time range for querying detection Metrics each time a task is executed, affected by the detection frequency.
You can choose the default interval range provided by the system, with the corresponding relationship between detection frequency as follows:
Detection Frequency | Detection Interval |
---|---|
5m | 5m 15m 30m 1h 6h 12h 24h |
15m | 15m 30m 1h 6h 12h 24h |
30m | 30m 1h 6h 12h 24h |
1h | 1h 6h 12h 24h |
6h | 6h 12h 24h |
12h | 12h 24h |
24h | 24h |
Note
The time range for custom input detection intervals must be ≥ the time range of the detection frequency.
Detection Metrics¶
Monitored Metric data, covering various types of infrastructure:
- Infrastructure type: includes HOSTs, processes, CONTAINERS, Pods, Services, Deployments, Nodes, ReplicaSets, Jobs, CronJobs;
-
Detection Objects: supports selecting "all" or "custom" objects;
- All: detects all objects within the workspace, judging whether the last update time of the data triggers the threshold.
- Custom: limits the scope of infrastructure objects within the detection range through wildcard fuzzy matching or precise matching filtering conditions, judging whether their data's last update time triggers the threshold.
-
Additional Information: after selecting fields, the system will perform additional queries, but they are not used for triggering condition judgments.
Trigger Conditions¶
You can set trigger conditions for four alert levels: urgent, important, warning, and normal. Configure multiple trigger conditions and severity levels, and any one being met will generate an event.
Alert Levels¶
-
Urgent (red), Important (orange), Warning (yellow): based on configuration conditions to judge whether the last update time of the detection object data triggers an alert.
-
Normal (green): after the detection rule takes effect, if abnormal events occur and the data returns to normal within the custom detection count, then a recovery alert event is generated.
For more details, refer to Event Level Description.
Detection Count¶
Based on the configured detection count, it is explained as follows:
- Each execution of a detection task counts as 1 detection, such as a detection frequency of 5 minutes, then 1 detection = 5 minutes.
- You can customize the detection count, such as a detection frequency of 5 minutes, 3 detections = 15 minutes.
- If no abnormal events occur within the detection count, then a normal event is generated.
Note
The input value range supported for configuring trigger conditions for urgent, important, and warning levels is 5~999. When the input value is less than 5, adjustments are required to avoid false alarms during detection.
Other Configurations¶
For more details, refer to Rule Configuration.