Infrastructure Liveness Detection V2¶
Used to monitor the stability of data reporting for key objects in infrastructure (such as HOSTs, CONTAINERS, Pods, etc.). By setting detection conditions and alert levels, it helps promptly identify and address anomalies to ensure stable infrastructure operation.
Detection Configuration¶
Detection Frequency¶
The execution frequency of the detection rules.
The system defaults to the following frequencies:
- 5m (default display)
- 15m
- 30m
- 1h
- 6h
- 12h
- 24h
It also supports custom input for detection frequency, such as: 20m (20 minutes), 2h (2 hours), 1d (1 day).
Note
Since object data reporting updates every 5 minutes, the detection frequency should be greater than 5 minutes and less than 1 day.
Detection Interval¶
The time range for querying detection metrics each time a task is executed, influenced by the detection frequency.
You can choose the default interval range provided by the system, with the corresponding relationship between the detection frequency as follows:
Detection Frequency | Detection Interval |
---|---|
5m | 5m 15m 30m 1h 6h 12h 24h |
15m | 15m 30m 1h 6h 12h 24h |
30m | 30m 1h 6h 12h 24h |
1h | 1h 6h 12h 24h |
6h | 6h 12h 24h |
12h | 12h 24h |
24h | 24h |
Note
The time range for custom input detection intervals must be ≥ the time range of the detection frequency.
Detection Metrics¶
Monitored metric data, covering various types of infrastructure:
- Infrastructure Types: Includes HOSTs, processes, CONTAINERS, Pods, Services, Deployments, Nodes, ReplicaSets, Jobs, CronJobs;
-
Detection Objects: Supports selecting "all" or "custom" objects;
- All: Detects all objects within the workspace, judging whether the last update time of the data triggers the threshold.
- Custom: Limits the scope of infrastructure objects within the detection range using wildcard fuzzy matching or precise matching filtering conditions, judging whether their data's last update time triggers the threshold.
-
Additional Information: After selecting fields, the system performs additional queries, but these are not used for triggering condition judgments.
Trigger Conditions¶
You can set trigger conditions for four alert levels: urgent, important, warning, and normal. Configure multiple trigger conditions and severity levels, any one of which being met will generate an event.
Alert Levels¶
-
Urgent (red), Important (orange), Warning (yellow): Based on configuration conditions, judge whether the last update time of the detection object's data triggers an alert.
-
Normal (green): After the detection rule takes effect, if abnormal events occur and data returns to normal within a custom number of detections, then a recovery alert event is generated.
For more details, refer to Event Level Description.
Detection Counts¶
Based on configured detection counts, the explanation is as follows:
- Each execution of a detection task counts as 1 detection, for example, if the detection frequency is 5 minutes, then 1 detection = 5 minutes.
- You can customize the number of detections, for instance, if the detection frequency is 5 minutes, 3 detections = 15 minutes.
- If no abnormal events occur within the number of detections, then a normal event is generated.
Note
The input value range supported for trigger conditions for urgent, important, and warning levels is 5~999. When the input value is less than 5, adjustments are needed to avoid false alarms during detection.