Skip to content

Infrastructure Liveness Detection V2


Used to monitor the stability of data reporting for critical objects in infrastructure (such as HOSTs, CONTAINERS, Pods, etc.). By setting detection conditions and alert levels, anomalies can be detected and handled promptly, ensuring stable operation of the infrastructure.

Detection Configuration

Detection Frequency

The execution frequency of the detection rule.

The system defaults to the following frequencies:

  • 5m (default display)
  • 15m
  • 30m
  • 1h
  • 6h
  • 12h
  • 24h

It also supports custom input detection frequency, formatted like: 20m (20 minutes), 2h (2 hours), 1d (1 day).

Note

Since object data reporting updates every 5 minutes, the detection frequency should be greater than 5 minutes and less than 1 day.

Detection Interval

The time range for querying detection Metrics each time a task is executed, affected by the detection frequency.

You can choose the default interval range provided by the system, with the corresponding relationship between detection frequency as follows:

Detection Frequency Detection Interval
5m 5m
15m
30m
1h
6h
12h
24h
15m 15m
30m
1h
6h
12h
24h
30m 30m
1h
6h
12h
24h
1h 1h
6h
12h
24h
6h 6h
12h
24h
12h 12h
24h
24h 24h
Note

The time range for custom input detection intervals must be ≥ the time range of the detection frequency.

Detection Metrics

Monitored Metric data, covering various types of infrastructure:

  1. Infrastructure type: includes HOSTs, processes, CONTAINERS, Pods, Services, Deployments, Nodes, ReplicaSets, Jobs, CronJobs;
  2. Detection Objects: supports selecting "all" or "custom" objects;

    • All: detects all objects within the workspace, judging whether the last update time of the data triggers the threshold.
    • Custom: limits the scope of infrastructure objects within the detection range through wildcard fuzzy matching or precise matching filtering conditions, judging whether their data's last update time triggers the threshold.
  3. Additional Information: after selecting fields, the system will perform additional queries, but they are not used for triggering condition judgments.

Trigger Conditions

You can set trigger conditions for four alert levels: urgent, important, warning, and normal. Configure multiple trigger conditions and severity levels, and any one being met will generate an event.

Alert Levels

  • Urgent (red), Important (orange), Warning (yellow): based on configuration conditions to judge whether the last update time of the detection object data triggers an alert.

  • Normal (green): after the detection rule takes effect, if abnormal events occur and the data returns to normal within the custom detection count, then a recovery alert event is generated.

For more details, refer to Event Level Description.

Detection Count

Based on the configured detection count, it is explained as follows:

  • Each execution of a detection task counts as 1 detection, such as a detection frequency of 5 minutes, then 1 detection = 5 minutes.
  • You can customize the detection count, such as a detection frequency of 5 minutes, 3 detections = 15 minutes.
  • If no abnormal events occur within the detection count, then a normal event is generated.
Note

The input value range supported for configuring trigger conditions for urgent, important, and warning levels is 5~999. When the input value is less than 5, adjustments are required to avoid false alarms during detection.

Other Configurations

For more details, refer to Rule Configuration.

Feedback

Is this page helpful? ×