Apollo¶

Collect Apollo related Metrics information.

Installation and Configuration¶

Prerequisites¶

Install DataKit

Apollo Monitoring¶

In a distributed deployment, Apollo includes numerous roles played by three types of processes: Portal, ConfigService, and AdminService. For example, dedicated ConfigService and AdminService are deployed separately for testing and production environments. Refer to the Apollo Deployment Architecture. All three types of processes expose Prometheus format metrics at the /prometheus endpoint:

Portal: 8070/prometheus
ConfigService: 8080/prometheus
AdminService: 8090/prometheus

DataKit Collector Configuration¶

Since Apollo can directly expose metrics urls, it is possible to collect them using the prom collector.

Navigate to the conf.d/prom directory under the DataKit installation directory, and copy prom.conf.sample as apollo-portal-prod-1.conf

cp prom.conf.sample apollo-portal-prod-1.conf

Adjust the content as follows:

  url = "http://127.0.0.1:8070/prometheus"
  ## Collector alias.
  source = "apollo_portal_prod_1"
  ## (Optional) Collect interval: (defaults to "30s").
  interval = "30s"
  ## If measurement_name is not empty, use this as the Measurement set name.
  measurement_name = "apollo"

Following the same method, create configuration files for the ConfigService and AdminService collectors.

Other configurations should be adjusted as needed, parameter adjustment explanation:

urls: prometheus Metrics address, fill in the Metrics url exposed by the corresponding component
source: Collector alias, differentiation is recommended
interval: Collection interval

Restart DataKit¶

Restart DataKit

Metrics¶

Apollo Metrics are located under the Apollo Metrics set. This section mainly introduces the description of Apollo-related Metrics.

Metric Name	Description	Unit
`http_server_requests_seconds`	HTTP server response time when processing requests; clients connect to the Apollo server using HTTP	Second
`process_uptime_seconds`	JVM uptime duration	Second
`hikaricp_connections_active`	Number of active connections	Count
`hikaricp_connections_idle`	Number of idle connections	Count
`hikaricp_connections_pending`	Number of threads waiting for connections; normally 0, persistent non-zero values should trigger alerts, optimization methods include increasing maximum connections	Count
`hikaricp_connections_usage_seconds`	Time that connections are occupied by business logic; long durations should trigger alerts, possibly due to slow database responses; pay attention to average and P99 extreme values	Second
`jvm_memory_max_bytes`	Maximum number of bytes managed by the JVM, identified by different memory types using the id tag	Byte
`jvm_memory_usage_after_gc_percent`	Percentage of long-lived objects in heap memory after last GC	%
`jvm_memory_used_bytes`	Number of used bytes managed by the JVM, identified by different memory types using the id tag	Byte
`jvm_memory_committed_bytes`	Number of committed bytes by the JVM	Byte
`jvm_gc_pause_seconds`	Duration of JVM GC pauses	Second
`system_load_average_1m`	Operating system average load over the past one minute	-
`system_cpu_count`	Number of CPUs available to the JVM	Count
`system_cpu_usage`	Operating system CPU usage	%
`process_cpu_usage`	Process CPU usage	%
`process_files_max_files`	Maximum number of file descriptors allowed to be opened by the process	Count
`process_files_open_files`	Number of file descriptors opened by the process	Count