Process
The process collector can monitor various running processes in the system, acquire and analyze various metrics when the process is running, Including memory utilization rate, CPU time occupied, current state of the process, port of process monitoring, etc. According to various index information of process running, users can configure relevant alarms in Guance, so that users can know the state of the process, and maintain the failed process in time when the process fails.
Warning
Process collectors (whether objects or metrics) may consume a lot on macOS, causing CPU to soar, so you can turn them off manually. At present, the default collector still turns on the process object collector (it runs once every 5min by default).
Configuration¶
Preconditions¶
- The process collector does not collect process metrics by default. To collect metrics-related data, set
open_metric
totrue
inhost_processes.conf
. For example:
Collector Configuration¶
Go to the conf.d/host
directory under the DataKit installation directory, copy host_processes.conf.sample
and name it host_processes.conf
. Examples are as follows:
[[inputs.host_processes]]
# Only collect these matched process' metrics. For process objects
# these white list not applied. Process name support regexp.
# process_name = [".*nginx.*", ".*mysql.*"]
# Process minimal run time(default 10m)
# If process running time less than the setting, we ignore it(both for metric and object)
min_run_time = "10m"
## Enable process metric collecting
open_metric = false
## Enable listen ports tag, default is false
enable_listen_ports = false
## Enable open files field, default is false
enable_open_files = false
## only collect container-based process(object and metric)
only_container_processes = false
# Extra tags
[inputs.host_processes.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
# ...
Once configured, restart DataKit.
Can be turned on by ConfigMap Injection Collector Configuration or Config ENV_DATAKIT_INPUTS .
Can also be turned on by environment variables, (needs to be added as the default collector in ENV_DEFAULT_ENABLED_INPUTS):
-
ENV_INPUT_HOST_PROCESSES_OPEN_METRIC
Enable process metric collecting
Type: Boolean
input.conf:
open_metric
Default: false
-
ENV_INPUT_HOST_PROCESSES_PROCESS_NAME
Whitelist of process
Type: List
input.conf:
process_name
Example:
.*datakit.*,guance
-
ENV_INPUT_HOST_PROCESSES_MIN_RUN_TIME
Process minimal run time
Type: Duration
input.conf:
min_run_time
Default: 10m
-
ENV_INPUT_HOST_PROCESSES_ENABLE_LISTEN_PORTS
Enable listen ports tag
Type: Boolean
input.conf:
enable_listen_ports
Default: false
-
ENV_INPUT_HOST_PROCESSES_TAGS
Customize tags. If there is a tag with the same name in the configuration file, it will be overwritten
Type: Map
input.conf:
tags
Example: tag1=value1,tag2=value2
-
ENV_INPUT_HOST_PROCESSES_ONLY_CONTAINER_PROCESSES
Only collect container process for metric and object
Type: Boolean
input.conf:
only_container_processes
Default: false
-
ENV_INPUT_HOST_PROCESSES_METRIC_INTERVAL
Collect interval on metric
Type: Duration
input.conf:
metric_interval
Default: 30s
-
ENV_INPUT_HOST_PROCESSES_object_interval
Collect interval on object
Type: Duration
input.conf:
object_interval
Default: 300s
Metric¶
For all of the following data collections, a global tag named host
is appended by default (the tag value is the host name of the DataKit), or other tags can be specified in the configuration by [inputs.host_processes.tags]
:
host_processes
¶
Collect process metrics, including CPU/memory usage, etc.
- Tags
Tag | Description |
---|---|
container_id | Container ID of the process, only supported Linux |
host | Host name |
pid | Process ID |
process_name | Process name |
username | Username |
- Metrics
Metric | Description |
---|---|
cpu_usage | CPU usage, the percentage of CPU occupied by the process since it was started. This value will be more stable (different from the instantaneous percentage of top )Type: float Unit: percent,percent |
cpu_usage_top | CPU usage, the average CPU usage of the process within a collection cycle Type: float Unit: percent,percent |
mem_used_percent | Memory usage percentage Type: float Unit: percent,percent |
nonvoluntary_ctxt_switches | From /proc/[PID]/status. Context switches that nonvoluntary drop the CPU. Linux only Type: int Unit: count |
open_files | Number of open files (Linux only) Type: int Unit: count |
page_children_major_faults | Linux from /proc/[PID]/stat. The number of major page faults for this process. Linux only Type: int Unit: digital,B |
page_children_minor_faults | Linux from /proc/[PID]/stat. The number of minor page faults for this process. Linux only Type: int Unit: digital,B |
page_major_faults | Linux from /proc/[PID]/stat. The number of major page faults. Linux only Type: int Unit: digital,B |
page_minor_faults | Linux from /proc/[PID]/stat. The number of minor page faults. Linux only Type: int Unit: digital,B |
proc_read_bytes | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Read bytes from diskType: int Unit: digital,B |
proc_syscr | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Count of read() like syscall`. Linux&Windows onlyType: int Unit: count |
proc_syscw | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Count of write() like syscall`. Linux&Windows onlyType: int Unit: count |
proc_write_bytes | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Written bytes to diskType: int Unit: digital,B |
rss | Resident Set Size Type: int Unit: digital,B |
threads | Total number of threads Type: int Unit: count |
vms | Virtual memory size Type: int Unit: digital,B |
voluntary_ctxt_switches | From /proc/[PID]/status. Context switches that voluntary drop the CPU, such as sleep()/read()/sched_yield() . Linux onlyType: int Unit: count |
Object¶
host_processes
¶
Collect data on process objects, including process names, process commands, etc.
- Tags
Tag | Description |
---|---|
container_id | Container ID of the process if the process is running in container, Linux only |
host | Host name |
name | Process object name field, consisting of [host-name]_[pid] |
process_name | Process name |
state | Process status. Linux only |
username | Username |
- Metrics
Metric | Description |
---|---|
cmdline | Command line parameters for the process Type: string Unit: - |
cpu_usage | CPU usage, the percentage of CPU occupied by the process since it was started. This value will be more stable (different from the instantaneous percentage of top )Type: float Unit: percent,percent |
cpu_usage_top | CPU usage, the average CPU usage of the process within a collection cycle Type: float Unit: percent,percent |
listen_ports | The port the process is listening on Type: string Unit: - |
mem_used_percent | Memory usage percentage Type: float Unit: percent,percent |
message | Process details Type: string Unit: - |
nonvoluntary_ctxt_switches | From /proc/[PID]/status. Context switches that nonvoluntary drop the CPU. Linux only Type: int Unit: count |
open_files | Number of open files (only supports Linux, and the enable_open_files option needs to be turned on)Type: int Unit: count |
page_children_major_faults | Linux from /proc/[PID]/stat. The number of major page faults of it's child processes. Linux only Type: int Unit: digital,B |
page_children_minor_faults | Linux from /proc/[PID]/stat. The number of minor page faults of it's child processes. Linux only Type: int Unit: digital,B |
page_major_faults | Linux from /proc/[PID]/stat. The number of major page faults. Linux only Type: int Unit: digital,B |
page_minor_faults | Linux from /proc/[PID]/stat. The number of minor page faults. Linux only Type: int Unit: digital,B |
pid | Process ID Type: int Unit: - |
proc_read_bytes | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Read bytes from diskType: int Unit: digital,B |
proc_syscr | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Count of read() like syscall`. Linux&Windows onlyType: int Unit: count |
proc_syscw | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Count of write() like syscall`. Linux&Windows onlyType: int Unit: count |
proc_write_bytes | Linux from /proc/[PID]/io, Windows from GetProcessIoCounters() . Written bytes to diskType: int Unit: digital,B |
rss | Resident set size Type: int Unit: digital,B |
start_time | process start time Type: int Unit: timeStamp,msec |
started_duration | Process startup time Type: int Unit: timeStamp,sec |
state_zombie | Whether it is a zombie process Type: bool Unit: - |
threads | Total number of threads Type: int Unit: count |
vms | Virtual memory size Type: int Unit: digital,B |
voluntary_ctxt_switches | From /proc/[PID]/status. Context switches that voluntary drop the CPU, such as sleep()/read()/sched_yield() . Linux onlyType: int Unit: count |
work_directory | Working directory (Linux only) Type: string Unit: - |