跳转至

Kubernetes


采集 Container 和 Kubernetes 的指标、对象和日志数据,上报到观测云。

采集器配置

前置条件

  • 目前 container 支持 Docker、Containerd、CRI-O 容器运行时
    • 版本要求:Docker v17.04 及以上版本,Containerd v1.5.1 及以上,CRI-O 1.20.1 及以上
  • 采集 Kubernetes 数据需要 DataKit 以 DaemonSet 方式部署
Info
  • 容器采集支持 Docker 和 Containerd 两种运行时 Version-1.5.7,且默认都开启采集。

如果是纯 Docker 或 Containerd 环境,那么 DataKit 只能安装在宿主机上。

进入 DataKit 安装目录下的 conf.d/samples 目录,复制 container.conf.sample 并命名为 container.conf。示例如下:

[inputs.container]
  endpoints = [
    "unix:///var/run/docker.sock",
    "unix:///var/run/containerd/containerd.sock",
    "unix:///var/run/crio/crio.sock",
  ]

  ## Collect metric interval, default "60s".
  # metric_collect_interval = "60s"
  ## Collect object interval, default "5m".
  # object_collect_interval = "5m"
  ## Search logging interval, default "60s".
  # logging_search_interval = "60s"

  enable_container_metric = true
  enable_k8s_metric       = true
  enable_pod_metric       = false
  enable_k8s_event        = true
  enable_k8s_node_local   = true
  enable_collect_k8s_job  = true

  ## Add resource Label as Tags (container use Pod Label), need to specify Label keys.
  ## e.g. ["app", "name"]
  # extract_k8s_label_as_tags_v2            = []
  # extract_k8s_label_as_tags_v2_for_metric = []

  ## Containers logs to include and exclude, default collect all containers. Globs accepted.
  container_include_log = []
  container_exclude_log = ["image:*logfwd*", "image:*datakit*"]

  ## Pods metric to include and exclude, default collect all pods. Globs accepted.
  pod_include_metric = []
  pod_exclude_metric = []

  logging_enable_multiline              = true
  logging_auto_multiline_detection      = true
  logging_auto_multiline_extra_patterns = []

  ## Only retain the fields specified in the whitelist.
  logging_field_white_list = []

  ## Removes ANSI escape codes from text strings.
  logging_remove_ansi_escape_codes = false

  ## Whether to collect logs from the begin of the file.
  logging_file_from_beginning = false

  ## The maximum allowed number of open files, default is 500. If it is -1, it means no limit.
  # logging_max_open_files = 500

  ## Log collection configures additional source matching, and the regular source will be renamed.
  [inputs.container.logging_extra_source_map]
    # source_regexp = "new_source"

  ## Log collection with multiline configuration as specified by the source.
  [inputs.container.logging_source_multiline_map]
    # source = '''^\d{4}'''

  [inputs.container.tags]
    # some_tag = "some_value"
    # more_tag = "some_other_value"

可通过 ConfigMap 方式注入采集器配置配置 ENV_DATAKIT_INPUTS 开启采集器。

也支持以环境变量的方式修改配置参数(需要在 ENV_DEFAULT_ENABLED_INPUTS 中加为默认采集器):

  • ENV_INPUT_CONTAINER_ENDPOINTS

    追加多个容器运行时的 endpoint

    字段类型: List

    采集器配置字段: endpoints

    示例: "unix:///var/run/docker.sock,unix:///var/run/containerd/containerd.sock,unix:///var/run/crio/crio.sock"

  • ENV_INPUT_CONTAINER_METRIC_COLLEC_INTERVAL

    容器/k8s 指标数据采集间隔

    字段类型: Duration

    采集器配置字段: metric_collec_interval

    默认值: 60s

  • ENV_INPUT_CONTAINER_OBJECT_COLLEC_INTERVAL

    容器/k8s 对象数据采集间隔

    字段类型: Duration

    采集器配置字段: object_collec_interval

    默认值: 5m

  • ENV_INPUT_CONTAINER_LOGGING_SEARCH_INTERVAL

    日志发现的时间间隔,即每隔多久检索一次日志,如果间隔太长,会导致忽略了一些存活较短的日志

    字段类型: Duration

    采集器配置字段: logging_search_interval

    默认值: 60s

  • ENV_INPUT_CONTAINER_ENABLE_CONTAINER_METRIC

    开启容器指标采集

    字段类型: Boolean

    采集器配置字段: enable_container_metric

    默认值: true

  • ENV_INPUT_CONTAINER_ENABLE_K8S_METRIC

    开启 k8s 指标采集

    字段类型: Boolean

    采集器配置字段: enable_k8s_metric

    默认值: true

  • ENV_INPUT_CONTAINER_ENABLE_POD_METRIC

    是否开启 Pod 指标采集(CPU 和内存使用情况)

    字段类型: Boolean

    采集器配置字段: enable_pod_metric

    默认值: false

  • ENV_INPUT_CONTAINER_ENABLE_K8S_EVENT

    是否开启分时间采集模式

    字段类型: Boolean

    采集器配置字段: enable_k8s_event

    默认值: true

  • ENV_INPUT_CONTAINER_ENABLE_K8S_NODE_LOCAL

    是否开启分 Node 采集模式,由部署在各个 Node 的 DataKit 独立采集当前 Node 的资源。 Version-1.19.0 需要额外的 RABC 权限,见此处

    字段类型: Boolean

    采集器配置字段: enable_k8s_node_local

    默认值: true

  • ENV_INPUT_CONTAINER_ENABLE_COLLECT_KUBE_JOB

    开启对 Kubernetes Job 资源的采集(包括指标数据和对象数据)

    字段类型: Boolean

    采集器配置字段: enable_collect_kube_job

    默认值: true

  • ENV_INPUT_CONTAINER_EXTRACT_K8S_LABEL_AS_TAGS_V2

    追加资源的 labels 到数据(不包括指标数据)的 tag 中。需指定 label keys,如果只有一个 key 且为空字符串(例如 [""]),会添加所有 labels 到 tag。容器会继承 Pod labels。如果 label 的 key 有 dot 字符,会将其变为横线

    字段类型: JSON

    采集器配置字段: extract_k8s_label_as_tags_v2

    示例: '["app","name"]'

  • ENV_INPUT_CONTAINER_EXTRACT_K8S_LABEL_AS_TAGS_V2_FOR_METRIC

    追加资源的 labels 到指标数据的 tag 中。需指定 label keys,如果只有一个 key 且为空字符串(例如 [""]),会添加所有 labels 到 tag。容器会继承 Pod labels。如果 label 的 key 有 dot 字符,会将其变为横线

    字段类型: JSON

    采集器配置字段: extract_k8s_label_as_tags_v2_for_metric

    示例: '["app","name"]'

  • ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_POD_ANNOTATIONS

    是否开启自动发现 Prometheus Pod Annotations 并采集指标

    字段类型: Boolean

    采集器配置字段: enable_auto_discovery_of_prometheus_pod_annotations

    默认值: false

  • ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_SERVICE_ANNOTATIONS

    是否开启自动发现 Prometheus 服务 Annotations 并采集指标

    字段类型: Boolean

    采集器配置字段: enable_auto_discovery_of_prometheus_service_annotations

    默认值: false

  • ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_POD_MONITORS

    是否开启自动发现 Prometheus Pod Monitor CRD 并采集指标,详见 Prometheus-Operator CRD 文档

    字段类型: Boolean

    采集器配置字段: enable_auto_discovery_of_prometheus_pod_monitors

    默认值: false

  • ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_SERVICE_MONITORS

    是否开启自动发现 Prometheus ServiceMonitor CRD 并采集指标,详见 Prometheus-Operator CRD 文档

    字段类型: Boolean

    采集器配置字段: enable_auto_discovery_of_prometheus_service_monitors

    默认值: false

  • ENV_INPUT_CONTAINER_CONTAINER_MAX_CONCURRENT

    采集容器数据时的最大并发数,推荐只在采集延迟较大时开启

    字段类型: Int

    采集器配置字段: container_max_concurrent

    默认值: cpu cores + 1

  • ENV_INPUT_CONTAINER_CONTAINER_INCLUDE_LOG

    容器日志白名单,使用 image/namespace 过滤

    字段类型: List

    采集器配置字段: container_include_log

    示例: "image:pubrepo.jiagouyun.com/datakit/logfwd*"

  • ENV_INPUT_CONTAINER_CONTAINER_EXCLUDE_LOG

    容器日志黑名单,使用 image/namespace 过滤

    字段类型: List

    采集器配置字段: container_exclude_log

    示例: "image:pubrepo.jiagouyun.com/datakit/logfwd*"

  • ENV_INPUT_CONTAINER_POD_INCLUDE_METRIC

    Pod 指标白名单,使用 namespace 过滤

    字段类型: List

    采集器配置字段: pod_include_metric

    示例: "namespace:datakit*"

  • ENV_INPUT_CONTAINER_POD_EXCLUDE_METRIC

    Pod 指标黑名单,使用 namespace 过滤

    字段类型: List

    采集器配置字段: pod_exclude_metric

    示例: "namespace:kube-system"

  • ENV_INPUT_CONTAINER_LOGGING_EXTRA_SOURCE_MAP

    日志采集配置额外的 source 匹配,符合正则的 source 会被改名

    字段类型: Map

    采集器配置字段: logging_extra_source_map

    示例: source_regex*=new_source,regex*=new_source2

  • ENV_INPUT_CONTAINER_LOGGING_SOURCE_MULTILINE_MAP_JSON

    日志采集根据 source 指定多行配置

    字段类型: JSON

    采集器配置字段: logging_source_multiline_map

    示例: '{"source_nginx":"^\d{4}", "source_redis":"^[A-Za-z_]"}'

  • ENV_INPUT_CONTAINER_LOGGING_AUTO_MULTILINE_DETECTION

    日志采集是否开启自动多行模式,开启后会在 patterns 列表中匹配适用的多行规则

    字段类型: Boolean

    采集器配置字段: logging_auto_multiline_detection

    默认值: false

  • ENV_INPUT_CONTAINER_LOGGING_AUTO_MULTILINE_EXTRA_PATTERNS_JSON

    日志采集的自动多行模式 pattens 列表,支持手动配置多个多行规则

    字段类型: JSON

    采集器配置字段: logging_auto_multiline_extra_patterns

    示例: '["^\d{4}-\d{2}", "^[A-Za-z_]"]'

    默认值: For more default rules, see doc

  • ENV_INPUT_CONTAINER_LOGGING_REMOVE_ANSI_ESCAPE_CODES

    日志采集删除包含的颜色字符,详见日志特殊字符处理说明

    字段类型: Boolean

    采集器配置字段: logging_remove_ansi_escape_codes

    默认值: false

  • ENV_INPUT_CONTAINER_LOGGING_FILE_FROM_BEGINNING_THRESHOLD_SIZE

    根据文件 size 决定是否 from_beginning,如果发现该文件时,文件 size 小于这个值,就使用 from_beginning 从头部开始采集

    字段类型: Int

    采集器配置字段: logging_file_from_beginning_threshold_size

    默认值: 20,000,000

  • ENV_INPUT_CONTAINER_LOGGING_FILE_FROM_BEGINNING

    是否从文件首部采集日志

    字段类型: Boolean

    采集器配置字段: logging_file_from_beginning

    默认值: false

  • ENV_INPUT_CONTAINER_LOGGING_MAX_OPEN_FILES

    日志采集最大打开文件个数,如果是 -1 则没有限制

    字段类型: Int

    采集器配置字段: logging_max_open_files

    默认值: 500

  • ENV_INPUT_CONTAINER_LOGGING_FIELD_WHITE_LIST

    指定保留白名单中的字段

    字段类型: List

    采集器配置字段: logging_field_white_list

    示例: '["service","container_id"]'

  • ENV_INPUT_CONTAINER_TAGS

    自定义标签。如果配置文件有同名标签,将会覆盖它

    字段类型: Map

    采集器配置字段: tags

    示例: tag1=value1,tag2=value2

环境变量额外说明:

  • ENV_INPUT_CONTAINER_TAGS:如果配置文件(container.conf)中有同名 tag,将会被这里的配置覆盖掉。

  • ENV_INPUT_CONTAINER_LOGGING_EXTRA_SOURCE_MAP:指定替换 source,参数格式是「正则表达式=new_source」,当某个 source 能够匹配正则表达式,则这个 source 会被 new_source 替换。如果能够替换成功,则不再使用 annotations/labels 中配置的 source( Version-1.4.7)。如果要做到精确匹配,需要使用 ^$ 将内容括起来。比如正则表达式写成 datakit,不仅可以匹配 datakit 字样,还能匹配到 datakit123;写成 ^datakit$ 则只能匹配到的 datakit

  • ENV_INPUT_CONTAINER_LOGGING_SOURCE_MULTILINE_MAP_JSON:用来指定 source 到多行配置的映射,如果某个日志没有配置 multiline_match,就会根据它的 source 来此处查找和使用对应的 multiline_match。因为 multiline_match 值是正则表达式较为复杂,所以 value 格式是 JSON 字符串,可以使用 json.cn 辅助编写并压缩成一行。

Note
  • 对象数据采集间隔是 5 分钟,指标数据采集间隔是 60 秒,不支持配置
  • 采集到的日志,单行(包括经过 multiline_match 处理后)最大长度默认为 800KB 左右,超出部分会被分割成多条日志

Docker 和 Containerd sock 文件配置

如果 Docker 或 Containerd 的 sock 路径不是默认的,则需要指定一下 sock 文件路径,根据 DataKit 不同部署方式,其方式有所差别,以 Containerd 为例:

修改 container.conf 的 endpoints 配置项,将其设置为对应的 sock 路径。

更改 datakit.yaml 的 volumes containerd-socket,将新路径 mount 到 DataKit 中,同时配置环境变量 ENV_INPUT_CONTAINER_ENDPOINTS

# 添加 env
- env:
  - name: ENV_INPUT_CONTAINER_ENDPOINTS
    value: ["unix:///path/to/new/containerd/containerd.sock"]

# 修改 mountPath
  - mountPath: /path/to/new/containerd/containerd.sock
    name: containerd-socket
    readOnly: true

# 修改 volumes
volumes:
- hostPath:
    path: /path/to/new/containerd/containerd.sock
  name: containerd-socket

环境变量 ENV_INPUT_CONTAINER_ENDPOINTS 是追加到现有的 endpoints 配置,最终实际 endpoints 配置可能有很多项,采集器会去重然后逐一连接、采集。

默认的 endpoints 配置是:

  endpoints = [
    "unix:///var/run/docker.sock",
    "unix:///var/run/containerd/containerd.sock",
    "unix:///var/run/crio/crio.sock",
  ] 

使用环境变量 ENV_INPUT_CONTAINER_ENDPOINTS["unix:///path/to/new//run/containerd.sock"],最终 endpoints 配置如下:

  endpoints = [
    "unix:///var/run/docker.sock",
    "unix:///var/run/containerd/containerd.sock",
    "unix:///var/run/crio/crio.sock",
    "unix:///path/to/new//run/containerd.sock",
  ] 

采集器会连接和采集这些容器运行时,如果 sock 文件不存在,会在第一次连接失败时输出报错日志,不影响后续采集。

Prometheus Exporter 指标采集

如果 Pod/容器有暴露 Prometheus 指标,有两种方式可以采集,参见这里

日志采集

日志采集的相关配置详见此处


指标

以下所有数据采集,默认会追加名为 host 的全局 tag(tag 值为 DataKit 所在主机名),也可以在配置中通过 [inputs.container.tags] 指定其它标签:

 [inputs.container.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...

kube_statefulset

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
namespace
(tag)
Namespace defines the space within each name must be unique.
statefulset
(tag)
Name must be unique within a namespace.
uid
(tag)
The UID of StatefulSet.
replicas The number of Pods created by the StatefulSet controller.
Type: int
Unit: count
replicas_available Total number of available pods (ready for at least minReadySeconds) targeted by this StatefulSet.
Type: int
Unit: count
replicas_current The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by currentRevision.
Type: int
Unit: count
replicas_desired The desired number of replicas of the given Template.
Type: int
Unit: count
replicas_ready The number of pods created for this StatefulSet with a Ready Condition.
Type: int
Unit: count
replicas_updated The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by updateRevision.
Type: int
Unit: count

kube_replicaset

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
namespace
(tag)
Namespace defines the space within each name must be unique.
replicaset
(tag)
Name must be unique within a namespace.
uid
(tag)
The UID of ReplicaSet.
fully_labeled_replicas The number of fully labeled replicas per ReplicaSet.
Type: int
Unit: count
replicas The most recently observed number of replicas.
Type: int
Unit: count
replicas_available The number of available replicas (ready for at least minReadySeconds) for this replica set.
Type: int
Unit: count
replicas_desired The number of desired replicas.
Type: int
Unit: count
replicas_ready The number of ready replicas for this replica set.
Type: int
Unit: count

kube_service

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
namespace
(tag)
Namespace defines the space within each name must be unique.
service
(tag)
Name must be unique within a namespace.
uid
(tag)
The UID of Service
ports Total number of ports that are exposed by this service.
Type: int
Unit: count

kube_pod

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
daemonset
(tag)
The name of the DaemonSet which the object belongs to.
deployment
(tag)
The name of the Deployment which the object belongs to.
namespace
(tag)
Namespace defines the space within each name must be unique.
node_name
(tag)
NodeName is a request to schedule this pod onto a specific node.
pod
(tag)
Name must be unique within a namespace.
pod_name
(tag)
Renamed from 'pod'.
statefulset
(tag)
The name of the StatefulSet which the object belongs to.
uid
(tag)
The UID of pod.
cpu_limit_millicores The total CPU limit (in millicores) across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value.
Type: int
Unit: milli-cores
cpu_number The total number of CPUs on the node where the Pod is running.
Type: int
Unit: count
cpu_request_millicores The total CPU request (in millicores) across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value.
Type: int
Unit: milli-cores
cpu_usage The total CPU usage across all containers in this Pod.
Type: float
Unit: percent,percent
cpu_usage_base100 The normalized CPU usage, with a maximum of 100%.
Type: float
Unit: percent,percent
cpu_usage_base_limit The normalized CPU usage, with a maximum of 100%, based on the CPU limit.
Type: float
Unit: percent,percent
cpu_usage_base_request The normalized CPU usage, with a maximum of 100%, based on the CPU request.
Type: float
Unit: percent,percent
cpu_usage_millicores The total CPU usage (in millicores) averaged over the sample window for all containers.
Type: int
Unit: milli-cores
ephemeral_storage_available_bytes The storage space available (bytes) for the filesystem.
Type: int
Unit: digital,B
ephemeral_storage_capacity_bytes The total capacity (bytes) of the filesystems underlying storage.
Type: int
Unit: digital,B
ephemeral_storage_used_bytes The bytes used for a specific task on the filesystem.
Type: int
Unit: digital,B
mem_capacity The total memory capacity of the host machine.
Type: int
Unit: digital,B
mem_limit The total memory limit across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value.
Type: int
Unit: digital,B
mem_request The total memory request across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value.
Type: int
Unit: digital,B
mem_rss The total RSS memory usage of all containers in this Pod, which is not supported by metrics-server.
Type: int
Unit: digital,B
mem_usage The total memory usage of all containers in this Pod.
Type: int
Unit: digital,B
mem_used_percent The percentage of memory usage based on the host machine’s total memory capacity.
Type: float
Unit: percent,percent
mem_used_percent_base_limit The percentage of memory usage based on the memory limit.
Type: float
Unit: percent,percent
mem_used_percent_base_request The percentage of memory usage based on the memory request.
Type: float
Unit: percent,percent
memory_capacity The total memory in the host machine (Deprecated use mem_capacity).
Type: int
Unit: digital,B
memory_usage_bytes The sum of the memory usage of all containers in this Pod (Deprecated use mem_usage).
Type: int
Unit: digital,B
memory_used_percent The percentage usage of the memory (refer from mem_used_percent
Type: float
Unit: percent,percent
network_bytes_rcvd Cumulative count of bytes received.
Type: int
Unit: digital,B
network_bytes_sent Cumulative count of bytes transmitted.
Type: int
Unit: digital,B
ready Describes whether the pod is ready to serve requests.
Type: int
Unit: count
restarts The number of times the container has been restarted.
Type: int
Unit: count

kubernetes

Kubernetes 中的资源计数。

Tags & Fields Description
namespace
(tag)
namespace
node_name
(tag)
NodeName is a request to schedule this pod onto a specific node (only supported Pod and Container).
container Container count
Type: int | (count)
Unit: -
cronjob CronJob count
Type: int | (count)
Unit: -
daemonset Service count
Type: int | (count)
Unit: -
deployment Deployment count
Type: int | (count)
Unit: -
endpoint Endpoint count
Type: int | (count)
Unit: -
job Job count
Type: int | (count)
Unit: -
node Node count
Type: int | (count)
Unit: -
pod Pod count
Type: int | (count)
Unit: -
replicaset ReplicaSet count
Type: int | (count)
Unit: -
service Service count
Type: int | (count)
Unit: -
statefulset StatefulSet count
Type: int | (count)
Unit: -

kube_cronjob

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
cronjob
(tag)
Name must be unique within a namespace.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of CronJob.
spec_suspend This flag tells the controller to suspend subsequent executions.
Type: bool
Unit: -

kube_daemonset

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
daemonset
(tag)
Name must be unique within a namespace.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of DaemonSet.
daemons_available The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and available (ready for at least spec.minReadySeconds).
Type: int
Unit: count
daemons_unavailable The number of nodes that should be running the daemon pod and have none of the daemon pod running and available (ready for at least spec.minReadySeconds).
Type: int
Unit: count
desired The total number of nodes that should be running the daemon pod (including nodes correctly running the daemon pod).
Type: int
Unit: count
misscheduled The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod.
Type: int
Unit: count
ready The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready.
Type: int
Unit: count
scheduled The number of nodes that are running at least one daemon pod and are supposed to run the daemon pod.
Type: int
Unit: count
updated The total number of nodes that are running updated daemon pod.
Type: int
Unit: count

kube_deployment

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
deployment
(tag)
Name must be unique within a namespace.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of Deployment.
replicas Total number of non-terminated pods targeted by this deployment (their labels match the selector).
Type: int
Unit: count
replicas_available Total number of available pods (ready for at least minReadySeconds) targeted by this deployment.
Type: int
Unit: count
replicas_desired Number of desired pods for a Deployment.
Type: int
Unit: count
replicas_ready The number of pods targeted by this Deployment with a Ready Condition.
Type: int
Unit: count
replicas_unavailable Total number of unavailable pods targeted by this deployment.
Type: int
Unit: count
replicas_updated Total number of non-terminated pods targeted by this deployment that have the desired template spec.
Type: int
Unit: count
rollingupdate_max_surge The maximum number of pods that can be scheduled above the desired number of pods.
Type: int
Unit: count
rollingupdate_max_unavailable The maximum number of pods that can be unavailable during the update.
Type: int
Unit: count

kube_dfpv

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
name
(tag)
The dfpv name, consists of pvc name and pod name
namespace
(tag)
The namespace of Pod and PVC.
node_name
(tag)
Reference to the Node.
pod_name
(tag)
Reference to the Pod.
pvc_name
(tag)
Reference to the PVC.
volume_mount_name
(tag)
The name given to the Volume.
available AvailableBytes represents the storage space available (bytes) for the filesystem.
Type: int
Unit: digital,B
capacity CapacityBytes represents the total capacity (bytes) of the filesystems underlying storage.
Type: int
Unit: digital,B
inodes Inodes represents the total inodes in the filesystem.
Type: int
Unit: count
inodes_free InodesFree represents the free inodes in the filesystem.
Type: int
Unit: count
inodes_used InodesUsed represents the inodes used by the filesystem.
Type: int
Unit: count
used UsedBytes represents the bytes used for a specific task on the filesystem.
Type: int
Unit: digital,B

kube_endpoint

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
endpoint
(tag)
Name must be unique within a namespace.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of Endpoint.
address_available Number of addresses available in endpoint.
Type: int
Unit: count
address_not_ready Number of addresses not ready in endpoint.
Type: int
Unit: count

kube_job

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
job
(tag)
Name must be unique within a namespace.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of Job.
active The number of actively running pods.
Type: int
Unit: count
completion_failed The job has failed its execution.
Type: int
Unit: count
completion_succeeded The job has completed its execution.
Type: int
Unit: count
failed The number of pods which reached phase Failed.
Type: int
Unit: count
succeeded The number of pods which reached phase Succeeded.
Type: int
Unit: count

kube_node

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
node
(tag)
Name must be unique within a namespace
uid
(tag)
The UID of Node.
cpu_allocatable The allocatable CPU of a node that is available for scheduling.
Type: int
Unit: -
cpu_capacity The CPU capacity of a node.
Type: int
Unit: -
ephemeral_storage_allocatable The allocatable ephemeral-storage of a node that is available for scheduling.
Type: int
Unit: -
ephemeral_storage_capacity The ephemeral-storage capacity of a node.
Type: int
Unit: -
memory_allocatable The allocatable memory of a node that is available for scheduling.
Type: int
Unit: -
memory_capacity The memory capacity of a node.
Type: int
Unit: -
pods_allocatable The allocatable pods of a node that is available for scheduling.
Type: int
Unit: -
pods_capacity The pods capacity of a node.
Type: int
Unit: -

docker_containers

容器指标字段(只有正在运行的容器才能采集)

Tags & Fields Description
aws_ecs_cluster_name
(tag)
Cluster name of the AWS ECS.
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
container_id
(tag)
Container ID
container_name
(tag)
Container name from k8s (label io.kubernetes.container.name). If empty then use $container_runtime_name.
container_runtime
(tag)
Container runtime (this container from Docker/Containerd/cri-o).
container_runtime_name
(tag)
Container name from runtime (like 'docker ps'). If empty then use 'unknown'.
container_runtime_version
(tag)
Container runtime version.
container_type
(tag)
The type of the container (this container is created by Kubernetes/Docker/Containerd/cri-o).
daemonset
(tag)
The name of the DaemonSet which the object belongs to.
deployment
(tag)
The name of the Deployment which the object belongs to.
image
(tag)
The full name of the container image, example nginx.org/nginx:1.21.0.
image_name
(tag)
The name of the container image, example nginx.org/nginx.
image_short_name
(tag)
The short name of the container image, example nginx.
image_tag
(tag)
The tag of the container image, example 1.21.0.
namespace
(tag)
The namespace of the container (label io.kubernetes.pod.namespace).
pod_name
(tag)
The pod name of the container (label io.kubernetes.pod.name).
pod_uid
(tag)
The pod uid of the container (label io.kubernetes.pod.uid).
state
(tag)
Container status (only Running).
statefulset
(tag)
The name of the StatefulSet which the object belongs to.
task_arn
(tag)
The task arn of the AWS Fargate.
task_family
(tag)
The task family of the AWS fargate.
task_version
(tag)
The task version of the AWS fargate.
block_read_byte Total number of bytes read from the container file system (only supported docker).
Type: int
Unit: digital,B
block_write_byte Total number of bytes wrote to the container file system (only supported docker).
Type: int
Unit: digital,B
cpu_limit_millicores The CPU limit of the container, measured in milli-cores.
Type: int
Unit: milli-cores
cpu_numbers The number of CPU cores on the system host.
Type: int
Unit: count
cpu_request_millicores The CPU request of the container, measured in milli-cores (only supported in Kubernetes).
Type: int
Unit: milli-cores
cpu_usage The actual CPU usage on the system host (percentage).
Type: float
Unit: percent,percent
cpu_usage_base100 The normalized CPU usage, with a maximum value of 100%. It is calculated as the number of CPU cores multiplied by 100.
Type: float
Unit: percent,percent
cpu_usage_base_limit The CPU usage based on the CPU limit (percentage).
Type: float
Unit: percent,percent
cpu_usage_base_request The CPU usage based on the CPU request (percentage) (only supported in Kubernetes).
Type: float
Unit: percent,percent
cpu_usage_millicores The CPU usage of the container, measured in milli-cores.
Type: int
Unit: milli-cores
mem_capacity The total memory on the system host.
Type: int
Unit: digital,B
mem_limit The memory limit of the container.
Type: int
Unit: digital,B
mem_request The memory request of the container (only supported in Kubernetes).
Type: int
Unit: digital,B
mem_usage The actual memory usage of the container.
Type: int
Unit: digital,B
mem_used_percent The memory usage percentage based on the total memory of the system host.
Type: float
Unit: percent,percent
mem_used_percent_base_limit The memory usage percentage based on the memory limit.
Type: float
Unit: percent,percent
mem_used_percent_base_request The memory usage percentage based on the memory request (only supported in Kubernetes).
Type: float
Unit: percent,percent
network_bytes_rcvd Total number of bytes received from the network (only count the usage of the main process in the container, excluding loopback).
Type: int
Unit: digital,B
network_bytes_sent Total number of bytes send to the network (only count the usage of the main process in the container, excluding loopback).
Type: int
Unit: digital,B

对象

kubernetes_statefulsets

Tags & Fields Description
<ALL-SELECTOR-MATCH-LABELS>
(tag)
Represents the selector.matchLabels for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
name
(tag)
The UID of StatefulSet.
namespace
(tag)
Namespace defines the space within each name must be unique.
statefulset_name
(tag)
Name must be unique within a namespace.
uid
(tag)
The UID of StatefulSet.
workload_name
(tag)
The name of the workload resource.
age Age (seconds)
Type: int
Unit: time,s
message Object details
Type: string
Unit: -
replicas The number of Pods created by the StatefulSet controller.
Type: int
Unit: count
replicas_available Total number of available pods (ready for at least minReadySeconds) targeted by this StatefulSet.
Type: int
Unit: count
replicas_current The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by currentRevision.
Type: int
Unit: count
replicas_desired The desired number of replicas of the given Template.
Type: int
Unit: count
replicas_ready The number of pods created for this StatefulSet with a Ready Condition.
Type: int
Unit: count
replicas_updated The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by updateRevision.
Type: int
Unit: count

kubernetes_replica_sets

Tags & Fields Description
<ALL-SELECTOR-MATCH-LABELS>
(tag)
Represents the selector.matchLabels for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
deployment
(tag)
The name of the Deployment which the object belongs to.
name
(tag)
The UID of ReplicaSet.
namespace
(tag)
Namespace defines the space within each name must be unique.
replicaset_name
(tag)
Name must be unique within a namespace.
statefulset
(tag)
The name of the StatefulSet which the object belongs to.
uid
(tag)
The UID of ReplicaSet.
workload_name
(tag)
The name of the workload resource.
age Age (seconds)
Type: int
Unit: time,s
available The number of available replicas (ready for at least minReadySeconds) for this replica set. (Deprecated)
Type: int
Unit: -
message Object details
Type: string
Unit: -
ready The number of ready replicas for this replica set. (Deprecated)
Type: int
Unit: -
replicas The most recently observed number of replicas.
Type: int
Unit: count
replicas_available The number of available replicas (ready for at least minReadySeconds) for this replica set.
Type: int
Unit: count
replicas_desired The number of desired replicas.
Type: int
Unit: count
replicas_ready The number of ready replicas for this replica set.
Type: int
Unit: count

kubernetes_services

Tags & Fields Description
<ALL-SELECTOR>
(tag)
Represents the selector for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
name
(tag)
The UID of Service
namespace
(tag)
Namespace defines the space within each name must be unique.
service_name
(tag)
Name must be unique within a namespace.
type
(tag)
Type determines how the Service is exposed. Defaults to ClusterIP. (ClusterIP/NodePort/LoadBalancer/ExternalName)
uid
(tag)
The UID of Service
workload_name
(tag)
The name of the workload resource.
age Age (seconds)
Type: int
Unit: time,s
cluster_ip ClusterIP is the IP address of the service and is usually assigned randomly by the master.
Type: string
Unit: -
external_ips ExternalIPs is a list of IP addresses for which nodes in the cluster will also accept traffic for this service.
Type: string
Unit: -
external_name ExternalName is the external reference that kubedns or equivalent will return as a CNAME record for this service.
Type: string
Unit: -
external_traffic_policy ExternalTrafficPolicy denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints.
Type: string
Unit: -
message Object details
Type: string
Unit: -
session_affinity Supports "ClientIP" and "None".
Type: string
Unit: -

kubelet_pod

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
daemonset
(tag)
The name of the DaemonSet which the object belongs to.
deployment
(tag)
The name of the Deployment which the object belongs to.
host
(tag)
Pointing to the node where the pod is located.
name
(tag)
The UID of Pod.
namespace
(tag)
Namespace defines the space within each name must be unique.
node_name
(tag)
NodeName is a request to schedule this pod onto a specific node.
phase
(tag)
The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle.(Pending/Running/Succeeded/Failed/Unknown)
pod_name
(tag)
Name must be unique within a namespace.
qos_class
(tag)
The Quality of Service (QOS) classification assigned to the pod based on resource requirements
statefulset
(tag)
The name of the StatefulSet which the object belongs to.
status
(tag)
Reason the container is not yet running.
uid
(tag)
The UID of Pod.
workload_name
(tag)
The name of the workload resource.
age Age (seconds)
Type: int
Unit: time,s
available Number of containers
Type: int
Unit: count
cpu_limit_millicores The total CPU limit (in millicores) across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value.
Type: int
Unit: milli-cores
cpu_number The total number of CPUs on the node where the Pod is running.
Type: int
Unit: count
cpu_request_millicores The total CPU request (in millicores) across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value.
Type: int
Unit: milli-cores
cpu_usage The total CPU usage across all containers in this Pod.
Type: float
Unit: percent,percent
cpu_usage_base100 The normalized CPU usage, with a maximum of 100%.
Type: float
Unit: percent,percent
cpu_usage_base_limit The normalized CPU usage, with a maximum of 100%, based on the CPU limit.
Type: float
Unit: percent,percent
cpu_usage_base_request The normalized CPU usage, with a maximum of 100%, based on the CPU request.
Type: float
Unit: percent,percent
cpu_usage_millicores The total CPU usage (in millicores) averaged over the sample window for all containers.
Type: int
Unit: milli-cores
ephemeral_storage_available_bytes The storage space available (bytes) for the filesystem.
Type: int
Unit: digital,B
ephemeral_storage_capacity_bytes The total capacity (bytes) of the filesystems underlying storage.
Type: int
Unit: digital,B
ephemeral_storage_used_bytes The bytes used for a specific task on the filesystem.
Type: int
Unit: digital,B
mem_capacity The total memory capacity of the host machine.
Type: int
Unit: digital,B
mem_limit The total memory limit across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value.
Type: int
Unit: digital,B
mem_request The total memory request across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value.
Type: int
Unit: digital,B
mem_rss The total RSS memory usage of all containers in this Pod, which is not supported by metrics-server.
Type: int
Unit: digital,B
mem_usage The total memory usage of all containers in this Pod.
Type: int
Unit: digital,B
mem_used_percent The percentage of memory usage based on the host machine’s total memory capacity.
Type: float
Unit: percent,percent
mem_used_percent_base_100 The percentage usage of the memory (refer from mem_used_percent
Type: float
Unit: percent,percent
mem_used_percent_base_limit The percentage of memory usage based on the memory limit.
Type: float
Unit: percent,percent
mem_used_percent_base_request The percentage of memory usage based on the memory request.
Type: float
Unit: percent,percent
memory_capacity The total memory in the host machine (Deprecated use mem_capacity).
Type: int
Unit: digital,B
memory_usage_bytes The sum of the memory usage of all containers in this Pod (Deprecated use mem_usage).
Type: int
Unit: digital,B
memory_used_percent The percentage usage of the memory (refer from mem_used_percent
Type: float
Unit: percent,percent
message Object details
Type: string
Unit: -
network_bytes_rcvd Cumulative count of bytes received.
Type: int
Unit: digital,B
network_bytes_sent Cumulative count of bytes transmitted.
Type: int
Unit: digital,B
ready Describes whether the pod is ready to serve requests.
Type: int
Unit: count
restarts The number of times the container has been restarted.
Type: int
Unit: count

kubernetes_persistentvolumeclaims

Tags & Fields Description
<ALL-SELECTOR-MATCH-LABELS>
(tag)
Represents the selector.matchLabels for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
name
(tag)
The UID of PersistentVolume.
namespace
(tag)
Namespace defines the space within each name must be unique.
persistentvolumeclaim_name
(tag)
Name must be unique within a namespace.
uid
(tag)
The UID of PersistentVolume.
workload_name
(tag)
The name of the workload resource.
access_modes AccessModes contains the desired access modes the volume should have.
Type: string
Unit: -
age Age (seconds)
Type: int
Unit: time,s
message Object details
Type: string
Unit: -
phase The phase indicates if a volume is available, bound to a claim, or released by a claim.(Pending/Bound/Lost)
Type: string
Unit: -
requests_storage Specifies the maximum storage capacity of a PersistentVolume (PV), which Kubernetes uses for scheduling and resource allocation.
Type: string
Unit: -
storage_class_name StorageClassName is the name of the StorageClass required by the claim.
Type: string
Unit: -
volume_mode VolumeMode defines what type of volume is required by the claim.(Block/Filesystem)
Type: string
Unit: -
volume_name VolumeName is the binding reference to the PersistentVolume backing this claim.
Type: string
Unit: -

kubernetes_persistentvolumes

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
name
(tag)
The UID of PersistentVolume.
persistentvolume_name
(tag)
The name of PersistentVolume
uid
(tag)
The UID of PersistentVolume.
workload_name
(tag)
The name of the workload resource.
access_modes AccessModes contains the desired access modes the volume should have.
Type: string
Unit: -
age Age (seconds)
Type: int
Unit: time,s
capacity_storage Specifies the maximum storage capacity of a PersistentVolume (PV), which Kubernetes uses for scheduling and resource allocation.
Type: string
Unit: -
claimRef_name Name of the bound PersistentVolumeClaim.
Type: string
Unit: -
claimRef_namespace Namespace of the PersistentVolumeClaim.
Type: string
Unit: -
message Object details
Type: string
Unit: -
phase The phase indicates if a volume is available, bound to a claim, or released by a claim.(Pending/Available/Bound/Released/Failed)
Type: string
Unit: -

kubernetes_cron_jobs

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
cron_job_name
(tag)
Name must be unique within a namespace.
name
(tag)
The UID of CronJob.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of CronJob.
workload_name
(tag)
The name of the workload resource.
active_jobs The number of pointers to currently running jobs.
Type: int
Unit: count
age Age (seconds)
Type: int
Unit: time,s
message Object details
Type: string
Unit: -
schedule The schedule in Cron format, see doc
Type: string
Unit: -
suspend This flag tells the controller to suspend subsequent executions, it does not apply to already started executions.
Type: bool
Unit: -

kubernetes_daemonset

Tags & Fields Description
<ALL-SELECTOR-MATCH-LABELS>
(tag)
Represents the selector.matchLabels for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
daemonset_name
(tag)
Name must be unique within a namespace.
name
(tag)
The UID of DaemonSet.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of DaemonSet.
workload_name
(tag)
The name of the workload resource.
age Age (seconds)
Type: int
Unit: time,s
daemons_available The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and available (ready for at least spec.minReadySeconds).
Type: int
Unit: count
daemons_unavailable The number of nodes that should be running the daemon pod and have none of the daemon pod running and available (ready for at least spec.minReadySeconds).
Type: int
Unit: count
desired The total number of nodes that should be running the daemon pod (including nodes correctly running the daemon pod).
Type: int
Unit: count
message Object details
Type: string
Unit: -
misscheduled The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod.
Type: int
Unit: count
ready The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready.
Type: int
Unit: count
scheduled The number of nodes that are running at least one daemon pod and are supposed to run the daemon pod.
Type: int
Unit: count
updated The total number of nodes that are running updated daemon pod.
Type: int
Unit: count

kubernetes_deployments

Tags & Fields Description
<ALL-SELECTOR-MATCH-LABELS>
(tag)
Represents the selector.matchLabels for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
deployment_name
(tag)
Name must be unique within a namespace.
name
(tag)
The UID of Deployment.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of Deployment.
workload_name
(tag)
The name of the workload resource.
age Age (seconds)
Type: int
Unit: time,s
available Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. (Deprecated)
Type: int
Unit: count
max_surge The maximum number of pods that can be scheduled above the desired number of pods. (Deprecated)
Type: int
Unit: count
max_unavailable The maximum number of pods that can be unavailable during the update. (Deprecated)
Type: int
Unit: count
message Object details
Type: string
Unit: -
paused Indicates that the deployment is paused (true or false).
Type: bool
Unit: -
ready The number of pods targeted by this Deployment with a Ready Condition. (Deprecated)
Type: int
Unit: count
replicas Total number of non-terminated pods targeted by this deployment (their labels match the selector).
Type: int
Unit: count
replicas_available Total number of available pods (ready for at least minReadySeconds) targeted by this deployment.
Type: int
Unit: count
replicas_desired Number of desired pods for a Deployment.
Type: int
Unit: count
replicas_ready The number of pods targeted by this Deployment with a Ready Condition.
Type: int
Unit: count
replicas_unavailable Total number of unavailable pods targeted by this deployment.
Type: int
Unit: count
replicas_updated Total number of non-terminated pods targeted by this deployment that have the desired template spec.
Type: int
Unit: count
rollingupdate_max_surge The maximum number of pods that can be scheduled above the desired number of pods.
Type: int
Unit: count
rollingupdate_max_unavailable The maximum number of pods that can be unavailable during the update.
Type: int
Unit: count
strategy Type of deployment. Can be "Recreate" or "RollingUpdate". Default is RollingUpdate.
Type: string
Unit: -
unavailable Total number of unavailable pods targeted by this deployment. (Deprecated)
Type: int
Unit: count
up_dated Total number of non-terminated pods targeted by this deployment that have the desired template spec. (Deprecated)
Type: int
Unit: count

kubernetes_dfpv

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
name
(tag)
The dfpv name, consists of pvc name and pod name
namespace
(tag)
The namespace of Pod and PVC.
node_name
(tag)
Reference to the Node.
pod_name
(tag)
Reference to the Pod.
pvc_name
(tag)
Reference to the PVC.
volume_mount_name
(tag)
The name given to the Volume.
available AvailableBytes represents the storage space available (bytes) for the filesystem.
Type: int
Unit: digital,B
capacity CapacityBytes represents the total capacity (bytes) of the filesystems underlying storage.
Type: int
Unit: digital,B
inodes Inodes represents the total inodes in the filesystem.
Type: int
Unit: count
inodes_free InodesFree represents the free inodes in the filesystem.
Type: int
Unit: count
inodes_used InodesUsed represents the inodes used by the filesystem.
Type: int
Unit: count
message Object details
Type: string
Unit: -
used UsedBytes represents the bytes used for a specific task on the filesystem.
Type: int
Unit: digital,B

kubernetes_jobs

Tags & Fields Description
<ALL-SELECTOR-MATCH-LABELS>
(tag)
Represents the selector.matchLabels for Kubernetes resources
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
job_name
(tag)
Name must be unique within a namespace.
name
(tag)
The UID of Job.
namespace
(tag)
Namespace defines the space within each name must be unique.
uid
(tag)
The UID of Job.
workload_name
(tag)
The name of the workload resource.
active The number of actively running pods.
Type: int
Unit: count
active_deadline Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it
Type: int
Unit: time,s
age Age (seconds)
Type: int
Unit: time,s
backoff_limit Specifies the number of retries before marking this job failed.
Type: int
Unit: count
completions Specifies the desired number of successfully finished pods the job should be run with.
Type: int
Unit: count
failed The number of pods which reached phase Failed.
Type: int
Unit: count
message Object details
Type: string
Unit: -
parallelism Specifies the maximum desired number of pods the job should run at any given time.
Type: int
Unit: count
succeeded The number of pods which reached phase Succeeded.
Type: int
Unit: count

kubernetes_nodes

Tags & Fields Description
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
internal_ip
(tag)
Node internal IP
name
(tag)
The UID of Node.
node_name
(tag)
Name must be unique within a namespace.
role
(tag)
Node role. (master/node)
status
(tag)
NodePhase is the recently observed lifecycle phase of the node. (Pending/Running/Terminated)
uid
(tag)
The UID of Node.
workload_name
(tag)
The name of the workload resource.
age Age (seconds).
Type: int
Unit: time,s
kubelet_version Kubelet Version reported by the node.
Type: string
Unit: -
message Object details.
Type: string
Unit: -
node_ready NodeReady means kubelet is healthy and ready to accept pods (true/false/unknown).
Type: string
Unit: -
taints Node's taints.
Type: string
Unit: -
unschedulable Unschedulable controls node schedulability of new pods (yes/no).
Type: string
Unit: -

docker_containers

容器对象字段(只有正在运行的容器才能采集)

Tags & Fields Description
aws_ecs_cluster_name
(tag)
Cluster name of the AWS ECS.
cluster_name_k8s
(tag)
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S.
container_id
(tag)
Container ID.
container_name
(tag)
Container name from k8s (label io.kubernetes.container.name). If empty then use $container_runtime_name.
container_runtime
(tag)
Container runtime (this container from Docker/Containerd/cri-o).
container_runtime_name
(tag)
Container name from runtime (like 'docker ps'). If empty then use 'unknown'.
container_runtime_version
(tag)
Container runtime version.
container_type
(tag)
The type of the container (this container is created by Kubernetes/Docker/Containerd/cri-o).
daemonset
(tag)
The name of the DaemonSet which the object belongs to.
deployment
(tag)
The name of the Deployment which the object belongs to.
image
(tag)
The full name of the container image, example nginx.org/nginx:1.21.0.
image_name
(tag)
The name of the container image, example nginx.org/nginx.
image_short_name
(tag)
The short name of the container image, example nginx.
image_tag
(tag)
The tag of the container image, example 1.21.0.
name
(tag)
The ID of the container.
namespace
(tag)
The namespace of the container (label io.kubernetes.pod.namespace).
pod_name
(tag)
The pod name of the container (label io.kubernetes.pod.name).
pod_uid
(tag)
The pod uid of the container (label io.kubernetes.pod.uid).
state
(tag)
The state of the Container (only Running).
statefulset
(tag)
The name of the StatefulSet which the object belongs to.
status
(tag)
The status of the container,example Up 5 hours.
task_arn
(tag)
The task arn of the AWS Fargate.
task_family
(tag)
The task family of the AWS fargate.
task_version
(tag)
The task version of the AWS fargate.
workload_name
(tag)
The name of the workload resource.
age Age (seconds).
Type: int
Unit: time,s
block_read_byte Total number of bytes read from the container file system (only supported docker).
Type: int
Unit: digital,B
block_write_byte Total number of bytes wrote to the container file system (only supported docker).
Type: int
Unit: digital,B
cpu_limit_millicores The CPU limit of the container, measured in milli-cores.
Type: int
Unit: milli-cores
cpu_numbers The number of CPU cores on the system host.
Type: int
Unit: count
cpu_request_millicores The CPU request of the container, measured in milli-cores (only supported in Kubernetes).
Type: int
Unit: milli-cores
cpu_usage The actual CPU usage on the system host (percentage).
Type: float
Unit: percent,percent
cpu_usage_base100 The normalized CPU usage, with a maximum value of 100%. It is calculated as the number of CPU cores multiplied by 100.
Type: float
Unit: percent,percent
cpu_usage_base_limit The CPU usage based on the CPU limit (percentage).
Type: float
Unit: percent,percent
cpu_usage_base_request The CPU usage based on the CPU request (percentage) (only supported in Kubernetes).
Type: float
Unit: percent,percent
cpu_usage_millicores The CPU usage of the container, measured in milli-cores.
Type: int
Unit: milli-cores
mem_capacity The total memory on the system host.
Type: int
Unit: digital,B
mem_limit The memory limit of the container.
Type: int
Unit: digital,B
mem_request The memory request of the container (only supported in Kubernetes).
Type: int
Unit: digital,B
mem_usage The actual memory usage of the container.
Type: int
Unit: digital,B
mem_used_percent The memory usage percentage based on the total memory of the system host.
Type: float
Unit: percent,percent
mem_used_percent_base_limit The memory usage percentage based on the memory limit.
Type: float
Unit: percent,percent
mem_used_percent_base_request The memory usage percentage based on the memory request (only supported in Kubernetes).
Type: float
Unit: percent,percent
message Object details.
Type: string
Unit: -
network_bytes_rcvd Total number of bytes received from the network (only count the usage of the main process in the container, excluding loopback).
Type: int
Unit: digital,B
network_bytes_sent Total number of bytes send to the network (only count the usage of the main process in the container, excluding loopback).
Type: int
Unit: digital,B

日志

kubernetes_events

Tags & Fields Description
reason
(tag)
This should be a short, machine understandable string that gives the reason, for the transition into the object's current status.
type
(tag)
Type of this event.
uid
(tag)
The UID of event.
involved_kind Kind of the referent for involved object.
Type: string
Unit: -
involved_name Name must be unique within a namespace for involved object.
Type: string
Unit: -
involved_namespace Namespace defines the space within which each name must be unique for involved object.
Type: string
Unit: -
involved_uid The UID of involved object.
Type: string
Unit: -
message Details of event log
Type: string
Unit: -
source_component Component from which the event is generated.
Type: string
Unit: -
source_host Node name on which the event is generated.
Type: string
Unit: -

<CONTAINER-NAME>

容器日志采集

Tags & Fields Description
container_id
(tag)
Container ID.
container_name
(tag)
Container name from k8s (label io.kubernetes.container.name). If empty then use $container_runtime_name.
daemonset
(tag)
The name of the DaemonSet which the object belongs to.
deployment
(tag)
The name of the Deployment which the object belongs to.
filepath
(tag)
The filepath to the log file on the host system where the log is stored.
host
(tag)
Host name
image
(tag)
The full name of the container image, example nginx.org/nginx:1.21.0.
inside_filepath
(tag)
The path to the log file inside the container (only applicable for log collection from within containers).
namespace
(tag)
The namespace of the container (label io.kubernetes.pod.namespace).
pod_ip
(tag)
The pod ip of the container.
pod_name
(tag)
The pod name of the container (label io.kubernetes.pod.name).
service
(tag)
The name of the service, if service is empty then use source.
statefulset
(tag)
The name of the StatefulSet which the object belongs to.
log_file_inode The inode of the log file, which uniquely identifies it on the file system (requires enabling the global configuration enable_debug_fields).
Type: int
Unit: count
log_read_lines The lines of the read file.
Type: int
Unit: count
log_read_offset The current offset in the log file where reading has occurred, used to track progress during log collection (requires enabling the global configuration enable_debug_fields).
Type: int
Unit: count
message The text of the logging.
Type: string
Unit: -
status The status of the logging, dafault is info.
Type: string
Unit: -

变更事件

event

Kubernetes 中主要资源(Pod/Deployment/Service 等)变更将触发如下形式的变更事件。完整的变更列表,参见这里

Tags & Fields Description
class
(tag)
The type of Kubernetes resource, e.g. kubernetes_deployments/kubernetes_nodes/..
deployment_name/node_name/..
(tag)
The name of Kubernetes resource, e.g. deployment-abc-123
df_event_id
(tag)
The event ID is generated by UUIDv4, e.g. event-<lowercase UUIDv4>.
df_source
(tag)
The event source is always change.
df_status
(tag)
The event source is always info.
df_sub_status
(tag)
Always info.
namespace
(tag)
The namespace of Kubernetes resource.
uid
(tag)
The UID of Kubernetes resource.
df_message This is a template field, concatenated from other values: [{{df_resource_type}}] {{df_resource}} configuration changed.
Type: string | (unknown)
Unit: -
df_title Diff text of resource changes.
Type: string | (unknown)
Unit: -
diff Diff text of resource changes.
Type: string | (unknown)
Unit: -

Dataway Sink 详见文档

所有的 Kubernetes 资源采集,都会添加与 CustomerKey 匹配的 Label。例如 CustomerKey 是 name,DaemonSet、Deployment、Pod 等资源,会在自己当前的 Labels 中找到 name,并将其添加到 tags。

容器会添加其所属 Pod 的 Customer Labels。

FAQ

根据 Pod Namespace 过滤指标采集

在启用 Kubernetes Pod 指标采集(enable_pod_metric = true)后,DataKit 将采集集群中所有 Pod 的指标数据。由于这可能会生成大量数据,因此可以通过 Pod 的 namespace 字段来过滤指标采集,从而仅采集特定命名空间中的 Pod 指标。

通过配置 pod_include_metricpod_exclude_metric,可以控制哪些命名空间的 Pod 会被包含或排除在指标采集之外。

  ## 当 Pod 的 namespace 能够匹配 `datakit` 时,采集该 Pod 的指标
  pod_include_metric = ["namespace:datakit"]

  ## 忽略所有 namespace 是 `kodo` 的 Pod
  pod_exclude_metric = ["namespace:kodo"]
  • includeexclude 配置项必须以字段名开头,格式为类似于 glob 通配符 的表达式:"<字段名>:<glob 规则>"
  • 目前,namespace 字段是唯一支持的过滤字段。例如:namespace:datakit-ns

如果同时设置了 includeexclude 配置,Pod 必须满足以下条件:

  • 必须满足 include 的规则
  • 且不满足 exclude 的规则

例如,以下配置会导致所有 Pod 都被过滤掉:

  ## 只采集 `namespace:datakit` 的 Pod,排除所有命名空间
  pod_include_metric = ["namespace:datakit"]
  pod_exclude_metric = ["namespace:*"]

对于 Kubernetes 环境,可以通过以下环境变量来进行配置:

  • ENV_INPUT_CONTAINER_POD_INCLUDE_METRIC
  • ENV_INPUT_CONTAINER_POD_EXCLUDE_METRIC

例如,如果希望只采集 namespacekube-system 的 Pod 指标,可以设置 ENV_INPUT_CONTAINER_POD_INCLUDE_METRIC 环境变量,如下所示:

  - env:
      - name: ENV_INPUT_CONTAINER_POD_INCLUDE_METRIC
        value: namespace:kube-system  # 指定需要采集的命名空间

通过这种方式,可以灵活地控制 DataKit 采集的 Pod 指标范围,避免采集不需要的数据,从而优化系统性能和资源利用率。

NODE_LOCAL 需要新的权限

ENV_INPUT_CONTAINER_ENABLE_K8S_NODE_LOCAL 模式只推荐 DaemonSet 部署时使用,该模式需要访问 kubelet,所以需要在 RBAC 添加 nodes/stats 权限。例如:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: datakit
rules:
- apiGroups: [""]
  resources: ["nodes", "nodes/stats"]
  verbs: ["get", "list", "watch"]

此外,DataKit Pod 还需要开启 hostNetwork: true 配置项。

采集 PersistentVolumes 和 PersistentVolumeClaims 需要新的权限

DataKit 在 1.25.0 Version-1.25.0 版本支持采集 Kubernetes PersistentVolume 和 PersistentVolumeClaim 的对象数据,采集这两种资源需要新的 RBAC 权限,详细见下:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: datakit
rules:
- apiGroups: [""]
  resources: ["persistentvolumes", "persistentvolumeclaims"]
  verbs: ["get", "list", "watch"]

Kubernetes YAML 敏感字段屏蔽

DataKit 会采集 Kubernetes Pod 或 Service 等资源的 yaml 配置,并存储到对象数据的 yaml 字段中。如果该 yaml 中包含敏感数据(例如密码),DataKit 暂不支持手动配置屏蔽敏感字段,推荐使用 Kubernetes 官方的做法,即使用 ConfigMap 或者 Secret 来隐藏敏感字段。

例如,现在需要在 env 中添加一份密码,正常情况下是这样:

    containers:
    - name: mycontainer
      image: redis
      env:
        - name: SECRET_PASSWORD
      value: password123

在编排 yaml 配置会将密码明文存储,这是很不安全的。可以使用 Kubernetes Secret 实现隐藏,方法如下:

创建一个 Secret:

apiVersion: v1
kind: Secret
metadata:
  name: mysecret
type: Opaque
data:
  username: username123
  password: password123

执行:

kubectl apply -f mysecret.yaml

在 env 中使用 Secret:

    containers:
    - name: mycontainer
      image: redis
      env:
        - name: SECRET_PASSWORD
      valueFrom:
          secretKeyRef:
            name: mysecret
            key: password
            optional: false

详见官方文档

延伸阅读

文档评价

文档内容是否对您有帮助? ×