Kubernetes
采集 Container 和 Kubernetes 的指标、对象和日志数据,上报到观测云。
采集器配置¶
前置条件¶
- 目前 container 支持 Docker、Containerd、CRI-O 容器运行时
- 版本要求:Docker v17.04 及以上版本,Containerd v1.5.1 及以上,CRI-O 1.20.1 及以上
- 采集 Kubernetes 数据需要 DataKit 以 DaemonSet 方式部署。
Info
- 容器采集支持 Docker 和 Containerd 两种运行时 Version-1.5.7,且默认都开启采集。
如果是纯 Docker 或 Containerd 环境,那么 DataKit 只能安装在宿主机上。
进入 DataKit 安装目录下的 conf.d/samples 目录,复制 container.conf.sample 并命名为 container.conf。示例如下:
[inputs.container]
endpoints = [
"unix:///var/run/docker.sock",
"unix:///var/run/containerd/containerd.sock",
"unix:///var/run/crio/crio.sock",
]
## Collect metric interval, default "60s".
# metric_collect_interval = "60s"
## Collect object interval, default "5m".
# object_collect_interval = "5m"
## Search logging interval, default "60s".
# logging_search_interval = "60s"
enable_container_metric = true
enable_k8s_metric = true
enable_pod_metric = false
enable_k8s_event = true
enable_k8s_node_local = true
enable_collect_k8s_job = true
## Add resource Label as Tags (container use Pod Label), need to specify Label keys.
## e.g. ["app", "name"]
# extract_k8s_label_as_tags_v2 = []
# extract_k8s_label_as_tags_v2_for_metric = []
## Containers logs to include and exclude, default collect all containers. Globs accepted.
container_include_log = []
container_exclude_log = ["image:*logfwd*", "image:*datakit*"]
## Pods metric to include and exclude, default collect all pods. Globs accepted.
pod_include_metric = []
pod_exclude_metric = []
logging_enable_multiline = true
logging_auto_multiline_detection = true
logging_auto_multiline_extra_patterns = []
## Only retain the fields specified in the whitelist.
logging_field_white_list = []
## Removes ANSI escape codes from text strings.
logging_remove_ansi_escape_codes = false
## Whether to collect logs from the begin of the file.
logging_file_from_beginning = false
## The maximum allowed number of open files, default is 500. If it is -1, it means no limit.
# logging_max_open_files = 500
## Log collection configures additional source matching, and the regular source will be renamed.
[inputs.container.logging_extra_source_map]
# source_regexp = "new_source"
## Log collection with multiline configuration as specified by the source.
[inputs.container.logging_source_multiline_map]
# source = '''^\d{4}'''
[inputs.container.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
可通过 ConfigMap 方式注入采集器配置 或 配置 ENV_DATAKIT_INPUTS 开启采集器。
也支持以环境变量的方式修改配置参数(需要在 ENV_DEFAULT_ENABLED_INPUTS 中加为默认采集器):
-
ENV_INPUT_CONTAINER_ENDPOINTS
追加多个容器运行时的 endpoint
字段类型: List
采集器配置字段:
endpoints示例: "
unix:///var/run/docker.sock,unix:///var/run/containerd/containerd.sock,unix:///var/run/crio/crio.sock" -
ENV_INPUT_CONTAINER_METRIC_COLLEC_INTERVAL
容器/k8s 指标数据采集间隔
字段类型: Duration
采集器配置字段:
metric_collec_interval默认值: 60s
-
ENV_INPUT_CONTAINER_OBJECT_COLLEC_INTERVAL
容器/k8s 对象数据采集间隔
字段类型: Duration
采集器配置字段:
object_collec_interval默认值: 5m
-
ENV_INPUT_CONTAINER_LOGGING_SEARCH_INTERVAL
日志发现的时间间隔,即每隔多久检索一次日志,如果间隔太长,会导致忽略了一些存活较短的日志
字段类型: Duration
采集器配置字段:
logging_search_interval默认值: 60s
-
ENV_INPUT_CONTAINER_ENABLE_CONTAINER_METRIC
开启容器指标采集
字段类型: Boolean
采集器配置字段:
enable_container_metric默认值: true
-
ENV_INPUT_CONTAINER_ENABLE_K8S_METRIC
开启 k8s 指标采集
字段类型: Boolean
采集器配置字段:
enable_k8s_metric默认值: true
-
ENV_INPUT_CONTAINER_ENABLE_POD_METRIC
是否开启 Pod 指标采集(CPU 和内存使用情况)
字段类型: Boolean
采集器配置字段:
enable_pod_metric默认值: false
-
ENV_INPUT_CONTAINER_ENABLE_K8S_EVENT
是否开启分时间采集模式
字段类型: Boolean
采集器配置字段:
enable_k8s_event默认值: true
-
ENV_INPUT_CONTAINER_ENABLE_K8S_NODE_LOCAL
是否开启分 Node 采集模式,由部署在各个 Node 的 DataKit 独立采集当前 Node 的资源。 Version-1.19.0 需要额外的
RABC权限,见此处字段类型: Boolean
采集器配置字段:
enable_k8s_node_local默认值: true
-
ENV_INPUT_CONTAINER_ENABLE_COLLECT_KUBE_JOB
开启对 Kubernetes Job 资源的采集(包括指标数据和对象数据)
字段类型: Boolean
采集器配置字段:
enable_collect_kube_job默认值: true
-
ENV_INPUT_CONTAINER_EXTRACT_K8S_LABEL_AS_TAGS_V2
追加资源的 labels 到数据(不包括指标数据)的 tag 中。需指定 label keys,如果只有一个 key 且为空字符串(例如 [""]),会添加所有 labels 到 tag。容器会继承 Pod labels。如果 label 的 key 有 dot 字符,会将其变为横线
字段类型: JSON
采集器配置字段:
extract_k8s_label_as_tags_v2示例:
'["app","name"]' -
ENV_INPUT_CONTAINER_EXTRACT_K8S_LABEL_AS_TAGS_V2_FOR_METRIC
追加资源的 labels 到指标数据的 tag 中。需指定 label keys,如果只有一个 key 且为空字符串(例如 [""]),会添加所有 labels 到 tag。容器会继承 Pod labels。如果 label 的 key 有 dot 字符,会将其变为横线
字段类型: JSON
采集器配置字段:
extract_k8s_label_as_tags_v2_for_metric示例:
'["app","name"]' -
ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_POD_ANNOTATIONS
是否开启自动发现 Prometheus Pod Annotations 并采集指标
字段类型: Boolean
采集器配置字段:
enable_auto_discovery_of_prometheus_pod_annotations默认值: false
-
ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_SERVICE_ANNOTATIONS
是否开启自动发现 Prometheus 服务 Annotations 并采集指标
字段类型: Boolean
采集器配置字段:
enable_auto_discovery_of_prometheus_service_annotations默认值: false
-
ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_POD_MONITORS
是否开启自动发现 Prometheus Pod Monitor CRD 并采集指标,详见 Prometheus-Operator CRD 文档
字段类型: Boolean
采集器配置字段:
enable_auto_discovery_of_prometheus_pod_monitors默认值: false
-
ENV_INPUT_CONTAINER_ENABLE_AUTO_DISCOVERY_OF_PROMETHEUS_SERVICE_MONITORS
是否开启自动发现 Prometheus ServiceMonitor CRD 并采集指标,详见 Prometheus-Operator CRD 文档
字段类型: Boolean
采集器配置字段:
enable_auto_discovery_of_prometheus_service_monitors默认值: false
-
ENV_INPUT_CONTAINER_CONTAINER_MAX_CONCURRENT
采集容器数据时的最大并发数,推荐只在采集延迟较大时开启
字段类型: Int
采集器配置字段:
container_max_concurrent默认值: cpu cores + 1
-
ENV_INPUT_CONTAINER_CONTAINER_INCLUDE_LOG
容器日志白名单,使用 image/namespace 过滤
字段类型: List
采集器配置字段:
container_include_log示例:
"image:pubrepo.jiagouyun.com/datakit/logfwd*" -
ENV_INPUT_CONTAINER_CONTAINER_EXCLUDE_LOG
容器日志黑名单,使用 image/namespace 过滤
字段类型: List
采集器配置字段:
container_exclude_log示例:
"image:pubrepo.jiagouyun.com/datakit/logfwd*" -
ENV_INPUT_CONTAINER_POD_INCLUDE_METRIC
Pod 指标白名单,使用 namespace 过滤
字段类型: List
采集器配置字段:
pod_include_metric示例:
"namespace:datakit*" -
ENV_INPUT_CONTAINER_POD_EXCLUDE_METRIC
Pod 指标黑名单,使用 namespace 过滤
字段类型: List
采集器配置字段:
pod_exclude_metric示例:
"namespace:kube-system" -
ENV_INPUT_CONTAINER_LOGGING_EXTRA_SOURCE_MAP
日志采集配置额外的 source 匹配,符合正则的 source 会被改名
字段类型: Map
采集器配置字段:
logging_extra_source_map示例:
source_regex*=new_source,regex*=new_source2 -
ENV_INPUT_CONTAINER_LOGGING_SOURCE_MULTILINE_MAP_JSON
日志采集根据 source 指定多行配置
字段类型: JSON
采集器配置字段:
logging_source_multiline_map示例:
'{"source_nginx":"^\d{4}", "source_redis":"^[A-Za-z_]"}' -
ENV_INPUT_CONTAINER_LOGGING_AUTO_MULTILINE_DETECTION
日志采集是否开启自动多行模式,开启后会在 patterns 列表中匹配适用的多行规则
字段类型: Boolean
采集器配置字段:
logging_auto_multiline_detection默认值: false
-
ENV_INPUT_CONTAINER_LOGGING_AUTO_MULTILINE_EXTRA_PATTERNS_JSON
日志采集的自动多行模式 pattens 列表,支持手动配置多个多行规则
字段类型: JSON
采集器配置字段:
logging_auto_multiline_extra_patterns示例:
'["^\d{4}-\d{2}", "^[A-Za-z_]"]'默认值: For more default rules, see doc
-
ENV_INPUT_CONTAINER_LOGGING_REMOVE_ANSI_ESCAPE_CODES
日志采集删除包含的颜色字符,详见日志特殊字符处理说明
字段类型: Boolean
采集器配置字段:
logging_remove_ansi_escape_codes默认值: false
-
ENV_INPUT_CONTAINER_LOGGING_FILE_FROM_BEGINNING_THRESHOLD_SIZE
根据文件 size 决定是否 from_beginning,如果发现该文件时,文件 size 小于这个值,就使用 from_beginning 从头部开始采集
字段类型: Int
采集器配置字段:
logging_file_from_beginning_threshold_size默认值: 20,000,000
-
ENV_INPUT_CONTAINER_LOGGING_FILE_FROM_BEGINNING
是否从文件首部采集日志
字段类型: Boolean
采集器配置字段:
logging_file_from_beginning默认值: false
-
ENV_INPUT_CONTAINER_LOGGING_MAX_OPEN_FILES
日志采集最大打开文件个数,如果是 -1 则没有限制
字段类型: Int
采集器配置字段:
logging_max_open_files默认值: 500
-
ENV_INPUT_CONTAINER_LOGGING_FIELD_WHITE_LIST
指定保留白名单中的字段
字段类型: List
采集器配置字段:
logging_field_white_list示例:
'["service","container_id"]' -
ENV_INPUT_CONTAINER_TAGS
自定义标签。如果配置文件有同名标签,将会覆盖它
字段类型: Map
采集器配置字段:
tags示例:
tag1=value1,tag2=value2
环境变量额外说明:
-
ENV_INPUT_CONTAINER_TAGS:如果配置文件(container.conf)中有同名 tag,将会被这里的配置覆盖掉。
-
ENV_INPUT_CONTAINER_LOGGING_EXTRA_SOURCE_MAP:指定替换 source,参数格式是「正则表达式=new_source」,当某个 source 能够匹配正则表达式,则这个 source 会被 new_source 替换。如果能够替换成功,则不再使用
annotations/labels中配置的 source( Version-1.4.7)。如果要做到精确匹配,需要使用^和$将内容括起来。比如正则表达式写成datakit,不仅可以匹配datakit字样,还能匹配到datakit123;写成^datakit$则只能匹配到的datakit。 -
ENV_INPUT_CONTAINER_LOGGING_SOURCE_MULTILINE_MAP_JSON:用来指定 source 到多行配置的映射,如果某个日志没有配置
multiline_match,就会根据它的 source 来此处查找和使用对应的multiline_match。因为multiline_match值是正则表达式较为复杂,所以 value 格式是 JSON 字符串,可以使用 json.cn 辅助编写并压缩成一行。
Note
- 对象数据采集间隔是 5 分钟,指标数据采集间隔是 60 秒,不支持配置
- 采集到的日志,单行(包括经过
multiline_match处理后)最大长度默认为 800KB 左右,超出部分会被分割成多条日志
Docker 和 Containerd sock 文件配置¶
如果 Docker 或 Containerd 的 sock 路径不是默认的,则需要指定一下 sock 文件路径,根据 DataKit 不同部署方式,其方式有所差别,以 Containerd 为例:
修改 container.conf 的 endpoints 配置项,将其设置为对应的 sock 路径。
更改 datakit.yaml 的 volumes containerd-socket,将新路径 mount 到 DataKit 中,同时配置环境变量 ENV_INPUT_CONTAINER_ENDPOINTS:
# 添加 env
- env:
- name: ENV_INPUT_CONTAINER_ENDPOINTS
value: ["unix:///path/to/new/containerd/containerd.sock"]
# 修改 mountPath
- mountPath: /path/to/new/containerd/containerd.sock
name: containerd-socket
readOnly: true
# 修改 volumes
volumes:
- hostPath:
path: /path/to/new/containerd/containerd.sock
name: containerd-socket
环境变量 ENV_INPUT_CONTAINER_ENDPOINTS 是追加到现有的 endpoints 配置,最终实际 endpoints 配置可能有很多项,采集器会去重然后逐一连接、采集。
默认的 endpoints 配置是:
endpoints = [
"unix:///var/run/docker.sock",
"unix:///var/run/containerd/containerd.sock",
"unix:///var/run/crio/crio.sock",
]
使用环境变量 ENV_INPUT_CONTAINER_ENDPOINTS 为 ["unix:///path/to/new//run/containerd.sock"],最终 endpoints 配置如下:
endpoints = [
"unix:///var/run/docker.sock",
"unix:///var/run/containerd/containerd.sock",
"unix:///var/run/crio/crio.sock",
"unix:///path/to/new//run/containerd.sock",
]
采集器会连接和采集这些容器运行时,如果 sock 文件不存在,会在第一次连接失败时输出报错日志,不影响后续采集。
Prometheus Exporter 指标采集¶
如果 Pod/容器有暴露 Prometheus 指标,有两种方式可以采集,参见这里
日志采集¶
日志采集的相关配置详见此处。
指标¶
以下所有数据采集,默认会追加名为 host 的全局 tag(tag 值为 DataKit 所在主机名),也可以在配置中通过 [inputs.container.tags] 指定其它标签:
kube_statefulset¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| statefulset ( tag) |
Name must be unique within a namespace. |
| uid ( tag) |
The UID of StatefulSet. |
| replicas | The number of Pods created by the StatefulSet controller. Type: int Unit: count |
| replicas_available | Total number of available pods (ready for at least minReadySeconds) targeted by this StatefulSet. Type: int Unit: count |
| replicas_current | The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by currentRevision. Type: int Unit: count |
| replicas_desired | The desired number of replicas of the given Template. Type: int Unit: count |
| replicas_ready | The number of pods created for this StatefulSet with a Ready Condition. Type: int Unit: count |
| replicas_updated | The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by updateRevision. Type: int Unit: count |
kube_replicaset¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| replicaset ( tag) |
Name must be unique within a namespace. |
| uid ( tag) |
The UID of ReplicaSet. |
| fully_labeled_replicas | The number of fully labeled replicas per ReplicaSet. Type: int Unit: count |
| replicas | The most recently observed number of replicas. Type: int Unit: count |
| replicas_available | The number of available replicas (ready for at least minReadySeconds) for this replica set. Type: int Unit: count |
| replicas_desired | The number of desired replicas. Type: int Unit: count |
| replicas_ready | The number of ready replicas for this replica set. Type: int Unit: count |
kube_service¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| service ( tag) |
Name must be unique within a namespace. |
| uid ( tag) |
The UID of Service |
| ports | Total number of ports that are exposed by this service. Type: int Unit: count |
kube_pod¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| daemonset ( tag) |
The name of the DaemonSet which the object belongs to. |
| deployment ( tag) |
The name of the Deployment which the object belongs to. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| node_name ( tag) |
NodeName is a request to schedule this pod onto a specific node. |
| pod ( tag) |
Name must be unique within a namespace. |
| pod_name ( tag) |
Renamed from 'pod'. |
| statefulset ( tag) |
The name of the StatefulSet which the object belongs to. |
| uid ( tag) |
The UID of pod. |
| cpu_limit_millicores | The total CPU limit (in millicores) across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value. Type: int Unit: milli-cores |
| cpu_number | The total number of CPUs on the node where the Pod is running. Type: int Unit: count |
| cpu_request_millicores | The total CPU request (in millicores) across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value. Type: int Unit: milli-cores |
| cpu_usage | The total CPU usage across all containers in this Pod. Type: float Unit: percent,percent |
| cpu_usage_base100 | The normalized CPU usage, with a maximum of 100%. Type: float Unit: percent,percent |
| cpu_usage_base_limit | The normalized CPU usage, with a maximum of 100%, based on the CPU limit. Type: float Unit: percent,percent |
| cpu_usage_base_request | The normalized CPU usage, with a maximum of 100%, based on the CPU request. Type: float Unit: percent,percent |
| cpu_usage_millicores | The total CPU usage (in millicores) averaged over the sample window for all containers. Type: int Unit: milli-cores |
| ephemeral_storage_available_bytes | The storage space available (bytes) for the filesystem. Type: int Unit: digital,B |
| ephemeral_storage_capacity_bytes | The total capacity (bytes) of the filesystems underlying storage. Type: int Unit: digital,B |
| ephemeral_storage_used_bytes | The bytes used for a specific task on the filesystem. Type: int Unit: digital,B |
| mem_capacity | The total memory capacity of the host machine. Type: int Unit: digital,B |
| mem_limit | The total memory limit across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value. Type: int Unit: digital,B |
| mem_request | The total memory request across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value. Type: int Unit: digital,B |
| mem_rss | The total RSS memory usage of all containers in this Pod, which is not supported by metrics-server. Type: int Unit: digital,B |
| mem_usage | The total memory usage of all containers in this Pod. Type: int Unit: digital,B |
| mem_used_percent | The percentage of memory usage based on the host machine’s total memory capacity. Type: float Unit: percent,percent |
| mem_used_percent_base_limit | The percentage of memory usage based on the memory limit. Type: float Unit: percent,percent |
| mem_used_percent_base_request | The percentage of memory usage based on the memory request. Type: float Unit: percent,percent |
| memory_capacity | The total memory in the host machine (Deprecated use mem_capacity).Type: int Unit: digital,B |
| memory_usage_bytes | The sum of the memory usage of all containers in this Pod (Deprecated use mem_usage).Type: int Unit: digital,B |
| memory_used_percent | The percentage usage of the memory (refer from mem_used_percentType: float Unit: percent,percent |
| network_bytes_rcvd | Cumulative count of bytes received. Type: int Unit: digital,B |
| network_bytes_sent | Cumulative count of bytes transmitted. Type: int Unit: digital,B |
| ready | Describes whether the pod is ready to serve requests. Type: int Unit: count |
| restarts | The number of times the container has been restarted. Type: int Unit: count |
kubernetes¶
Kubernetes 中的资源计数。
| Tags & Fields | Description |
|---|---|
| namespace ( tag) |
namespace |
| node_name ( tag) |
NodeName is a request to schedule this pod onto a specific node (only supported Pod and Container). |
| container | Container count Type: int | (count) Unit: - |
| cronjob | CronJob count Type: int | (count) Unit: - |
| daemonset | Service count Type: int | (count) Unit: - |
| deployment | Deployment count Type: int | (count) Unit: - |
| endpoint | Endpoint count Type: int | (count) Unit: - |
| job | Job count Type: int | (count) Unit: - |
| node | Node count Type: int | (count) Unit: - |
| pod | Pod count Type: int | (count) Unit: - |
| replicaset | ReplicaSet count Type: int | (count) Unit: - |
| service | Service count Type: int | (count) Unit: - |
| statefulset | StatefulSet count Type: int | (count) Unit: - |
kube_cronjob¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| cronjob ( tag) |
Name must be unique within a namespace. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of CronJob. |
| spec_suspend | This flag tells the controller to suspend subsequent executions. Type: bool Unit: - |
kube_daemonset¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| daemonset ( tag) |
Name must be unique within a namespace. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of DaemonSet. |
| daemons_available | The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and available (ready for at least spec.minReadySeconds). Type: int Unit: count |
| daemons_unavailable | The number of nodes that should be running the daemon pod and have none of the daemon pod running and available (ready for at least spec.minReadySeconds). Type: int Unit: count |
| desired | The total number of nodes that should be running the daemon pod (including nodes correctly running the daemon pod). Type: int Unit: count |
| misscheduled | The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod. Type: int Unit: count |
| ready | The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready. Type: int Unit: count |
| scheduled | The number of nodes that are running at least one daemon pod and are supposed to run the daemon pod. Type: int Unit: count |
| updated | The total number of nodes that are running updated daemon pod. Type: int Unit: count |
kube_deployment¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| deployment ( tag) |
Name must be unique within a namespace. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of Deployment. |
| replicas | Total number of non-terminated pods targeted by this deployment (their labels match the selector). Type: int Unit: count |
| replicas_available | Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. Type: int Unit: count |
| replicas_desired | Number of desired pods for a Deployment. Type: int Unit: count |
| replicas_ready | The number of pods targeted by this Deployment with a Ready Condition. Type: int Unit: count |
| replicas_unavailable | Total number of unavailable pods targeted by this deployment. Type: int Unit: count |
| replicas_updated | Total number of non-terminated pods targeted by this deployment that have the desired template spec. Type: int Unit: count |
| rollingupdate_max_surge | The maximum number of pods that can be scheduled above the desired number of pods. Type: int Unit: count |
| rollingupdate_max_unavailable | The maximum number of pods that can be unavailable during the update. Type: int Unit: count |
kube_dfpv¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| name ( tag) |
The dfpv name, consists of pvc name and pod name |
| namespace ( tag) |
The namespace of Pod and PVC. |
| node_name ( tag) |
Reference to the Node. |
| pod_name ( tag) |
Reference to the Pod. |
| pvc_name ( tag) |
Reference to the PVC. |
| volume_mount_name ( tag) |
The name given to the Volume. |
| available | AvailableBytes represents the storage space available (bytes) for the filesystem. Type: int Unit: digital,B |
| capacity | CapacityBytes represents the total capacity (bytes) of the filesystems underlying storage. Type: int Unit: digital,B |
| inodes | Inodes represents the total inodes in the filesystem. Type: int Unit: count |
| inodes_free | InodesFree represents the free inodes in the filesystem. Type: int Unit: count |
| inodes_used | InodesUsed represents the inodes used by the filesystem. Type: int Unit: count |
| used | UsedBytes represents the bytes used for a specific task on the filesystem. Type: int Unit: digital,B |
kube_endpoint¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| endpoint ( tag) |
Name must be unique within a namespace. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of Endpoint. |
| address_available | Number of addresses available in endpoint. Type: int Unit: count |
| address_not_ready | Number of addresses not ready in endpoint. Type: int Unit: count |
kube_job¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| job ( tag) |
Name must be unique within a namespace. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of Job. |
| active | The number of actively running pods. Type: int Unit: count |
| completion_failed | The job has failed its execution. Type: int Unit: count |
| completion_succeeded | The job has completed its execution. Type: int Unit: count |
| failed | The number of pods which reached phase Failed. Type: int Unit: count |
| succeeded | The number of pods which reached phase Succeeded. Type: int Unit: count |
kube_node¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| node ( tag) |
Name must be unique within a namespace |
| uid ( tag) |
The UID of Node. |
| cpu_allocatable | The allocatable CPU of a node that is available for scheduling. Type: int Unit: - |
| cpu_capacity | The CPU capacity of a node. Type: int Unit: - |
| ephemeral_storage_allocatable | The allocatable ephemeral-storage of a node that is available for scheduling. Type: int Unit: - |
| ephemeral_storage_capacity | The ephemeral-storage capacity of a node. Type: int Unit: - |
| memory_allocatable | The allocatable memory of a node that is available for scheduling. Type: int Unit: - |
| memory_capacity | The memory capacity of a node. Type: int Unit: - |
| pods_allocatable | The allocatable pods of a node that is available for scheduling. Type: int Unit: - |
| pods_capacity | The pods capacity of a node. Type: int Unit: - |
docker_containers¶
容器指标字段(只有正在运行的容器才能采集)
| Tags & Fields | Description |
|---|---|
| aws_ecs_cluster_name ( tag) |
Cluster name of the AWS ECS. |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| container_id ( tag) |
Container ID |
| container_name ( tag) |
Container name from k8s (label io.kubernetes.container.name). If empty then use $container_runtime_name. |
| container_runtime ( tag) |
Container runtime (this container from Docker/Containerd/cri-o). |
| container_runtime_name ( tag) |
Container name from runtime (like 'docker ps'). If empty then use 'unknown'. |
| container_runtime_version ( tag) |
Container runtime version. |
| container_type ( tag) |
The type of the container (this container is created by Kubernetes/Docker/Containerd/cri-o). |
| daemonset ( tag) |
The name of the DaemonSet which the object belongs to. |
| deployment ( tag) |
The name of the Deployment which the object belongs to. |
| image ( tag) |
The full name of the container image, example nginx.org/nginx:1.21.0. |
| image_name ( tag) |
The name of the container image, example nginx.org/nginx. |
| image_short_name ( tag) |
The short name of the container image, example nginx. |
| image_tag ( tag) |
The tag of the container image, example 1.21.0. |
| namespace ( tag) |
The namespace of the container (label io.kubernetes.pod.namespace). |
| pod_name ( tag) |
The pod name of the container (label io.kubernetes.pod.name). |
| pod_uid ( tag) |
The pod uid of the container (label io.kubernetes.pod.uid). |
| state ( tag) |
Container status (only Running). |
| statefulset ( tag) |
The name of the StatefulSet which the object belongs to. |
| task_arn ( tag) |
The task arn of the AWS Fargate. |
| task_family ( tag) |
The task family of the AWS fargate. |
| task_version ( tag) |
The task version of the AWS fargate. |
| block_read_byte | Total number of bytes read from the container file system (only supported docker). Type: int Unit: digital,B |
| block_write_byte | Total number of bytes wrote to the container file system (only supported docker). Type: int Unit: digital,B |
| cpu_limit_millicores | The CPU limit of the container, measured in milli-cores. Type: int Unit: milli-cores |
| cpu_numbers | The number of CPU cores on the system host. Type: int Unit: count |
| cpu_request_millicores | The CPU request of the container, measured in milli-cores (only supported in Kubernetes). Type: int Unit: milli-cores |
| cpu_usage | The actual CPU usage on the system host (percentage). Type: float Unit: percent,percent |
| cpu_usage_base100 | The normalized CPU usage, with a maximum value of 100%. It is calculated as the number of CPU cores multiplied by 100. Type: float Unit: percent,percent |
| cpu_usage_base_limit | The CPU usage based on the CPU limit (percentage). Type: float Unit: percent,percent |
| cpu_usage_base_request | The CPU usage based on the CPU request (percentage) (only supported in Kubernetes). Type: float Unit: percent,percent |
| cpu_usage_millicores | The CPU usage of the container, measured in milli-cores. Type: int Unit: milli-cores |
| mem_capacity | The total memory on the system host. Type: int Unit: digital,B |
| mem_limit | The memory limit of the container. Type: int Unit: digital,B |
| mem_request | The memory request of the container (only supported in Kubernetes). Type: int Unit: digital,B |
| mem_usage | The actual memory usage of the container. Type: int Unit: digital,B |
| mem_used_percent | The memory usage percentage based on the total memory of the system host. Type: float Unit: percent,percent |
| mem_used_percent_base_limit | The memory usage percentage based on the memory limit. Type: float Unit: percent,percent |
| mem_used_percent_base_request | The memory usage percentage based on the memory request (only supported in Kubernetes). Type: float Unit: percent,percent |
| network_bytes_rcvd | Total number of bytes received from the network (only count the usage of the main process in the container, excluding loopback). Type: int Unit: digital,B |
| network_bytes_sent | Total number of bytes send to the network (only count the usage of the main process in the container, excluding loopback). Type: int Unit: digital,B |
对象¶
kubernetes_statefulsets¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR-MATCH-LABELS> ( tag) |
Represents the selector.matchLabels for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| name ( tag) |
The UID of StatefulSet. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| statefulset_name ( tag) |
Name must be unique within a namespace. |
| uid ( tag) |
The UID of StatefulSet. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds) Type: int Unit: time,s |
| message | Object details Type: string Unit: - |
| replicas | The number of Pods created by the StatefulSet controller. Type: int Unit: count |
| replicas_available | Total number of available pods (ready for at least minReadySeconds) targeted by this StatefulSet. Type: int Unit: count |
| replicas_current | The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by currentRevision. Type: int Unit: count |
| replicas_desired | The desired number of replicas of the given Template. Type: int Unit: count |
| replicas_ready | The number of pods created for this StatefulSet with a Ready Condition. Type: int Unit: count |
| replicas_updated | The number of Pods created by the StatefulSet controller from the StatefulSet version indicated by updateRevision. Type: int Unit: count |
kubernetes_replica_sets¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR-MATCH-LABELS> ( tag) |
Represents the selector.matchLabels for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| deployment ( tag) |
The name of the Deployment which the object belongs to. |
| name ( tag) |
The UID of ReplicaSet. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| replicaset_name ( tag) |
Name must be unique within a namespace. |
| statefulset ( tag) |
The name of the StatefulSet which the object belongs to. |
| uid ( tag) |
The UID of ReplicaSet. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds) Type: int Unit: time,s |
| available | The number of available replicas (ready for at least minReadySeconds) for this replica set. (Deprecated) Type: int Unit: - |
| message | Object details Type: string Unit: - |
| ready | The number of ready replicas for this replica set. (Deprecated) Type: int Unit: - |
| replicas | The most recently observed number of replicas. Type: int Unit: count |
| replicas_available | The number of available replicas (ready for at least minReadySeconds) for this replica set. Type: int Unit: count |
| replicas_desired | The number of desired replicas. Type: int Unit: count |
| replicas_ready | The number of ready replicas for this replica set. Type: int Unit: count |
kubernetes_services¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR> ( tag) |
Represents the selector for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| name ( tag) |
The UID of Service |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| service_name ( tag) |
Name must be unique within a namespace. |
| type ( tag) |
Type determines how the Service is exposed. Defaults to ClusterIP. (ClusterIP/NodePort/LoadBalancer/ExternalName) |
| uid ( tag) |
The UID of Service |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds) Type: int Unit: time,s |
| cluster_ip | ClusterIP is the IP address of the service and is usually assigned randomly by the master. Type: string Unit: - |
| external_ips | ExternalIPs is a list of IP addresses for which nodes in the cluster will also accept traffic for this service. Type: string Unit: - |
| external_name | ExternalName is the external reference that kubedns or equivalent will return as a CNAME record for this service. Type: string Unit: - |
| external_traffic_policy | ExternalTrafficPolicy denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. Type: string Unit: - |
| message | Object details Type: string Unit: - |
| session_affinity | Supports "ClientIP" and "None". Type: string Unit: - |
kubelet_pod¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| daemonset ( tag) |
The name of the DaemonSet which the object belongs to. |
| deployment ( tag) |
The name of the Deployment which the object belongs to. |
| host ( tag) |
Pointing to the node where the pod is located. |
| name ( tag) |
The UID of Pod. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| node_name ( tag) |
NodeName is a request to schedule this pod onto a specific node. |
| phase ( tag) |
The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle.(Pending/Running/Succeeded/Failed/Unknown) |
| pod_name ( tag) |
Name must be unique within a namespace. |
| qos_class ( tag) |
The Quality of Service (QOS) classification assigned to the pod based on resource requirements |
| statefulset ( tag) |
The name of the StatefulSet which the object belongs to. |
| status ( tag) |
Reason the container is not yet running. |
| uid ( tag) |
The UID of Pod. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds) Type: int Unit: time,s |
| available | Number of containers Type: int Unit: count |
| cpu_limit_millicores | The total CPU limit (in millicores) across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value. Type: int Unit: milli-cores |
| cpu_number | The total number of CPUs on the node where the Pod is running. Type: int Unit: count |
| cpu_request_millicores | The total CPU request (in millicores) across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value. Type: int Unit: milli-cores |
| cpu_usage | The total CPU usage across all containers in this Pod. Type: float Unit: percent,percent |
| cpu_usage_base100 | The normalized CPU usage, with a maximum of 100%. Type: float Unit: percent,percent |
| cpu_usage_base_limit | The normalized CPU usage, with a maximum of 100%, based on the CPU limit. Type: float Unit: percent,percent |
| cpu_usage_base_request | The normalized CPU usage, with a maximum of 100%, based on the CPU request. Type: float Unit: percent,percent |
| cpu_usage_millicores | The total CPU usage (in millicores) averaged over the sample window for all containers. Type: int Unit: milli-cores |
| ephemeral_storage_available_bytes | The storage space available (bytes) for the filesystem. Type: int Unit: digital,B |
| ephemeral_storage_capacity_bytes | The total capacity (bytes) of the filesystems underlying storage. Type: int Unit: digital,B |
| ephemeral_storage_used_bytes | The bytes used for a specific task on the filesystem. Type: int Unit: digital,B |
| mem_capacity | The total memory capacity of the host machine. Type: int Unit: digital,B |
| mem_limit | The total memory limit across all containers in this Pod. Note: This value is the sum of all container limit values, as Pods do not have a direct limit value. Type: int Unit: digital,B |
| mem_request | The total memory request across all containers in this Pod. Note: This value is the sum of all container request values, as Pods do not have a direct request value. Type: int Unit: digital,B |
| mem_rss | The total RSS memory usage of all containers in this Pod, which is not supported by metrics-server. Type: int Unit: digital,B |
| mem_usage | The total memory usage of all containers in this Pod. Type: int Unit: digital,B |
| mem_used_percent | The percentage of memory usage based on the host machine’s total memory capacity. Type: float Unit: percent,percent |
| mem_used_percent_base_100 | The percentage usage of the memory (refer from mem_used_percentType: float Unit: percent,percent |
| mem_used_percent_base_limit | The percentage of memory usage based on the memory limit. Type: float Unit: percent,percent |
| mem_used_percent_base_request | The percentage of memory usage based on the memory request. Type: float Unit: percent,percent |
| memory_capacity | The total memory in the host machine (Deprecated use mem_capacity).Type: int Unit: digital,B |
| memory_usage_bytes | The sum of the memory usage of all containers in this Pod (Deprecated use mem_usage).Type: int Unit: digital,B |
| memory_used_percent | The percentage usage of the memory (refer from mem_used_percentType: float Unit: percent,percent |
| message | Object details Type: string Unit: - |
| network_bytes_rcvd | Cumulative count of bytes received. Type: int Unit: digital,B |
| network_bytes_sent | Cumulative count of bytes transmitted. Type: int Unit: digital,B |
| ready | Describes whether the pod is ready to serve requests. Type: int Unit: count |
| restarts | The number of times the container has been restarted. Type: int Unit: count |
kubernetes_persistentvolumeclaims¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR-MATCH-LABELS> ( tag) |
Represents the selector.matchLabels for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| name ( tag) |
The UID of PersistentVolume. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| persistentvolumeclaim_name ( tag) |
Name must be unique within a namespace. |
| uid ( tag) |
The UID of PersistentVolume. |
| workload_name ( tag) |
The name of the workload resource. |
| access_modes | AccessModes contains the desired access modes the volume should have. Type: string Unit: - |
| age | Age (seconds) Type: int Unit: time,s |
| message | Object details Type: string Unit: - |
| phase | The phase indicates if a volume is available, bound to a claim, or released by a claim.(Pending/Bound/Lost) Type: string Unit: - |
| requests_storage | Specifies the maximum storage capacity of a PersistentVolume (PV), which Kubernetes uses for scheduling and resource allocation. Type: string Unit: - |
| storage_class_name | StorageClassName is the name of the StorageClass required by the claim. Type: string Unit: - |
| volume_mode | VolumeMode defines what type of volume is required by the claim.(Block/Filesystem) Type: string Unit: - |
| volume_name | VolumeName is the binding reference to the PersistentVolume backing this claim. Type: string Unit: - |
kubernetes_persistentvolumes¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| name ( tag) |
The UID of PersistentVolume. |
| persistentvolume_name ( tag) |
The name of PersistentVolume |
| uid ( tag) |
The UID of PersistentVolume. |
| workload_name ( tag) |
The name of the workload resource. |
| access_modes | AccessModes contains the desired access modes the volume should have. Type: string Unit: - |
| age | Age (seconds) Type: int Unit: time,s |
| capacity_storage | Specifies the maximum storage capacity of a PersistentVolume (PV), which Kubernetes uses for scheduling and resource allocation. Type: string Unit: - |
| claimRef_name | Name of the bound PersistentVolumeClaim. Type: string Unit: - |
| claimRef_namespace | Namespace of the PersistentVolumeClaim. Type: string Unit: - |
| message | Object details Type: string Unit: - |
| phase | The phase indicates if a volume is available, bound to a claim, or released by a claim.(Pending/Available/Bound/Released/Failed) Type: string Unit: - |
kubernetes_cron_jobs¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| cron_job_name ( tag) |
Name must be unique within a namespace. |
| name ( tag) |
The UID of CronJob. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of CronJob. |
| workload_name ( tag) |
The name of the workload resource. |
| active_jobs | The number of pointers to currently running jobs. Type: int Unit: count |
| age | Age (seconds) Type: int Unit: time,s |
| message | Object details Type: string Unit: - |
| schedule | The schedule in Cron format, see doc Type: string Unit: - |
| suspend | This flag tells the controller to suspend subsequent executions, it does not apply to already started executions. Type: bool Unit: - |
kubernetes_daemonset¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR-MATCH-LABELS> ( tag) |
Represents the selector.matchLabels for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| daemonset_name ( tag) |
Name must be unique within a namespace. |
| name ( tag) |
The UID of DaemonSet. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of DaemonSet. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds) Type: int Unit: time,s |
| daemons_available | The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and available (ready for at least spec.minReadySeconds). Type: int Unit: count |
| daemons_unavailable | The number of nodes that should be running the daemon pod and have none of the daemon pod running and available (ready for at least spec.minReadySeconds). Type: int Unit: count |
| desired | The total number of nodes that should be running the daemon pod (including nodes correctly running the daemon pod). Type: int Unit: count |
| message | Object details Type: string Unit: - |
| misscheduled | The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod. Type: int Unit: count |
| ready | The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready. Type: int Unit: count |
| scheduled | The number of nodes that are running at least one daemon pod and are supposed to run the daemon pod. Type: int Unit: count |
| updated | The total number of nodes that are running updated daemon pod. Type: int Unit: count |
kubernetes_deployments¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR-MATCH-LABELS> ( tag) |
Represents the selector.matchLabels for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| deployment_name ( tag) |
Name must be unique within a namespace. |
| name ( tag) |
The UID of Deployment. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of Deployment. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds) Type: int Unit: time,s |
| available | Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. (Deprecated) Type: int Unit: count |
| max_surge | The maximum number of pods that can be scheduled above the desired number of pods. (Deprecated) Type: int Unit: count |
| max_unavailable | The maximum number of pods that can be unavailable during the update. (Deprecated) Type: int Unit: count |
| message | Object details Type: string Unit: - |
| paused | Indicates that the deployment is paused (true or false). Type: bool Unit: - |
| ready | The number of pods targeted by this Deployment with a Ready Condition. (Deprecated) Type: int Unit: count |
| replicas | Total number of non-terminated pods targeted by this deployment (their labels match the selector). Type: int Unit: count |
| replicas_available | Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. Type: int Unit: count |
| replicas_desired | Number of desired pods for a Deployment. Type: int Unit: count |
| replicas_ready | The number of pods targeted by this Deployment with a Ready Condition. Type: int Unit: count |
| replicas_unavailable | Total number of unavailable pods targeted by this deployment. Type: int Unit: count |
| replicas_updated | Total number of non-terminated pods targeted by this deployment that have the desired template spec. Type: int Unit: count |
| rollingupdate_max_surge | The maximum number of pods that can be scheduled above the desired number of pods. Type: int Unit: count |
| rollingupdate_max_unavailable | The maximum number of pods that can be unavailable during the update. Type: int Unit: count |
| strategy | Type of deployment. Can be "Recreate" or "RollingUpdate". Default is RollingUpdate. Type: string Unit: - |
| unavailable | Total number of unavailable pods targeted by this deployment. (Deprecated) Type: int Unit: count |
| up_dated | Total number of non-terminated pods targeted by this deployment that have the desired template spec. (Deprecated) Type: int Unit: count |
kubernetes_dfpv¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| name ( tag) |
The dfpv name, consists of pvc name and pod name |
| namespace ( tag) |
The namespace of Pod and PVC. |
| node_name ( tag) |
Reference to the Node. |
| pod_name ( tag) |
Reference to the Pod. |
| pvc_name ( tag) |
Reference to the PVC. |
| volume_mount_name ( tag) |
The name given to the Volume. |
| available | AvailableBytes represents the storage space available (bytes) for the filesystem. Type: int Unit: digital,B |
| capacity | CapacityBytes represents the total capacity (bytes) of the filesystems underlying storage. Type: int Unit: digital,B |
| inodes | Inodes represents the total inodes in the filesystem. Type: int Unit: count |
| inodes_free | InodesFree represents the free inodes in the filesystem. Type: int Unit: count |
| inodes_used | InodesUsed represents the inodes used by the filesystem. Type: int Unit: count |
| message | Object details Type: string Unit: - |
| used | UsedBytes represents the bytes used for a specific task on the filesystem. Type: int Unit: digital,B |
kubernetes_jobs¶
| Tags & Fields | Description |
|---|---|
| <ALL-SELECTOR-MATCH-LABELS> ( tag) |
Represents the selector.matchLabels for Kubernetes resources |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| job_name ( tag) |
Name must be unique within a namespace. |
| name ( tag) |
The UID of Job. |
| namespace ( tag) |
Namespace defines the space within each name must be unique. |
| uid ( tag) |
The UID of Job. |
| workload_name ( tag) |
The name of the workload resource. |
| active | The number of actively running pods. Type: int Unit: count |
| active_deadline | Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it Type: int Unit: time,s |
| age | Age (seconds) Type: int Unit: time,s |
| backoff_limit | Specifies the number of retries before marking this job failed. Type: int Unit: count |
| completions | Specifies the desired number of successfully finished pods the job should be run with. Type: int Unit: count |
| failed | The number of pods which reached phase Failed. Type: int Unit: count |
| message | Object details Type: string Unit: - |
| parallelism | Specifies the maximum desired number of pods the job should run at any given time. Type: int Unit: count |
| succeeded | The number of pods which reached phase Succeeded. Type: int Unit: count |
kubernetes_nodes¶
| Tags & Fields | Description |
|---|---|
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| internal_ip ( tag) |
Node internal IP |
| name ( tag) |
The UID of Node. |
| node_name ( tag) |
Name must be unique within a namespace. |
| role ( tag) |
Node role. (master/node) |
| status ( tag) |
NodePhase is the recently observed lifecycle phase of the node. (Pending/Running/Terminated) |
| uid ( tag) |
The UID of Node. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds). Type: int Unit: time,s |
| kubelet_version | Kubelet Version reported by the node. Type: string Unit: - |
| message | Object details. Type: string Unit: - |
| node_ready | NodeReady means kubelet is healthy and ready to accept pods (true/false/unknown). Type: string Unit: - |
| taints | Node's taints. Type: string Unit: - |
| unschedulable | Unschedulable controls node schedulability of new pods (yes/no). Type: string Unit: - |
docker_containers¶
容器对象字段(只有正在运行的容器才能采集)
| Tags & Fields | Description |
|---|---|
| aws_ecs_cluster_name ( tag) |
Cluster name of the AWS ECS. |
| cluster_name_k8s ( tag) |
K8s cluster name(default is default). We can rename it in datakit.yaml on ENV_CLUSTER_NAME_K8S. |
| container_id ( tag) |
Container ID. |
| container_name ( tag) |
Container name from k8s (label io.kubernetes.container.name). If empty then use $container_runtime_name. |
| container_runtime ( tag) |
Container runtime (this container from Docker/Containerd/cri-o). |
| container_runtime_name ( tag) |
Container name from runtime (like 'docker ps'). If empty then use 'unknown'. |
| container_runtime_version ( tag) |
Container runtime version. |
| container_type ( tag) |
The type of the container (this container is created by Kubernetes/Docker/Containerd/cri-o). |
| daemonset ( tag) |
The name of the DaemonSet which the object belongs to. |
| deployment ( tag) |
The name of the Deployment which the object belongs to. |
| image ( tag) |
The full name of the container image, example nginx.org/nginx:1.21.0. |
| image_name ( tag) |
The name of the container image, example nginx.org/nginx. |
| image_short_name ( tag) |
The short name of the container image, example nginx. |
| image_tag ( tag) |
The tag of the container image, example 1.21.0. |
| name ( tag) |
The ID of the container. |
| namespace ( tag) |
The namespace of the container (label io.kubernetes.pod.namespace). |
| pod_name ( tag) |
The pod name of the container (label io.kubernetes.pod.name). |
| pod_uid ( tag) |
The pod uid of the container (label io.kubernetes.pod.uid). |
| state ( tag) |
The state of the Container (only Running). |
| statefulset ( tag) |
The name of the StatefulSet which the object belongs to. |
| status ( tag) |
The status of the container,example Up 5 hours. |
| task_arn ( tag) |
The task arn of the AWS Fargate. |
| task_family ( tag) |
The task family of the AWS fargate. |
| task_version ( tag) |
The task version of the AWS fargate. |
| workload_name ( tag) |
The name of the workload resource. |
| age | Age (seconds). Type: int Unit: time,s |
| block_read_byte | Total number of bytes read from the container file system (only supported docker). Type: int Unit: digital,B |
| block_write_byte | Total number of bytes wrote to the container file system (only supported docker). Type: int Unit: digital,B |
| cpu_limit_millicores | The CPU limit of the container, measured in milli-cores. Type: int Unit: milli-cores |
| cpu_numbers | The number of CPU cores on the system host. Type: int Unit: count |
| cpu_request_millicores | The CPU request of the container, measured in milli-cores (only supported in Kubernetes). Type: int Unit: milli-cores |
| cpu_usage | The actual CPU usage on the system host (percentage). Type: float Unit: percent,percent |
| cpu_usage_base100 | The normalized CPU usage, with a maximum value of 100%. It is calculated as the number of CPU cores multiplied by 100. Type: float Unit: percent,percent |
| cpu_usage_base_limit | The CPU usage based on the CPU limit (percentage). Type: float Unit: percent,percent |
| cpu_usage_base_request | The CPU usage based on the CPU request (percentage) (only supported in Kubernetes). Type: float Unit: percent,percent |
| cpu_usage_millicores | The CPU usage of the container, measured in milli-cores. Type: int Unit: milli-cores |
| mem_capacity | The total memory on the system host. Type: int Unit: digital,B |
| mem_limit | The memory limit of the container. Type: int Unit: digital,B |
| mem_request | The memory request of the container (only supported in Kubernetes). Type: int Unit: digital,B |
| mem_usage | The actual memory usage of the container. Type: int Unit: digital,B |
| mem_used_percent | The memory usage percentage based on the total memory of the system host. Type: float Unit: percent,percent |
| mem_used_percent_base_limit | The memory usage percentage based on the memory limit. Type: float Unit: percent,percent |
| mem_used_percent_base_request | The memory usage percentage based on the memory request (only supported in Kubernetes). Type: float Unit: percent,percent |
| message | Object details. Type: string Unit: - |
| network_bytes_rcvd | Total number of bytes received from the network (only count the usage of the main process in the container, excluding loopback). Type: int Unit: digital,B |
| network_bytes_sent | Total number of bytes send to the network (only count the usage of the main process in the container, excluding loopback). Type: int Unit: digital,B |
日志¶
kubernetes_events¶
| Tags & Fields | Description |
|---|---|
| reason ( tag) |
This should be a short, machine understandable string that gives the reason, for the transition into the object's current status. |
| type ( tag) |
Type of this event. |
| uid ( tag) |
The UID of event. |
| involved_kind | Kind of the referent for involved object. Type: string Unit: - |
| involved_name | Name must be unique within a namespace for involved object. Type: string Unit: - |
| involved_namespace | Namespace defines the space within which each name must be unique for involved object. Type: string Unit: - |
| involved_uid | The UID of involved object. Type: string Unit: - |
| message | Details of event log Type: string Unit: - |
| source_component | Component from which the event is generated. Type: string Unit: - |
| source_host | Node name on which the event is generated. Type: string Unit: - |
<CONTAINER-NAME>¶
容器日志采集
| Tags & Fields | Description |
|---|---|
| container_id ( tag) |
Container ID. |
| container_name ( tag) |
Container name from k8s (label io.kubernetes.container.name). If empty then use $container_runtime_name. |
| daemonset ( tag) |
The name of the DaemonSet which the object belongs to. |
| deployment ( tag) |
The name of the Deployment which the object belongs to. |
| filepath ( tag) |
The filepath to the log file on the host system where the log is stored. |
| host ( tag) |
Host name |
| image ( tag) |
The full name of the container image, example nginx.org/nginx:1.21.0. |
| inside_filepath ( tag) |
The path to the log file inside the container (only applicable for log collection from within containers). |
| namespace ( tag) |
The namespace of the container (label io.kubernetes.pod.namespace). |
| pod_ip ( tag) |
The pod ip of the container. |
| pod_name ( tag) |
The pod name of the container (label io.kubernetes.pod.name). |
| service ( tag) |
The name of the service, if service is empty then use source. |
| statefulset ( tag) |
The name of the StatefulSet which the object belongs to. |
| log_file_inode | The inode of the log file, which uniquely identifies it on the file system (requires enabling the global configuration enable_debug_fields).Type: int Unit: count |
| log_read_lines | The lines of the read file. Type: int Unit: count |
| log_read_offset | The current offset in the log file where reading has occurred, used to track progress during log collection (requires enabling the global configuration enable_debug_fields).Type: int Unit: count |
| message | The text of the logging. Type: string Unit: - |
| status | The status of the logging, dafault is info.Type: string Unit: - |
变更事件¶
event¶
Kubernetes 中主要资源(Pod/Deployment/Service 等)变更将触发如下形式的变更事件。完整的变更列表,参见这里。
| Tags & Fields | Description |
|---|---|
| class ( tag) |
The type of Kubernetes resource, e.g. kubernetes_deployments/kubernetes_nodes/.. |
| deployment_name/node_name/.. ( tag) |
The name of Kubernetes resource, e.g. deployment-abc-123 |
| df_event_id ( tag) |
The event ID is generated by UUIDv4, e.g. event-<lowercase UUIDv4>. |
| df_source ( tag) |
The event source is always change. |
| df_status ( tag) |
The event source is always info. |
| df_sub_status ( tag) |
Always info. |
| namespace ( tag) |
The namespace of Kubernetes resource. |
| uid ( tag) |
The UID of Kubernetes resource. |
| df_message | This is a template field, concatenated from other values: [{{df_resource_type}}] {{df_resource}} configuration changed.Type: string | (unknown) Unit: - |
| df_title | Diff text of resource changes. Type: string | (unknown) Unit: - |
| diff | Diff text of resource changes. Type: string | (unknown) Unit: - |
联动 Dataway Sink 功能¶
Dataway Sink 详见文档。
所有的 Kubernetes 资源采集,都会添加与 CustomerKey 匹配的 Label。例如 CustomerKey 是 name,DaemonSet、Deployment、Pod 等资源,会在自己当前的 Labels 中找到 name,并将其添加到 tags。
容器会添加其所属 Pod 的 Customer Labels。
FAQ¶
根据 Pod Namespace 过滤指标采集¶
在启用 Kubernetes Pod 指标采集(enable_pod_metric = true)后,DataKit 将采集集群中所有 Pod 的指标数据。由于这可能会生成大量数据,因此可以通过 Pod 的 namespace 字段来过滤指标采集,从而仅采集特定命名空间中的 Pod 指标。
通过配置 pod_include_metric 和 pod_exclude_metric,可以控制哪些命名空间的 Pod 会被包含或排除在指标采集之外。
## 当 Pod 的 namespace 能够匹配 `datakit` 时,采集该 Pod 的指标
pod_include_metric = ["namespace:datakit"]
## 忽略所有 namespace 是 `kodo` 的 Pod
pod_exclude_metric = ["namespace:kodo"]
include和exclude配置项必须以字段名开头,格式为类似于 glob 通配符 的表达式:"<字段名>:<glob 规则>"。- 目前,
namespace字段是唯一支持的过滤字段。例如:namespace:datakit-ns。
如果同时设置了 include 和 exclude 配置,Pod 必须满足以下条件:
- 必须满足
include的规则 - 且不满足
exclude的规则
例如,以下配置会导致所有 Pod 都被过滤掉:
对于 Kubernetes 环境,可以通过以下环境变量来进行配置:
ENV_INPUT_CONTAINER_POD_INCLUDE_METRICENV_INPUT_CONTAINER_POD_EXCLUDE_METRIC
例如,如果希望只采集 namespace 为 kube-system 的 Pod 指标,可以设置 ENV_INPUT_CONTAINER_POD_INCLUDE_METRIC 环境变量,如下所示:
通过这种方式,可以灵活地控制 DataKit 采集的 Pod 指标范围,避免采集不需要的数据,从而优化系统性能和资源利用率。
NODE_LOCAL 需要新的权限¶
ENV_INPUT_CONTAINER_ENABLE_K8S_NODE_LOCAL 模式只推荐 DaemonSet 部署时使用,该模式需要访问 kubelet,所以需要在 RBAC 添加 nodes/stats 权限。例如:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: datakit
rules:
- apiGroups: [""]
resources: ["nodes", "nodes/stats"]
verbs: ["get", "list", "watch"]
此外,DataKit Pod 还需要开启 hostNetwork: true 配置项。
采集 PersistentVolumes 和 PersistentVolumeClaims 需要新的权限¶
DataKit 在 1.25.0 Version-1.25.0 版本支持采集 Kubernetes PersistentVolume 和 PersistentVolumeClaim 的对象数据,采集这两种资源需要新的 RBAC 权限,详细见下:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: datakit
rules:
- apiGroups: [""]
resources: ["persistentvolumes", "persistentvolumeclaims"]
verbs: ["get", "list", "watch"]
Kubernetes YAML 敏感字段屏蔽¶
DataKit 会采集 Kubernetes Pod 或 Service 等资源的 yaml 配置,并存储到对象数据的 yaml 字段中。如果该 yaml 中包含敏感数据(例如密码),DataKit 暂不支持手动配置屏蔽敏感字段,推荐使用 Kubernetes 官方的做法,即使用 ConfigMap 或者 Secret 来隐藏敏感字段。
例如,现在需要在 env 中添加一份密码,正常情况下是这样:
在编排 yaml 配置会将密码明文存储,这是很不安全的。可以使用 Kubernetes Secret 实现隐藏,方法如下:
创建一个 Secret:
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
username: username123
password: password123
执行:
在 env 中使用 Secret:
containers:
- name: mycontainer
image: redis
env:
- name: SECRET_PASSWORD
valueFrom:
secretKeyRef:
name: mysecret
key: password
optional: false
详见官方文档。