跳转至

ClickHouse

·


ClickHouse 采集器可以采集 ClickHouse 服务器实例主动暴露的多种指标,比如语句执行数量和内存存储量,IO 交互等多种指标,并将指标采集到观测云,帮助你监控分析 ClickHouse 各种异常情况。

配置

前置条件

  • ClickHouse 版本 >=v20.1.2.4
  • 在 ClickHouse Server 的 config.xml 配置文件中找到如下的代码段,取消注释,并设置 metrics 暴露的端口号(具体哪个自己选择,唯一即可)。修改完成后重启(若为集群,则每台机器均需操作)。
vim /etc/clickhouse-server/config.xml
<prometheus>
    <endpoint>/metrics</endpoint>
    <port>9363</port>
    <metrics>true</metrics>
    <events>true</events>
    <asynchronous_metrics>true</asynchronous_metrics>
</prometheus>

字段说明:

  • endpoint Prometheus 服务器抓取指标的 HTTP 路由
  • port 端点的端口号
  • metrics 从 ClickHouse 的 system.metrics 表中抓取暴露的指标标志
  • events 从 ClickHouse 的 system.events 表中抓取暴露的事件标志
  • asynchronous_metrics 从 ClickHouse 中 system.asynchronous_metrics 表中抓取暴露的异步指标标志

详见ClickHouse 官方文档

采集器配置

进入 DataKit 安装目录下的 conf.d/samples 目录,复制 clickhousev1.conf.sample 并命名为 clickhousev1.conf。示例如下:

[[inputs.clickhousev1]]
  ## Exporter URLs.
  urls = ["http://127.0.0.1:9363/metrics"]

  ## Unix Domain Socket URL. Using socket to request data when not empty.
  uds_path = ""

  ## Ignore URL request errors.
  ignore_req_err = false

  ## Collect data output.
  ## Fill this when want to collect the data to local file nor center.
  ## After filling, could use 'datakit debug --prom-conf /path/to/this/conf' to debug local storage measurement set.
  ## Using '--prom-conf' when priority debugging data in 'output' path.
  # output = "/abs/path/to/file"

  ## Collect data upper limit as bytes.
  ## Only available when set output to local file.
  ## If collect data exceeded the limit, the data would be dropped.
  ## Default is 32MB.
  # max_file_size = 0

  ## Metrics type whitelist. Optional: counter, gauge, histogram, summary
  ## Example: metric_types = ["counter", "gauge"], only collect 'counter' and 'gauge'.
  ## Default collect all.
  # metric_types = []

  ## Metrics name whitelist.
  ## Regex supported. Multi supported, conditions met when one matched.
  ## Collect all if empty.
  # metric_name_filter = ["cpu"]

  ## Metrics name blacklist.
  ## If a word both in blacklist and whitelist, blacklist priority.
  ## Regex supported. Multi supported, conditions met when one matched.
  ## Collect all if empty.
  # metric_name_filter_ignore = ["foo","bar"]

  ## Measurement prefix.
  ## Add prefix to measurement set name.
  measurement_prefix = ""

  ## Measurement name.
  ## If measurement_name is empty, split metric name by '_', the first field after split as measurement set name, the rest as current metric name.
  ## If measurement_name is not empty, using this as measurement set name.
  ## Always add 'measurement_prefix' prefix at last.
  # measurement_name = "clickhouse"

  ## TLS configuration.
  tls_open = false
  # tls_ca = "/tmp/ca.crt"
  # tls_cert = "/tmp/peer.crt"
  # tls_key = "/tmp/peer.key"

  ## Set to 'true' to enable election.
  election = true

  ## disable setting host tag for this input
  disable_host_tag = false

  ## disable setting instance tag for this input
  disable_instance_tag = false

  ## disable info tag for this input
  disable_info_tag = false

  ## Ignore tags. Multi supported.
  ## The matched tags would be dropped, but the item would still be sent.
  # tags_ignore = ["xxxx"]

  ## Customize authentification. For now support Bearer Token only.
  ## Filling in 'token' or 'token_file' is acceptable.
  # [inputs.clickhousev1.auth]
    # type = "bearer_token"
    # token = "xxxxxxxx"
    # token_file = "/tmp/token"

  ## Customize measurement set name.
  ## Treat those metrics with prefix as one set.
  ## Prioritier over 'measurement_name' configuration.
  [[inputs.clickhousev1.measurements]]
    prefix = "ClickHouseProfileEvents_"
    name = "ClickHouseProfileEvents"

  [[inputs.clickhousev1.measurements]]
    prefix = "ClickHouseMetrics_"
    name = "ClickHouseMetrics"

  [[inputs.clickhousev1.measurements]]
    prefix = "ClickHouseAsyncMetrics_"
    name = "ClickHouseAsyncMetrics"

  [[inputs.clickhousev1.measurements]]
    prefix = "ClickHouseStatusInfo_"
    name = "ClickHouseStatusInfo"

  ## Not collecting those data when tag matched.
  [inputs.clickhousev1.ignore_tag_kv_match]
    # key1 = [ "val1.*", "val2.*"]
    # key2 = [ "val1.*", "val2.*"]

  ## Add HTTP headers to data pulling.
  [inputs.clickhousev1.http_headers]
    # Root = "passwd"
    # Michael = "1234"

  ## Rename tag key in clickhouse data.
  [inputs.clickhousev1.tags_rename]
    overwrite_exist_tags = false
  [inputs.clickhousev1.tags_rename.mapping]
    # tag1 = "new-name-1"
    # tag2 = "new-name-2"
    # tag3 = "new-name-3"

  ## Customize tags.
  [inputs.clickhousev1.tags]
    # some_tag = "some_value"
    # more_tag = "some_other_value"

  ## (Optional) Collect interval: (defaults to "30s").
  # interval = "30s"

  ## (Optional) Timeout: (defaults to "30s").
  # timeout = "30s"

配置好后,重启 DataKit 即可。

目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。

指标

以下所有数据采集,默认会追加全局选举 tag,也可以在配置中通过 [inputs.clickhousev1.tags] 指定其它标签:

[inputs.prom.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"

ClickHouseAsyncMetrics

Tags & Fields Description
cpu
(tag)
Cpu id
disk
(tag)
Disk name
eth
(tag)
Eth id
host
(tag)
Host name
instance
(tag)
Instance endpoint
unit
(tag)
Unit name
AsynchronousHeavyMetricsCalculationTimeSpent Time in seconds spent for calculation of asynchronous heavy tables related metrics this is the overhead of asynchronous metrics.
Type: float | (gauge)
Unit: time,s
AsynchronousHeavyMetricsUpdateInterval Heavy (tables related) metrics update interval
Type: float | (gauge)
Unit: time,s
AsynchronousMetricsCalculationTimeSpent Time in seconds spent for calculation of asynchronous metrics this is the overhead of asynchronous metrics.
Type: float | (gauge)
Unit: time,s
AsynchronousMetricsUpdateInterval Metrics update interval
Type: float | (gauge)
Unit: time,s
BlockActiveTime Time in seconds the block device had the IO requests queued. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: time,s
BlockDiscardBytes Number of discarded bytes on the block device. These operations are relevant for SSD. Discard operations are not used by ClickHouse, but can be used by other processes on the system. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: digital,B
BlockDiscardMerges Number of discard operations requested from the block device and merged together by the OS IO scheduler. These operations are relevant for SSD. Discard operations are not used by ClickHouse, but can be used by other processes on the system. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockDiscardOps Number of discard operations requested from the block device. These operations are relevant for SSD. Discard operations are not used by ClickHouse, but can be used by other processes on the system. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockDiscardTime Time in seconds spend in discard operations requested from the block device, summed across all the operations. These operations are relevant for SSD. Discard operations are not used by ClickHouse, but can be used by other processes on the system. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: time,s
BlockInFlightOps This value counts the number of I/O requests that have been issued to the device driver but have not yet completed. It does not include IO requests that are in the queue but not yet issued to the device driver. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockQueueTime This value counts the number of milliseconds that IO requests have waited on this block device. If there are multiple IO requests waiting, this value will increase as the product of the number of milliseconds times the number of requests waiting. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: time,ms
BlockReadBytes Number of bytes read from the block device. It can be lower than the number of bytes read from the filesystem due to the usage of the OS page cache, that saves IO. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: digital,B
BlockReadMerges Number of read operations requested from the block device and merged together by the OS IO scheduler. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockReadOps Number of read operations requested from the block device. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockReadTime Time in seconds spend in read operations requested from the block device, summed across all the operations. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: time,s
BlockWriteBytes Number of bytes written to the block device. It can be lower than the number of bytes written to the filesystem due to the usage of the OS page cache, that saves IO. A write to the block device may happen later than the corresponding write to the filesystem due to write-through caching. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: digital,B
BlockWriteMerges Number of write operations requested from the block device and merged together by the OS IO scheduler. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockWriteOps Number of write operations requested from the block device. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: count
BlockWriteTime Time in seconds spend in write operations requested from the block device, summed across all the operations. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Source: /sys/block.
Type: float | (gauge)
Unit: time,s
CPUFrequencyMHz The current frequency of the CPU, in MHz. Most of the modern CPUs adjust the frequency dynamically for power saving and Turbo Boosting.
Type: float | (gauge)
Unit: frequency,MHz
CompiledExpressionCacheBytes Total bytes used for the cache of JIT-compiled code.
Type: float | (gauge)
Unit: digital,B
CompiledExpressionCacheCount Total entries in the cache of JIT-compiled code.
Type: float | (gauge)
Unit: count
DiskAvailable Available bytes on the disk (virtual filesystem). Remote filesystems can show a large value like 16 EiB.
Type: float | (gauge)
Unit: digital,B
DiskTotal The total size in bytes of the disk (virtual filesystem). Remote filesystems can show a large value like 16 EiB.
Type: float | (gauge)
Unit: digital,B
DiskUnreserved Available bytes on the disk (virtual filesystem) without the reservations for merges, fetches, and moves. Remote filesystems can show a large value like 16 EiB.
Type: float | (gauge)
Unit: digital,B
DiskUsed Used bytes on the disk (virtual filesystem). Remote filesystems not always provide this information.
Type: float | (gauge)
Unit: digital,B
FilesystemCacheBytes Total bytes in the cache virtual filesystem. This cache is hold on disk.
Type: float | (gauge)
Unit: digital,B
FilesystemCacheFiles Total number of cached file segments in the cache virtual filesystem. This cache is hold on disk.
Type: float | (gauge)
Unit: count
FilesystemLogsPathAvailableBytes Available bytes on the volume where ClickHouse logs path is mounted. If this value approaches zero, you should tune the log rotation in the configuration file.
Type: float | (gauge)
Unit: digital,B
FilesystemLogsPathAvailableINodes The number of available inodes on the volume where ClickHouse logs path is mounted.
Type: float | (gauge)
Unit: count
FilesystemLogsPathTotalBytes The size of the volume where ClickHouse logs path is mounted, in bytes. It's recommended to have at least 10 GB for logs.
Type: float | (gauge)
Unit: digital,B
FilesystemLogsPathTotalINodes The total number of inodes on the volume where ClickHouse logs path is mounted.
Type: float | (gauge)
Unit: count
FilesystemLogsPathUsedBytes Used bytes on the volume where ClickHouse logs path is mounted.
Type: float | (gauge)
Unit: digital,B
FilesystemLogsPathUsedINodes The number of used inodes on the volume where ClickHouse logs path is mounted.
Type: float | (gauge)
Unit: count
FilesystemMainPathAvailableBytes Available bytes on the volume where the main ClickHouse path is mounted.
Type: float | (gauge)
Unit: digital,B
FilesystemMainPathAvailableINodes The number of available inodes on the volume where the main ClickHouse path is mounted. If it is close to zero, it indicates a misconfiguration, and you will get 'no space left on device' even when the disk is not full.
Type: float | (gauge)
Unit: count
FilesystemMainPathTotalBytes The size of the volume where the main ClickHouse path is mounted, in bytes.
Type: float | (gauge)
Unit: digital,B
FilesystemMainPathTotalINodes The total number of inodes on the volume where the main ClickHouse path is mounted. If it is less than 25 million, it indicates a misconfiguration.
Type: float | (gauge)
Unit: count
FilesystemMainPathUsedBytes Used bytes on the volume where the main ClickHouse path is mounted.
Type: float | (gauge)
Unit: digital,B
FilesystemMainPathUsedINodes The number of used inodes on the volume where the main ClickHouse path is mounted. This value mostly corresponds to the number of files.
Type: float | (gauge)
Unit: count
HTTPThreads Number of threads in the server of the HTTP interface (without TLS).
Type: float | (gauge)
Unit: count
InterserverThreads Number of threads in the server of the replicas communication protocol (without TLS).
Type: float | (gauge)
Unit: count
Jitter The difference in time the thread for calculation of the asynchronous metrics was scheduled to wake up and the time it was in fact, woken up. A proxy-indicator of overall system latency and responsiveness.
Type: float | (gauge)
Unit: time,s
LoadAverage The whole system load, averaged with exponential smoothing over 1 minute. The load represents the number of threads across all the processes (the scheduling entities of the OS kernel), that are currently running by CPU or waiting for IO, or ready to run but not being scheduled at this point of time. This number includes all the processes, not only clickhouse-server. The number can be greater than the number of CPU cores, if the system is overloaded, and many processes are ready to run but waiting for CPU or IO.
Type: float | (gauge)
Unit: count
MMapCacheCells The number of files opened with mmap (mapped in memory). This is used for queries with the setting local_filesystem_read_method set to mmap. The files opened with mmap are kept in the cache to avoid costly TLB flushes.
Type: float | (gauge)
Unit: count
MarkCacheBytes Total size of mark cache in bytes
Type: float | (gauge)
Unit: digital,B
MarkCacheFiles Total number of mark files cached in the mark cache
Type: float | (gauge)
Unit: count
MaxPartCountForPartition Maximum number of parts per partition across all partitions of all tables of MergeTree family. Values larger than 300 indicates misconfiguration, overload, or massive data loading.
Type: float | (gauge)
Unit: count
MemoryCode The amount of virtual memory mapped for the pages of machine code of the server process, in bytes.
Type: float | (gauge)
Unit: digital,B
MemoryDataAndStack The amount of virtual memory mapped for the use of stack and for the allocated memory, in bytes. It is unspecified whether it includes the per-thread stacks and most of the allocated memory, that is allocated with the mmap system call. This metric exists only for completeness reasons. I recommend to use the MemoryResident metric for monitoring.
Type: float | (gauge)
Unit: digital,B
MemoryResident The amount of physical memory used by the server process, in bytes.
Type: float | (gauge)
Unit: digital,B
MemoryShared The amount of memory used by the server process, that is also shared by another processes, in bytes. ClickHouse does not use shared memory, but some memory can be labeled by OS as shared for its own reasons. This metric does not make a lot of sense to watch, and it exists only for completeness reasons.
Type: float | (gauge)
Unit: digital,B
MemoryVirtual The size of the virtual address space allocated by the server process, in bytes. The size of the virtual address space is usually much greater than the physical memory consumption, and should not be used as an estimate for the memory consumption. The large values of this metric are totally normal, and makes only technical sense.
Type: float | (gauge)
Unit: digital,B
MySQLThreads Number of threads in the server of the MySQL compatibility protocol.
Type: float | (gauge)
Unit: count
NetworkReceiveBytes Number of bytes received via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
NetworkReceiveDrop Number of bytes a packet was dropped while received via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
NetworkReceiveErrors Number of times error happened receiving via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
NetworkReceivePackets Number of network packets received via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
NetworkSendBytes Number of bytes sent via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
NetworkSendDrop Number of times a packed was dropped while sending via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
NetworkSendErrors Number of times error (e.g. TCP retransmit) happened while sending via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
NetworkSendPackets Number of network packets sent via the network interface. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
NumberOfDatabases Total number of databases on the server.
Type: float | (gauge)
Unit: count
NumberOfDetachedByUserParts The total number of parts detached from MergeTree tables by users with the ALTER TABLE DETACH query (as opposed to unexpected, broken or ignored parts). The server does not care about detached parts and they can be removed.
Type: float | (gauge)
Unit: count
NumberOfDetachedParts The total number of parts detached from MergeTree tables. A part can be detached by a user with the ALTER TABLE DETACH query or by the server itself it the part is broken, unexpected or unneeded. The server does not care about detached parts and they can be removed.
Type: float | (gauge)
Unit: count
NumberOfTables Total number of tables summed across the databases on the server, excluding the databases that cannot contain MergeTree tables. The excluded database engines are those who generate the set of tables on the fly, like Lazy, MySQL, PostgreSQL, SQlite.
Type: float | (gauge)
Unit: count
OSContextSwitches The number of context switches that the system underwent on the host machine. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
OSGuestNiceTime The ratio of time spent running a virtual CPU for guest operating systems under the control of the Linux kernel, when a guest was set to a higher priority (See man procfs). This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This metric is irrelevant for ClickHouse, but still exists for completeness. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSGuestNiceTimeCPU The ratio of time spent running a virtual CPU for guest operating systems under the control of the Linux kernel, when a guest was set to a higher priority (See man procfs). This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This metric is irrelevant for ClickHouse, but still exists for completeness. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSGuestNiceTimeNormalized The value is similar to OSGuestNiceTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSGuestTime The ratio of time spent running a virtual CPU for guest operating systems under the control of the Linux kernel (See man procfs). This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This metric is irrelevant for ClickHouse, but still exists for completeness. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSGuestTimeCPU The ratio of time spent running a virtual CPU for guest operating systems under the control of the Linux kernel (See man procfs). This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This metric is irrelevant for ClickHouse, but still exists for completeness. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSGuestTimeNormalized The value is similar to OSGuestTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSIOWaitTime The ratio of time the CPU core was not running the code but when the OS kernel did not run any other process on this CPU as the processes were waiting for IO. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSIOWaitTimeCPU The ratio of time the CPU core was not running the code but when the OS kernel did not run any other process on this CPU as the processes were waiting for IO. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSIOWaitTimeNormalized The value is similar to OSIOWaitTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSIdleTime The ratio of time the CPU core was idle (not even ready to run a process waiting for IO) from the OS kernel standpoint. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This does not include the time when the CPU was under-utilized due to the reasons internal to the CPU (memory loads, pipeline stalls, branch mispredictions, running another SMT core). The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSIdleTimeCPU The ratio of time the CPU core was idle (not even ready to run a process waiting for IO) from the OS kernel standpoint. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This does not include the time when the CPU was under-utilized due to the reasons internal to the CPU (memory loads, pipeline stalls, branch mispredictions, running another SMT core). The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSIdleTimeNormalized The value is similar to OSIdleTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSInterrupts The number of interrupts on the host machine. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
OSIrqTime The ratio of time spent for running hardware interrupt requests on the CPU. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. A high number of this metric may indicate hardware misconfiguration or a very high network load. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSIrqTimeCPU The ratio of time spent for running hardware interrupt requests on the CPU. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. A high number of this metric may indicate hardware misconfiguration or a very high network load. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSIrqTimeNormalized The value is similar to OSIrqTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSMemoryAvailable The amount of memory available to be used by programs, in bytes. This is very similar to the OSMemoryFreePlusCached metric. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
OSMemoryBuffers The amount of memory used by OS kernel buffers, in bytes. This should be typically small, and large values may indicate a misconfiguration of the OS. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
OSMemoryCached The amount of memory used by the OS page cache, in bytes. Typically, almost all available memory is used by the OS page cache - high values of this metric are normal and expected. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
OSMemoryFreePlusCached The amount of free memory plus OS page cache memory on the host system, in bytes. This memory is available to be used by programs. The value should be very similar to OSMemoryAvailable. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
OSMemoryFreeWithoutCached The amount of free memory on the host system, in bytes. This does not include the memory used by the OS page cache memory, in bytes. The page cache memory is also available for usage by programs, so the value of this metric can be confusing. See the OSMemoryAvailable metric instead. For convenience we also provide the OSMemoryFreePlusCached metric, that should be somewhat similar to OSMemoryAvailable. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
OSMemorySwapCached The amount of memory in swap that was also loaded in RAM. Swap should be disabled on production systems. If the value of this metric is large, it indicates a misconfiguration. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: digital,B
OSMemoryTotal The total amount of memory on the host system, in bytes.
Type: float | (gauge)
Unit: digital,B
OSNiceTime The ratio of time the CPU core was running userspace code with higher priority. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSNiceTimeCPU The ratio of time the CPU core was running userspace code with higher priority. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSNiceTimeNormalized The value is similar to OSNiceTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSOpenFiles The total number of opened files on the host machine. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
OSProcessesBlocked Number of threads blocked waiting for I/O to complete (man procfs). This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
OSProcessesCreated The number of processes created. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
OSProcessesRunning The number of runnable (running or ready to run) threads by the operating system. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
Type: float | (gauge)
Unit: count
OSSoftIrqTime The ratio of time spent for running software interrupt requests on the CPU. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. A high number of this metric may indicate inefficient software running on the system. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSSoftIrqTimeCPU The ratio of time spent for running software interrupt requests on the CPU. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. A high number of this metric may indicate inefficient software running on the system. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSSoftIrqTimeNormalized The value is similar to OSSoftIrqTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSStealTime The ratio of time spent in other operating systems by the CPU when running in a virtualized environment. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Not every virtualized environments present this metric, and most of them don't. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSStealTimeCPU The ratio of time spent in other operating systems by the CPU when running in a virtualized environment. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. Not every virtualized environments present this metric, and most of them don't. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSStealTimeNormalized The value is similar to OSStealTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSSystemTime The ratio of time the CPU core was running OS kernel (system) code. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSSystemTimeCPU The ratio of time the CPU core was running OS kernel (system) code. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSSystemTimeNormalized The value is similar to OSSystemTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
OSThreadsRunnable The total number of 'runnable' threads, as the OS kernel scheduler seeing it.
Type: float | (gauge)
Unit: count
OSThreadsTotal The total number of threads, as the OS kernel scheduler seeing it.
Type: float | (gauge)
Unit: count
OSUptime The uptime of the host server (the machine where ClickHouse is running), in seconds.
Type: float | (gauge)
Unit: time,s
OSUserTime The ratio of time the CPU core was running userspace code. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This includes also the time when the CPU was under-utilized due to the reasons internal to the CPU (memory loads, pipeline stalls, branch mispredictions, running another SMT core). The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSUserTimeCPU The ratio of time the CPU core was running userspace code. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server. This includes also the time when the CPU was under-utilized due to the reasons internal to the CPU (memory loads, pipeline stalls, branch mispredictions, running another SMT core). The value for a single CPU core will be in the interval [0..1]. The value for all CPU cores is calculated as a sum across them [0..num cores].
Type: float | (gauge)
Unit: rate
OSUserTimeNormalized The value is similar to OSUserTime but divided to the number of CPU cores to be measured in the [0..1] interval regardless of the number of cores. This allows you to average the values of this metric across multiple servers in a cluster even if the number of cores is non-uniform, and still get the average resource utilization metric.
Type: float | (gauge)
Unit: rate
PostgreSQLThreads Number of threads in the server of the PostgreSQL compatibility protocol.
Type: float | (gauge)
Unit: count
PrometheusThreads Number of threads in the server of the Prometheus endpoint. Note: Prometheus endpoints can be also used via the usual HTTP/HTTPs ports.
Type: float | (gauge)
Unit: count
ReplicasMaxAbsoluteDelay Maximum difference in seconds between the most fresh replicated part and the most fresh data part still to be replicated, across Replicated tables. A very high value indicates a replica with no data.
Type: float | (gauge)
Unit: time,s
ReplicasMaxInsertsInQueue Maximum number of INSERT operations in the queue (still to be replicated) across Replicated tables.
Type: float | (gauge)
Unit: count
ReplicasMaxMergesInQueue Maximum number of merge operations in the queue (still to be applied) across Replicated tables.
Type: float | (gauge)
Unit: count
ReplicasMaxQueueSize Maximum queue size (in the number of operations like get, merge) across Replicated tables.
Type: float | (gauge)
Unit: count
ReplicasMaxRelativeDelay Maximum difference between the replica delay and the delay of the most up-to-date replica of the same table, across Replicated tables.
Type: float | (gauge)
Unit: time,s
ReplicasSumInsertsInQueue Sum of INSERT operations in the queue (still to be replicated) across Replicated tables.
Type: float | (gauge)
Unit: count
ReplicasSumMergesInQueue Sum of merge operations in the queue (still to be applied) across Replicated tables.
Type: float | (gauge)
Unit: count
ReplicasSumQueueSize Sum queue size (in the number of operations like get, merge) across Replicated tables.
Type: float | (gauge)
Unit: count
TCPThreads Number of threads in the server of the TCP protocol (without TLS).
Type: float | (gauge)
Unit: count
Temperature The temperature of the corresponding device in ℃. A sensor can return an unrealistic value. Source: /sys/class/thermal
Type: float | (gauge)
Unit: temperature,C
TotalBytesOfMergeTreeTables Total amount of bytes (compressed, including data and indices) stored in all tables of MergeTree family.
Type: float | (gauge)
Unit: digital,B
TotalPartsOfMergeTreeTables Total amount of data parts in all tables of MergeTree family. Numbers larger than 10 000 will negatively affect the server startup time and it may indicate unreasonable choice of the partition key.
Type: float | (gauge)
Unit: count
TotalRowsOfMergeTreeTables Total amount of rows (records) stored in all tables of MergeTree family.
Type: float | (gauge)
Unit: count
UncompressedCacheBytes Total size of uncompressed cache in bytes. Uncompressed cache does not usually improve the performance and should be mostly avoided.
Type: float | (gauge)
Unit: digital,B
UncompressedCacheCells Total number of entries in the uncompressed cache. Each entry represents a decompressed block of data. Uncompressed cache does not usually improve performance and should be mostly avoided.
Type: float | (gauge)
Unit: digital,B
Uptime The server uptime in seconds. It includes the time spent for server initialization before accepting connections.
Type: float | (gauge)
Unit: time,s
jemalloc_active Total number of bytes in active pages allocated by the application. This is a multiple of the page size, and greater than or equal to.
Type: float | (gauge)
Unit: digital,B
jemalloc_allocated Total number of bytes allocated by the application.
Type: float | (gauge)
Unit: digital,B
jemalloc_arenas_all_dirty_purged Number of madvise() or similar calls made to purge dirty pages.
Type: float | (gauge)
Unit: count
jemalloc_arenas_all_muzzy_purged Number of muzzy page purge sweeps performed.
Type: float | (gauge)
Unit: count
jemalloc_arenas_all_pactive Number of pages in active extents.
Type: float | (gauge)
Unit: count
jemalloc_arenas_all_pdirty Number of pages within unused extents that are potentially dirty, and for which madvise() or similar has not been called.
Type: float | (gauge)
Unit: count
jemalloc_arenas_all_pmuzzy Number of pages within unused extents that are muzzy.
Type: float | (gauge)
Unit: count
jemalloc_background_thread_num_runs Total number of runs from all background threads.
Type: float | (gauge)
Unit: count
jemalloc_background_thread_num_threads Number of background threads running currently.
Type: float | (gauge)
Unit: count
jemalloc_background_thread_run_intervals Average run interval in nanoseconds of background threads.
Type: float | (gauge)
Unit: time,ns
jemalloc_epoch An internal incremental update number of the statistics of jemalloc (Jason Evans' memory allocator), used in all other jemalloc metrics.
Type: float | (gauge)
Unit: count
jemalloc_mapped If a value is passed in, refresh the data from which the mallctl*() functions report values, and increment the epoch. Return the current epoch. This is useful for detecting whether another thread caused a refresh..
Type: float | (gauge)
Unit: digital,B
jemalloc_metadata Total number of bytes dedicated to metadata, which comprise base allocations used for bootstrap-sensitive allocator metadata structures .
Type: float | (gauge)
Unit: digital,B
jemalloc_metadata_thp Number of transparent huge pages (THP) used for metadata.
Type: float | (gauge)
Unit: count
jemalloc_resident Maximum number of bytes in physically resident data pages mapped by the allocator.
Type: float | (gauge)
Unit: digital,B
jemalloc_retained Total number of bytes in virtual memory mappings.
Type: float | (gauge)
Unit: digital,B

ClickHouseMetrics

Tags & Fields Description
host
(tag)
Host name
instance
(tag)
Instance endpoint
ActiveAsyncDrainedConnections Number of active connections drained asynchronously.
Type: float | (gauge)
Unit: count
ActiveSyncDrainedConnections Number of active connections drained synchronously.
Type: float | (gauge)
Unit: count
AggregatorThreads Number of threads in the Aggregator thread pool.
Type: float | (gauge)
Unit: count
AggregatorThreadsActive Number of threads in the Aggregator thread pool running a task.
Type: float | (gauge)
Unit: count
AsyncDrainedConnections Number of connections drained asynchronously.
Type: float | (gauge)
Unit: count
AsyncInsertCacheSize Number of async insert hash id in cache
Type: float | (gauge)
Unit: count
AsynchronousInsertThreads Number of threads in the AsynchronousInsert thread pool.
Type: float | (gauge)
Unit: count
AsynchronousInsertThreadsActive Number of threads in the AsynchronousInsert thread pool running a task.
Type: float | (gauge)
Unit: count
AsynchronousReadWait Number of threads waiting for asynchronous read.
Type: float | (gauge)
Unit: count
BackgroundBufferFlushSchedulePoolSize Limit on number of tasks in BackgroundBufferFlushSchedulePool
Type: float | (gauge)
Unit: count
BackgroundBufferFlushSchedulePoolTask Number of active tasks in BackgroundBufferFlushSchedulePool. This pool is used for periodic Buffer flushes
Type: float | (gauge)
Unit: count
BackgroundCommonPoolSize Limit on number of tasks in an associated background pool
Type: float | (gauge)
Unit: count
BackgroundCommonPoolTask Number of active tasks in an associated background pool
Type: float | (gauge)
Unit: count
BackgroundDistributedSchedulePoolSize Limit on number of tasks in BackgroundDistributedSchedulePool
Type: float | (gauge)
Unit: count
BackgroundDistributedSchedulePoolTask Number of active tasks in BackgroundDistributedSchedulePool. This pool is used for distributed sends that is done in background.
Type: float | (gauge)
Unit: count
BackgroundFetchesPoolSize Limit on number of simultaneous fetches in an associated background pool
Type: float | (gauge)
Unit: count
BackgroundFetchesPoolTask Number of active fetches in an associated background pool
Type: float | (gauge)
Unit: count
BackgroundMergesAndMutationsPoolSize Limit on number of active merges and mutations in an associated background pool
Type: float | (gauge)
Unit: count
BackgroundMergesAndMutationsPoolTask Number of active merges and mutations in an associated background pool
Type: float | (gauge)
Unit: count
BackgroundMessageBrokerSchedulePoolSize Limit on number of tasks in BackgroundProcessingPool for message streaming
Type: float | (gauge)
Unit: count
BackgroundMessageBrokerSchedulePoolTask Number of active tasks in BackgroundProcessingPool for message streaming
Type: float | (gauge)
Unit: count
BackgroundMovePoolSize Limit on number of tasks in BackgroundProcessingPool for moves
Type: float | (gauge)
Unit: count
BackgroundMovePoolTask Number of active tasks in BackgroundProcessingPool for moves
Type: float | (gauge)
Unit: count
BackgroundPoolTask Number of active tasks in BackgroundProcessingPool (merges, mutations, or replication queue bookkeeping)
Type: float | (gauge)
Unit: count
BackgroundSchedulePoolSize Limit on number of tasks in BackgroundSchedulePool. This pool is used for periodic ReplicatedMergeTree tasks, like cleaning old data parts, altering data parts, replica re-initialization, etc.
Type: float | (gauge)
Unit: count
BackgroundSchedulePoolTask Number of active tasks in BackgroundSchedulePool. This pool is used for periodic ReplicatedMergeTree tasks, like cleaning old data parts, altering data parts, replica re-initialization, etc.
Type: float | (gauge)
Unit: count
BackupsIOThreads Number of threads in the BackupsIO thread pool.
Type: float | (gauge)
Unit: count
BackupsIOThreadsActive Number of threads in the BackupsIO thread pool running a task.
Type: float | (gauge)
Unit: count
BackupsThreads Number of threads in the thread pool for BACKUP.
Type: float | (gauge)
Unit: count
BackupsThreadsActive Number of threads in thread pool for BACKUP running a task.
Type: float | (gauge)
Unit: count
BrokenDistributedFilesToInsert Number of files for asynchronous insertion into Distributed tables that has been marked as broken. This metric will starts from 0 on start. Number of files for every shard is summed.
Type: float | (gauge)
Unit: count
CacheDetachedFileSegments Number of existing detached cache file segments
Type: float | (gauge)
Unit: count
CacheDictionaryThreads Number of threads in the CacheDictionary thread pool.
Type: float | (gauge)
Unit: count
CacheDictionaryThreadsActive Number of threads in the CacheDictionary thread pool running a task.
Type: float | (gauge)
Unit: count
CacheDictionaryUpdateQueueBatches Number of 'batches' (a set of keys) in update queue in CacheDictionaries.
Type: float | (gauge)
Unit: count
CacheDictionaryUpdateQueueKeys Exact number of keys in update queue in CacheDictionaries.
Type: float | (gauge)
Unit: count
CacheFileSegments Number of existing cache file segments
Type: float | (gauge)
Unit: count
ContextLockWait Number of threads waiting for lock in Context. This is global lock.
Type: float | (gauge)
Unit: count
DDLWorkerThreads Number of threads in the DDLWorker thread pool for ON CLUSTER queries.
Type: float | (gauge)
Unit: count
DDLWorkerThreadsActive Number of threads in the DDLWORKER thread pool for ON CLUSTER queries running a task.
Type: float | (gauge)
Unit: count
DatabaseCatalogThreads Number of threads in the DatabaseCatalog thread pool.
Type: float | (gauge)
Unit: count
DatabaseCatalogThreadsActive Number of threads in the DatabaseCatalog thread pool running a task.
Type: float | (gauge)
Unit: count
DatabaseOnDiskThreads Number of threads in the DatabaseOnDisk thread pool.
Type: float | (gauge)
Unit: count
DatabaseOnDiskThreadsActive Number of threads in the DatabaseOnDisk thread pool running a task.
Type: float | (gauge)
Unit: count
DatabaseOrdinaryThreads Number of threads in the Ordinary database thread pool.
Type: float | (gauge)
Unit: count
DatabaseOrdinaryThreadsActive Number of threads in the Ordinary database thread pool running a task.
Type: float | (gauge)
Unit: count
DelayedInserts Number of INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree table.
Type: float | (gauge)
Unit: count
DestroyAggregatesThreads Number of threads in the thread pool for destroy aggregate states.
Type: float | (gauge)
Unit: count
DestroyAggregatesThreadsActive Number of threads in the thread pool for destroy aggregate states running a task.
Type: float | (gauge)
Unit: count
DictCacheRequests Number of requests in fly to data sources of dictionaries of cache type.
Type: float | (gauge)
Unit: count
DiskObjectStorageAsyncThreads Number of threads in the async thread pool for DiskObjectStorage.
Type: float | (gauge)
Unit: count
DiskObjectStorageAsyncThreadsActive Number of threads in the async thread pool for DiskObjectStorage running a task.
Type: float | (gauge)
Unit: count
DiskSpaceReservedForMerge Disk space reserved for currently running background merges. It is slightly more than the total size of currently merging parts.
Type: float | (gauge)
Unit: digital,B
DistributedFilesToInsert Number of pending files to process for asynchronous insertion into Distributed tables. Number of files for every shard is summed.
Type: float | (gauge)
Unit: count
DistributedInsertThreads Number of threads used for INSERT into Distributed.
Type: float | (gauge)
Unit: count
DistributedInsertThreadsActive Number of threads used for INSERT into Distributed running a task.
Type: float | (gauge)
Unit: count
DistributedSend Number of connections to remote servers sending data that was INSERTed into Distributed tables. Both synchronous and asynchronous mode.
Type: float | (gauge)
Unit: count
EphemeralNode Number of ephemeral nodes hold in ZooKeeper.
Type: float | (gauge)
Unit: count
FilesystemCacheElements Filesystem cache elements (file segments)
Type: float | (gauge)
Unit: count
FilesystemCacheReadBuffers Number of active cache buffers
Type: float | (gauge)
Unit: count
FilesystemCacheSize Filesystem cache size in bytes
Type: float | (gauge)
Unit: digital,B
GlobalThread Number of threads in global thread pool.
Type: float | (gauge)
Unit: count
GlobalThreadActive Number of threads in global thread pool running a task.
Type: float | (gauge)
Unit: count
HTTPConnection Number of connections to HTTP server
Type: float | (gauge)
Unit: count
HashedDictionaryThreads Number of threads in the HashedDictionary thread pool.
Type: float | (gauge)
Unit: count
HashedDictionaryThreadsActive Number of threads in the HashedDictionary thread pool running a task.
Type: float | (gauge)
Unit: count
IOPrefetchThreads Number of threads in the IO prefertch thread pool.
Type: float | (gauge)
Unit: count
IOPrefetchThreadsActive Number of threads in the IO prefetch thread pool running a task.
Type: float | (gauge)
Unit: count
IOThreads Number of threads in the IO thread pool.
Type: float | (gauge)
Unit: count
IOThreadsActive Number of threads in the IO thread pool running a task.
Type: float | (gauge)
Unit: count
IOUringInFlightEvents Number of io_uring SQEs in flight
Type: float | (gauge)
Unit: count
IOUringPendingEvents Number of io_uring SQEs waiting to be submitted
Type: float | (gauge)
Unit: count
IOWriterThreads Number of threads in the IO writer thread pool.
Type: float | (gauge)
Unit: count
IOWriterThreadsActive Number of threads in the IO writer thread pool running a task.
Type: float | (gauge)
Unit: count
InterserverConnection Number of connections from other replicas to fetch parts
Type: float | (gauge)
Unit: count
KafkaAssignedPartitions Number of partitions Kafka tables currently assigned to
Type: float | (gauge)
Unit: count
KafkaBackgroundReads Number of background reads currently working (populating materialized views from Kafka)
Type: float | (gauge)
Unit: count
KafkaConsumers Number of active Kafka consumers
Type: float | (gauge)
Unit: count
KafkaConsumersInUse Number of consumers which are currently used by direct or background reads
Type: float | (gauge)
Unit: count
KafkaConsumersWithAssignment Number of active Kafka consumers which have some partitions assigned.
Type: float | (gauge)
Unit: count
KafkaLibrdkafkaThreads Number of active librdkafka threads
Type: float | (gauge)
Unit: count
KafkaProducers Number of active Kafka producer created
Type: float | (gauge)
Unit: count
KafkaWrites Number of currently running inserts to Kafka
Type: float | (gauge)
Unit: count
KeeperAliveConnections Number of alive connections
Type: float | (gauge)
Unit: count
KeeperOutstandingRequets Number of outstanding requests
Type: float | (gauge)
Unit: count
LocalThread Number of threads in local thread pools. The threads in local thread pools are taken from the global thread pool.
Type: float | (gauge)
Unit: count
LocalThreadActive Number of threads in local thread pools running a task.
Type: float | (gauge)
Unit: count
MMappedAllocBytes Sum bytes of mmapped allocations
Type: float | (gauge)
Unit: digital,B
MMappedAllocs Total number of mmapped allocations
Type: float | (gauge)
Unit: count
MMappedFileBytes Sum size of mmapped file regions.
Type: float | (gauge)
Unit: digital,B
MMappedFiles Total number of mmapped files.
Type: float | (gauge)
Unit: count
MarksLoaderThreads Number of threads in thread pool for loading marks.
Type: float | (gauge)
Unit: count
MarksLoaderThreadsActive Number of threads in the thread pool for loading marks running a task.
Type: float | (gauge)
Unit: count
MaxDDLEntryID Max processed DDL entry of DDLWorker.
Type: float | (gauge)
Unit: count
MaxPushedDDLEntryID Max DDL entry of DDLWorker that pushed to zookeeper.
Type: float | (gauge)
Unit: count
MemoryTracking Total amount of memory (bytes) allocated by the server.
Type: float | (gauge)
Unit: digital,B
Merge Number of executing background merges
Type: float | (gauge)
Unit: count
MergeTreeAllRangesAnnouncementsSent The current number of announcement being sent in flight from the remote server to the initiator server about the set of data parts (for MergeTree tables). Measured on the remote server side.
Type: float | (gauge)
Unit: count
MergeTreeBackgroundExecutorThreads Number of threads in the MergeTreeBackgroundExecutor thread pool.
Type: float | (gauge)
Unit: count
MergeTreeBackgroundExecutorThreadsActive Number of threads in the MergeTreeBackgroundExecutor thread pool running a task.
Type: float | (gauge)
Unit: count
MergeTreeDataSelectExecutorThreads Number of threads in the MergeTreeDataSelectExecutor thread pool.
Type: float | (gauge)
Unit: count
MergeTreeDataSelectExecutorThreadsActive Number of threads in the MergeTreeDataSelectExecutor thread pool running a task.
Type: float | (gauge)
Unit: count
MergeTreePartsCleanerThreads Number of threads in the MergeTree parts cleaner thread pool.
Type: float | (gauge)
Unit: count
MergeTreePartsCleanerThreadsActive Number of threads in the MergeTree parts cleaner thread pool running a task.
Type: float | (gauge)
Unit: count
MergeTreePartsLoaderThreads Number of threads in the MergeTree parts loader thread pool.
Type: float | (gauge)
Unit: count
MergeTreePartsLoaderThreadsActive Number of threads in the MergeTree parts loader thread pool running a task.
Type: float | (gauge)
Unit: count
MergeTreeReadTaskRequestsSent The current number of callback requests in flight from the remote server back to the initiator server to choose the read task (for MergeTree tables). Measured on the remote server side.
Type: float | (gauge)
Unit: count
Move Number of currently executing moves
Type: float | (gauge)
Unit: count
MySQLConnection Number of client connections using MySQL protocol
Type: float | (gauge)
Unit: count
NetworkReceive Number of threads receiving data from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.
Type: float | (gauge)
Unit: count
NetworkSend Number of threads sending data to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.
Type: float | (gauge)
Unit: count
OpenFileForRead Number of files open for reading
Type: float | (gauge)
Unit: count
OpenFileForWrite Number of files open for writing
Type: float | (gauge)
Unit: count
ParallelFormattingOutputFormatThreads Number of threads in the ParallelFormattingOutputFormatThreads thread pool.
Type: float | (gauge)
Unit: count
ParallelFormattingOutputFormatThreadsActive Number of threads in the ParallelFormattingOutputFormatThreads thread pool running a task.
Type: float | (gauge)
Unit: count
ParallelParsingInputFormatThreads Number of threads in the ParallelParsingInputFormat thread pool.
Type: float | (gauge)
Unit: count
ParallelParsingInputFormatThreadsActive Number of threads in the ParallelParsingInputFormat thread pool running a task.
Type: float | (gauge)
Unit: count
ParquetDecoderThreads Number of threads in the ParquetBlockInputFormat thread pool running a task.
Type: float | (gauge)
Unit: count
ParquetDecoderThreadsActive Number of threads in the ParquetBlockInputFormat thread pool.
Type: float | (gauge)
Unit: count
PartMutation Number of mutations (ALTER DELETE/UPDATE)
Type: float | (gauge)
Unit: count
PartsActive Active data part, used by current and upcoming SELECTs.
Type: float | (gauge)
Unit: count
PartsCommitted Deprecated. See PartsActive.
Type: float | (gauge)
Unit: count
PartsCompact Compact parts.
Type: float | (gauge)
Unit: count
PartsDeleteOnDestroy Part was moved to another disk and should be deleted in own destructor.
Type: float | (gauge)
Unit: count
PartsDeleting Not active data part with identity refcounter, it is deleting right now by a cleaner.
Type: float | (gauge)
Unit: count
PartsInMemory In-memory parts.
Type: float | (gauge)
Unit: count
PartsOutdated Not active data part, but could be used by only current SELECTs, could be deleted after SELECTs finishes.
Type: float | (gauge)
Unit: count
PartsPreActive The part is in data_parts, but not used for SELECTs.
Type: float | (gauge)
Unit: count
PartsPreCommitted Deprecated. See PartsPreActive.
Type: float | (gauge)
Unit: count
PartsTemporary The part is generating now, it is not in data_parts list.
Type: float | (gauge)
Unit: count
PartsWide Wide parts.
Type: float | (gauge)
Unit: count
PendingAsyncInsert Number of asynchronous inserts that are waiting for flush.
Type: float | (gauge)
Unit: count
PostgreSQLConnection Number of client connections using PostgreSQL protocol
Type: float | (gauge)
Unit: count
Query Number of executing queries
Type: float | (count)
Unit: count
QueryPipelineExecutorThreads Number of threads in the PipelineExecutor thread pool.
Type: float | (gauge)
Unit: count
QueryPipelineExecutorThreadsActive Number of threads in the PipelineExecutor thread pool running a task.
Type: float | (gauge)
Unit: count
QueryPreempted Number of queries that are stopped and waiting due to 'priority' setting.
Type: float | (gauge)
Unit: count
QueryThread Number of query processing threads
Type: float | (gauge)
Unit: count
RWLockActiveReaders Number of threads holding read lock in a table RWLock.
Type: float | (gauge)
Unit: count
RWLockActiveWriters Number of threads holding write lock in a table RWLock.
Type: float | (gauge)
Unit: count
RWLockWaitingReaders Number of threads waiting for read on a table RWLock.
Type: float | (gauge)
Unit: count
RWLockWaitingWriters Number of threads waiting for write on a table RWLock.
Type: float | (gauge)
Unit: count
Read Number of read (read, pread, io_getevents, etc.) syscalls in fly
Type: float | (gauge)
Unit: count
ReadTaskRequestsSent The current number of callback requests in flight from the remote server back to the initiator server to choose the read task (for s3Cluster table function and similar). Measured on the remote server side.
Type: float | (gauge)
Unit: count
ReadonlyReplica Number of Replicated tables that are currently in readonly state due to re-initialization after ZooKeeper session loss or due to startup without ZooKeeper configured.
Type: float | (gauge)
Unit: count
RemoteRead Number of read with remote reader in fly
Type: float | (gauge)
Unit: count
ReplicatedChecks Number of data parts checking for consistency
Type: float | (gauge)
Unit: count
ReplicatedFetch Number of data parts being fetched from replica
Type: float | (gauge)
Unit: count
ReplicatedSend Number of data parts being sent to replicas
Type: float | (gauge)
Unit: count
RestartReplicaThreads Number of threads in the RESTART REPLICA thread pool.
Type: float | (gauge)
Unit: count
RestartReplicaThreadsActive Number of threads in the RESTART REPLICA thread pool running a task.
Type: float | (gauge)
Unit: count
RestoreThreads Number of threads in the thread pool for RESTORE.
Type: float | (gauge)
Unit: count
RestoreThreadsActive Number of threads in the thread pool for RESTORE running a task.
Type: float | (gauge)
Unit: count
Revision Revision of the server. It is a number incremented for every release or release candidate except patch releases.
Type: float | (gauge)
Unit: count
S3Requests S3 requests
Type: float | (gauge)
Unit: count
SendExternalTables Number of connections that are sending data for external tables to remote servers. External tables are used to implement GLOBAL IN and GLOBAL JOIN operators with distributed subqueries.
Type: float | (gauge)
Unit: count
SendScalars Number of connections that are sending data for scalars to remote servers.
Type: float | (gauge)
Unit: count
StartupSystemTablesThreads Number of threads in the StartupSystemTables thread pool.
Type: float | (gauge)
Unit: count
StartupSystemTablesThreadsActive Number of threads in the StartupSystemTables thread pool running a task.
Type: float | (gauge)
Unit: count
StorageBufferBytes Number of bytes in buffers of Buffer tables
Type: float | (gauge)
Unit: digital,B
StorageBufferRows Number of rows in buffers of Buffer tables
Type: float | (gauge)
Unit: count
StorageDistributedThreads Number of threads in the StorageDistributed thread pool.
Type: float | (gauge)
Unit: count
StorageDistributedThreadsActive Number of threads in the StorageDistributed thread pool running a task.
Type: float | (gauge)
Unit: count
StorageHiveThreads Number of threads in the StorageHive thread pthreadpoolool.
Type: float | (gauge)
Unit: count
StorageHiveThreadsActive Number of threads in the StorageHive thread pool running a task.
Type: float | (gauge)
Unit: count
StorageS3Threads Number of threads in the StorageS3 thread pool.
Type: float | (gauge)
Unit: count
StorageS3ThreadsActive Number of threads in the StorageS3 thread pool running a task.
Type: float | (gauge)
Unit: count
SyncDrainedConnections Number of connections drained synchronously.
Type: float | (gauge)
Unit: count
SystemReplicasThreads Number of threads in the system.replicas thread pool.
Type: float | (gauge)
Unit: count
SystemReplicasThreadsActive Number of threads in the system.replicas thread pool running a task.
Type: float | (gauge)
Unit: count
TCPConnection Number of connections to TCP server (clients with native interface), also included server-server distributed query connections
Type: float | (gauge)
Unit: count
TablesLoaderThreads Number of threads in the tables loader thread pool.
Type: float | (gauge)
Unit: count
TablesLoaderThreadsActive Number of threads in the tables loader thread pool running a task.
Type: float | (gauge)
Unit: count
TablesToDropQueueSize Number of dropped tables, that are waiting for background data removal.
Type: float | (gauge)
Unit: count
TemporaryFilesForAggregation Number of temporary files created for external aggregation
Type: float | (gauge)
Unit: count
TemporaryFilesForJoin Number of temporary files created for JOIN
Type: float | (gauge)
Unit: count
TemporaryFilesForSort Number of temporary files created for external sorting
Type: float | (gauge)
Unit: count
TemporaryFilesUnknown Number of temporary files created without known purpose
Type: float | (gauge)
Unit: count
ThreadPoolFSReaderThreads Number of threads in the thread pool for local_filesystem_read_method=threadpool.
Type: float | (gauge)
Unit: count
ThreadPoolFSReaderThreadsActive Number of threads in the thread pool for local_filesystem_read_method=threadpool running a task.
Type: float | (gauge)
Unit: count
ThreadPoolRemoteFSReaderThreads Number of threads in the thread pool for remote_filesystem_read_method=threadpool.
Type: float | (gauge)
Unit: count
ThreadPoolRemoteFSReaderThreadsActive Number of threads in the thread pool for remote_filesystem_read_method=threadpool running a task.
Type: float | (gauge)
Unit: count
ThreadsInOvercommitTracker Number of waiting threads inside of OvercommitTracker
Type: float | (gauge)
Unit: count
TotalTemporaryFiles Number of temporary files created
Type: float | (gauge)
Unit: count
VersionInteger Version of the server in a single integer number in base-1000. For example, version 11.22.33 is translated to 11022033.
Type: float | (gauge)
Unit: count
Write Number of write (write/pwrite/io_getevents, etc.) syscalls in fly
Type: float | (gauge)
Unit: count
ZooKeeperRequest Number of requests to ZooKeeper in fly.
Type: float | (gauge)
Unit: count
ZooKeeperSession Number of sessions (connections) to ZooKeeper. Should be no more than one, because using more than one connection to ZooKeeper may lead to bugs due to lack of linearizability (stale reads) that ZooKeeper consistency model allows.
Type: float | (gauge)
Unit: count
ZooKeeperWatch Number of watches (event subscriptions) in ZooKeeper.
Type: float | (gauge)
Unit: count

ClickHouseProfileEvents

Tags & Fields Description
host
(tag)
Host name
instance
(tag)
Instance endpoint
AIORead Number of reads with Linux or FreeBSD AIO interface
Type: float | (gauge)
Unit: count
AIOReadBytes Number of bytes read with Linux or FreeBSD AIO interface
Type: float | (gauge)
Unit: digital,B
AIOWrite Number of writes with Linux or FreeBSD AIO interface
Type: float | (gauge)
Unit: count
AIOWriteBytes Number of bytes written with Linux or FreeBSD AIO interface
Type: float | (gauge)
Unit: digital,B
AggregationHashTablesInitializedAsTwoLevel How many hash tables were inited as two-level for aggregation.
Type: float | (gauge)
Unit: count
AggregationPreallocatedElementsInHashTables How many elements were preallocated in hash tables for aggregation.
Type: float | (gauge)
Unit: count
ArenaAllocBytes Number of bytes allocated for memory Arena (used for GROUP BY and similar operations)
Type: float | (count)
Unit: count
ArenaAllocChunks Number of chunks allocated for memory Arena (used for GROUP BY and similar operations)
Type: float | (count)
Unit: count
AsyncInsertBytes Data size in bytes of asynchronous INSERT queries.
Type: float | (count)
Unit: digital,B
AsyncInsertCacheHits Number of times a duplicate hash id has been found in asynchronous INSERT hash id cache.
Type: float | (count)
Unit: count
AsyncInsertQuery Same as InsertQuery, but only for asynchronous INSERT queries.
Type: float | (count)
Unit: count
AsynchronousReadWaitMicroseconds Time spent in waiting for asynchronous reads.
Type: float | (count)
Unit: time,ms
AsynchronousRemoteReadWaitMicroseconds Time spent in waiting for asynchronous remote reads.
Type: float | (count)
Unit: time,ms
BackgroundLoadingMarksTasks Number of background tasks for loading marks
Type: float | (count)
Unit: count
CachedReadBufferCacheWriteBytes Bytes written from source (remote fs, etc) to filesystem cache
Type: float | (count)
Unit: digital,B
CachedReadBufferCacheWriteMicroseconds Time spent writing data into filesystem cache
Type: float | (count)
Unit: time,ms
CachedReadBufferReadFromCacheBytes Bytes read from filesystem cache
Type: float | (count)
Unit: digital,B
CachedReadBufferReadFromCacheMicroseconds Time reading from filesystem cache
Type: float | (count)
Unit: time,ms
CachedReadBufferReadFromSourceBytes Bytes read from filesystem cache source (from remote fs, etc)
Type: float | (count)
Unit: digital,B
CachedReadBufferReadFromSourceMicroseconds Time reading from filesystem cache source (from remote filesystem, etc)
Type: float | (count)
Unit: time,ms
CachedWriteBufferCacheWriteBytes Bytes written from source (remote fs, etc) to filesystem cache
Type: float | (count)
Unit: digital,B
CachedWriteBufferCacheWriteMicroseconds Time spent writing data into filesystem cache
Type: float | (count)
Unit: time,ms
CannotRemoveEphemeralNode Number of times an error happened while trying to remove ephemeral node. This is not an issue, because our implementation of ZooKeeper library guarantee that the session will expire and the node will be removed.
Type: float | (count)
Unit: count
CannotWriteToWriteBufferDiscard Number of stack traces dropped by query profiler or signal handler because pipe is full or cannot write to pipe.
Type: float | (count)
Unit: count
CompileExpressionsBytes Number of bytes used for expressions compilation.
Type: float | (count)
Unit: digital,B
CompileExpressionsMicroseconds Total time spent for compilation of expressions to LLVM code.
Type: float | (count)
Unit: time,ms
CompileFunction Number of times a compilation of generated LLVM code (to create fused function for complex expressions) was initiated.
Type: float | (count)
Unit: count
CompiledFunctionExecute Number of times a compiled function was executed.
Type: float | (count)
Unit: count
CompressedReadBufferBlocks Number of compressed blocks (the blocks of data that are compressed independent of each other) read from compressed sources (files, network).
Type: float | (count)
Unit: count
CompressedReadBufferBytes Number of uncompressed bytes (the number of bytes after decompression) read from compressed sources (files, network).
Type: float | (count)
Unit: digital,B
ContextLock Number of times the lock of Context was acquired or tried to acquire. This is global lock.
Type: float | (count)
Unit: count
CreatedHTTPConnections Total amount of created HTTP connections (counter increase every time connection is created).
Type: float | (count)
Unit: count
CreatedLogEntryForMerge Successfully created log entry to merge parts in ReplicatedMergeTree.
Type: float | (count)
Unit: count
CreatedLogEntryForMutation Successfully created log entry to mutate parts in ReplicatedMergeTree.
Type: float | (count)
Unit: count
CreatedReadBufferAIO Created read buffer AIO
Type: float | (count)
Unit: count
CreatedReadBufferAIOFailed Created read buffer AIO Failed
Type: float | (count)
Unit: count
CreatedReadBufferDirectIO Number of times a read buffer with O_DIRECT was created for reading data (while choosing among other read methods).
Type: float | (count)
Unit: count
CreatedReadBufferDirectIOFailed Number of times a read buffer with O_DIRECT was attempted to be created for reading data (while choosing among other read methods), but the OS did not allow it (due to lack of filesystem support or other reasons) and we fallen back to the ordinary reading method.
Type: float | (count)
Unit: count
CreatedReadBufferMMap Number of times a read buffer using mmap was created for reading data (while choosing among other read methods).
Type: float | (count)
Unit: count
CreatedReadBufferMMapFailed Number of times a read buffer with mmap was attempted to be created for reading data (while choosing among other read methods), but the OS did not allow it (due to lack of filesystem support or other reasons) and we fallen back to the ordinary reading method.
Type: float | (count)
Unit: count
CreatedReadBufferOrdinary Number of times ordinary read buffer was created for reading data (while choosing among other read methods).
Type: float | (count)
Unit: count
DNSError Total count of errors in DNS resolution
Type: float | (count)
Unit: count
DataAfterMergeDiffersFromReplica Number of times data after merge is not byte-identical to the data on another replicas. There could be several reasons
Type: float | (count)
Unit: count
DataAfterMutationDiffersFromReplica Number of times data after mutation is not byte-identical to the data on another replicas. In addition to the reasons described in 'DataAfterMergeDiffersFromReplica', it is also possible due to non-deterministic mutation.
Type: float | (count)
Unit: count
DelayedInserts Number of times the INSERT of a block to a MergeTree table was throttled due to high number of active data parts for partition.
Type: float | (count)
Unit: count
DelayedInsertsMilliseconds Total number of milliseconds spent while the INSERT of a block to a MergeTree table was throttled due to high number of active data parts for partition.
Type: float | (count)
Unit: time,ms
DictCacheKeysExpired Number of keys looked up in the dictionaries of 'cache' types and found in the cache but they were obsolete.
Type: float | (count)
Unit: count
DictCacheKeysHit Number of keys looked up in the dictionaries of 'cache' types and found in the cache.
Type: float | (count)
Unit: count
DictCacheKeysNotFound Number of keys looked up in the dictionaries of 'cache' types and not found.
Type: float | (count)
Unit: count
DictCacheKeysRequested Number of keys requested from the data source for the dictionaries of 'cache' types.
Type: float | (count)
Unit: count
DictCacheKeysRequestedFound Number of keys requested from the data source for dictionaries of 'cache' types and found in the data source.
Type: float | (count)
Unit: count
DictCacheKeysRequestedMiss Number of keys requested from the data source for dictionaries of 'cache' types but not found in the data source.
Type: float | (count)
Unit: count
DictCacheLockReadNs Number of nanoseconds spend in waiting for read lock to lookup the data for the dictionaries of 'cache' types.
Type: float | (count)
Unit: time,ms
DictCacheLockWriteNs Number of nanoseconds spend in waiting for write lock to update the data for the dictionaries of 'cache' types.
Type: float | (count)
Unit: time,ms
DictCacheRequestTimeNs Number of nanoseconds spend in querying the external data sources for the dictionaries of 'cache' types.
Type: float | (count)
Unit: time,ms
DictCacheRequests Number of bulk requests to the external data sources for the dictionaries of 'cache' types.
Type: float | (count)
Unit: count
DirectorySync Number of times the F_FULLFSYNC/fsync/fdatasync function was called for directories.
Type: float | (count)
Unit: count
DirectorySyncElapsedMicroseconds Total time spent waiting for F_FULLFSYNC/fsync/fdatasync syscall for directories.
Type: float | (count)
Unit: time,ms
DiskReadElapsedMicroseconds Total time spent waiting for read syscall. This include reads from page cache.
Type: float | (count)
Unit: time,ms
DiskS3AbortMultipartUpload Number of DiskS3 API AbortMultipartUpload calls.
Type: float | (count)
Unit: count
DiskS3CompleteMultipartUpload Number of DiskS3 API CompleteMultipartUpload calls.
Type: float | (count)
Unit: count
DiskS3CopyObject Number of DiskS3 API CopyObject calls.
Type: float | (count)
Unit: count
DiskS3CreateMultipartUpload Number of DiskS3 API CreateMultipartUpload calls.
Type: float | (count)
Unit: count
DiskS3DeleteObjects Number of DiskS3 API DeleteObject(s) calls.
Type: float | (count)
Unit: count
DiskS3GetObject Number of DiskS3 API GetObject calls.
Type: float | (count)
Unit: count
DiskS3GetObjectAttributes Number of DiskS3 API GetObjectAttributes calls.
Type: float | (count)
Unit: count
DiskS3GetRequestThrottlerCount Number of DiskS3 GET and SELECT requests passed through throttler.
Type: float | (count)
Unit: count
DiskS3GetRequestThrottlerSleepMicroseconds Total time a query was sleeping to conform DiskS3 GET and SELECT request throttling.
Type: float | (count)
Unit: time,ms
DiskS3HeadObject Number of DiskS3 API HeadObject calls.
Type: float | (count)
Unit: count
DiskS3ListObjects Number of DiskS3 API ListObjects calls.
Type: float | (count)
Unit: count
DiskS3PutObject Number of DiskS3 API PutObject calls.
Type: float | (count)
Unit: count
DiskS3PutRequestThrottlerCount Number of DiskS3 PUT, COPY, POST and LIST requests passed through throttler.
Type: float | (count)
Unit: count
DiskS3PutRequestThrottlerSleepMicroseconds Total time a query was sleeping to conform DiskS3 PUT, COPY, POST and LIST request throttling.
Type: float | (count)
Unit: time,ms
DiskS3ReadMicroseconds Time of GET and HEAD requests to DiskS3 storage.
Type: float | (count)
Unit: time,ms
DiskS3ReadRequestsCount Number of GET and HEAD requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3ReadRequestsErrors Number of non-throttling errors in GET and HEAD requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3ReadRequestsRedirects Number of redirects in GET and HEAD requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3ReadRequestsThrottling Number of 429 and 503 errors in GET and HEAD requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3UploadPart Number of DiskS3 API UploadPart calls.
Type: float | (count)
Unit: count
DiskS3UploadPartCopy Number of DiskS3 API UploadPartCopy calls.
Type: float | (count)
Unit: count
DiskS3WriteMicroseconds Time of POST, DELETE, PUT and PATCH requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3WriteRequestsCount Number of POST, DELETE, PUT and PATCH requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3WriteRequestsErrors Number of non-throttling errors in POST, DELETE, PUT and PATCH requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3WriteRequestsRedirects Number of redirects in POST, DELETE, PUT and PATCH requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskS3WriteRequestsThrottling Number of 429 and 503 errors in POST, DELETE, PUT and PATCH requests to DiskS3 storage.
Type: float | (count)
Unit: count
DiskWriteElapsedMicroseconds Total time spent waiting for write syscall. This include writes to page cache.
Type: float | (count)
Unit: time,ms
DistributedConnectionFailAtAll Total count when distributed connection fails after all retries finished.
Type: float | (count)
Unit: time,ms
DistributedConnectionFailTry Total count when distributed connection fails with retry.
Type: float | (count)
Unit: time,ms
DistributedConnectionMissingTable Number of times we rejected a replica from a distributed query, because it did not contain a table needed for the query.
Type: float | (count)
Unit: count
DistributedConnectionStaleReplica Number of times we rejected a replica from a distributed query, because some table needed for a query had replication lag higher than the configured threshold.
Type: float | (count)
Unit: count
DistributedDelayedInserts Number of times the INSERT of a block to a Distributed table was throttled due to high number of pending bytes.
Type: float | (count)
Unit: count
DistributedDelayedInsertsMilliseconds Total number of milliseconds spent while the INSERT of a block to a Distributed table was throttled due to high number of pending bytes.
Type: float | (count)
Unit: time,ms
DistributedRejectedInserts Number of times the INSERT of a block to a Distributed table was rejected with 'Too many bytes' exception due to high number of pending bytes.
Type: float | (count)
Unit: count
DistributedSyncInsertionTimeoutExceeded A timeout has exceeded while waiting for shards during synchronous insertion into a Distributed table (with 'insert_distributed_sync' = 1)
Type: float | (count)
Unit: count
DuplicatedInsertedBlocks Number of times the INSERTed block to a ReplicatedMergeTree table was deduplicated.
Type: float | (count)
Unit: count
ExecuteShellCommand Number of shell command executions.
Type: float | (count)
Unit: count
ExternalAggregationCompressedBytes Number of bytes written to disk for aggregation in external memory.
Type: float | (count)
Unit: digital,B
ExternalAggregationMerge Number of times temporary files were merged for aggregation in external memory.
Type: float | (count)
Unit: count
ExternalAggregationUncompressedBytes Amount of data (uncompressed, before compression) written to disk for aggregation in external memory.
Type: float | (count)
Unit: digital,B
ExternalAggregationWritePart Number of times a temporary file was written to disk for aggregation in external memory.
Type: float | (count)
Unit: count
ExternalDataSourceLocalCacheReadBytes Bytes read from local cache buffer in RemoteReadBufferCache
Type: float | (count)
Unit: digital,B
ExternalJoinCompressedBytes Number of compressed bytes written for JOIN in external memory.
Type: float | (count)
Unit: digital,B
ExternalJoinMerge Number of times temporary files were merged for JOIN in external memory.
Type: float | (count)
Unit: count
ExternalJoinUncompressedBytes Amount of data (uncompressed, before compression) written for JOIN in external memory.
Type: float | (count)
Unit: digital,B
ExternalJoinWritePart Number of times a temporary file was written to disk for JOIN in external memory.
Type: float | (count)
Unit: count
ExternalProcessingCompressedBytesTotal Number of compressed bytes written by external processing (sorting/aggravating/joining)
Type: float | (count)
Unit: digital,B
ExternalProcessingFilesTotal Number of files used by external processing (sorting/aggravating/joining)
Type: float | (count)
Unit: count
ExternalProcessingUncompressedBytesTotal Amount of data (uncompressed, before compression) written by external processing (sorting/aggravating/joining)
Type: float | (count)
Unit: digital,B
ExternalSortCompressedBytes Number of compressed bytes written for sorting in external memory.
Type: float | (count)
Unit: digital,B
ExternalSortMerge Number of times temporary files were merged for sorting in external memory.
Type: float | (count)
Unit: count
ExternalSortUncompressedBytes Amount of data (uncompressed, before compression) written for sorting in external memory.
Type: float | (count)
Unit: digital,B
ExternalSortWritePart Number of times a temporary file was written to disk for sorting in external memory.
Type: float | (count)
Unit: count
FailedAsyncInsertQuery Number of failed ASYNC INSERT queries.
Type: float | (count)
Unit: count
FailedInsertQuery Same as FailedQuery, but only for INSERT queries.
Type: float | (count)
Unit: count
FailedQuery Number of failed queries.
Type: float | (count)
Unit: count
FailedSelectQuery Same as FailedQuery, but only for SELECT queries.
Type: float | (count)
Unit: count
FileOpen Number of files opened.
Type: float | (count)
Unit: count
FileSegmentCacheWriteMicroseconds Metric per file segment. Time spend writing data to cache
Type: float | (count)
Unit: time,ms
FileSegmentPredownloadMicroseconds Metric per file segment. Time spent predownloading data to cache (predownloading - finishing file segment download (after someone who failed to do that) up to the point current thread was requested to do)
Type: float | (count)
Unit: time,ms
FileSegmentReadMicroseconds Metric per file segment. Time spend reading from file
Type: float | (count)
Unit: time,ms
FileSegmentUsedBytes Metric per file segment. How many bytes were actually used from current file segment
Type: float | (count)
Unit: digital,B
FileSegmentWaitReadBufferMicroseconds Metric per file segment. Time spend waiting for internal read buffer (includes cache waiting)
Type: float | (count)
Unit: time,ms
FileSegmentWriteMicroseconds Metric per file segment. Time spend writing cache
Type: float | (count)
Unit: time,ms
FileSync Number of times the F_FULLFSYNC/fsync/fdatasync function was called for files.
Type: float | (count)
Unit: count
FileSyncElapsedMicroseconds Total time spent waiting for F_FULLFSYNC/fsync/fdatasync syscall for files.
Type: float | (count)
Unit: time,ms
FunctionExecute Number of SQL ordinary function calls (SQL functions are called on per-block basis, so this number represents the number of blocks).
Type: float | (count)
Unit: count
HardPageFaults The number of hard page faults in query execution threads. High values indicate either that you forgot to turn off swap on your server, or eviction of memory pages of the ClickHouse binary during very high memory pressure, or successful usage of the mmap read method for the tables data.
Type: float | (count)
Unit: count
HedgedRequestsChangeReplica Total count when timeout for changing replica expired in hedged requests.
Type: float | (count)
Unit: count
IOBufferAllocBytes Number of bytes allocated for IO buffers (for ReadBuffer/WriteBuffer).
Type: float | (count)
Unit: digital,B
IOBufferAllocs Number of allocations of IO buffers (for ReadBuffer/WriteBuffer).
Type: float | (count)
Unit: count
IOUringCQEsCompleted Total number of successfully completed io_uring CQEs
Type: float | (count)
Unit: count
IOUringCQEsFailed Total number of completed io_uring CQEs with failures
Type: float | (count)
Unit: count
IOUringSQEsResubmits Total number of io_uring SQE resubmits performed
Type: float | (count)
Unit: count
IOUringSQEsSubmitted Total number of io_uring SQEs submitted
Type: float | (count)
Unit: count
InsertQuery Same as Query, but only for INSERT queries.
Type: float | (count)
Unit: count
InsertQueryTimeMicroseconds Total time of INSERT queries.
Type: float | (count)
Unit: time,ms
InsertedBytes Number of bytes (uncompressed; for columns as they stored in memory) INSERTed to all tables.
Type: float | (count)
Unit: digital,B
InsertedCompactParts Number of parts inserted in Compact format.
Type: float | (count)
Unit: count
InsertedInMemoryParts Number of parts inserted in InMemory format.
Type: float | (count)
Unit: count
InsertedRows Number of rows INSERTed to all tables.
Type: float | (count)
Unit: count
InsertedWideParts Number of parts inserted in Wide format.
Type: float | (count)
Unit: count
InvoluntaryContextSwitches Involuntary context switches
Type: float | (count)
Unit: count
KafkaBackgroundReads Number of background reads populating materialized views from Kafka since server start
Type: float | (count)
Unit: count
KafkaCommitFailures Number of failed commits of consumed offsets to Kafka (usually is a sign of some data duplication)
Type: float | (count)
Unit: count
KafkaCommits Number of successful commits of consumed offsets to Kafka (normally should be the same as KafkaBackgroundReads)
Type: float | (count)
Unit: count
KafkaConsumerErrors Number of errors reported by librdkafka during polls
Type: float | (count)
Unit: count
KafkaDirectReads Number of direct selects from Kafka tables since server start
Type: float | (count)
Unit: count
KafkaMessagesFailed Number of Kafka messages ClickHouse failed to parse
Type: float | (count)
Unit: count
KafkaMessagesPolled Number of Kafka messages polled from librdkafka to ClickHouse
Type: float | (count)
Unit: count
KafkaMessagesProduced Number of messages produced to Kafka
Type: float | (count)
Unit: count
KafkaMessagesRead Number of Kafka messages already processed by ClickHouse
Type: float | (count)
Unit: count
KafkaProducerErrors Number of errors during producing the messages to Kafka
Type: float | (count)
Unit: count
KafkaProducerFlushes Number of explicit flushes to Kafka producer
Type: float | (count)
Unit: count
KafkaRebalanceAssignments Number of partition assignments (the final stage of consumer group rebalance)
Type: float | (count)
Unit: count
KafkaRebalanceErrors Number of failed consumer group rebalances
Type: float | (count)
Unit: count
KafkaRebalanceRevocations Number of partition revocations (the first stage of consumer group rebalance)
Type: float | (count)
Unit: count
KafkaRowsRead Number of rows parsed from Kafka messages
Type: float | (count)
Unit: count
KafkaRowsRejected Number of parsed rows which were later rejected (due to rebalances / errors or similar reasons). Those rows will be consumed again after the rebalance.
Type: float | (count)
Unit: count
KafkaRowsWritten Number of rows inserted into Kafka tables
Type: float | (count)
Unit: count
KafkaWrites Number of writes (inserts) to Kafka tables
Type: float | (count)
Unit: count
KeeperCheckRequest Number of check requests
Type: float | (count)
Unit: count
KeeperCommits Number of successful commits
Type: float | (count)
Unit: count
KeeperCommitsFailed Number of failed commits
Type: float | (count)
Unit: count
KeeperCreateRequest Number of create requests
Type: float | (count)
Unit: count
KeeperExistsRequest Number of exists requests
Type: float | (count)
Unit: count
KeeperGetRequest Number of get requests
Type: float | (count)
Unit: count
KeeperLatency Keeper latency
Type: float | (count)
Unit: count
KeeperListRequest Number of list requests
Type: float | (count)
Unit: count
KeeperMultiReadRequest Number of multi read requests
Type: float | (count)
Unit: count
KeeperMultiRequest Number of multi requests
Type: float | (count)
Unit: count
KeeperPacketsReceived Packets received by keeper server
Type: float | (count)
Unit: count
KeeperPacketsSent Packets sent by keeper server
Type: float | (count)
Unit: count
KeeperReadSnapshot Number of snapshot read(serialization)
Type: float | (count)
Unit: count
KeeperRemoveRequest Number of remove requests
Type: float | (count)
Unit: count
KeeperRequestTotal Total requests number on keeper server
Type: float | (count)
Unit: count
KeeperSaveSnapshot Number of snapshot save
Type: float | (count)
Unit: count
KeeperSetRequest Number of set requests
Type: float | (count)
Unit: count
KeeperSnapshotApplys Number of snapshot applying
Type: float | (count)
Unit: count
KeeperSnapshotApplysFailed Number of failed snapshot applying
Type: float | (count)
Unit: count
KeeperSnapshotCreations Number of snapshots creations
Type: float | (count)
Unit: count
KeeperSnapshotCreationsFailed Number of failed snapshot creations
Type: float | (count)
Unit: count
LoadedMarksCount Number of marks loaded (total across columns).
Type: float | (count)
Unit: count
LoadedMarksMemoryBytes Size of in-memory representations of loaded marks.
Type: float | (count)
Unit: digital,B
LocalReadThrottlerBytes Bytes passed through 'max_local_read_bandwidth_for_server'/'max_local_read_bandwidth' throttler.
Type: float | (count)
Unit: digital,B
LocalReadThrottlerSleepMicroseconds Total time a query was sleeping to conform 'max_local_read_bandwidth_for_server'/'max_local_read_bandwidth' throttling.
Type: float | (count)
Unit: time,ms
LocalWriteThrottlerBytes Bytes passed through 'max_local_write_bandwidth_for_server'/'max_local_write_bandwidth' throttler.
Type: float | (count)
Unit: digital,B
LocalWriteThrottlerSleepMicroseconds Total time a query was sleeping to conform 'max_local_write_bandwidth_for_server'/'max_local_write_bandwidth' throttling.
Type: float | (count)
Unit: time,ms
LogDebug Number of log messages with level Debug
Type: float | (count)
Unit: count
LogError Number of log messages with level Error
Type: float | (count)
Unit: count
LogFatal Number of log messages with level Fatal
Type: float | (count)
Unit: count
LogInfo Number of log messages with level Info
Type: float | (count)
Unit: count
LogTest Number of log messages with level Test
Type: float | (count)
Unit: count
LogTrace Number of log messages with level Trace
Type: float | (count)
Unit: count
LogWarning Number of log messages with level Warning
Type: float | (count)
Unit: count
MMappedFileCacheHits Number of times a file has been found in the MMap cache (for the mmap read_method), so we didn't have to mmap it again.
Type: float | (count)
Unit: count
MMappedFileCacheMisses Number of times a file has not been found in the MMap cache (for the mmap read_method), so we had to mmap it again.
Type: float | (count)
Unit: count
MainConfigLoads Number of times the main configuration was reloaded.
Type: float | (count)
Unit: count
MarkCacheHits Number of times an entry has been found in the mark cache, so we didn't have to load a mark file.
Type: float | (count)
Unit: count
MarkCacheMisses Number of times an entry has not been found in the mark cache, so we had to load a mark file in memory, which is a costly operation, adding to query latency.
Type: float | (count)
Unit: count
MemoryAllocatorPurge Total number of times memory allocator purge was requested
Type: float | (count)
Unit: time,ms
MemoryAllocatorPurgeTimeMicroseconds Total number of times memory allocator purge was requested
Type: float | (count)
Unit: time,ms
MemoryOvercommitWaitTimeMicroseconds Total time spent in waiting for memory to be freed in OvercommitTracker.
Type: float | (count)
Unit: time,ms
Merge Number of launched background merges.
Type: float | (count)
Unit: count
MergeTreeAllRangesAnnouncementsSent The number of announcement sent from the remote server to the initiator server about the set of data parts (for MergeTree tables). Measured on the remote server side.
Type: float | (count)
Unit: count
MergeTreeAllRangesAnnouncementsSentElapsedMicroseconds Time spent in sending the announcement from the remote server to the initiator server about the set of data parts (for MergeTree tables). Measured on the remote server side.
Type: float | (count)
Unit: time,ms
MergeTreeDataProjectionWriterBlocks Number of blocks INSERTed to MergeTree tables projection. Each block forms a data part of level zero.
Type: float | (count)
Unit: count
MergeTreeDataProjectionWriterBlocksAlreadySorted Number of blocks INSERTed to MergeTree tables projection that appeared to be already sorted.
Type: float | (count)
Unit: count
MergeTreeDataProjectionWriterCompressedBytes Bytes written to filesystem for data INSERTed to MergeTree tables projection.
Type: float | (count)
Unit: digital,B
MergeTreeDataProjectionWriterRows Number of rows INSERTed to MergeTree tables projection.
Type: float | (count)
Unit: count
MergeTreeDataProjectionWriterUncompressedBytes Uncompressed bytes (for columns as they stored in memory) INSERTed to MergeTree tables projection.
Type: float | (count)
Unit: digital,B
MergeTreeDataWriterBlocks Number of blocks INSERTed to MergeTree tables. Each block forms a data part of level zero.
Type: float | (count)
Unit: count
MergeTreeDataWriterBlocksAlreadySorted Number of blocks INSERTed to MergeTree tables that appeared to be already sorted.
Type: float | (count)
Unit: count
MergeTreeDataWriterCompressedBytes Bytes written to filesystem for data INSERTed to MergeTree tables.
Type: float | (count)
Unit: digital,B
MergeTreeDataWriterRows Number of rows INSERTed to MergeTree tables.
Type: float | (count)
Unit: count
MergeTreeDataWriterUncompressedBytes Uncompressed bytes (for columns as they stored in memory) INSERTed to MergeTree tables.
Type: float | (count)
Unit: digital,B
MergeTreeMetadataCacheDelete Number of rocksdb deletes(used for merge tree metadata cache)
Type: float | (count)
Unit: count
MergeTreeMetadataCacheGet Number of rocksdb reads(used for merge tree metadata cache)
Type: float | (count)
Unit: count
MergeTreeMetadataCacheHit Number of times the read of meta file was done from MergeTree metadata cache
Type: float | (count)
Unit: count
MergeTreeMetadataCacheMiss Number of times the read of meta file was not done from MergeTree metadata cache
Type: float | (count)
Unit: count
MergeTreeMetadataCachePut Number of rocksdb puts(used for merge tree metadata cache)
Type: float | (count)
Unit: count
MergeTreeMetadataCacheSeek Number of rocksdb seeks(used for merge tree metadata cache)
Type: float | (count)
Unit: count
MergeTreePrefetchedReadPoolInit Time spent preparing tasks in MergeTreePrefetchedReadPool
Type: float | (count)
Unit: time,ms
MergeTreeReadTaskRequestsReceived The number of callbacks requested from the remote server back to the initiator server to choose the read task (for MergeTree tables). Measured on the initiator server side.
Type: float | (count)
Unit: count
MergeTreeReadTaskRequestsSent The number of callbacks requested from the remote server back to the initiator server to choose the read task (for MergeTree tables). Measured on the remote server side.
Type: float | (count)
Unit: count
MergeTreeReadTaskRequestsSentElapsedMicroseconds Time spent in callbacks requested from the remote server back to the initiator server to choose the read task (for MergeTree tables). Measured on the remote server side.
Type: float | (count)
Unit: count
MergedIntoCompactParts Number of parts merged into Compact format.
Type: float | (count)
Unit: count
MergedIntoInMemoryParts Number of parts in merged into InMemory format.
Type: float | (count)
Unit: count
MergedIntoWideParts Number of parts merged into Wide format.
Type: float | (count)
Unit: count
MergedRows Rows read for background merges. This is the number of rows before merge.
Type: float | (count)
Unit: count
MergedUncompressedBytes Uncompressed bytes (for columns as they stored in memory) that was read for background merges. This is the number before merge.
Type: float | (count)
Unit: digital,B
MergesTimeMilliseconds Total time spent for background merges.
Type: float | (count)
Unit: time,ms
NetworkReceiveBytes Total number of bytes received from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.
Type: float | (count)
Unit: digital,B
NetworkReceiveElapsedMicroseconds Total time spent waiting for data to receive or receiving data from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.
Type: float | (count)
Unit: time,ms
NetworkSendBytes Total number of bytes send to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.
Type: float | (count)
Unit: digital,B
NetworkSendElapsedMicroseconds Total time spent waiting for data to send to network or sending data to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.
Type: float | (count)
Unit: time,ms
NotCreatedLogEntryForMerge Log entry to merge parts in ReplicatedMergeTree is not created due to concurrent log update by another replica.
Type: float | (count)
Unit: count
NotCreatedLogEntryForMutation Log entry to mutate parts in ReplicatedMergeTree is not created due to concurrent log update by another replica.
Type: float | (count)
Unit: count
OSCPUVirtualTimeMicroseconds CPU time spent seen by OS. Does not include involuntary waits due to virtualization.
Type: float | (count)
Unit: time,ms
OSCPUWaitMicroseconds Total time a thread was ready for execution but waiting to be scheduled by OS, from the OS point of view.
Type: float | (count)
Unit: time,ms
OSIOWaitMicroseconds Total time a thread spent waiting for a result of IO operation, from the OS point of view. This is real IO that does not include page cache.
Type: float | (count)
Unit: time,ms
OSReadBytes Number of bytes read from disks or block devices. Does not include bytes read from page cache. May include excessive data due to block size, readahead, etc.
Type: float | (count)
Unit: digital,B
OSReadChars Number of bytes read from filesystem, including page cache.
Type: float | (count)
Unit: digital,B
OSWriteBytes Number of bytes written to disks or block devices. Does not include bytes that are in page cache dirty pages. May not include data that was written by OS asynchronously.
Type: float | (count)
Unit: digital,B
OSWriteChars Number of bytes written to filesystem, including page cache.
Type: float | (count)
Unit: digital,B
ObsoleteReplicatedParts Number of times a data part was covered by another data part that has been fetched from a replica (so, we have marked a covered data part as obsolete and no longer needed).
Type: float | (count)
Unit: count
OpenedFileCacheHits Number of times a file has been found in the opened file cache, so we didn't have to open it again.
Type: float | (count)
Unit: count
OpenedFileCacheMisses Number of times a file has been found in the opened file cache, so we had to open it again.
Type: float | (count)
Unit: count
OtherQueryTimeMicroseconds Total time of queries that are not SELECT or INSERT.
Type: float | (count)
Unit: time,ms
OverflowAny Number of times approximate GROUP BY was in effect: when aggregation was performed only on top of first 'max_rows_to_group_by' unique keys and other keys were ignored due to 'group_by_overflow_mode' = 'any'.
Type: float | (count)
Unit: count
OverflowBreak Number of times, data processing was canceled by query complexity limitation with setting '_overflow_mode' = 'break' and the result is incomplete.
Type: float | (count)
Unit: count*
OverflowThrow Number of times, data processing was canceled by query complexity limitation with setting '_overflow_mode' = 'throw' and exception was thrown.
Type: float | (count)
Unit: count*
PerfAlignmentFaults Number of alignment faults. These happen when unaligned memory accesses happen; the kernel can handle these but it reduces performance. This happens only on some architectures (never on x86).
Type: float | (count)
Unit: count
PerfBranchInstructions Retired branch instructions. Prior to Linux 2.6.35, this used the wrong event on AMD processors.
Type: float | (count)
Unit: count
PerfBranchMisses Mispredicted branch instructions.
Type: float | (count)
Unit: count
PerfBusCycles Bus cycles, which can be different from total cycles.
Type: float | (count)
Unit: count
PerfCacheMisses Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in con-junction with the PERFCOUNTHWCACHEREFERENCES event to calculate cache miss rates.
Type: float | (count)
Unit: count
PerfCacheReferences Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU.
Type: float | (count)
Unit: count
PerfContextSwitches Number of context switches
Type: float | (count)
Unit: count
PerfCpuClock The CPU clock, a high-resolution per-CPU timer
Type: float | (count)
Unit: count
PerfCpuCycles Total cycles. Be wary of what happens during CPU frequency scaling.
Type: float | (count)
Unit: count
PerfCpuMigrations Number of times the process has migrated to a new CPU
Type: float | (count)
Unit: count
PerfDataTLBMisses Data TLB misses
Type: float | (count)
Unit: count
PerfDataTLBReferences Data TLB references
Type: float | (count)
Unit: count
PerfEmulationFaults Number of emulation faults. The kernel sometimes traps on unimplemented instructions and emulates them for user space. This can negatively impact performance.
Type: float | (count)
Unit: count
PerfInstructionTLBMisses Instruction TLB misses
Type: float | (count)
Unit: count
PerfInstructionTLBReferences Instruction TLB references
Type: float | (count)
Unit: count
PerfInstructions Retired instructions. Be careful, these can be affected by various issues, most notably hardware interrupt counts.
Type: float | (count)
Unit: count
PerfLocalMemoryMisses Local NUMA node memory read missesubqueriess
Type: float | (count)
Unit: digital,B
PerfLocalMemoryReferences Local NUMA node memory reads
Type: float | (count)
Unit: digital,B
PerfMinEnabledRunningTime Running time for event with minimum enabled time. Used to track the amount of event multiplexing
Type: float | (count)
Unit: time,ms
PerfMinEnabledTime For all events, minimum time that an event was enabled. Used to track event multiplexing influence
Type: float | (count)
Unit: time,ms
PerfRefCpuCycles Total cycles; not affected by CPU frequency scaling.
Type: float | (count)
Unit: count
PerfStalledCyclesBackend Stalled cycles during retirement.
Type: float | (count)
Unit: count
PerfStalledCyclesFrontend Stalled cycles during issue.
Type: float | (count)
Unit: count
PerfTaskClock A clock count specific to the task that is running
Type: float | (count)
Unit: count
PolygonsAddedToPool A polygon has been added to the cache (pool) for the 'pointInPolygon' function.
Type: float | (count)
Unit: count
PolygonsInPoolAllocatedBytes The number of bytes for polygons added to the cache (pool) for the 'pointInPolygon' function.
Type: float | (count)
Unit: digital,B
Query Number of queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.
Type: float | (count)
Unit: count
QueryCacheHits Number of times a query result has been found in the query cache (and query computation was avoided).
Type: float | (count)
Unit: count
QueryCacheMisses Number of times a query result has not been found in the query cache (and required query computation).
Type: float | (count)
Unit: count
QueryMaskingRulesMatch Number of times query masking rules was successfully matched.
Type: float | (count)
Unit: count
QueryMemoryLimitExceeded Number of times when memory limit exceeded for query.
Type: float | (count)
Unit: count
QueryProfilerRuns Number of times QueryProfiler had been run.
Type: float | (count)
Unit: count
QueryProfilerSignalOverruns Number of times we drop processing of a query profiler signal due to overrun plus the number of signals that OS has not delivered due to overrun.
Type: float | (count)
Unit: count
QueryTimeMicroseconds Total time of all queries.
Type: float | (count)
Unit: time,ms
RWLockAcquiredReadLocks Number of times a read lock was acquired (in a heavy RWLock).
Type: float | (count)
Unit: count
RWLockAcquiredWriteLocks Number of times a write lock was acquired (in a heavy RWLock).
Type: float | (count)
Unit: count
RWLockReadersWaitMilliseconds Total time spent waiting for a read lock to be acquired (in a heavy RWLock).
Type: float | (count)
Unit: time,ms
RWLockWritersWaitMilliseconds Total time spent waiting for a write lock to be acquired (in a heavy RWLock).
Type: float | (count)
Unit: time,ms
ReadBackoff Number of times the number of query processing threads was lowered due to slow reads.
Type: float | (count)
Unit: count
ReadBufferAIORead Read buffer AIO read
Type: float | (count)
Unit: count
ReadBufferAIOReadBytes Read buffer AIO read bytes
Type: float | (count)
Unit: digital,B
ReadBufferFromFileDescriptorRead Number of reads (read/pread) from a file descriptor. Does not include sockets.
Type: float | (count)
Unit: count
ReadBufferFromFileDescriptorReadBytes Number of bytes read from file descriptors. If the file is compressed, this will show the compressed data size.
Type: float | (count)
Unit: digital,B
ReadBufferFromFileDescriptorReadFailed Number of times the read (read/pread) from a file descriptor have failed.
Type: float | (count)
Unit: count
ReadBufferFromS3Bytes Bytes read from S3.
Type: float | (count)
Unit: digital,B
ReadBufferFromS3InitMicroseconds Time spent initializing connection to S3.
Type: float | (count)
Unit: time,ms
ReadBufferFromS3Microseconds Time spent on reading from S3.
Type: float | (count)
Unit: time,ms
ReadBufferFromS3RequestsErrors Number of exceptions while reading from S3.
Type: float | (count)
Unit: count
ReadBufferSeekCancelConnection Number of seeks which lead to new connection (s3, http)
Type: float | (count)
Unit: count
ReadCompressedBytes Number of bytes (the number of bytes before decompression) read from compressed sources (files, network).
Type: float | (count)
Unit: digital,B
ReadTaskRequestsReceived The number of callbacks requested from the remote server back to the initiator server to choose the read task (for s3Cluster table function and similar). Measured on the initiator server side.
Type: float | (count)
Unit: count
ReadTaskRequestsSent The number of callbacks requested from the remote server back to the initiator server to choose the read task (for s3Cluster table function and similar). Measured on the remote server side.
Type: float | (count)
Unit: count
ReadTaskRequestsSentElapsedMicroseconds Time spent in callbacks requested from the remote server back to the initiator server to choose the read task (for s3Cluster table function and similar). Measured on the remote server side.
Type: float | (count)
Unit: time,ms
RealTimeMicroseconds Total (wall clock) time spent in processing (queries and other tasks) threads (note that this is a sum).
Type: float | (count)
Unit: time,ms
RegexpCreated Compiled regular expressions. Identical regular expressions compiled just once and cached forever.
Type: float | (count)
Unit: count
RejectedInserts Number of times the INSERT of a block to a MergeTree table was rejected with 'Too many parts' exception due to high number of active data parts for partition.
Type: float | (count)
Unit: count
RemoteFSBuffers Number of buffers created for asynchronous reading from remote filesystem
Type: float | (count)
Unit: count
RemoteFSCancelledPrefetches Number of canceled prefecthes (because of seek)
Type: float | (count)
Unit: count
RemoteFSLazySeeks Number of lazy seeks
Type: float | (count)
Unit: count
RemoteFSPrefetchedBytes Number of bytes from prefetched buffer
Type: float | (count)
Unit: count
RemoteFSPrefetchedReads Number of reads from prefetched buffer
Type: float | (count)
Unit: count
RemoteFSPrefetches Number of prefetches made with asynchronous reading from remote filesystem
Type: float | (count)
Unit: count
RemoteFSSeeks Total number of seeks for async buffer
Type: float | (count)
Unit: count
RemoteFSSeeksWithReset Number of seeks which lead to a new connection
Type: float | (count)
Unit: count
RemoteFSUnprefetchedBytes Number of bytes from un prefetched buffer
Type: float | (count)
Unit: digital,B
RemoteFSUnprefetchedReads Number of reads from un prefetched buffer
Type: float | (count)
Unit: count
RemoteFSUnusedPrefetches Number of prefetches pending at buffer destruction
Type: float | (count)
Unit: count
RemoteReadThrottlerBytes Bytes passed through 'max_remote_read_network_bandwidth_for_server'/'max_remote_read_network_bandwidth' throttler.
Type: float | (count)
Unit: digital,B
RemoteReadThrottlerSleepMicroseconds Total time a query was sleeping to conform 'max_remote_read_network_bandwidth_for_server'/'max_remote_read_network_bandwidth' throttling.
Type: float | (count)
Unit: time,ms
RemoteWriteThrottlerBytes Bytes passed through 'max_remote_write_network_bandwidth_for_server'/'max_remote_write_network_bandwidth' throttler.
Type: float | (count)
Unit: digital,B
RemoteWriteThrottlerSleepMicroseconds Total time a query was sleeping to conform 'max_remote_write_network_bandwidth_for_server'/'max_remote_write_network_bandwidth' throttling.
Type: float | (count)
Unit: time,ms
ReplicaPartialShutdown How many times Replicated table has to deinitialize its state due to session expiration in ZooKeeper. The state is reinitialized every time when ZooKeeper is available again.
Type: float | (count)
Unit: count
ReplicatedDataLoss Number of times a data part that we wanted does not exist on any replica (even on replicas that are offline right now). That data parts are definitely lost. This is normal due to asynchronous replication (if quorum inserts were not enabled), when the replica on which the data part was written was failed and when it became online after fail it does not contain that data part.
Type: float | (count)
Unit: count
ReplicatedPartChecks Number of times we had to perform advanced search for a data part on replicas or to clarify the need of an existing data part.
Type: float | (count)
Unit: count
ReplicatedPartChecksFailed Number of times the advanced search for a data part on replicas did not give result or when unexpected part has been found and moved away.
Type: float | (count)
Unit: count
ReplicatedPartFailedFetches Number of times a data part was failed to download from replica of a ReplicatedMergeTree table.
Type: float | (count)
Unit: count
ReplicatedPartFetches Number of times a data part was downloaded from replica of a ReplicatedMergeTree table.
Type: float | (count)
Unit: count
ReplicatedPartFetchesOfMerged Number of times we prefer to download already merged part from replica of ReplicatedMergeTree table instead of performing a merge ourself (usually we prefer doing a merge ourself to save network traffic). This happens when we have not all source parts to perform a merge or when the data part is old enough.
Type: float | (count)
Unit: count
ReplicatedPartMerges Number of times data parts of ReplicatedMergeTree tables were successfully merged.
Type: float | (count)
Unit: count
ReplicatedPartMutations Number of times data parts of ReplicatedMergeTree tables were successfully mutated.
Type: float | (count)
Unit: count
S3AbortMultipartUpload Number of S3 API AbortMultipartUpload calls.
Type: float | (count)
Unit: count
S3CompleteMultipartUpload Number of S3 API CompleteMultipartUpload calls.
Type: float | (count)
Unit: count
S3CopyObject Number of S3 API CopyObject calls.
Type: float | (count)
Unit: count
S3CreateMultipartUpload Number of S3 API CreateMultipartUpload calls.
Type: float | (count)
Unit: count
S3DeleteObjects Number of S3 API DeleteObject(s) calls.
Type: float | (count)
Unit: count
S3GetObject Number of S3 API GetObject calls.
Type: float | (count)
Unit: count
S3GetObjectAttributes Number of S3 API GetObjectAttributes calls.
Type: float | (count)
Unit: count
S3GetRequestThrottlerCount Number of S3 GET and SELECT requests passed through throttler.
Type: float | (count)
Unit: count
S3GetRequestThrottlerSleepMicroseconds Total time a query was sleeping to conform S3 GET and SELECT request throttling.
Type: float | (count)
Unit: time,ms
S3HeadObject Number of S3 API HeadObject calls.
Type: float | (count)
Unit: count
S3ListObjects Number of S3 API ListObjects calls.
Type: float | (count)
Unit: count
S3PutObject Number of S3 API PutObject calls.
Type: float | (count)
Unit: count
S3PutRequestThrottlerCount Number of S3 PUT, COPY, POST and LIST requests passed through throttler.
Type: float | (count)
Unit: count
S3PutRequestThrottlerSleepMicroseconds Total time a query was sleeping to conform S3 PUT, COPY, POST and LIST request throttling.
Type: float | (count)
Unit: time,ms
S3ReadBytes Read bytes (incoming) in GET and HEAD requests to S3 storage.
Type: float | (count)
Unit: digital,B
S3ReadMicroseconds Time of GET and HEAD requests to S3 storage.
Type: float | (count)
Unit: time,ms
S3ReadRequestsCount Number of GET and HEAD requests to S3 storage.
Type: float | (count)
Unit: count
S3ReadRequestsErrors Number of non-throttling errors in GET and HEAD requests to S3 storage.
Type: float | (count)
Unit: count
S3ReadRequestsRedirects Number of redirects in GET and HEAD requests to S3 storage.
Type: float | (count)
Unit: count
S3ReadRequestsThrottling Number of 429 and 503 errors in GET and HEAD requests to S3 storage.
Type: float | (count)
Unit: count
S3UploadPart Number of S3 API UploadPart calls.
Type: float | (count)
Unit: count
S3UploadPartCopy Number of S3 API UploadPartCopy calls.
Type: float | (count)
Unit: count
S3WriteBytes Write bytes (outgoing) in POST, DELETE, PUT and PATCH requests to S3 storage.
Type: float | (count)
Unit: digital,B
S3WriteMicroseconds Time of POST, DELETE, PUT and PATCH requests to S3 storage.
Type: float | (count)
Unit: time,ms
S3WriteRequestsCount Number of POST, DELETE, PUT and PATCH requests to S3 storage.
Type: float | (count)
Unit: count
S3WriteRequestsErrors Number of non-throttling errors in POST, DELETE, PUT and PATCH requests to S3 storage.
Type: float | (count)
Unit: count
S3WriteRequestsRedirects Number of redirects in POST, DELETE, PUT and PATCH requests to S3 storage.
Type: float | (count)
Unit: count
S3WriteRequestsThrottling Number of 429 and 503 errors in POST, DELETE, PUT and PATCH requests to S3 storage.
Type: float | (count)
Unit: count
ScalarSubqueriesCacheMiss Number of times a read from a scalar sub query was not cached and had to be calculated completely
Type: float | (count)
Unit: count
ScalarSubqueriesGlobalCacheHit Number of times a read from a scalar sub query was done using the global cache
Type: float | (count)
Unit: count
ScalarSubqueriesLocalCacheHit Number of times a read from a scalar sub query was done using the local cache
Type: float | (count)
Unit: count
SchemaInferenceCacheEvictions Number of times a schema from cache was evicted due to overflow
Type: float | (count)
Unit: count
SchemaInferenceCacheHits Number of times a schema from cache was used for schema inference
Type: float | (count)
Unit: count
SchemaInferenceCacheInvalidations Number of times a schema in cache became invalid due to changes in data
Type: float | (count)
Unit: count
SchemaInferenceCacheMisses Number of times a schema is not in cache while schema inference
Type: float | (count)
Unit: count
Seek Number of times the 'lseek' function was called.
Type: float | (count)
Unit: count
SelectQuery Same as Query, but only for SELECT queries.
Type: float | (count)
Unit: count
SelectQueryTimeMicroseconds Total time of SELECT queries.
Type: float | (count)
Unit: time,ms
SelectedBytes Number of bytes (uncompressed; for columns as they stored in memory) SELECTed from all tables.
Type: float | (count)
Unit: digital,B
SelectedMarks Number of marks (index granules) selected to read from a MergeTree table.
Type: float | (count)
Unit: count
SelectedParts Number of data parts selected to read from a MergeTree table.
Type: float | (count)
Unit: count
SelectedRanges Number of (non-adjacent) ranges in all data parts selected to read from a MergeTree table.
Type: float | (count)
Unit: count
SelectedRows Number of rows SELECTed from all tables.
Type: float | (count)
Unit: count
ServerStartupMilliseconds Time elapsed from starting server to listening to sockets in milliseconds
Type: float | (count)
Unit: time,ms
SleepFunctionCalls Number of times a sleep function (sleep, sleepEachRow) has been called.
Type: float | (count)
Unit: count
SleepFunctionMicroseconds Time spent sleeping due to a sleep function call.
Type: float | (count)
Unit: time,ms
SlowRead Number of reads from a file that were slow. This indicate system overload. Thresholds are controlled by read_backoff_* settings.
Type: float | (count)
Unit: count
SoftPageFaults The number of soft page faults in query execution threads. Soft page fault usually means a miss in the memory allocator cache which required a new memory mapping from the OS and subsequent allocation of a page of physical memory.
Type: float | (count)
Unit: count
StorageBufferErrorOnFlush Number of times a buffer in the 'Buffer' table has not been able to flush due to error writing in the destination table.
Type: float | (count)
Unit: count
StorageBufferFlush Number of times a buffer in a 'Buffer' table was flushed.
Type: float | (count)
Unit: count
StorageBufferLayerLockReadersWaitMilliseconds Time for waiting for Buffer layer during reading.
Type: float | (count)
Unit: time,ms
StorageBufferLayerLockWritersWaitMilliseconds Time for waiting free Buffer layer to write to (can be used to tune Buffer layers).
Type: float | (count)
Unit: time,ms
StorageBufferPassedAllMinThresholds Number of times a criteria on min thresholds has been reached to flush a buffer in a 'Buffer' table.
Type: float | (count)
Unit: count
StorageBufferPassedBytesFlushThreshold Number of times background-only flush threshold on bytes has been reached to flush a buffer in a 'Buffer' table. This is expert-only metric. If you read this and you are not an expert, stop reading.
Type: float | (count)
Unit: count
StorageBufferPassedBytesMaxThreshold Number of times a criteria on max bytes threshold has been reached to flush a buffer in a 'Buffer' table.
Type: float | (count)
Unit: count
StorageBufferPassedRowsFlushThreshold Number of times background-only flush threshold on rows has been reached to flush a buffer in a 'Buffer' table. This is expert-only metric. If you read this and you are not an expert, stop reading.
Type: float | (count)
Unit: count
StorageBufferPassedRowsMaxThreshold Number of times a criteria on max rows threshold has been reached to flush a buffer in a 'Buffer' table.
Type: float | (count)
Unit: count
StorageBufferPassedTimeFlushThreshold Number of times background-only flush threshold on time has been reached to flush a buffer in a 'Buffer' table. This is expert-only metric. If you read this and you are not an expert, stop reading.
Type: float | (count)
Unit: count
StorageBufferPassedTimeMaxThreshold Number of times a criteria on max time threshold has been reached to flush a buffer in a 'Buffer' table.
Type: float | (count)
Unit: count
SuspendSendingQueryToShard Total count when sending query to shard was suspended when async_query_sending_for_remote is enabled.
Type: float | (count)
Unit: count
SynchronousRemoteReadWaitMicroseconds Time spent in waiting for synchronous remote reads.
Type: float | (count)
Unit: time,ms
SystemTimeMicroseconds Total time spent in processing (queries and other tasks) threads executing CPU instructions in OS kernel space. This include time CPU pipeline was stalled due to cache misses, branch mispredictions, hyper-threading, etc.
Type: float | (count)
Unit: time,ms
TableFunctionExecute Number of table function calls.
Type: float | (count)
Unit: count
ThreadPoolReaderPageCacheHit Number of times the read inside ThreadPoolReader was done from page cache.
Type: float | (count)
Unit: count
ThreadPoolReaderPageCacheHitBytes Number of bytes read inside ThreadPoolReader when it was done from page cache.
Type: float | (count)
Unit: digital,B
ThreadPoolReaderPageCacheHitElapsedMicroseconds Time spent reading data from page cache in ThreadPoolReader.
Type: float | (count)
Unit: time,ms
ThreadPoolReaderPageCacheMiss Number of times the read inside ThreadPoolReader was not done from page cache and was hand off to thread pool.
Type: float | (count)
Unit: count
ThreadPoolReaderPageCacheMissBytes Number of bytes read inside ThreadPoolReader when read was not done from page cache and was hand off to thread pool.
Type: float | (count)
Unit: digital,B
ThreadPoolReaderPageCacheMissElapsedMicroseconds Time spent reading data inside the asynchronous job in ThreadPoolReader - when read was not done from page cache.
Type: float | (count)
Unit: time,ms
ThreadpoolReaderReadBytes Bytes read from a threadpool task in asynchronous reading
Type: float | (count)
Unit: digital,B
ThreadpoolReaderSubmit Bytes read from a threadpool task in asynchronous reading
Type: float | (count)
Unit: digital,B
ThreadpoolReaderTaskMicroseconds Time spent getting the data in asynchronous reading
Type: float | (count)
Unit: time,ms
ThrottlerSleepMicroseconds Total time a query was sleeping to conform all throttling settings.
Type: float | (count)
Unit: time,ms
UncompressedCacheHits Number of times a block of data has been found in the uncompressed cache (and decompression was avoided).
Type: float | (count)
Unit: count
UncompressedCacheMisses Number of times a block of data has not been found in the uncompressed cache (and required decompression).
Type: float | (count)
Unit: count
UncompressedCacheWeightLost Number of bytes evicted from the uncompressed cache.
Type: float | (count)
Unit: digital,B
UserTimeMicroseconds Total time spent in processing (queries and other tasks) threads executing CPU instructions in user space. This include time CPU pipeline was stalled due to cache misses, branch mispredictions, hyper-threading, etc.
Type: float | (count)
Unit: time,ms
VoluntaryContextSwitches Voluntary context switches
Type: float | (count)
Unit: count
WaitMarksLoadMicroseconds Time spent loading marks
Type: float | (count)
Unit: time,ms
WaitPrefetchTaskMicroseconds Time spend waiting for prefetched reader
Type: float | (count)
Unit: time,ms
WriteBufferAIOWrite Write Buffer AIO Write
Type: float | (count)
Unit: count
WriteBufferAIOWriteBytes Write buffer AIO write bytes
Type: float | (count)
Unit: digital,B
WriteBufferFromFileDescriptorWrite Number of writes (write/pwrite) to a file descriptor. Does not include sockets.
Type: float | (count)
Unit: count
WriteBufferFromFileDescriptorWriteBytes Number of bytes written to file descriptors. If the file is compressed, this will show compressed data size.
Type: float | (count)
Unit: digital,B
WriteBufferFromFileDescriptorWriteFailed Number of times the write (write/pwrite) to a file descriptor have failed.
Type: float | (count)
Unit: count
WriteBufferFromS3Bytes Bytes written to S3.
Type: float | (count)
Unit: digital,B
WriteBufferFromS3Microseconds Time spent on writing to S3.
Type: float | (count)
Unit: time,ms
WriteBufferFromS3RequestsErrors Number of exceptions while writing to S3.
Type: float | (count)
Unit: count
ZooKeeperBytesReceived Number of bytes received over network while communicating with ZooKeeper.
Type: float | (count)
Unit: digital,B
ZooKeeperBytesSent Number of bytes send over network while communicating with ZooKeeper.
Type: float | (count)
Unit: digital,B
ZooKeeperCheck Number of 'check' requests to ZooKeeper. Usually they don't make sense in isolation, only as part of a complex transaction.
Type: float | (count)
Unit: count
ZooKeeperClose Number of times connection with ZooKeeper has been closed voluntary.
Type: float | (count)
Unit: count
ZooKeeperCreate Number of 'create' requests to ZooKeeper.
Type: float | (count)
Unit: count
ZooKeeperExists Number of 'exists' requests to ZooKeeper.
Type: float | (count)
Unit: count
ZooKeeperGet Number of 'get' requests to ZooKeeper.
Type: float | (count)
Unit: count
ZooKeeperHardwareExceptions Number of exceptions while working with ZooKeeper related to network (connection loss or similar).
Type: float | (count)
Unit: count
ZooKeeperInit Number of times connection with ZooKeeper has been established.
Type: float | (count)
Unit: count
ZooKeeperList Number of 'list' (getChildren) requests to ZooKeeper.
Type: float | (count)
Unit: count
ZooKeeperMulti Number of 'multi' requests to ZooKeeper (compound transactions).
Type: float | (count)
Unit: count
ZooKeeperOtherExceptions Number of exceptions while working with ZooKeeper other than ZooKeeperUserExceptions and ZooKeeperHardwareExceptions.
Type: float | (count)
Unit: count
ZooKeeperRemove Number of 'remove' requests to ZooKeeper.
Type: float | (count)
Unit: count
ZooKeeperSet Number of 'set' requests to ZooKeeper.
Type: float | (count)
Unit: count
ZooKeeperSync Number of 'sync' requests to ZooKeeper. These requests are rarely needed or usable.
Type: float | (count)
Unit: count
ZooKeeperTransactions Number of ZooKeeper operations, which include both read and write operations as well as multi-transactions.
Type: float | (count)
Unit: count
ZooKeeperUserExceptions Number of exceptions while working with ZooKeeper related to the data (no node, bad version or similar).
Type: float | (count)
Unit: count
ZooKeeperWaitMicroseconds Number of microseconds spent waiting for responses from ZooKeeper after creating a request, summed across all the requesting threads.
Type: float | (count)
Unit: time,ms
ZooKeeperWatchResponse Number of times watch notification has been received from ZooKeeper.
Type: float | (count)
Unit: count

ClickHouseStatusInfo

Tags & Fields Description
host
(tag)
Host name
instance
(tag)
Instance endpoint
DictionaryStatus Dictionary Status.
Type: float | (gauge)
Unit: -

collector

Tags & Fields Description
instance
(tag)
Server addr of the instance
job
(tag)
Server name of the instance
up
Type: int | (gauge)
Unit: -

文档评价

文档内容是否对您有帮助? ×