跳转至

etcd

·


etcd 采集器可以从 etcd 实例中采取很多指标,比如 etcd 服务器状态和网络的状态等多种指标,并将指标采集到 DataFlux,帮助你监控分析 etcd 各种异常情况。

配置

前置条件

etcd 版本 >= 3, 已测试的版本:

  • 3.5.7
  • 3.4.24
  • 3.3.27

采集器配置

开启 etcd,默认的 metrics 接口是 http://localhost:2379/metrics,也可以自行在配置文件中修改。

进入 DataKit 安装目录下的 conf.d/etcd 目录,复制 etcd.conf.sample 并命名为 etcd.conf。示例如下:

[[inputs.etcd]]
  ## Exporter URLs.
  urls = ["http://127.0.0.1:2379/metrics"]

  ## TLS configuration.
  tls_open = false
  # tls_ca = "/tmp/ca.crt"
  # tls_cert = "/tmp/peer.crt"
  # tls_key = "/tmp/peer.key"

  ## Set to 'true' to enable election.
  election = true

  ## Ignore tags. Multi supported.
  ## The matched tags would be dropped, but the item would still be sent.
  # tags_ignore = ["xxxx"]

  ## Customize tags.
  [inputs.etcd.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"

  ## (Optional) Collect interval: (defaults to "30s").
  # interval = "30s"

配置好后,重启 DataKit 即可。

目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。

指标

etcd

  • 标签
Tag Description
action Action.
cluster_version Cluster version.
code Code.
grpc_code GRPC code.
grpc_method GRPC method.
grpc_service GRPC service name.
grpc_type GRPC type.
host Hostname.
instance Instance.
server_go_version Server go version.
server_id Server ID.
server_version Server version.
version Version.
  • 指标列表
Metric Description Type Unit
etcd_cluster_version Which version is running. 1 for 'cluster_version' label with current cluster version float count
etcd_debugging_auth_revision The current revision of auth store. float count
etcd_debugging_disk_backend_commit_rebalance_duration_seconds The latency distributions of commit.rebalance called by bboltdb backend. float count
etcd_debugging_disk_backend_commit_spill_duration_seconds The latency distributions of commit.spill called by bboltdb backend. float count
etcd_debugging_disk_backend_commit_write_duration_seconds The latency distributions of commit.write called by bboltdb backend. float count
etcd_debugging_lease_granted_total The total number of granted leases. float count
etcd_debugging_lease_renewed_total The number of renewed leases seen by the leader. float count
etcd_debugging_lease_revoked_total The total number of revoked leases. float count
etcd_debugging_lease_ttl_total Bucketed histogram of lease TTLs. float count
etcd_debugging_mvcc_compact_revision The revision of the last compaction in store. float count
etcd_debugging_mvcc_current_revision The current revision of store. float count
etcd_debugging_mvcc_db_compaction_keys_total Total number of db keys compacted. float count
etcd_debugging_mvcc_db_compaction_last The unix time of the last db compaction. Resets to 0 on start. float count
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds Bucketed histogram of db compaction pause duration. float count
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds Bucketed histogram of db compaction total duration. float count
etcd_debugging_mvcc_db_total_size_in_bytes Total size of the underlying database physically allocated in bytes. float count
etcd_debugging_mvcc_delete_total Total number of deletes seen by this member. float count
etcd_debugging_mvcc_events_total Total number of events sent by this member. float count
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds Bucketed histogram of index compaction pause duration. float count
etcd_debugging_mvcc_keys_total Total number of keys. float count
etcd_debugging_mvcc_pending_events_total Total number of pending events to be sent. float count
etcd_debugging_mvcc_put_total Total number of puts seen by this member. float count
etcd_debugging_mvcc_range_total Total number of ranges seen by this member. float count
etcd_debugging_mvcc_slow_watcher_total Total number of unsynced slow watchers. float count
etcd_debugging_mvcc_total_put_size_in_bytes The total size of put kv pairs seen by this member. float count
etcd_debugging_mvcc_txn_total Total number of txns seen by this member. float count
etcd_debugging_mvcc_watch_stream_total Total number of watch streams. float count
etcd_debugging_mvcc_watcher_total Total number of watchers. float count
etcd_debugging_server_alarms Alarms for every member in cluster. 1 for 'server_id' label with current ID. 2 for 'alarm_type' label with type of this alarm float count
etcd_debugging_server_lease_expired_total The total number of expired leases. float count
etcd_debugging_snap_save_marshalling_duration_seconds The marshaling cost distributions of save called by snapshot. float count
etcd_debugging_snap_save_total_duration_seconds The total latency distributions of save called by snapshot. float count
etcd_debugging_store_expires_total Total number of expired keys. float count
etcd_debugging_store_reads_failed_total Failed read actions by (get/getRecursive), local to this member. float count
etcd_debugging_store_reads_total Total number of reads action by (get/getRecursive), local to this member. float count
etcd_debugging_store_watch_requests_total Total number of incoming watch requests (new or reestablished). float count
etcd_debugging_store_watchers Count of currently active watchers. float count
etcd_debugging_store_writes_failed_total Failed write actions (e.g. set/compareAndDelete), seen by this member. float count
etcd_debugging_store_writes_total Total number of writes (e.g. set/compareAndDelete) seen by this member. float count
etcd_disk_backend_commit_duration_seconds The latency distributions of commit called by backend. float count
etcd_disk_backend_defrag_duration_seconds The latency distribution of backend defragmentation. float count
etcd_disk_backend_snapshot_duration_seconds The latency distribution of backend snapshots. float count
etcd_disk_defrag_inflight Whether or not defrag is active on the member. 1 means active, 0 means not. float count
etcd_disk_wal_fsync_duration_seconds The latency distributions of fsync called by WAL. float count
etcd_disk_wal_write_bytes_total Total number of bytes written in WAL. float count
etcd_grpc_proxy_cache_hits_total Total number of cache hits float count
etcd_grpc_proxy_cache_keys_total Total number of keys/ranges cached float count
etcd_grpc_proxy_cache_misses_total Total number of cache misses float count
etcd_grpc_proxy_events_coalescing_total Total number of events coalescing float count
etcd_grpc_proxy_watchers_coalescing_total Total number of current watchers coalescing float count
etcd_mvcc_db_open_read_transactions The number of currently open read transactions float count
etcd_mvcc_db_total_size_in_bytes Total size of the underlying database physically allocated in bytes. float count
etcd_mvcc_db_total_size_in_use_in_bytes Total size of the underlying database logically in use in bytes. float count
etcd_mvcc_delete_total Total number of deletes seen by this member. float count
etcd_mvcc_hash_duration_seconds The latency distribution of storage hash operation. float count
etcd_mvcc_hash_rev_duration_seconds The latency distribution of storage hash by revision operation. float count
etcd_mvcc_put_total Total number of puts seen by this member. float count
etcd_mvcc_range_total Total number of ranges seen by this member. float count
etcd_mvcc_txn_total Total number of txns seen by this member. float count
etcd_network_active_peers The current number of active peer connections. float count
etcd_network_client_grpc_received_bytes_total The total number of bytes received from grpc clients. float count
etcd_network_client_grpc_sent_bytes_total The total number of bytes sent to grpc clients. float count
etcd_network_disconnected_peers_total The total number of disconnected peers. float count
etcd_network_known_peers The current number of known peers. float count
etcd_network_peer_received_bytes_total The total number of bytes received from peers. float count
etcd_network_peer_received_failures_total The total number of receive failures from peers. float count
etcd_network_peer_round_trip_time_seconds Round-Trip-Time histogram between peers float count
etcd_network_peer_sent_bytes_total The total number of bytes sent to peers. float count
etcd_network_peer_sent_failures_total The total number of send failures from peers. float count
etcd_network_server_stream_failures_total The total number of stream failures from the local server. float count
etcd_network_snapshot_receive_failures Total number of snapshot receive failures float count
etcd_network_snapshot_receive_inflights_total Total number of inflight snapshot receives float count
etcd_network_snapshot_receive_success Total number of successful snapshot receives float count
etcd_network_snapshot_receive_total_duration_seconds Total latency distributions of v3 snapshot receives float count
etcd_network_snapshot_send_failures Total number of snapshot send failures float count
etcd_network_snapshot_send_inflights_total Total number of inflight snapshot sends float count
etcd_network_snapshot_send_success Total number of successful snapshot sends float count
etcd_network_snapshot_send_total_duration_seconds Total latency distributions of v3 snapshot sends float count
etcd_server_apply_duration_seconds The latency distributions of v2 apply called by backend. float count
etcd_server_client_requests_total The total number of client requests per client version. float count
etcd_server_go_version Which Go version server is running with. 1 for 'server_go_version' label with current version. float count
etcd_server_has_leader Whether or not a leader exists. 1 is existence, 0 is not. float count
etcd_server_health_failures The total number of failed health checks float count
etcd_server_health_success The total number of successful health checks float count
etcd_server_heartbeat_send_failures_total The total number of leader heartbeat send failures (likely overloaded from slow disk). float count
etcd_server_id Server or member ID in hexadecimal format. 1 for 'server_id' label with current ID. float count
etcd_server_is_leader Whether or not this member is a leader. 1 if is, 0 otherwise. float count
etcd_server_is_learner Whether or not this member is a learner. 1 if is, 0 otherwise. float count
etcd_server_leader_changes_seen_total The number of leader changes seen. float count
etcd_server_learner_promote_failures The total number of failed learner promotions (likely learner not ready) while this member is leader. float count
etcd_server_learner_promote_successes The total number of successful learner promotions while this member is leader. float count
etcd_server_proposals_applied_total The total number of consensus proposals applied. float count
etcd_server_proposals_committed_total The total number of consensus proposals committed. float count
etcd_server_proposals_failed_total The total number of failed proposals seen. float count
etcd_server_proposals_pending The current number of pending proposals to commit. float count
etcd_server_quota_backend_bytes Current backend storage quota size in bytes. float count
etcd_server_read_indexes_failed_total The total number of failed read indexes seen. float count
etcd_server_slow_apply_total The total number of slow apply requests (likely overloaded from slow disk). float count
etcd_server_slow_read_indexes_total The total number of pending read indexes not in sync with leader's or timed out read index requests. float count
etcd_server_snapshot_apply_in_progress_total 1 if the server is applying the incoming snapshot. 0 if none. float count
etcd_server_version Which version is running. 1 for 'server_version' label with current version. float count
etcd_snap_db_fsync_duration_seconds The latency distributions of fsyncing .snap.db file float count
etcd_snap_db_save_total_duration_seconds The total latency distributions of v3 snapshot save float count
etcd_snap_fsync_duration_seconds The latency distributions of fsync called by snap. float count
go_gc_duration_seconds A summary of the pause duration of garbage collection cycles. float count
go_goroutines Number of goroutines that currently exist. float count
go_info Information about the Go environment. float count
go_memstats_alloc_bytes Number of bytes allocated and still in use. float count
go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed. float count
go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table. float count
go_memstats_frees_total Total number of frees. float count
go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started. float count
go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata. float count
go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use. float count
go_memstats_heap_idle_bytes Number of heap bytes waiting to be used. float count
go_memstats_heap_inuse_bytes Number of heap bytes that are in use. float count
go_memstats_heap_objects Number of allocated objects. float count
go_memstats_heap_released_bytes Number of heap bytes released to OS. float count
go_memstats_heap_sys_bytes Number of heap bytes obtained from system. float count
go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection. float count
go_memstats_lookups_total Total number of pointer lookups. float count
go_memstats_mallocs_total Total number of mallocs. float count
go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures. float count
go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system. float count
go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures. float count
go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system. float count
go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place. float count
go_memstats_other_sys_bytes Number of bytes used for other system allocations. float count
go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator. float count
go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator. float count
go_memstats_sys_bytes Number of bytes obtained from system. float count
go_threads Number of OS threads created. float count
grpc_server_handled_total Total number of RPCs completed on the server, regardless of success or failure. float count
grpc_server_msg_received_total Total number of RPC stream messages received on the server. float count
grpc_server_msg_sent_total Total number of gRPC stream messages sent by the server. float count
grpc_server_started_total Total number of RPCs started on the server. float count
os_fd_limit The file descriptor limit. float count
os_fd_used The number of used file descriptors. float count
process_cpu_seconds_total Total user and system CPU time spent in seconds float count
process_max_fds Maximum number of open file descriptors float count
process_open_fds Number of open file descriptors float count
process_resident_memory_bytes Resident memory size in bytes float count
process_start_time_seconds Start time of the process since unix epoch in seconds float count
process_virtual_memory_bytes Virtual memory size in bytes float count
process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes float count
promhttp_metric_handler_requests_in_flight Current number of scrapes being served. float count
promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code. float count

文档评价

文档内容是否对您有帮助? ×