跳转至

StatsD


DDTrace Agent 采集的指标数据会通过 StatsD 数据类型发送到 DK 的 8125 端口上。其中包括 JVM 运行时的 CPU 、内存、线程、类加载信息,也包括开启的各种采集上来的 JMX 指标, 如: Kafka、Tomcat、RabbitMQ 等。

配置

前置条件

DDTrace 以 agent 形式运行时,不需要用户特意的开通 jmx 端口,如果没有开通端口的话, agent 会随机打开一个本地端口。

DDTrace 默认会采集 JVM 信息。默认情况下会发送到 localhost:8125.

如果是 k8s 环境下,需要配置 StatsD host 和 port:

DD_JMXFETCH_STATSD_HOST=datakit_url
DD_JMXFETCH_STATSD_PORT=8125

可以使用 dd.jmxfetch.<INTEGRATION_NAME>.enabled=true 开启指定的采集器。

填写 INTEGRATION_NAME 之前可以先查看 默认支持的三方软件

比如 Tomcat 或者 Kafka:

-Ddd.jmxfetch.tomcat.enabled=true
# or
-Ddd.jmxfetch.kafka.enabled=true

采集器配置

进入 DataKit 安装目录下的 conf.d/samples 目录,复制 statsd.conf.sample 并命名为 statsd.conf。示例如下:

[[inputs.statsd]]
  ## Collector alias.
  # source = "statsd/-/-"

  ## Collect interval, default is 10 seconds. (optional)
  # interval = '10s'

  protocol = "udp"

  ## Address to host unix listener on, linux only
  service_unix_address = "/var/run/datakit/statsd.sock"

  ## Address and port to host UDP listener on
  service_address = ":8125"

  ## Tag request metric. Used for distinguish feed metric name.
  ## eg, DD_TAGS=source_key:tomcat,host_key:cn-shanghai-sq5ei
  ## eg, -Ddd.tags=source_key:tomcat,host_key:cn-shanghai-sq5ei
  # statsd_source_key = "source_key"
  # statsd_host_key   = "host_key"
  ## Indicate whether report tag statsd_source_key and statsd_host_key.
  # save_above_key    = false

  delete_gauges = true
  delete_counters = true
  delete_sets = true
  delete_timings = true

  ## Counter metric is float in new Datakit version, set true if want be int.
  # set_counter_int = false

  ## Percentiles to calculate for timing & histogram stats
  percentiles = [50.0, 90.0, 99.0, 99.9, 99.95, 100.0]

  ## separator to use between elements of a statsd metric
  metric_separator = "_"

  ## Parses tags in the datadog statsd format
  ## http://docs.datadoghq.com/guides/dogstatsd/
  parse_data_dog_tags = true

  ## Parses datadog extensions to the statsd format
  datadog_extensions = true

  ## Parses distributions metric as specified in the datadog statsd format
  ## https://docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition
  datadog_distributions = true

  ## We do not need following tags(they may create tremendous of time-series under influxdb's logic)
  # Examples:
  # "runtime-id", "metric-type"
  drop_tags = [ ]

  # All metric-name prefixed with 'jvm_' are set to influxdb's measurement 'jvm'
  # All metric-name prefixed with 'stats_' are set to influxdb's measurement 'stats'
  # Examples:
  # "stats_:stats", "jvm_:jvm", "tomcat_:tomcat",
  metric_mapping = [ ]

  ## Number of UDP messages allowed to queue up, once filled,
  ## the statsd server will start dropping packets, default is 128.
  # allowed_pending_messages = 128

  ## Number of timing/histogram values to track per-measurement in the
  ## calculation of percentiles. Raising this limit increases the accuracy
  ## of percentiles but also increases the memory usage and cpu time.
  percentile_limit = 1000

  ## Max duration (TTL) for each metric to stay cached/reported without being updated.
  #max_ttl = "1000h"

  [inputs.statsd.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"

配置好后,重启 DataKit 即可。

目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。

Info

如果日志出现大量 Feed: io busy,可以配置 interval = '1s',最低 1s。

标记数据源

如果想标记 DDTrace 采集的主机,可以使用注入 tags 的方式进行标记:

  • 可以使用环境变量,即 DD_TAGS,例如:DD_TAGS=source_key:tomcat,host_key:cn-shanghai-sq5ei
  • 可以使用命令行方式,即 dd.tags,例如:-Ddd.tags=source_key:tomcat,host_key:cn-shanghai-sq5ei

在上面的例子中,需要在 DataKit 配置中指定 source 的 key 是 source_key,host 的 key 是 host_key。改成其它的也可以,但必须保证 DataKit 中的配置字段名与 DDTrace 中的字段名一致。

最终的效果是:在使用 datakit monitor 时可以看到 statsd/tomcat/cn-shanghai-sq5ei,这样可以与其它两样报告给 statsd 采集器的数据源区分开来。如果没有进行以上配置,那么在 datakit monitor 上看到的是默认展示:statsd/-/-

另外,有配置开关 save_above_key 决定是否将 statsd_source_keystatsd_host_key 对应的 tag 报告给中心。默认不报告(false)。

指标

以下所有数据采集,默认会追加名为 host 的全局 tag(tag 值为 DataKit 所在主机名),也可以在配置中通过 [inputs.statsd.tags] 指定其它标签:

 [inputs.statsd.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...

jvm

Tags & Fields Description
host
(tag)
Host name.
instance
(tag)
Instance name.
jmx_domain
(tag)
JMX domain.
metric_type
(tag)
Metric type.
name
(tag)
Type name.
runtime-id
(tag)
Runtime id.
service
(tag)
Service name.
type
(tag)
Object type.
buffer_pool_direct_capacity Measure of total memory capacity of direct buffers.
Type: float | (gauge)
Unit: digital,B
buffer_pool_direct_count Number of direct buffers in the pool.
Type: float | (gauge)
Unit: count
buffer_pool_direct_used Measure of memory used by direct buffers.
Type: float | (gauge)
Unit: digital,B
buffer_pool_mapped_capacity Measure of total memory capacity of mapped buffers.
Type: float | (gauge)
Unit: digital,B
buffer_pool_mapped_count Number of mapped buffers in the pool.
Type: float | (gauge)
Unit: count
buffer_pool_mapped_used Measure of memory used by mapped buffers.
Type: float | (gauge)
Unit: digital,B
cpu_load_process Recent CPU utilization for the process.
Type: float | (gauge)
Unit: percent,percent
cpu_load_system Recent CPU utilization for the whole system.
Type: float | (gauge)
Unit: percent,percent
daemon_code_cache_used The number of daemon threads.
Type: float | (count)
Unit: count
daemon_thread_count Daemon thread count.
Type: float | (gauge)
Unit: count
gc_code_cache_used GC code cache used.
Type: float | (gauge)
Unit: count
gc_eden_size The 'eden' size in garbage collection.
Type: float | (gauge)
Unit: digital,B
gc_major_collection_count The rate of major garbage collections.
Type: float | (gauge)
Unit: count
gc_major_collection_time The fraction of time spent(rate) in major garbage collection.
Type: float | (gauge)
Unit: time,ms
gc_metaspace_size The metaspace size in garbage collection.
Type: float | (gauge)
Unit: digital,B
gc_minor_collection_count The rate of minor garbage collections.
Type: float | (gauge)
Unit: count
gc_minor_collection_time The fraction of time spent(rate) in minor garbage collection.
Type: float | (gauge)
Unit: time,ms
gc_old_gen_size The ond gen size in garbage collection.
Type: float | (gauge)
Unit: digital,B
gc_survivor_size The survivor size in garbage collection.
Type: float | (gauge)
Unit: digital,B
heap_memory The total Java heap memory used.
Type: float | (gauge)
Unit: digital,B
heap_memory_committed The total Java heap memory committed to be used.
Type: float | (gauge)
Unit: digital,B
heap_memory_init The initial Java heap memory allocated.
Type: float | (gauge)
Unit: digital,B
heap_memory_max The maximum Java heap memory available.
Type: float | (gauge)
Unit: digital,B
loaded_classes Number of classes currently loaded.
Type: float | (gauge)
Unit: count
non_heap_memory The total Java non-heap memory used. Non-heap memory is: Metaspace + CompressedClassSpace + CodeCache.
Type: float | (gauge)
Unit: digital,B
non_heap_memory_committed The total Java non-heap memory committed to be used.
Type: float | (gauge)
Unit: digital,B
non_heap_memory_init The initial Java non-heap memory allocated.
Type: float | (gauge)
Unit: digital,B
non_heap_memory_max The maximum Java non-heap memory available.
Type: float | (gauge)
Unit: digital,B
os_open_file_descriptors The number of file descriptors used by this process (only available for processes run as the dd-agent user)
Type: float | (gauge)
Unit: count
peak_thread_count The peak number of live threads.
Type: float | (count)
Unit: count
thread_count The number of live threads.
Type: float | (count)
Unit: count
total_thread_count The number of total threads.
Type: float | (count)
Unit: count

jmx

Tags & Fields Description
host
(tag)
Host name.
instance
(tag)
Instance name.
jmx_domain
(tag)
JMX domain.
metric_type
(tag)
Metric type.
name
(tag)
Type name.
runtime-id
(tag)
Runtime id.
service
(tag)
Service name.
type
(tag)
Object type.
gc_cms.count The total number of garbage collections that have occurred.
Type: float | (count)
Unit: count
gc_major_collection_count The rate of major garbage collections.
Type: float | (gauge)
Unit: count
gc_major_collection_time The fraction of time spent in major garbage collection. Set new_gc_metrics: true to receive this metric.
Type: float | (gauge)
Unit: PPM
gc_minor_collection_count The rate of minor garbage collections.
Type: float | (gauge)
Unit: count
gc_minor_collection_time The fraction of time spent in minor garbage collection. Set new_gc_metrics: true to receive this metric.
Type: float | (gauge)
Unit: PPM
gc_parnew.time The approximate accumulated garbage collection time elapsed.
Type: float | (gauge)
Unit: time,ms
heap_memory The total Java heap memory used.
Type: float | (gauge)
Unit: digital,B
heap_memory_committed The total Java heap memory committed to be used.
Type: float | (gauge)
Unit: digital,B
heap_memory_init The initial Java heap memory allocated.
Type: float | (gauge)
Unit: digital,B
heap_memory_max The maximum Java heap memory available.
Type: float | (gauge)
Unit: digital,B
non_heap_memory The total Java non-heap memory used. Non-heap memory is calculated as follows: 'Metaspace' + CompressedClassSpace + CodeCache
Type: float | (gauge)
Unit: digital,B
non_heap_memory_committed The total Java non-heap memory committed to be used.
Type: float | (gauge)
Unit: digital,B
non_heap_memory_init The initial Java non-heap memory allocated.
Type: float | (gauge)
Unit: digital,B
non_heap_memory_max The maximum Java non-heap memory available.
Type: float | (gauge)
Unit: digital,B
thread_count The number of live threads.
Type: float | (count)
Unit: count

ddtrace

Tags & Fields Description
endpoint
(tag)
Endpoint.
host
(tag)
Host name.
lang
(tag)
Lang type.
lang_interpreter
(tag)
Lang interpreter.
lang_interpreter_vendor
(tag)
Lang interpreter vendor.
lang_version
(tag)
Lang version.
metric_type
(tag)
Metric type.
priority
(tag)
Priority.
service
(tag)
Service name.
stat
(tag)
Stat.
tracer_version
(tag)
Tracer version.
tracer_agent_discovery_time Tracer agent discovery time.
Type: float | (gauge)
Unit: time,ms
tracer_api_errors_total Tracer api errors total.
Type: float | (gauge)
Unit: count
tracer_api_requests_total Tracer api requests total.
Type: float | (gauge)
Unit: count
tracer_flush_bytes_total Tracer flush bytes total.
Type: float | (gauge)
Unit: count
tracer_flush_traces_total Tracer flush traces total.
Type: float | (gauge)
Unit: count
tracer_queue_enqueued_bytes Tracer queue enqueued bytes.
Type: float | (gauge)
Unit: count
tracer_queue_enqueued_spans Tracer queue enqueued spans.
Type: float | (gauge)
Unit: count
tracer_queue_enqueued_traces Tracer queue enqueued traces.
Type: float | (gauge)
Unit: count
tracer_queue_max_length Tracer queue max length.
Type: float | (gauge)
Unit: count
tracer_scope_activate_count Tracer scope activate count.
Type: float | (gauge)
Unit: count
tracer_scope_close_count Tracer scope close count.
Type: float | (gauge)
Unit: count
tracer_span_pending_created Tracer span pending created.
Type: float | (gauge)
Unit: count
tracer_span_pending_finished Tracer span pending finished.
Type: float | (gauge)
Unit: count
tracer_trace_agent_discovery_time Tracer trace agent discovery time.
Type: float | (gauge)
Unit: count
tracer_trace_agent_send_time Tracer trace agent send time.
Type: float | (gauge)
Unit: count
tracer_trace_pending_created Tracer trace pending created.
Type: float | (gauge)
Unit: count
tracer_tracer_trace_buffer_fill_time Tracer trace buffer fill time.
Type: float | (gauge)
Unit: count

文档评价

文档内容是否对您有帮助? ×