StatsD
DDTrace Agent 采集的指标数据会通过 StatsD 数据类型发送到 DK 的 8125 端口上。其中包括 JVM 运行时的 CPU 、内存、线程、类加载信息,也包括开启的各种采集上来的 JMX 指标, 如: Kafka、Tomcat、RabbitMQ 等。
配置¶
前置条件¶
DDTrace 以 agent 形式运行时,不需要用户特意的开通 jmx 端口,如果没有开通端口的话, agent 会随机打开一个本地端口。
DDTrace 默认会采集 JVM 信息。默认情况下会发送到 localhost:8125.
如果是 k8s 环境下,需要配置 StatsD host 和 port:
可以使用 dd.jmxfetch.<INTEGRATION_NAME>.enabled=true 开启指定的采集器。
填写 INTEGRATION_NAME 之前可以先查看 默认支持的三方软件
比如 Tomcat 或者 Kafka:
采集器配置¶
进入 DataKit 安装目录下的 conf.d/samples 目录,复制 statsd.conf.sample 并命名为 statsd.conf。示例如下:
[[inputs.statsd]]
## Collector alias.
# source = "statsd/-/-"
## Collect interval, default is 10 seconds. (optional)
# interval = '10s'
protocol = "udp"
## Address to host unix listener on, linux only
service_unix_address = "/var/run/datakit/statsd.sock"
## Address and port to host UDP listener on
service_address = ":8125"
## Tag request metric. Used for distinguish feed metric name.
## eg, DD_TAGS=source_key:tomcat,host_key:cn-shanghai-sq5ei
## eg, -Ddd.tags=source_key:tomcat,host_key:cn-shanghai-sq5ei
# statsd_source_key = "source_key"
# statsd_host_key = "host_key"
## Indicate whether report tag statsd_source_key and statsd_host_key.
# save_above_key = false
delete_gauges = true
delete_counters = true
delete_sets = true
delete_timings = true
## Counter metric is float in new Datakit version, set true if want be int.
# set_counter_int = false
## Percentiles to calculate for timing & histogram stats
percentiles = [50.0, 90.0, 99.0, 99.9, 99.95, 100.0]
## separator to use between elements of a statsd metric
metric_separator = "_"
## Parses tags in the datadog statsd format
## http://docs.datadoghq.com/guides/dogstatsd/
parse_data_dog_tags = true
## Parses datadog extensions to the statsd format
datadog_extensions = true
## Parses distributions metric as specified in the datadog statsd format
## https://docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition
datadog_distributions = true
## We do not need following tags(they may create tremendous of time-series under influxdb's logic)
# Examples:
# "runtime-id", "metric-type"
drop_tags = [ ]
# All metric-name prefixed with 'jvm_' are set to influxdb's measurement 'jvm'
# All metric-name prefixed with 'stats_' are set to influxdb's measurement 'stats'
# Examples:
# "stats_:stats", "jvm_:jvm", "tomcat_:tomcat",
metric_mapping = [ ]
## Number of UDP messages allowed to queue up, once filled,
## the statsd server will start dropping packets, default is 128.
# allowed_pending_messages = 128
## Number of timing/histogram values to track per-measurement in the
## calculation of percentiles. Raising this limit increases the accuracy
## of percentiles but also increases the memory usage and cpu time.
percentile_limit = 1000
## Max duration (TTL) for each metric to stay cached/reported without being updated.
#max_ttl = "1000h"
[inputs.statsd.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
配置好后,重启 DataKit 即可。
目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。
Info
如果日志出现大量 Feed: io busy,可以配置 interval = '1s',最低 1s。
标记数据源¶
如果想标记 DDTrace 采集的主机,可以使用注入 tags 的方式进行标记:
- 可以使用环境变量,即
DD_TAGS,例如:DD_TAGS=source_key:tomcat,host_key:cn-shanghai-sq5ei - 可以使用命令行方式,即
dd.tags,例如:-Ddd.tags=source_key:tomcat,host_key:cn-shanghai-sq5ei
在上面的例子中,需要在 DataKit 配置中指定 source 的 key 是 source_key,host 的 key 是 host_key。改成其它的也可以,但必须保证 DataKit 中的配置字段名与 DDTrace 中的字段名一致。
最终的效果是:在使用 datakit monitor 时可以看到 statsd/tomcat/cn-shanghai-sq5ei,这样可以与其它两样报告给 statsd 采集器的数据源区分开来。如果没有进行以上配置,那么在 datakit monitor 上看到的是默认展示:statsd/-/-。
另外,有配置开关 save_above_key 决定是否将 statsd_source_key 和 statsd_host_key 对应的 tag 报告给中心。默认不报告(false)。
指标¶
以下所有数据采集,默认会追加名为 host 的全局 tag(tag 值为 DataKit 所在主机名),也可以在配置中通过 [inputs.statsd.tags] 指定其它标签:
jvm¶
| Tags & Fields | Description |
|---|---|
| host ( tag) |
Host name. |
| instance ( tag) |
Instance name. |
| jmx_domain ( tag) |
JMX domain. |
| metric_type ( tag) |
Metric type. |
| name ( tag) |
Type name. |
| runtime-id ( tag) |
Runtime id. |
| service ( tag) |
Service name. |
| type ( tag) |
Object type. |
| buffer_pool_direct_capacity | Measure of total memory capacity of direct buffers. Type: float | (gauge) Unit: digital,B |
| buffer_pool_direct_count | Number of direct buffers in the pool. Type: float | (gauge) Unit: count |
| buffer_pool_direct_used | Measure of memory used by direct buffers. Type: float | (gauge) Unit: digital,B |
| buffer_pool_mapped_capacity | Measure of total memory capacity of mapped buffers. Type: float | (gauge) Unit: digital,B |
| buffer_pool_mapped_count | Number of mapped buffers in the pool. Type: float | (gauge) Unit: count |
| buffer_pool_mapped_used | Measure of memory used by mapped buffers. Type: float | (gauge) Unit: digital,B |
| cpu_load_process | Recent CPU utilization for the process. Type: float | (gauge) Unit: percent,percent |
| cpu_load_system | Recent CPU utilization for the whole system. Type: float | (gauge) Unit: percent,percent |
| daemon_code_cache_used | The number of daemon threads. Type: float | (count) Unit: count |
| daemon_thread_count | Daemon thread count. Type: float | (gauge) Unit: count |
| gc_code_cache_used | GC code cache used. Type: float | (gauge) Unit: count |
| gc_eden_size | The 'eden' size in garbage collection. Type: float | (gauge) Unit: digital,B |
| gc_major_collection_count | The rate of major garbage collections. Type: float | (gauge) Unit: count |
| gc_major_collection_time | The fraction of time spent(rate) in major garbage collection. Type: float | (gauge) Unit: time,ms |
| gc_metaspace_size | The metaspace size in garbage collection.Type: float | (gauge) Unit: digital,B |
| gc_minor_collection_count | The rate of minor garbage collections. Type: float | (gauge) Unit: count |
| gc_minor_collection_time | The fraction of time spent(rate) in minor garbage collection. Type: float | (gauge) Unit: time,ms |
| gc_old_gen_size | The ond gen size in garbage collection. Type: float | (gauge) Unit: digital,B |
| gc_survivor_size | The survivor size in garbage collection. Type: float | (gauge) Unit: digital,B |
| heap_memory | The total Java heap memory used. Type: float | (gauge) Unit: digital,B |
| heap_memory_committed | The total Java heap memory committed to be used. Type: float | (gauge) Unit: digital,B |
| heap_memory_init | The initial Java heap memory allocated. Type: float | (gauge) Unit: digital,B |
| heap_memory_max | The maximum Java heap memory available. Type: float | (gauge) Unit: digital,B |
| loaded_classes | Number of classes currently loaded. Type: float | (gauge) Unit: count |
| non_heap_memory | The total Java non-heap memory used. Non-heap memory is: Metaspace + CompressedClassSpace + CodeCache.Type: float | (gauge) Unit: digital,B |
| non_heap_memory_committed | The total Java non-heap memory committed to be used. Type: float | (gauge) Unit: digital,B |
| non_heap_memory_init | The initial Java non-heap memory allocated. Type: float | (gauge) Unit: digital,B |
| non_heap_memory_max | The maximum Java non-heap memory available. Type: float | (gauge) Unit: digital,B |
| os_open_file_descriptors | The number of file descriptors used by this process (only available for processes run as the dd-agent user) Type: float | (gauge) Unit: count |
| peak_thread_count | The peak number of live threads. Type: float | (count) Unit: count |
| thread_count | The number of live threads. Type: float | (count) Unit: count |
| total_thread_count | The number of total threads. Type: float | (count) Unit: count |
jmx¶
| Tags & Fields | Description |
|---|---|
| host ( tag) |
Host name. |
| instance ( tag) |
Instance name. |
| jmx_domain ( tag) |
JMX domain. |
| metric_type ( tag) |
Metric type. |
| name ( tag) |
Type name. |
| runtime-id ( tag) |
Runtime id. |
| service ( tag) |
Service name. |
| type ( tag) |
Object type. |
| gc_cms.count | The total number of garbage collections that have occurred. Type: float | (count) Unit: count |
| gc_major_collection_count | The rate of major garbage collections. Type: float | (gauge) Unit: count |
| gc_major_collection_time | The fraction of time spent in major garbage collection. Set new_gc_metrics: true to receive this metric. Type: float | (gauge) Unit: PPM |
| gc_minor_collection_count | The rate of minor garbage collections. Type: float | (gauge) Unit: count |
| gc_minor_collection_time | The fraction of time spent in minor garbage collection. Set new_gc_metrics: true to receive this metric. Type: float | (gauge) Unit: PPM |
| gc_parnew.time | The approximate accumulated garbage collection time elapsed. Type: float | (gauge) Unit: time,ms |
| heap_memory | The total Java heap memory used. Type: float | (gauge) Unit: digital,B |
| heap_memory_committed | The total Java heap memory committed to be used. Type: float | (gauge) Unit: digital,B |
| heap_memory_init | The initial Java heap memory allocated. Type: float | (gauge) Unit: digital,B |
| heap_memory_max | The maximum Java heap memory available. Type: float | (gauge) Unit: digital,B |
| non_heap_memory | The total Java non-heap memory used. Non-heap memory is calculated as follows: 'Metaspace' + CompressedClassSpace + CodeCache Type: float | (gauge) Unit: digital,B |
| non_heap_memory_committed | The total Java non-heap memory committed to be used. Type: float | (gauge) Unit: digital,B |
| non_heap_memory_init | The initial Java non-heap memory allocated. Type: float | (gauge) Unit: digital,B |
| non_heap_memory_max | The maximum Java non-heap memory available. Type: float | (gauge) Unit: digital,B |
| thread_count | The number of live threads. Type: float | (count) Unit: count |
ddtrace¶
| Tags & Fields | Description |
|---|---|
| endpoint ( tag) |
Endpoint. |
| host ( tag) |
Host name. |
| lang ( tag) |
Lang type. |
| lang_interpreter ( tag) |
Lang interpreter. |
| lang_interpreter_vendor ( tag) |
Lang interpreter vendor. |
| lang_version ( tag) |
Lang version. |
| metric_type ( tag) |
Metric type. |
| priority ( tag) |
Priority. |
| service ( tag) |
Service name. |
| stat ( tag) |
Stat. |
| tracer_version ( tag) |
Tracer version. |
| tracer_agent_discovery_time | Tracer agent discovery time. Type: float | (gauge) Unit: time,ms |
| tracer_api_errors_total | Tracer api errors total. Type: float | (gauge) Unit: count |
| tracer_api_requests_total | Tracer api requests total. Type: float | (gauge) Unit: count |
| tracer_flush_bytes_total | Tracer flush bytes total. Type: float | (gauge) Unit: count |
| tracer_flush_traces_total | Tracer flush traces total. Type: float | (gauge) Unit: count |
| tracer_queue_enqueued_bytes | Tracer queue enqueued bytes. Type: float | (gauge) Unit: count |
| tracer_queue_enqueued_spans | Tracer queue enqueued spans. Type: float | (gauge) Unit: count |
| tracer_queue_enqueued_traces | Tracer queue enqueued traces. Type: float | (gauge) Unit: count |
| tracer_queue_max_length | Tracer queue max length. Type: float | (gauge) Unit: count |
| tracer_scope_activate_count | Tracer scope activate count. Type: float | (gauge) Unit: count |
| tracer_scope_close_count | Tracer scope close count. Type: float | (gauge) Unit: count |
| tracer_span_pending_created | Tracer span pending created. Type: float | (gauge) Unit: count |
| tracer_span_pending_finished | Tracer span pending finished. Type: float | (gauge) Unit: count |
| tracer_trace_agent_discovery_time | Tracer trace agent discovery time. Type: float | (gauge) Unit: count |
| tracer_trace_agent_send_time | Tracer trace agent send time. Type: float | (gauge) Unit: count |
| tracer_trace_pending_created | Tracer trace pending created. Type: float | (gauge) Unit: count |
| tracer_tracer_trace_buffer_fill_time | Tracer trace buffer fill time. Type: float | (gauge) Unit: count |