DDTrace¶
DDTrace 是 DataDog 开源的 APM 产品,Datakit 内嵌的 DDTrace Agent 用于接收,运算,分析 DataDog Tracing 协议数据。
DDTrace 文档和示例¶
Tip
我们对 DDTrace 做了一些功能扩展,便于支持更多的主流框架和更细粒度的数据追踪。
配置¶
进入 DataKit 安装目录下的 conf.d/ddtrace
目录,复制 ddtrace.conf.sample
并命名为 ddtrace.conf
。示例如下:
[[inputs.ddtrace]]
## DDTrace Agent endpoints register by version respectively.
## Endpoints can be skipped listen by remove them from the list.
## NOTE: DO NOT EDIT.
endpoints = ["/v0.3/traces", "/v0.4/traces", "/v0.5/traces"]
## ignore_tags will work as a blacklist to prevent tags send to data center.
## Every value in this list is a valid string of regular expression.
# ignore_tags = ["block1", "block2"]
## Keep rare tracing resources list switch.
## If some resources are rare enough(not presend in 1 hour), those resource will always send
## to data center and do not consider samplers and filters.
# keep_rare_resource = false
## By default every error presents in span will be send to data center and omit any filters or
## sampler. If you want to get rid of some error status, you can set the error status list here.
# omit_err_status = ["404"]
## compatible otel: It is possible to compatible OTEL Trace with DDTrace trace.
## make span_id and parent_id to hex encoding.
# compatible_otel=true
## Ignore tracing resources map like service:[resources...].
## The service name is the full service name in current application.
## The resource list is regular expressions uses to block resource names.
## If you want to block some resources universally under all services, you can set the
## service name as "*". Note: double quotes "" cannot be omitted.
# [inputs.ddtrace.close_resource]
# service1 = ["resource1", "resource2", ...]
# service2 = ["resource1", "resource2", ...]
# "*" = ["close_resource_under_all_services"]
# ...
## Sampler config uses to set global sampling strategy.
## sampling_rate used to set global sampling rate.
# [inputs.ddtrace.sampler]
# sampling_rate = 1.0
# [inputs.ddtrace.tags]
# key1 = "value1"
# key2 = "value2"
# ...
## Threads config controls how many goroutines an agent cloud start to handle HTTP request.
## buffer is the size of jobs' buffering of worker channel.
## threads is the total number fo goroutines at running time.
# [inputs.ddtrace.threads]
# buffer = 100
# threads = 8
## Storage config a local storage space in hard dirver to cache trace data.
## path is the local file path used to cache data.
## capacity is total space size(MB) used to store data.
# [inputs.ddtrace.storage]
# path = "./ddtrace_storage"
# capacity = 5120
配置好后,重启 DataKit 即可。
目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。
在 Kubernetes 中支持的环境变量如下表:
环境变量名 | 类型 | 示例 |
---|---|---|
ENV_INPUT_DDTRACE_ENDPOINTS |
JSON string | ["/v0.3/traces", "/v0.4/traces", "/v0.5/traces"] |
ENV_INPUT_DDTRACE_IGNORE_TAGS |
JSON string | ["block1", "block2"] |
ENV_INPUT_DDTRACE_KEEP_RARE_RESOURCE |
bool | true |
ENV_INPUT_DDTRACE_COMPATIBLE_OTEL |
bool | true |
ENV_INPUT_DDTRACE_OMIT_ERR_STATUS |
JSON string | ["404", "403", "400"] |
ENV_INPUT_DDTRACE_CLOSE_RESOURCE |
JSON string | {"service1":["resource1"], "service2":["resource2"], "service3":["resource3"]} |
ENV_INPUT_DDTRACE_SAMPLER |
float | 0.3 |
ENV_INPUT_DDTRACE_TAGS |
JSON string | {"k1":"v1", "k2":"v2", "k3":"v3"} |
ENV_INPUT_DDTRACE_THREADS |
JSON string | {"buffer":1000, "threads":100} |
ENV_INPUT_DDTRACE_STORAGE |
JSON string | {"storage":"./ddtrace_storage", "capacity": 5120} |
Attention
- 不要修改这里的
endpoints
列表(除非明确知道配置逻辑和效果)。
- 如果要关闭采样(即采集所有数据),采样率字段需做如下设置:
不要只注释 sampling_rate = 1.0
这一行,必须连同 [inputs.ddtrace.sampler]
也一并注释掉,否则采集器会认为 sampling_rate
被置为 0.0,从而导致所有数据都被丢弃。
HTTP 设置¶
如果 Trace 数据是跨机器发送过来的,那么需要设置 DataKit 的 HTTP 设置。
如果有 DDTrace 数据发送给 Datakit,那么在 DataKit 的 monitor 上能看到:
开启磁盘缓存¶
如果 Trace 数据量很大,为避免给主机造成大量的资源开销,可以将 Trace 数据临时缓存到磁盘中,延迟处理:
DDtrace SDK 配置¶
配置完采集器之后,还可以对 DDtrace SDK 端做一些配置。
环境变量设置¶
DD_TRACE_ENABLED
: Enable global tracer (部分语言平台支持)DD_AGENT_HOST
: DDtrace agent host addressDD_TRACE_AGENT_PORT
: DDtrace agent host portDD_SERVICE
: Service nameDD_TRACE_SAMPLE_RATE
: Set sampling rateDD_VERSION
: Application version (optional)DD_TRACE_STARTUP_LOGS
: DDtrace loggerDD_TRACE_DEBUG
: DDtrace debug modeDD_ENV
: Application env valuesDD_TAGS
: Application
除了在应用初始化时设置项目名,环境名以及版本号外,还可通过如下两种方式设置:
- 通过命令行注入环境变量
- 在 ddtrace.conf 中直接配置自定义标签。这种方式会影响所有发送给 Datakit tracing 服务的数据,需慎重考虑:
# tags is ddtrace configed key value pairs
[inputs.ddtrace.tags]
some_tag = "some_value"
more_tag = "some_other_value"
在代码中添加业务 tag¶
在应用代码中,可通过诸如 span.SetTag(some-tag-key, some-tag-value)
(不同语言方式不同) 这样的方式来设置业务自定义 tag。对于这些业务自定义 tag,可通过在 ddtrace.conf 中配置 customer_tags
来识别并提取:
注意,这些 tag-key 中不能包含英文字符 '.',带 .
的 tag-key 会替换为 _
。
应用代码中添加业务 tag 注意事项
- 在应用代码中添加了对应的 Tag 后,必须在 ddtrace.conf 的
customer_tags
中也同步添加对应的 Tag-Key 列表,否则 Datakit 不会对这些业务 Tag 进行提取 - 在开启了采样的情况下,部分添加了 Tag 的 Span 有可能被舍弃
链路字段¶
ddtrace
¶
- 标签
Tag | Description |
---|---|
container_host |
Container hostname. Available in OpenTelemetry. Optional. |
endpoint |
Endpoint info. Available in SkyWalking, Zipkin. Optional. |
env |
Application environment info. Available in Jaeger. Optional. |
host |
Hostname. |
http_method |
HTTP request method name. Available in DDTrace, OpenTelemetry. Optional. |
http_route |
HTTP route. Optional. |
http_status_code |
HTTP response code. Available in DDTrace, OpenTelemetry. Optional. |
http_url |
HTTP URL. Optional. |
operation |
Span name |
project |
Project name. Available in Jaeger. Optional. |
service |
Service name. Optional. |
source_type |
Tracing source type |
span_type |
Span type |
status |
Span status |
version |
Application version info. Available in Jaeger. Optional. |
- 指标列表
Metric | Description | Type | Unit |
---|---|---|---|
duration |
Duration of span | int | μs |
message |
Origin content of span | string | - |
parent_id |
Parent span ID of current span | string | - |
pid |
Application process id. Available in DDTrace, OpenTelemetry. Optional. | string | - |
priority |
Optional. | int | - |
resource |
Resource name produce current span | string | - |
span_id |
Span id | string | - |
start |
start time of span. | int | usec |
trace_id |
Trace id | string | - |