ebpftrace¶
Configuration¶
Preconditions¶
The collector enables sampling by default, and the default sampling rate is 0.1
, which is 10%
of link sampling.
If the data volume is 1e6 span/min, it is currently necessary to provide at least 4C CPU resources and 4G MEM resources.
The ebpftrace
collector is used to receive and link eBPF spans, ultimately generate link trace_id, and establish the parent-child relationship between spans.
Please refer to the following deployment method (as shown below): The data generated by the ebpf-trace
plug-in of all ebpf
external collectors needs to be sent to the same openerOn the DataKit** of the ebpftracing
collector, the DataKit will reprocess all the link eBPF span data generated by the eBPF collector and upload it to the Guance Cloud in a unified manner.
If three applications App 1 ~ 3 of a service are located on two different nodes,
ebpftrace
currently uses tcp seq to confirm the network call relationship between processes. It is necessary to link the relevant eBPF span to generate trace_id and set parent_id .
Collector Configuration¶
Go to the conf.d/ebpftrace
directory under the DataKit installation directory, copy ebpftrace.conf.sample
and name it ebpftrace.conf
. Examples are as follows:
[[inputs.ebpftrace]]
sqlite_path = "/usr/local/datakit/ebpf_spandb"
use_app_trace_id = true
window = "20s"
sampling_rate = 0.1
The default configuration does not turn on eBPF-bash. If you need to turn on, add ebpf-bash
in the enabled_plugins
configuration item;
After configuration, restart DataKit.
When the collector needs to be deployed, the number of copies needs to be limited to 1. Refer to the following yaml for deployment. You need to set ENV_DATAWAY
and image
in the yaml:
apiVersion: v1
kind: Namespace
metadata:
name: datakit-ebpftrace
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: datakit-ebpftrace
labels:
app: deployment-datakit-ebpftrace
namespace: datakit-ebpftrace
spec:
replicas: 1
selector:
matchLabels:
app: deployment-datakit-ebpftrace
template:
metadata:
labels:
app: deployment-datakit-ebpftrace
spec:
containers:
- name: datakit-ebpftrace
image:
imagePullPolicy: Always
ports:
- containerPort: 9529
protocol: TCP
- containerPort: 6060
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "4000m"
memory: "8Gi"
env:
- name: HOST_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: ENV_K8S_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: ENV_HTTP_LISTEN
value: 0.0.0.0:9529
- name: ENV_DATAWAY
value: https://openway.guance.com?token=<xxx>
- name: ENV_GLOBAL_TAGS
value: host=__datakit_hostname,host_ip=__datakit_ip
- name: ENV_DEFAULT_ENABLED_INPUTS
value: ebpftrace
- name: ENV_INPUT_EBPFTRACE_WINDOW
value: "20s"
- name: ENV_INPUT_EBPFTRACE_SAMPLING_RATE
value: "0.1"
- name: ENV_ENABLE_PPROF
value: "true"
- name: ENV_PPROF_LISTEN
value: "0.0.0.0:6060"
---
apiVersion: v1
kind: Service
metadata:
name: datakit-ebpftrace-service
namespace: datakit-ebpftrace
spec:
selector:
app: deployment-datakit-ebpftrace
ports:
- protocol: TCP
port: 9529
targetPort: 9529
The ebpftrace collection configuration in Kubernetes can be adjusted through the following environment variables:
-
ENV_INPUT_EBPFTRACE_SQLITE_PATH
SQLite database file storage path
Type: String
ConfField:
sqlite_path
Example:
/usr/local/datakit/ebpf_spandb/
-
ENV_INPUT_EBPFTRACE_USE_APP_TRACE_ID
Use application-side trace id instead of eBPF trace id
Type: Boolean
ConfField:
use_app_trace_id
Default: false
-
ENV_INPUT_EBPFTRACE_WINDOW
Span's link time window
Type: TimeDuration
ConfField:
window
Default: 20s
-
ENV_INPUT_EBPFTRACE_SAMPLING_RATE
Link sampling rate
Type: Float
ConfField:
sampling_rate
Example: 0.1
Metric¶
For all of the following data collections, a global tag named host
is appended by default (the tag value is the host name of the DataKit), or other tags can be specified in the configuration by [inputs.ebpftrace.tags]
: