Skip to content

Datakit Tracing Data Structure

Brief

This article is used to explain the data structure of the mainstream Telemetry platform and the mapping relationship with the data structure of Datakit platform. Currently supported data structures: DataDog, Jaeger, OpenTelemetry, SkyWalking, Zipkin.

Data conversion steps:

  1. External Tracing data structure access
  2. Datakit Span transformation
  3. span data operations
  4. Line Protocol transformation

Datakit Point Protocol Data Structure

  • Tags
Tag Description
container_host Host name of container
endpoint End point of resource
env Environment arguments
http_host HTTP host
http_method HTTP method
http_route HTTP route
http_status_code HTTP status code
http_url HTTP URL
operation Operation of resource
pid Process id
project Project name
service Service name
source_type Source types [app/framework/cache/message_queue/custom/db/web/...]
span_type Span types
status Span status [ok/info/warning/error/critical]
  • Field
Metric Description Type Unit
create_time Guancedb storage create timestamp 1 int s
duration Span duration int us
message Raw data content string
parent_id Parent ID of span string
priority priority rules string
resource Resource of service string
span_id Span ID string
start Span start timestamp int us
time Datakit received timestamp int ns
trace_id Trace ID string

span_type is the relative position of the current span in trace, and its value is described as follows:

  • entry: the current API is the first call after the entry of the link into the service
  • local: the current API is the api after the entrance and before the exit
  • exit: the current API is the link's last call on the service
  • unknown: the relative position state of the current api is not clear

priority are samples priority rules for clients:

  • PRIORITY_USER_REJECT = -1 User chooses to reject reporting
  • PRIORITY_AUTO_REJECT = 0 Client sampler chooses to reject reporting
  • PRIORITY_AUTO_KEEP = 1 Client sampler select report
  • PRIORITY_USER_KEEP = 2 User chooses to report

Datakit Tracing Span Data Structure

TraceID    string                 `json:"trace_id"`
ParentID   string                 `json:"parent_id"`
SpanID     string                 `json:"span_id"`
Service    string                 `json:"service"`     // service name
Resource   string                 `json:"resource"`    // resource or api under service
Operation  string                 `json:"operation"`   // api name
Source     string                 `json:"source"`      // client tracer name
SpanType   string                 `json:"span_type"`   // relative span position in tracing: entry, local, exit or unknown
SourceType string                 `json:"source_type"` // service type
Tags       map[string]string      `json:"tags"`
Metrics    map[string]interface{} `json:"metrics"`
Start      int64                  `json:"start"`    // unit: nano sec
Duration   int64                  `json:"duration"` // unit: nano sec
Status     string                 `json:"status"`   // span status like error, ok, info etc.
Content    string                 `json:"content"`  // raw tracing data in json

Datakit Span is a data structure used internally by Datakit. The third-party Tracing Agent data structure is converted into a Datakit Span structure and sent to the data center.

Hereinafter referred to as dkspan

Field Name Data Type Unit Description Correspond To
TraceID string Trace ID dkproto.fields.trace_id
ParentID string Parent Span ID dkproto.fields.parent_id
SpanID string Span ID dkproto.fields.span_id
Service string Service Name dkproto.tags.service
Resource string Resource Name(.e.g /get/data/from/some/api) dkproto.fields.resource
Operation string The method name that produces this Span dkproto.tags.operation
Source string Span source(.e.g ddtrace) dkproto.name
SpanType string Span Type(.e.g Entry) dkproto.tags.span_type
SourceType string Span Source Type(.e.g Web) dkproto.tags.type
Tags map[string, string] Span Tags dkproto.tags
Metrics map[string, interface{}] Span Metrics(for calculation) dkproto.fields
Start int64 Nanosecond Span Starting time dkproto.fields.start
Duration int64 Nanosecond Time consuming dkproto.fields.duration
Status string Span status field dkproto.tags.status
Content string Span raw data dkproto.fields.message

DDTrace Trace&Span Data Structure

DDTrace Trace Data Structure

DataDog Trace Structure

Trace: []*span

DataDog Traces Structure

Traces: []Trace

DDTrace Span Data Structure

Field Name Data Type Unit Description Correspond To
TraceID uint64 Trace ID dkspan.TraceID
ParentID uint64 Parent Span ID dkspan.ParentID
SpanID uint64 Span ID dkspan.SpanID
Service string Server name dkspan.Service
Resource string Resource name dkspan.Resource
Name string The name of the method to produce this Span dkspan.Operation
Start int64 Nanosecond Span starting time dkspan.Start
Duration int64 Nanosecond Time consuming dkspan.Duration
Error int32 Span Status field 0: No error 1: Error dkspan.Status
Meta map[string, string] Span process metadata, environment-related, and service-related fields are obtained from here dkspan.Project, dkspan.Env, dkspan.Version, dkspan.ContainerHost, dkspan.HTTPMethod, dkspan.HTTPStatusCode
Metrics map[string, float64] Span sampling, computing related data Indirect correspondence to dkspan
Type string Span Type dkspan.SourceType

OpenTelemetry Tracing Data Structure

When DataKit collects data sent from OpenTelemetry exporter: Otlp, the abbreviated raw data, after serialization by json, looks like this:

resource_spans:{
    resource:{
        attributes:{key:"message.type"  value:{string_value:"message-name"}}
        attributes:{key:"service.name"  value:{string_value:"test-name"}}
    }
    instrumentation_library_spans:{instrumentation_library:{name:"test-tracer"}
    spans:{
        trace_id:"\x94<\xdf\x00zx\x82\xe7Wy\xfe\x93\xab\x19\x95a"
        span_id:".\xbd\x06c\x10ɫ*"
        parent_span_id:"\xa7*\x80Z#\xbeL\xf6"
        name:"Sample-0"
        kind:SPAN_KIND_INTERNAL
        start_time_unix_nano:1644312397453313100
        end_time_unix_nano:1644312398464865900
        status:{}
    }
    spans:{
           ...
        }
}

The correspondence between resource_spans and dkspan in otel is as follows:

Field Name Data Type Unit Description Correspond To
trace_id [16]byte Trace ID dkspan.TraceID
span_id [8]byte Span ID dkspan.SpanID
parent_span_id [8]byte Parent Span ID dkspan.ParentID
name string Span Name dkspan.Operation
kind string Span Type dkspan.SpanType
start_time_unix_nano int64 Nanosecond Span starting time dkspan.Start
end_time_unix_nano int64 Nanosecond Span ending time dkspan.Duration = end - start
status string Span Status dkspan.Status
name string resource Name dkspan.Resource
resource.attributes map[string]string resource tag dkspan.tags.service, dkspan.tags.project, dkspan.tags.env, dkspan.tags.version, dkspan.tags.container_host, dkspan.tags.http_method, dkspan.tags.http_status_code
span.attributes map[string]string Span tag dkspan.tags

otel has some unique fields, but DKspan has no corresponding fields, so it is placed in the label and will only be displayed if these values are not 0, such as:

Field Date Type Uint Description Correspond
span.dropped_attributes_count int Span number of tags removed dkspan.tags.dropped_attributes_count
span.dropped_events_count int Span number of events deleted dkspan.tags.dropped_events_count
span.dropped_links_count int Span number of connections deleted dkspan.tags.dropped_links_count
span.events_count int Number of Span associated events dkspan.tags.events_count
span.links_count int The number of spans associated with a span dkspan.tags.links_count

Jaeger Tracing Data Structure

Jaeger Thrift Protocol Batch Data Structure

Field Name Data Type Unit Description Correspond to
Process structure pointer Process-related data structure dkspan.Service
SeqNo int64 pointer Serial number Disconnected mapping relation dkspan
Spans array Span array structure See the table below
Stats structure pointer Client statistical structure not directly correspond to dkspan

Jaeger Thrift Protocol Span Data Structure

Field Name Data Type Unit Description Correspond To
TraceIdHigh int64 Trace ID High and TraceIdLow make up Trace ID dkspan.TraceID
TraceIdLow int64 Trace ID Low and TraceIdHigh make up Trace ID dkspan.TraceID
ParentSpanId int64 Parent Span ID dkspan.ParentID
SpanId int64 Span ID dkspan.SpanID
OperationName string The name of the method to produce this Span dkspan.Operation
Flags int32 Span Flags not directly correspond to dkspan
Logs array Span Logs not directly correspond to dkspan
References array Span References not directly correspond to dkspan
StartTime int64 Nanosecond Span Starting time dkspan.Start
Duration int64 Nanosecond Time consuming dkspan.Duration
Tags array Span Tags currently only takes the Span status field dkspan.Status

SkyWalking Tracing Data Data Structure

SkyWalking Segment Object Generated By Protocol Buffer Protocol V3

Field Name Data Type Unit Description Correspond To
TraceId string Trace ID dkspan.TraceID
TraceSegmentId string The Segment ID is used with the Span ID to uniquely identify a Span dkspan.SpanID high order.
Service string service dkspan.Service
ServiceInstance string Node logical relationship name Fields not used
Spans array Tracing Span Array See the table below
IsSizeLimited bool whether includes all Spans on the link Span Fields not used

SkyWalking Span Object Data Structure in Segment Object

Field Name Data Type Unit Description Correspond To
ComponentId int32 Numerical definition of third-party framework Fields not used
Refs array Storing Parent Segment across threads and processes dkspan.ParentID high position
ParentSpanId int32 The Parent Span ID is used with the Segment ID to uniquely identify a Parent Span dkspan.ParentID low position
SpanId int32 The Span ID is used with the Segment ID to uniquely identify a Span dkspan.SpanID low position
OperationName string Span Operation Name dkspan.Operation
Peer string Communication peer dkspan.Endpoint
IsError bool Span Status field dkspan.Status
SpanType int32 Span Type Numerical definition dkspan.SpanType
StartTime int64 Milliseconds Span Starting time dkspan.Start
EndTime int64 Milliseconds Span end time subtracted from StartTime represents elapsed time dkspan.Duration
Logs array Span Logs Fields not used
SkipAnalysis bool Skip back-end analysis Fields not used
SpanLayer int32 Span technology stack numerical definition Fields not used
Tags array Span Tags Fields not used

Zipkin Tracing Data Data Structure

Zipkin Thrift Protocol Span Data Structure V1

Field Name Data Type Unit Description Correspond To
TraceIDHigh uint64 Trace ID high position There is no direct mapping relationship
TraceID uint64 Trace ID dkspan.TraceID
ID uint64 Span ID dkspan.SpanID
ParentID uint64 Parent Span ID dkspan.ParentID
Annotations array get Service Name dkspan.Service
Name string Span Operation Name dkspan.Operation
BinaryAnnotations array Get Span status field dkspan.Status
Timestamp uint64 Microsecond Span Starting time dkspan.Start
Duration uint64 Microsecond Span Time consuming dkspan.Duration
Debug bool Debug status field Fields not used

Zipkin Span Data Structure V2

Field Name Data Type Unit Description Correspond To
TraceID structure Trace ID dkspan.TraceID
ID uint64 Span ID dkspan.SpanID
ParentID uint64 Parent Span ID dkspan.ParentID
Name string Span Operation Name dkspan.Operation
Debug bool Debug status Fields not used
Sampled bool Sampling status field Fields not used
Err string Error Message Indirect correspondence to dkspan
Kind string Span Type dkspan.SpanType
Timestamp structure Microsecond Microsecond time structure representation span starting time dkspan.Start
Duration int64 Microsecond Span Time consuming dkspan.Duration
Shared bool Shared state Fields not used
LocalEndpoint structure to get Service Name dkspan.Service
RemoteEndpoint structure Communication peer dkspan.Endpoint
Annotations array Used to explain delay-related events Fields not used
Tags map to get Span status dkspan.Status

  1. This field created by the Guancedb storage, not exist in Datakit side. 

Feedback

Is this page helpful? ×