Skip to content

Ingestion Canary

·


This collector is used to detect data ingestion availability and latency. It automatically generates probe data (metrics, logs, traces) and then verifies data collection success through DQL queries, measuring the latency from data sending to queryable.

Requirements

  • DataWay must be configured for data reporting and DQL queries
  • If result_workspace is configured, ensure the workspace URL is accessible

Configuration

Go to the conf.d/samples directory under the DataKit installation directory, copy ingestion_canary.conf.sample and name it ingestion_canary.conf. Examples are as follows:

[[inputs.ingestion_canary]]
  ## Collect interval, default 10m
  interval = "10m"

  ## Query timeout, should be less than interval, default 5m
  query_timeout = "5m"

  ## Poll interval for DQL query, default 500ms
  poll_interval = "500ms"

  ## Max retry count for query errors, default 10
  error_retries = 10

  ## Result workspace URL to report metrics
  # result_workspace = "https://openway.guance.com?token=xxx"

  ## Data categories to collect (metric, logging, tracing)
  ## If not specified, all categories will be collected
  categories = ["metric", "logging", "tracing"]

  ## Logging configuration
  [inputs.ingestion_canary.logging]
    storage_index = "default"

  ## Enable election mode
  election = true

  ## Extra tags
  [inputs.ingestion_canary.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"

Once configured, restart DataKit.

Measurements

This collector generates two types of data:

  1. Probe Data: Probe data points (metrics, logs, traces) used to test data ingestion availability
  2. Result Metric: Test result metrics containing latency and test status

Probe data points do not carry global tags, only include the data point's own fields and tags, plus tags specified in configuration.

Probe Data

ingestion_canary (Metric)

Tags & Fields Description
test_type
(tag)
Test type: collect (collector) or cmd (CLI tool)
round Round number of the ingestion canary probe
Type: int | (gauge)
Unit: -

ingestion_canary (Logging)

Tags & Fields Description
test_type
(tag)
Test type: collect (collector) or cmd (CLI tool)
message Synthetic freshness probe message
Type: string
Unit: -
round Round number of the ingestion canary probe
Type: int
Unit: -

ingestion_canary (Tracing)

Tags & Fields Description
service
(tag)
Service name
source
(tag)
Source name
span_type
(tag)
Span type
test_type
(tag)
Test type: collect (collector) or cmd (CLI tool)
duration Duration in microseconds
Type: int
Unit: time,μs
parent_id Parent span ID
Type: string
Unit: -
resource Resource name
Type: string
Unit: -
round Round number of the ingestion canary probe
Type: int
Unit: -
span_id Span ID
Type: string
Unit: -
start Start time in microseconds
Type: int
Unit: timeStamp,usec
status Span status
Type: string
Unit: -
trace_id Trace ID
Type: string
Unit: -

Result Metric

ingestion_canary_result

Tags & Fields Description
category
(tag)
Data category: M (metric), L (logging), T (tracing)
status
(tag)
Test status: ok, timeout, error
storage_index
(tag)
Storage index for logging data (optional)
latency_ms Latency from feed to queryable in milliseconds
Type: int | (gauge)
Unit: time,ms

CLI Tool

In addition to the collector mode, a CLI tool is provided for one-time testing:

# Use default configuration
datakit tool --ingestion-canary

# Specify storage index for logging data
datakit tool --ingestion-canary --ingestion-canary-index my_index

Options:

  • --ingestion-canary: Enable ingestion canary test tool
  • --ingestion-canary-index: Specify storage index for logging data, default is "default" (only applies to logging data)

Description:

The tool generates one round of probe data (metrics, logs, traces), sends it to DataWay, then continuously queries until data is found or user interrupts, and outputs latency for each data type. The tool runs continuously with 10 second interval between rounds.

Feedback

Is this page helpful? ×