Ingestion Canary
This collector is used to detect data ingestion availability and latency. It automatically generates probe data (metrics, logs, traces) and then verifies data collection success through DQL queries, measuring the latency from data sending to queryable.
Requirements¶
- DataWay must be configured for data reporting and DQL queries
- If
result_workspaceis configured, ensure the workspace URL is accessible
Configuration¶
Go to the conf.d/samples directory under the DataKit installation directory, copy ingestion_canary.conf.sample and name it ingestion_canary.conf. Examples are as follows:
[[inputs.ingestion_canary]]
## Collect interval, default 10m
interval = "10m"
## Query timeout, should be less than interval, default 5m
query_timeout = "5m"
## Poll interval for DQL query, default 500ms
poll_interval = "500ms"
## Max retry count for query errors, default 10
error_retries = 10
## Result workspace URL to report metrics
# result_workspace = "https://openway.guance.com?token=xxx"
## Data categories to collect (metric, logging, tracing)
## If not specified, all categories will be collected
categories = ["metric", "logging", "tracing"]
## Logging configuration
[inputs.ingestion_canary.logging]
storage_index = "default"
## Enable election mode
election = true
## Extra tags
[inputs.ingestion_canary.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
Once configured, restart DataKit.
Can be turned on by ConfigMap Injection Collector Configuration or Config ENV_DATAKIT_INPUTS.
Measurements¶
This collector generates two types of data:
- Probe Data: Probe data points (metrics, logs, traces) used to test data ingestion availability
- Result Metric: Test result metrics containing latency and test status
Probe data points do not carry global tags, only include the data point's own fields and tags, plus tags specified in configuration.
Probe Data¶
ingestion_canary (Metric)¶
| Tags & Fields | Description |
|---|---|
| test_type ( tag) |
Test type: collect (collector) or cmd (CLI tool) |
| round | Round number of the ingestion canary probe Type: int | (gauge) Unit: - |
ingestion_canary (Logging)¶
| Tags & Fields | Description |
|---|---|
| test_type ( tag) |
Test type: collect (collector) or cmd (CLI tool) |
| message | Synthetic freshness probe message Type: string Unit: - |
| round | Round number of the ingestion canary probe Type: int Unit: - |
ingestion_canary (Tracing)¶
| Tags & Fields | Description |
|---|---|
| service ( tag) |
Service name |
| source ( tag) |
Source name |
| span_type ( tag) |
Span type |
| test_type ( tag) |
Test type: collect (collector) or cmd (CLI tool) |
| duration | Duration in microseconds Type: int Unit: time,μs |
| parent_id | Parent span ID Type: string Unit: - |
| resource | Resource name Type: string Unit: - |
| round | Round number of the ingestion canary probe Type: int Unit: - |
| span_id | Span ID Type: string Unit: - |
| start | Start time in microseconds Type: int Unit: timeStamp,usec |
| status | Span status Type: string Unit: - |
| trace_id | Trace ID Type: string Unit: - |
Result Metric¶
ingestion_canary_result¶
| Tags & Fields | Description |
|---|---|
| category ( tag) |
Data category: M (metric), L (logging), T (tracing) |
| status ( tag) |
Test status: ok, timeout, error |
| storage_index ( tag) |
Storage index for logging data (optional) |
| latency_ms | Latency from feed to queryable in milliseconds Type: int | (gauge) Unit: time,ms |
CLI Tool¶
In addition to the collector mode, a CLI tool is provided for one-time testing:
# Use default configuration
datakit tool --ingestion-canary
# Specify storage index for logging data
datakit tool --ingestion-canary --ingestion-canary-index my_index
Options:
--ingestion-canary: Enable ingestion canary test tool--ingestion-canary-index: Specify storage index for logging data, default is "default" (only applies to logging data)
Description:
The tool generates one round of probe data (metrics, logs, traces), sends it to DataWay, then continuously queries until data is found or user interrupts, and outputs latency for each data type. The tool runs continuously with 10 second interval between rounds.