Skip to content

DQL Function Reference


DQL provides a rich set of functions for data aggregation, transformation, and matching. This document details the semantics, parameters, and usage of each function.

Aggregation Functions

Aggregation functions are used to aggregate multiple rows of data into a single value, typically used with time windows (time-expr) and grouping (BY clause).

Basic Aggregation

sum

Calculates the sum of field values.

Syntax:

sum(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate total request count
M::http_requests:(sum(request_count)) [1h]

// Calculate total request count grouped by service
M::http_requests:(sum(request_count)) [1h] BY service

avg

Calculates the average of field values.

Syntax:

avg(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate average response time
M::response_time:(avg(duration)) [1h] BY endpoint

// Calculate average CPU usage
M::cpu:(avg(usage)) [1h] BY host

count

Counts the number of data rows.

Syntax:

count(field)
count(*)

Parameters:

  • field: Any field, counts the number of non-null values
  • *: Counts all rows

Examples:

// Count log entries
L::nginx:(count(*)) [1h]

// Count requests that have a response time
M::response_time:(count(duration)) [1h] BY service

min / max

Calculates the minimum or maximum value of a field.

Syntax:

min(field)
max(field)

Parameters:

  • field: A numeric field

Examples:

// Find the maximum response time
M::response_time:(max(duration)) [1h] BY endpoint

// Find the range of CPU usage
M::cpu:(min(usage), max(usage)) [1h] BY host

first / last

Gets the first or last value (in chronological order).

Syntax:

first(field)
last(field)
last_row(field)

Parameters:

  • field: Any field

Notes:

  • first: Returns the earliest value in time
  • last: Returns the latest value in time; if the field is an array type, it will be expanded
  • last_row: Returns the latest value in time; array types are not expanded

Examples:

// Get the latest status value
M::system:(last(status)) [1h] BY host

// Get initial and final values
M::counter:(first(value), last(value)) [1h] BY metric

any

Returns any non-null value. Suitable for obtaining sample data or scenarios where a specific aggregation order is not required.

Syntax:

any(field)

Parameters:

  • field: Any field

Examples:

// Get a sample message
L::logs:(any(message)) [1h] BY service

// Get a sample error stack trace
L::error_logs:(any(stack_trace)) [1h] BY error_type

spread

Calculates the range (difference between maximum and minimum).

Syntax:

spread(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate the fluctuation range of response time
M::response_time:(spread(duration)) [1h] BY endpoint

stddev

Calculates the standard deviation.

Syntax:

stddev(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate the standard deviation of response time
M::response_time:(stddev(duration)) [1h] BY endpoint

mode

Calculates the mode (the most frequently occurring value).

Syntax:

mode(field)

Parameters:

  • field: Any field

Examples:

// Find the most common response status code
M::http:(mode(status)) [1h] BY endpoint

count_series

Counts the number of time series (groupings). Returns how many independent time series exist within the current query scope.

Syntax:

count_series(field)

Parameters:

  • field: Any field (typically * or any existing field)

Examples:

// Count how many hosts are reporting CPU metrics
M::cpu:(count_series(*)) [1h]

// Count how many instances each service has
M::http_requests:(count_series(*)) [1h] BY service

Statistical Aggregation (Estimation Functions)

The following functions use probabilistic data structures for estimation, suitable for large-scale data scenarios, balancing accuracy and performance.

count_distinct

Estimates the number of distinct values in a field.

Syntax:

count_distinct(field)

Parameters:

  • field: Any field

Algorithm Notes:

Uses the HyperLogLog algorithm for cardinality estimation:

  • Number of registers: 2¹⁶ = 65536
  • Uses LogLog-Beta estimation method
  • Standard error: approximately 0.4%

Use Cases:

  • Counting unique users (UV)
  • Calculating the number of distinct IP addresses
  • Analyzing the number of unique request IDs

Examples:

// Count unique users
L::access_logs:(count_distinct(user_id)) [1d] BY service

// Count distinct accessing IPs
L::nginx:(count_distinct(client_ip)) [1h] BY endpoint

percentile

Estimates the percentile of a field.

Syntax:

percentile(field, n)
pXX(field)  // Shorthand form

Parameters:

  • field: A numeric field
  • n: Percentile, range 0-100

Shorthand Forms:

  • p50(field) is equivalent to percentile(field, 50)
  • p95(field) is equivalent to percentile(field, 95)
  • p99(field) is equivalent to percentile(field, 99)

Algorithm Notes:

Uses a logarithmic linear interpolation histogram for estimation: - Bucket range: 10⁻⁹ to 10¹⁸, covering the vast majority of numerical scenarios - Each order of magnitude is divided into 128 buckets - Uses linear interpolation in logarithmic space to improve accuracy

Use Cases:

  • Calculating P99, P95 for response times
  • Analyzing tail latency of performance metrics
  • Evaluating Service Level Agreement (SLA) compliance

Examples:

// Calculate P99 response time
M::response_time:(percentile(duration, 99)) [1h] BY service

// Using shorthand form
M::response_time:(p99(duration)) [1h] BY service

// Calculate multiple percentiles simultaneously
M::response_time:(p50(duration), p95(duration), p99(duration)) [1h] BY service

median

Calculates the median, equivalent to percentile(field, 50).

Syntax:

median(field)

Examples:

// Calculate the median response time
M::response_time:(median(duration)) [1h] BY service

Histogram Functions

DQL provides three histogram-related functions for different data sources and scenarios:

Function Use Case Data Source Type Recommendation
histogram_auto Numerical distribution statistics for detailed data like logs, Trace Detail model (Log/Trace) ⭐⭐⭐ Recommended
histogram Histogram with fixed bucket boundaries Detail model (Log/Trace) ⭐⭐ Deprecated
histogram_quantile Calculate quantiles from Prometheus histogram metrics Prometheus metrics ⭐⭐⭐ Recommended

Automatically generates a distribution histogram, specifically designed for numerical distribution statistics of detailed data like logs and Trace.

Features:

  • No need to specify bucket boundaries; automatically adapts to data distribution
  • Uses logarithmic linear interpolation histogram algorithm, covering numerical range from 10⁻⁹ to 10¹⁸
  • Returns both quantile statistics and bucket distribution information

Syntax:

histogram_auto(field)

Parameters:

  • field: A numeric field

Return Values:

Column Name Description
lower_bounds Array of lower bounds for each bucket
upper_bounds Array of upper bounds for each bucket
counts Array of counts for each bucket
min Minimum value
p50 Median (50th percentile)
p75 75th percentile
p90 90th percentile
p95 95th percentile
p99 99th percentile
max Maximum value

Algorithm Notes:

Uses an estimation histogram (logarithmic linear interpolation), dividing each order of magnitude into 128 buckets, suitable for distribution statistics on large datasets.

Use Cases:

  • Analyzing response time distribution in logs
  • Statistics on latency distribution in Trace
  • Exploratory data analysis without preset bucket boundaries

Examples:

// Analyze response time distribution in Nginx access logs
L::nginx:(histogram_auto(response_time)) [1h]

// Statistics on request duration distribution by service
L::app_logs:(histogram_auto(duration)) [1h] BY service

// Statistics on call duration distribution in Trace
T::http_client:(histogram_auto(elapsed)) [1h] BY operation

Result Example:

lower_bounds upper_bounds counts min p50 p75 p90 p95 p99 max
[0, 10, 100] [10, 100, 1000] [1000, 500, 100] 0.5 45 120 280 450 850 1200
Note

lower_bounds, upper_bounds, counts are array types, representing the boundaries and counts of each bucket.


histogram (Deprecated)

Generates a histogram with specified bucket boundaries. This function is Deprecated, use histogram_auto instead.

Notes:

histogram requires manually specifying bucket boundary parameters, making it less flexible. histogram_auto automatically adapts to data distribution, covers a wider numerical range, and returns richer statistical information.

Syntax:

histogram(field, left_bound, right_bound, bucket_size [, threshold])

Parameters:

  • field: A numeric field
  • left_bound: Left boundary
  • right_bound: Right boundary
  • bucket_size: Bucket size
  • threshold (optional): Minimum count per bucket; buckets below this value are not returned

Return Values:

Returns two columns: bucket_le (right boundary of the bucket) and count (count)

Examples:

// Generate a histogram in the range 0-1000ms, with 100ms per bucket
M::response_time:(histogram(duration, 0, 1000, 100)) [1h]

// Recommended: Use histogram_auto instead
M::response_time:(histogram_auto(duration)) [1h]

Result Example:

bucket_le count
100 1500
200 2800
300 3500
... ...
1000 5000

histogram_quantile

Calculates quantiles from Prometheus histogram metrics.

Features:

  • Specifically designed to handle histogram-type metrics reported by Prometheus
  • Relies on the le label (or VictoriaMetrics' vmrange label) to identify bucket boundaries
  • Input data should be cumulative counts

Syntax:

histogram_quantile(field, q)

Parameters:

  • field: Histogram count field (e.g., http_request_duration_bucket)
  • q: Quantile, range 0-1 (e.g., 0.99 for P99)

Use Case Comparison:

Scenario Recommended Function Notes
Analyzing response time distribution in logs histogram_auto Logs are detailed data, no pre-aggregated histogram
Analyzing P99 of Prometheus histogram metrics histogram_quantile Metrics are pre-aggregated by le label
Statistics on call duration distribution in Trace histogram_auto Trace is detailed data

le Label Processing Mechanism:

histogram_quantile relies on the le label (less than or equal) to identify histogram bucket boundaries:

  1. Prometheus format (default): Uses the le label directly to represent the bucket upper boundary

    • le value is numeric (e.g., "0.1", "1", "10") or "+Inf" (infinity)
    • Data should be cumulative counts
  2. VictoriaMetrics format: Uses the vmrange label to represent ranges

    • Format is "lower...upper" (e.g., "0.1...0.2")
    • Data is range counts (non-cumulative)
    • The function automatically converts range counts to cumulative counts

Calculation Process:

  1. Sort all buckets by le value
  2. If in vmrange format, accumulate counts to convert to cumulative distribution
  3. Ensure bucket counts are monotonically increasing (fix possible abnormal data)
  4. Use linear interpolation to calculate the target quantile

Differences from PromQL:

Feature DQL PromQL
Function Type Aggregation function Transformation function
Input Data Directly reads metrics with le label Requires combination with sum(rate(...)) by (le)
Usage histogram_quantile(field, 0.99) histogram_quantile(0.99, sum(rate(...)) by (le))
Data Format Supports both le and vmrange labels Only supports le label
Grouping Through DQL's BY clause Through explicit by (le) grouping

Equivalent Examples:

Assume a histogram metric http_request_duration_bucket, containing le labels (e.g., 0.1, 0.5, 1, 5, +Inf) and service label.

Scenario 1: Calculate P99 Latency

DQL:

M::http_request_duration:(histogram_quantile(duration_bucket, 0.99)) [1h] BY service

PromQL equivalent:

histogram_quantile(0.99, sum(rate(http_request_duration_bucket[1h])) by (le, service))

Scenario 2: Calculate P95 Latency per Service (Multiple Groups)

DQL:

M::http_request_duration:(
    histogram_quantile(duration_bucket, 0.95)
) [1h] BY service, endpoint

PromQL equivalent:

histogram_quantile(0.95, sum(rate(http_request_duration_bucket[1h])) by (le, service, endpoint))

Scenario 3: Calculate P50 (Median) and P99

DQL:

M::http_request_duration:(
    histogram_quantile(duration_bucket, 0.50) as p50,
    histogram_quantile(duration_bucket, 0.99) as p99
) [1h] BY service

PromQL equivalent:

label_join(
  histogram_quantile(0.50, sum(rate(http_request_duration_bucket[1h])) by (le, service)), "quantile", "", "0.50"
)
or
label_join(
  histogram_quantile(0.99, sum(rate(http_request_duration_bucket[1h])) by (le, service)), "quantile", "", "0.99"
)

Note

PromQL requires label_join or label_replace to distinguish results from different quantiles.

Notes:

  • Input data must contain le or vmrange labels, otherwise calculation is impossible
  • If there is no +Inf bucket, the upper boundary of the last bucket will be used as the maximum value
  • Buckets with count 0 or NaN are skipped
  • Returns -Inf when quantile < 0, and +Inf when > 1

TopN Functions

top

Gets the top N largest values.

Syntax:

top(field, n)

Parameters:

  • field: A numeric field
  • n: Number of values to return

Examples:

// Get the 5 requests with the longest response time
M::response_time:(top(duration, 5)) [1h] BY service

// Get the 10 hosts with the highest traffic
M::network:(top(bytes, 10)) [1h]

Result Example:

service top(duration, 5)
api 1250
api 1180
api 1050
api 980
api 920
Note

Returns multiple rows, each containing one TopN value.


bottom

Gets the top N smallest values.

Syntax:

bottom(field, n)

Parameters:

  • field: A numeric field
  • n: Number of values to return

Examples:

// Get the 5 requests with the shortest response time
M::response_time:(bottom(duration, 5)) [1h] BY service

Result Example:

service bottom(duration, 5)
api 12
api 18
api 25
api 32
api 45
Note

Returns multiple rows, each containing one BottomN value.


Value Collection Functions

distinct

Returns all distinct values of a field.

Syntax:

distinct(field)

Examples:

// Get all distinct status codes
M::http:(distinct(status)) [1h] BY endpoint

Result Example:

endpoint distinct(status)
/api/v1 200
/api/v1 404
/api/v1 500
/health 200
Note

Returns multiple rows, each containing one distinct value.


distinct_by_collapse

Gets distinct values of a field according to a collapse strategy, retaining the last value of other fields during deduplication.

Syntax:

distinct_by_collapse(field, [last_field1, last_field2, ...])

Parameters:

  • field: The field used as the basis for deduplication
  • last_fields (optional): List of fields for which the last value should be retained

Notes:

Unlike distinct, distinct_by_collapse retains information from other related fields (taking the last value) during deduplication, suitable for scenarios requiring contextual information.

Examples:

// Get distinct user IDs, retaining the last access time for each user
L::access_logs:(distinct_by_collapse(user_id, [timestamp])) [1h]

// Get distinct hosts, retaining the last status and message
O::HOST:(distinct_by_collapse(host, [status, message])) [1h]

Result Example:

user_id last(timestamp) last(path)
user001 1704067200000 /checkout
user002 1704067100000 /product
user003 1704067000000 /home
Note

Returns the deduplicated main field values, and the last value of other fields specified by last_fields.


collect

Collects all values (including duplicates).

Syntax:

collect(field [, limit])

Parameters:

  • field: Any field
  • limit (optional): Maximum number of values to collect

Examples:

// Collect all response times
M::response_time:(collect(duration)) [1h] BY service

// Collect at most 100 values
M::response_time:(collect(duration, 100)) [1h] BY service

Result Example:

service collect(duration)
api [120, 135, 98, 142, ...]
web [45, 52, 48, 61, ...]
Note

Returns an array type, containing all collected values (may include duplicates).


collect_distinct

Collects all distinct values.

Syntax:

collect_distinct(field [, limit])

Parameters:

  • field: Any field
  • limit (optional): Maximum number of values to collect

Examples:

// Collect all distinct error types
L::error_logs:(collect_distinct(error_type)) [1h] BY service

Result Example:

service collect_distinct(error_type)
api ["timeout", "connection refused", "404"]
web ["200", "301", "404"]
Note

Returns an array type, containing all deduplicated values.


field_values

Gets all values of a field, returning an array type.

Syntax:

field_values(field)

Examples:

// Get all tag values
M::metrics:(field_values(tags)) [1h] BY metric_name

Result Example:

metric_name field_values(tags)
cpu_usage ["host:A", "env:prod", "team:backend"]
memory_used ["host:B", "env:staging", "team:frontend"]
Note

Returns an array type, containing all values of the field.


Filter Aggregation

count_filter

Counts the number of field values within a specified list.

Syntax:

count_filter(field, [value1, value2, ...])

Parameters:

  • field: Any field
  • values: List of values

Examples:

// Count requests with specific status codes
M::http:(count_filter(status, [200, 201, 204])) [1h] BY endpoint

// Count error-level logs
L::logs:(count_filter(level, ["error", "critical"])) [1h] BY service

Helper Functions

default

Sets a default value for a field, returning the default value when the field is empty.

Syntax:

default(field, default_value)

Parameters:

  • field: Any field
  • default_value: Default value (can be numeric, string, boolean, or null)

Examples:

// Set a default response time for null values
M::response_time:(default(duration, 0)) [1h] BY service

Time Series Functions

Time series functions are used to process data that changes over time, especially Counter-type metrics.

Rollup Functions

Rollup functions are used to preprocess raw time series data within a time window. For detailed explanation, refer to Rollup Functions in this document.

Writing Notes:

  • Rollup is written in the time clause, e.g., [rate], [1h::5m:rate].
  • Writing in the outer query (e.g., rate(DQL)) belongs to outer functions, not Rollup.

Supported Rollup Functions:

Function Description
rate Calculate growth rate (per second)
irate Calculate instantaneous growth rate
increase Calculate increase amount
deriv Calculate derivative (rate of change)
difference Calculate difference
non_negative_derivative Calculate non-negative derivative
non_negative_difference Calculate non-negative difference
rate_over_sum Calculate average per second
rate_over_count Calculate count per second
sum Sum
avg Average
min Minimum
max Maximum
count Count
first First value
last Last value
stddev Standard deviation
mode Mode
spread Range
any Any value

Examples:

// Calculate request QPS
M::http_requests:(sum(request_count)) [1h::5m:rate] BY service

// Shorthand form
M::cpu:(max(usage)) [rate]

Growth Rate Calculation

rate

Calculates the growth rate of a metric (per second).

Syntax:

rate(field)

Notes:

rate calculates the average growth rate of a Counter metric within a time window. For monotonically increasing Counter-type metrics, aggregating raw values directly is meaningless; the growth rate needs to be calculated first.

Use Cases:

  • Calculating request QPS
  • Calculating data write rate
  • Analyzing traffic growth trends

Examples:

// Calculate request QPS
M::http_requests:(sum(request_count)) [rate] BY service

// Calculate data ingestion rate
M::data_ingestion:(sum(bytes)) [rate] BY source

irate

Calculates the instantaneous growth rate of a metric.

Syntax:

irate(field)

Notes:

Unlike rate, irate uses only the last two data points to calculate the growth rate, reflecting the instantaneous rate of change, more suitable for alerting scenarios.

Examples:

// Calculate instantaneous request QPS
M::http_requests:(sum(request_count)) [irate] BY service

increase

Calculates the increase amount of a metric.

Syntax:

increase(field)

Notes:

increase returns the total increase within the time window, not the growth rate.

Examples:

// Calculate total request increase
M::http_requests:(sum(request_count)) [increase] BY service

rate_over_sum

Calculates the average per second (sum / time window seconds).

Syntax:

rate_over_sum(field)

Notes:

Equivalent to sum(field) / time window (seconds), used to calculate the average per second. Often used in the Rollup stage to convert cumulative values to per-second rates.

Difference from rate:

  • rate: Calculates the growth rate of a Counter (handles resets)
  • rate_over_sum: Simply divides sum by time window seconds

Examples:

// Calculate average requests per second
M::http_requests:(rate_over_sum(request_count)) [1h] BY service

rate_over_count

Calculates the count per second (count / time window seconds).

Syntax:

rate_over_count(field)

Notes:

Equivalent to count(field) / time window (seconds), used to calculate occurrences per second.

Examples:

// Calculate error log entries per second
L::error_logs:(rate_over_count(*)) [1h] BY error_type

Difference Calculation

This section describes function semantics. The same functions can be used both as Rollup (e.g., [rate], [increase]) and written within query expressions (e.g., rate(field), increase(field)); they execute at different stages, prioritize based on business needs.

rate / deriv

Calculates the rate of change (derivative). rate is used for Counter-type metrics (ignores negative values), deriv is used for Gauge-type metrics (preserves negative values).

Aliases: rate alias is non_negative_derivative; deriv alias is derivative (PromQL style).

Syntax:

// Counter metrics: Calculate non-negative rate of change (ignores negative values from resets)
rate(field)

// Gauge metrics: Calculate full rate of change (including negative values)
deriv(field)

Function Selection:

Function Description Use Case
rate Calculates only non-negative rate of change Counter-type metrics (monotonically increasing)
deriv Calculates full rate of change (including negative) Gauge-type metrics (can increase or decrease)

Examples:

// Counter metrics: Calculate request QPS
M::requests:(rate(count)) [1h::5m] BY service

// Gauge metrics: Calculate memory usage change rate
M::memory:(deriv(used)) [1h::5m] BY host

increase / difference

Calculates the difference between adjacent values. increase is used for Counter-type metrics (ignores negative values), difference is used for Gauge-type metrics (preserves negative values).

Notes: increase and difference are two independent functions with different behaviors, not aliases.

Syntax:

// Counter metrics: Calculate non-negative difference (ignores negative values from resets)
increase(field)

// Gauge metrics: Calculate full difference (including negative values)
difference(field)

Function Selection:

Function Description Use Case
increase Calculates only non-negative difference Counter-type metrics (monotonically increasing)
difference Calculates full difference (including negative) Gauge-type metrics (can increase or decrease)

Examples:

// Counter metrics: Calculate request increase amount
M::requests:(increase(count)) [1h::5m] BY service

// Gauge metrics: Calculate request count change (may increase or decrease)
M::requests:(difference(count)) [1h::5m] BY service

Moving Calculation

moving_average

Calculates the moving average.

Syntax:

moving_average(field, n)

Parameters:

  • field: A numeric field
  • n: Window size (number of data points)

Examples:

// Calculate 5-point moving average
M::cpu:(moving_average(usage, 5)) [1h::1m] BY host

cumsum

Calculates the cumulative sum.

Syntax:

cumsum(field)

Examples:

// Calculate cumulative request count
M::requests:(cumsum(count)) [1h::5m] BY service

Transformation Functions

Transformation functions are used for mathematical operations, type conversions, or string processing on field values.

Mathematical Functions

abs

Calculates the absolute value.

Syntax:

abs(field)

Examples:

// Calculate absolute value of temperature deviation
M::temperature:(abs(deviation)) [1h] BY sensor

round / ceil / floor

Rounding functions.

Syntax:

round(field)  // Round to nearest integer
ceil(field)   // Round up
floor(field)  // Round down

Examples:

// Round up response time
M::response_time:(ceil(duration)) [1h]

// Round percentage
M::cpu:(round(usage)) [1h] BY host

log / log2 / log10

Logarithmic functions.

Syntax:

log(field)    // Natural logarithm
log2(field)   // Base-2 logarithm
log10(field)  // Base-10 logarithm

Examples:

// Calculate log-transformed values
M::metrics:(log(value)) [1h] BY metric_name

Type Conversion

int / uint / float / string / bool

Type conversion functions.

Syntax:

int(field)    // Convert to signed integer
uint(field)   // Convert to unsigned integer
float(field)  // Convert to floating-point
string(field) // Convert to string
bool(field)   // Convert to boolean

Examples:

// Convert string to numeric
L::logs:(int(response_time)) [1h] BY service

// Convert numeric to string for concatenation
M::metrics:(string(value)) [1h] BY metric_name

String Functions

substr

Extracts a substring.

Syntax:

substr(field, start)
substr(field, start, length)

Parameters:

  • field: A string field
  • start: Starting position (0-indexed, negative numbers count from the end)
  • length (optional): Length of substring

Return Value:

Returns the extracted substring.

Examples:

// Extract first 100 characters of a message
L::logs:(substr(message, 0, 100)) [1h]

// Extract last 10 characters
L::logs:(substr(message, -10)) [1h]

Result Example:

message substr(message, 0, 10) substr(message, -5)
"Error: connection timeout" "Error: con" "eout"

regexp_extract

Regular expression extraction.

Syntax:

regexp_extract(field, pattern)
regexp_extract(field, pattern, n)

Parameters:

  • field: A string field
  • pattern: Regular expression
  • n (optional): Extract the nth capture group, defaults to 0 (entire match)

Return Value:

Returns a single string, the content of the nth capture group. Returns null if no match.

Examples:

// Extract error code
L::logs:(regexp_extract(message, 'error_code: (\d+)', 1)) [1h] BY service

// Extract IP address
L::nginx:(regexp_extract(message, '(\d+\.\d+\.\d+\.\d+)', 1)) [1h]

Result Example:

service regexp_extract(message, 'error_code: (\d+)', 1)
api "404"
api "500"
web null

regexp_extract_all

Extracts all matching results.

Syntax:

regexp_extract_all(field, pattern)
regexp_extract_all(field, pattern, n)

Return Value:

Returns a string array containing all matching substrings.

Examples:

// Extract all numbers
L::logs:(regexp_extract_all(message, '\d+', 0)) [1h]

// Extract all IP addresses
L::logs:(regexp_extract_all(message, '\d+\.\d+\.\d+\.\d+', 0)) [1h]

Result Example:

message regexp_extract_all(message, '\d+.\d+.\d+.\d+', 0)
Request from 192.168.1.1 to 10.0.0.1 ["192.168.1.1", "10.0.0.1"]

md5

Calculates the MD5 hash.

Syntax:

md5(field)

Return Value:

Returns a 32-character hexadecimal string (lowercase).

Examples:

// Calculate MD5 of message
L::logs:(md5(message)) [1h] BY service

Result Example:

service md5(message)
api 5d41402abc4b2a76b9719d911017c592
web 098f6bcd4621d373cade4e832627b4f6

concat

String concatenation.

Syntax:

concat(field, ...)

Return Value:

Returns a single concatenated string.

Examples:

// Concatenate multiple fields
L::logs:(concat(service, ":", level)) [1h]

// Results: "api:error", "web:info", etc.

Result Example:

service level concat(service, ":", level)
api error "api:error"
web info "web:info"

set

Deduplicates and sorts an array field.

Syntax:

set(field)

Return Value:

Returns a deduplicated and sorted array.

Examples:

// Get all distinct tags
M::metrics:(set(tags)) [1h] BY metric_name

// Deduplicate collect results
M::http:(set(collect(status))) [1h] BY endpoint

Result Example:

metric_name set(tags)
cpu_usage ["env:prod", "host:A", "team:backend"]
memory_used ["env:staging", "host:B", "team:frontend"]

Log Clustering

drain

Uses the Drain algorithm to cluster logs and generate log templates.

Syntax:

drain(field, similarity_threshold, max_clusters)

Parameters:

  • field: A string field (usually message)
  • similarity_threshold: Similarity threshold, range (0, 1], higher values mean stricter clustering
  • max_clusters: Maximum number of clusters, range [1, 10000]

Algorithm Notes:

Drain is a log clustering algorithm based on a parse tree, capable of automatically identifying constant and variable parts in logs to generate log templates.

Use Cases:

  • Log pattern recognition
  • Anomaly log clustering analysis
  • Log noise reduction

Examples:

// Cluster logs, similarity 0.7, up to 1000 clusters
L::logs:(drain(message, 0.7, 1000)) [1h] BY service

// Stricter clustering
L::logs:(drain(message, 0.9, 500)) [1h] BY service

Result Example:

service drain(message, 0.7, 1000)
api "Request from to took
db "Query executed in
Note

Returns a string-type log template, where <...> indicates variable parts.


Matching Functions

Matching functions are used for text matching in WHERE clauses and can also be used as expressions returning boolean values.

Substring Matching

match

Checks if a field contains a specified substring.

Syntax:

match(pattern)
match(field, pattern)

Parameters:

  • pattern: Substring to match
  • field: Field name (optional, can be omitted in WHERE clause)

Examples:

// Use in WHERE
L::logs:(message) {match(message, "error")} [1h]

// Shorthand form
L::logs:(message) {match("error")} [1h]

// As an expression
L::logs:(match(message, "timeout")) [1h] BY match_result

Phrase Matching

Tokenized phrase matching, supports mixed Chinese and English.

Syntax:

phrase(query)
phrase(field, query)
search(query)
search(field, query)

Parameters:

  • query: Query phrase
  • field: Field name (optional)

Matching Rules:

  • Chinese: Tokenized by character for matching
  • English: Tokenized by word boundaries (space, punctuation)
  • Case-insensitive

Examples:

// Match logs containing "connection timeout"
L::logs:(message) {phrase("connection timeout")} [1h]

// Chinese matching
L::logs:(message) {phrase("连接超时")} [1h]

// Mixed Chinese and English
L::logs:(message) {search("error 错误")} [1h]

Regular Expression Matching

re / regex / regexp

Regular expression matching.

Syntax:

re(pattern)
re(field, pattern)
regex(pattern)
regex(field, pattern)
regexp(pattern)
regexp(field, pattern)

Parameters:

  • pattern: Regular expression (supports PromRegex syntax)
  • field: Field name (optional)

Examples:

// Match logs starting with error
L::logs:(message) {re("error.*")} [1h]

// Match error codes of specific format
L::logs:(message) {regexp(message, "ERR-\d{4}")} [1h]

// Use regex in data source
M::re('cpu.*'):(usage) [1h]

Wildcard Matching

wildcard

Wildcard pattern matching.

Syntax:

wildcard(pattern)
wildcard(field, pattern)

Parameters:

  • pattern: Wildcard pattern

    • *: Matches any number of characters
    • ?: Matches a single character
  • field: Field name (optional)

Examples:

// Match messages starting with error
L::logs:(message) {wildcard("error*")} [1h]

// Match specific format
L::logs:(message) {wildcard(message, "ERR-????")} [1h]

CIDR Matching

cidr

IP address CIDR range matching.

Syntax:

cidr(cidr)
cidr(field, cidr)

Parameters:

  • cidr: Network segment in CIDR notation, e.g., 192.168.1.0/24
  • field: IP address field (optional)

Examples:

// Match internal IPs
L::nginx:(*) {cidr(client_ip, "10.0.0.0/8")} [1h]

// Match specific network segment
L::nginx:(*) {cidr(client_ip, "192.168.1.0/24")} [1h]

Field Existence Check

exists

Checks if a field exists.

Syntax:

exists

Examples:

// Find logs that have an error_type field
L::logs:(message) {exists(error_type)} [1h]

Query String Syntax

query_string

Uses query string syntax for complex matching.

Syntax:

query_string(query)
query_string(field, query)

Parameters:

  • query: Query string
  • field: Field name (optional, defaults to full-text matching)

Query String Syntax:

1. Term Matching
foo              # Match content containing foo
"foo bar"        # Exact phrase match, must appear consecutively
foo\ bar         # Escape space, match "foo bar" as a whole
2. Wildcards
foo*             # Match content starting with foo
foo?bar          # ? matches a single character
"foo*bar"        # Wildcards inside quotes are not parsed, matched literally
3. Regular Expressions
/foo.*bar/       # Regular expression enclosed in slashes
/joh?n(ath[oa]n)/  # Complex regex
4. Boolean Operators
foo AND bar      # Logical AND, both must be contained
foo OR bar       # Logical OR, at least one must be contained
NOT foo          # Logical NOT, does not contain foo

# Shorthand forms
foo && bar       # Equivalent to foo AND bar
foo || bar       # Equivalent to foo OR bar
!foo             # Equivalent to NOT foo
5. Grouping
(foo OR bar) AND baz       # Use parentheses to change precedence
!(status 429 reading)      # Negate the entire expression
6. Default Operator

When multiple terms are separated by spaces, they are connected by OR by default (configurable to AND):

foo bar          # Equivalent to foo OR bar

Examples:

// Simple term matching
L::logs:(message) {query_string("error timeout")} [1h]

// Boolean combination
L::logs:(message) {query_string("error AND NOT timeout")} [1h]

// Regular expression
L::logs:(message) {query_string("/ERR-\d{4}/")} [1h]

// Complex query
L::logs:(message) {query_string("(error OR warn) AND service")} [1h]

// Specify field
L::logs:(*) {query_string(message, "error AND timeout")} [1h]

// Chinese query
L::logs:(message) {query_string("错误 AND 超时")} [1h]

Outer Functions

Usage Recommendation

Outer functions are legacy design. For scenarios that can be solved using Rollup + aggregation functions (e.g., [rate], [last], [increase], etc.), prioritize using the Rollup approach. Outer functions should only be used in scenarios where Rollup is not applicable (e.g., requiring secondary calculation on aggregated results).

Outer functions operate on the entire DQL query result, used for secondary calculation on the time series data output by the query. Outer functions wrap the entire DQL expression, rather than being written inside the Select clause.

In-Query Functions vs. Outer Functions

  • In-Query Functions: Used inside DQL expressions, e.g., sum, avg, max, etc.
  • Outer Functions: Wrap the entire DQL query result, performing post-processing on the output time series.

Writing Comparison:

// Rollup (time clause): First calculate growth rate for each time series, then aggregate
M::http_requests:(sum(request_count)) [rate] BY service

// Outer function: First get query result, then perform secondary calculation
rate(M::http_requests:(sum(request_count)) [1h::1m] BY service)

Syntax:

outer_function(DQL_expression)

Examples:

// In-query function: Average raw data
M::cpu:(avg(usage)) [1h::5m] BY host

// Outer function: Calculate moving average on query result
moving_average(M::cpu:(avg(usage)) [1h::5m] BY host, 5)

Cumulative Calculation

cumsum

Calculates the cumulative sum, computing the sum of all previous points for each point in the time series.

Syntax:

cumsum(DQL_expression)

Examples:

// Calculate cumulative request count
cumsum(M::requests:(sum(count)) [1h::5m] BY service)

Difference and Derivative

Recommended Usage

These functions are prioritized as Rollup functions (e.g., [rate], [deriv]). The outer function form is only used for secondary calculation on aggregated results.

The following functions can be used as outer functions:

Function Description
derivative(DQL) Calculate derivative (rate of change)
difference(DQL) Calculate difference from previous value
non_negative_derivative(DQL) Calculate non-negative derivative
non_negative_difference(DQL) Calculate non-negative difference
rate(DQL) Calculate growth rate (per second)
irate(DQL) Calculate instantaneous growth rate

Examples:

// Outer function: Calculate derivative on query result
derivative(M::cpu:(avg(usage)) [1h::5m] BY host)

// Recommended: Use Rollup approach
M::cpu:(deriv(usage)) [1h::5m:last] BY host

Moving Calculation

moving_average

Calculates the moving average on query results.

Recommended Usage

Prioritize using the Rollup approach moving_average(field, n). The outer function form is only used for secondary smoothing on aggregated results.

Syntax:

moving_average(DQL_expression, n)

Parameters:

  • DQL_expression: DQL query expression
  • n: Window size (number of data points)

Examples:

// Outer function: Calculate moving average on query result
moving_average(M::cpu:(avg(usage)) [1h::1m] BY host, 5)

// Recommended: Use Rollup approach
M::cpu:(moving_average(usage, 5)) [1h::1m] BY host

TopN

top / bottom

Recommended Usage

Prioritize using the Rollup approach top(field, n) or bottom(field, n). The outer function form is only used for secondary filtering on aggregated results.

Gets TopN or BottomN from query results.

Syntax:

top(DQL_expression, n)
bottom(DQL_expression, n)

Examples:

// Outer function: Get TopN from query result
top(M::response_time:(max(duration)) [1h::5m] BY service, 5)

// Recommended: Use Rollup approach
M::response_time:(top(duration, 5)) [1h::5m] BY service

Null Value Filling

fill

Fills null values in query results.

For detailed explanation, refer to fill function.

Examples:

// Fill nulls with 0
fill(M::cpu:(avg(usage)) [1h::5m] BY host, 0)

// Fill with linear interpolation
fill(M::cpu:(avg(usage)) [1h::5m] BY host, LINEAR)

Other Outer Functions

The following functions can also be used as outer functions:

Function Description
abs(DQL) Absolute value
round(DQL) Round
ceil(DQL) Round up
floor(DQL) Round down
log(DQL) / log2(DQL) / log10(DQL) Logarithmic transformation
set(DQL) Deduplicate and sort
concat(DQL, ...) String concatenation

Combined Usage

Outer functions can be combined:

// Calculate moving average then round
round(moving_average(M::cpu:(avg(usage)) [1h::1m] BY host, 5))

// Calculate moving average of growth rate
moving_average(rate(M::requests:(sum(count)) [1h::5m] BY service), 3)

eval Expression Calculation

eval is a special function that allows expression calculation outside the query, referencing results from multiple subqueries for combined operations.

Syntax

eval(expression, name1=query1, name2=query2, ..., alias="result_name")

Parameters:

  • expression: Mathematical expression, referencing subquery results using name.field
  • name=query: Named subquery
  • alias: Result alias (optional)

How It Works

  1. Execute all named subqueries
  2. Align results of each subquery by time
  3. Calculate the expression for each time point
  4. Return the calculation result

Use Cases

  • Calculating ratios of multiple metrics (e.g., error rate, utilization)
  • Comparing metrics across different time periods
  • Combining calculation results from multiple data sources

Examples

Calculate Error Rate

// Error rate = error count / total requests * 100
eval(a / b * 100,
     a=M::http:(sum(error_count)) [1h] BY service,
     b=M::http:(sum(request_count)) [1h] BY service,
     alias="error_rate")

Calculate CPU Usage

// Usage = used / total * 100
eval(used / total * 100,
     used=M::memory:(sum(used_bytes)) [1h] BY host,
     total=M::memory:(sum(total_bytes)) [1h] BY host,
     alias="memory_usage_percent")

Calculate Year-over-Year Growth

// Compare this week with last week
eval(this_week / last_week - 1,
     this_week=M::sales:(sum(amount)) [7d],
     last_week=M::sales:(sum(amount)) [7d offset 7d],
     alias="week_over_week_growth")

Reference Subquery Fields

// Reference specific fields of subqueries
eval(a.usage / b.total * 100,
     a=M::cpu:(avg(usage)) [1h] BY host,
     b=M::cpu:(avg(total)) [1h] BY host,
     alias="cpu_percent")

Notes

  • Time windows for all subqueries must be compatible
  • Grouping dimensions of subqueries should be consistent
  • Field names referenced in expressions use name.field format
  • If there is only one subquery, field names can be used directly

Other Functions

fill

Fills specified values for null values in query results.

Recommended Usage

fill is recommended to be used as an outer function, operating on the entire query result:

fill(M::cpu:(avg(usage)) [1h::5m] BY host, 0)

Although fill(avg(usage), 0) is also supported in the Select clause, fill actually fills the results after aggregation is complete, so the outer function form better matches its working mechanism.

Syntax (Outer Function):

fill(DQL_expression, value)
fill(DQL_expression, LINEAR)
fill(DQL_expression, PREVIOUS)

Parameters:

  • DQL_expression: DQL query expression
  • value: Fill value, supports multiple modes:
  • Specific value: numeric, string, null
  • LINEAR: Linear interpolation
  • PREVIOUS: Fill with previous non-null value

Examples:

// Recommended: Use as outer function
fill(M::cpu:(avg(usage)) [1h::5m] BY host, 0)

// Fill with linear interpolation
fill(M::cpu:(avg(usage)) [1h::5m] BY host, LINEAR)

// Fill with previous value
fill(M::cpu:(avg(usage)) [1h::5m] BY host, PREVIOUS)

now

Returns the current timestamp (milliseconds).

Syntax:

now()

Examples:

// Query recently updated data
O::HOST:(*) {last_update_time > now() - 600000}  // Updated within 10 minutes

unwrap

Unwraps the packaging of aggregated results.

Syntax:

unwrap(field)

Examples:

// Unwrap aggregated field
M::cpu:(unwrap(usage)) [1h] BY host

Show Functions

Show functions are used to view metadata (e.g., measurement, tag, field, cardinality and series count), often used for modeling troubleshooting and pre-query exploration.

General Syntax

show_xxx(arg1=..., arg2=...){ where_conditions } [time_window] LIMIT n OFFSET m
  • where, time_window, LIMIT, OFFSET are all optional;

  • LIMIT/OFFSET cannot be negative.

M Namespace Built-in Show Functions

Function Parameters Return Columns Description
show_measurement Optional re('pattern') name List measurements
show_tag_key Optional from=['measurement'] tagKey List tag keys
show_field_key Optional from=['measurement'] fieldKey, fieldType List field keys (current fieldType is float)
show_tag_value keyin=['tagKey'] (required), optional from key, value List tag values
show_measurement_cardinality No mandatory parameters count Number of measurements
show_series_cardinality No mandatory parameters count Series cardinality (estimated)
show_tag_key_cardinality No mandatory parameters count Tag key cardinality (estimated)
show_tag_value_cardinality keyin=['tagKey'] (required) count Value cardinality for specified tag key (estimated)
show_field_key_cardinality No mandatory parameters count Field key cardinality (estimated)
show_series_count_by_field_key from=['measurement'] (recommended) name, count Count series by field key
show_series_count_by_tag_key from=['measurement'] (recommended) name, count, value_count Count series and values by tag key
show_series_count_by_tag_value keyin=['tagKey'] (required), from=['measurement'] (recommended) name, count Count series by value of specified tag key

Cardinality-related functions use HyperLogLog merging at the bottom layer, returning estimated values.

Non-M Namespace Show Functions (Suffix Pattern)

For non-M namespaces, the following suffix patterns are supported:

  • show_<namespace>_source
  • show_<namespace>_class
  • show_<namespace>_type
  • show_<namespace>_field
  • show_<namespace>_label

Where <namespace> is automatically mapped from the middle segment of the function name, for example:

  • show_logging_source -> L
  • show_tracing_field -> T
  • show_object_source -> O

Common examples:

show_logging_source()
show_tracing_field('mysql')
show_logging_field('*')
show_logging_label(name='env')
show_logging_label(names=['env', 'team'])

Parameter and Behavior Notes

  • from: List of measurements, supports string or string array;
  • keyin: List of tag keys, supports string or string array;
  • field: List of fields, supports string or string array (for metric show field filtering);
  • For show_*_field:

    • Unnamed parameters (e.g., 'mysql') are usually treated as source filters
    • '*' is equivalent to not specifying a source
    • Named parameters are converted to where filter conditions
  • For show_*_label:

    • Requires named parameters; names is treated as an alias for name.

Constraints and Notes

  • show_tag_value and show_tag_value_cardinality must provide keyin;
  • show_series_count_by_tag_value must provide keyin;
  • show_series_count_by_* must provide from, or equivalent source constraints in where (e.g., @__source__ condition);
  • show_<namespace>_source, show_<namespace>_class, show_<namespace>_type currently share the same execution path, returning deduplicated source list;
  • Currently parser does not support show_<namespace>_index syntax (even though the execution layer has corresponding branches);
  • When Query API does not provide a show time range, some log show queries will fall back to the last 30-minute window for execution.

Return Examples

The following examples only show typical column structure and sample rows; actual results are affected by tenant data, filter conditions, time range, and LIMIT/OFFSET.

Query Typical Columns Sample Rows (Illustrative)
show_measurement() name cpu, disk, memory
show_tag_value(from=['cpu'], keyin=['host']) key, value host, web-01; host, web-02
show_series_count_by_tag_key(from=['cpu']) name, count, value_count host, 3200, 120; service, 2800, 35
show_tag_value_cardinality(keyin=['host']) count 120
show_logging_field('*') fieldKey, fieldType, fieldIndices service, keyword, ["idx_service"]
show_logging_source() source nginx, mysql, redis

Function Classification Quick Reference

Basic Aggregation

Function Description Exact/Estimated
sum Sum Exact
avg Average Exact
count Count Exact
count_distinct Distinct count Estimated (HyperLogLog, error≈0.4%)
min / max Minimum/Maximum Exact
first / last First/Last value Exact
any Any value Exact

Statistical Aggregation

Function Description Exact/Estimated
percentile / pXX Percentile Estimated (log histogram)
median Median Estimated
stddev Standard deviation Exact
mode Mode Exact
spread Range Exact
count_series Time series count Exact

Filter Aggregation

Function Description Exact/Estimated
top / bottom TopN / BottomN Exact
count_filter Conditional count Exact

Histogram Functions

Function Description Data Source Exact/Estimated
histogram_auto Automatic histogram (Recommended) Log, Trace detail data Estimated
histogram Fixed-boundary histogram (Deprecated) Log, Trace detail data Exact
histogram_quantile Calculate quantile from Prometheus histogram Prometheus metrics Estimated

Set Functions

Function Description Exact/Estimated
distinct Distinct value list Exact
distinct_by_collapse Collapse deduplication (retain other fields) Exact
collect Collect all values Exact
collect_distinct Collect distinct values Exact

Helper Functions

Function Description Exact/Estimated
default Set default value Exact

Time Series Functions

Function Description
rate Growth rate (per second)
irate Instantaneous growth rate
increase Increase amount
derivative Derivative
difference Difference
non_negative_derivative Non-negative derivative
non_negative_difference Non-negative difference
moving_average Moving average
cumsum Cumulative sum

Rollup Functions

Function Description
rate Calculate growth rate (per second)
irate Calculate instantaneous growth rate
increase Calculate increase amount
rate_over_sum Calculate average per second
rate_over_count Calculate count per second
deriv Calculate derivative
difference Calculate difference
sum Sum
avg Average
min Minimum
max Maximum
count Count
first First value
last Last value
stddev Standard deviation
mode Mode
spread Range
any Any value

Transformation Functions

Function Description
abs Absolute value
round / ceil / floor Rounding
log / log2 / log10 Logarithm
int / uint / float / string / bool Type conversion
substr Substring
regexp_extract Regex extraction
regexp_extract_all Regex extract all
md5 MD5 hash
concat String concatenation
set Array deduplicate and sort
drain Log clustering

Matching Functions

Function Description
match Substring match
phrase / search Tokenized phrase match
re / regex / regexp Regex match
wildcard Wildcard match
cidr CIDR range match
query_string Query string syntax
exists Field existence check

Outer Functions (Prioritize Rollup)

Function Description
cumsum Cumulative sum
rate / irate Growth rate (non-negative)
deriv Derivative (allows negative)
increase Increase amount (non-negative)
difference Difference (allows negative)
moving_average Moving average
top / bottom TopN
fill Null fill (recommended as outer)
abs / round / ceil / floor Mathematical operations
set Deduplicate and sort
concat String concatenation

Expression Calculation

Function Description
eval Multi-query expression calculation

Feedback

Is this page helpful? ×