Skip to content

Service Data Source Definitions and DQL Queries

Service Performance Data Source Definitions

The TM index space stores data related to service lists and performance Metrics. The data displayed on APM > Performance Metrics is primarily queried from this index space. TM aggregates service Metrics data for each service at three different granularities: minute, hour, and day, to improve query efficiency.

For example, to query all service Metrics data for a 15-minute period from 2024-03-19 15:00:00 to 2024-03-19 15:15:00, you can use the DQL:

TM::`*`:(){source="service_list_1m"} [1710831600000:1710832500000]

This will return results similar to the following:

Query Result Example
[
  {
    "time": 1710835681000,
    "time_us": 1710835681000000,
    "__docid": "T_cnskg71jdosvib6m44s0",
    "__source": "service_list_1m",
    "source": "service_list_1m",
    "__namespace": "tracing",
    "r_env": "demo",
    "r_error_count": 0,
    "r_max_duration": 2293857,
    "r_psketch": "Av1KgVq/UvA/AAAAAAAAAAAJAbgL",
    "r_request_count": 1,
    "r_resp_time": 2293857,
    "r_service": "go-profiling-demo-1",
    "r_service_sub": "go-profiling-demo-1:demo:v0.8.888",
    "r_type": "custom",
    "r_version": "v0.8.888",
    "create_time": 1710835740447,
    "date": 1710835681000,
    "date_ns": 0
  },
  {
    "time": 1710835201000,
    "time_us": 1710835201000000,
    "__docid": "T_cnskcf9jdosvib6jl4kg",
    "__source": "service_list_1m",
    "source": "service_list_1m",
    "__namespace": "tracing",
    "r_env": "demo",
    "r_error_count": 0,
    "r_max_duration": 2370648,
    "r_psketch": "Av1KgVq/UvA/AAAAAAAAAAAJAboL",
    "r_request_count": 1,
    "r_resp_time": 2370648,
    "r_service": "go-profiling-demo-1",
    "r_service_sub": "go-profiling-demo-1:demo:v0.8.888",
    "r_type": "custom",
    "r_version": "v0.8.888",
    "create_time": 1710835261477,
    "date": 1710835201000,
    "date_ns": 0
  }
]

The main fields are described below:

Field Type Description
source string Data aggregation granularity, divided into:
  • Per minute (source="service_list_1m")
  • Per hour (source="service_list_1h")
  • Per day (source="service_list_1d")
  • r_env string Service deployment environment
    r_error_count int Number of service errors
    r_max_duration int Maximum response time within the time granularity, unit: microseconds
    r_request_count int Number of requests
    r_resp_time int Sum of response times aggregated within the time granularity
    r_service string Service name
    r_service_sub string ::
    r_type string Service type, e.g., http/web/db/gateway...
    r_version string Service version
    date int Millisecond timestamp, corresponding to:
  • Zero seconds of each minute (hh:mm:00, when source="service_list_1m")
  • Zero minutes of each hour (hh:00, when source="service_list_1h")
  • Midnight of each day (00:00:00, when source="service_list_1d")
  • Similarly, to query data for a two-hour period from 2024-03-19 15:00:00 to 2024-03-19 17:00:00, you can use the DQL:

    TM::`*`:(){source="service_list_1h"} [1710831600000:1710838800000]
    

    To query data for a two-day period from 2024-03-19 00:00:00 to 2024-03-21 00:00:00, you can use the DQL:

    TM::`*`:(){source="service_list_1d"} [1710777600000:1710950400000]
    

    Different time granularities can be combined. For example, to query data for a two-and-a-half-hour period from 2024-03-19 15:00:00 to 2024-03-19 17:30:00, you can use the DQL:

    TM::`*`:(){ (source="service_list_1h" and date >= 1710831600000 and date < 1710838800000) or (source="service_list_1m" and date >= 1710838800000 and date <= 1710840600000) }
    
    Abstract

    Of course, you could also use only the minute granularity source="service_list_1m" to query the entire time range from 2024-03-19 15:00:00 to 2024-03-19 17:30:00, even if it spans hours or days. However, this would significantly reduce query efficiency and exponentially increase the amount of data returned, so it is highly discouraged.

    Further processing the query results allows calculation of relevant service-level Metrics, for example:

    total_count = SUM(r_request_count)
    error_count = SUM(r_error_count)
    error_rate = SUM(r_error_count) / SUM(r_request_count)
    max_duration = MAX(r_max_duration)
    sum_resp_time = SUM(r_resp_time)
    avg_per_second = SUM(r_request_count) / <query time range in seconds>
    avg_resp_time = SUM(r_resp_time) / SUM(r_request_count)
    p50: Generate an array [(r_resp_time1/r_request_count1)...{repeat r_request_count1 times}, (r_resp_time2/r_request_count2)...{repeat r_request_count2 times}, (r_resp_time3/r_request_count3)...{repeat r_request_count3 times}, ...], sort the array, and take the element at index SUM(r_request_count)*0.5
    

    For example, to query the QPS (queries per second) for each service over a period of time:

    TM::`*`:(r_service, sum(r_request_count) / (1737099000000 - 1737093600000) * 1000 as QPS){ (source="service_list_1h" and date >= 1737093600000 and date < 1737097200000) or (source="service_list_1m" and date >= 1737097200000 and date <= 1737099000000) } by r_service
    

    Service Topology Data Source Definitions

    The TSM index space primarily stores data on the calling relationships between services, pre-aggregated at a minute granularity. For example, to query all service call relationships for a 15-minute period from 2024-03-19 15:00:00 to 2024-03-19 15:15:00, you can use the DQL:

    TSM::`*`:(){} [1710831600000:1710832500000]
    

    This returns query results similar to the following:

    Query Result Example
    [
        {
            "time": 1710835700438,
            "time_us": 1710835700438064,
            "__docid": "6340252d-331c-6e1dd9338-0ed04e818c4d",
            "__source": "relationship",
            "source_service": "go-profiling-demo-1",
            "source_wsuuid": "wksp_8d351d83bdf14b8b8270ab75fe29a990",
            "source_env": "demo",
            "source_project": "",
            "source_version": "v0.8.888",
            "source_type": "custom",
            "source_organization": "",
            "source_status": "ok",
            "source_start": 1710835700059433,
            "source_duration": 220210272,
            "target_service": "go-profiling-demo-2",
            "target_wsuuid": "wksp_8d351d83bdf14b8b8270ab75fe29a990",
            "target_env": "demo",
            "target_project": "",
            "target_version": "v0.8.888",
            "target_type": "custom",
            "target_organization": "",
            "target_status": "ok",
            "target_start": 1710835700438064,
            "target_duration": 886040,
            "count": 96,
            "unique_id": "XTzHH-jScNjXgBSXNIdFcSOVpHwWKyAZroh71ttyPnXK9nl3jW0re0hlKOeHj6PYgo-profiling-demo-1go-profiling-demo-2",
            "unique_id_env_version": "Zp-9KKvEb4m9aU0OeUMon8MiH2isxqXU742YFlVtokgL2Vy73NwtykkG3vA3X0z1go-profiling-demo-1go-profiling-demo-2",
            "error_count": 0
        },
        {
            "time": 1710835243699,
            "time_us": 1710835243699376,
            "__docid": "f4c7de67-ca1b-6e17fd78e-e8bcb4d1d64d",
            "__source": "relationship",
            "source_service": "go-profiling-demo-1",
            "source_wsuuid": "wksp_8d351d83bdf14b8b8270ab75fe29a990",
            "source_env": "demo",
            "source_project": "",
            "source_version": "v0.8.888",
            "source_type": "custom",
            "source_organization": "",
            "source_status": "ok",
            "source_start": 1710835243379392,
            "source_duration": 227582208,
            "target_service": "go-profiling-demo-2",
            "target_wsuuid": "wksp_8d351d83bdf14b8b8270ab75fe29a990",
            "target_env": "demo",
            "target_project": "",
            "target_version": "v0.8.888",
            "target_type": "custom",
            "target_organization": "",
            "target_status": "ok",
            "target_start": 1710835243699376,
            "target_duration": 856217,
            "count": 96,
            "unique_id": "XTzHH-jScNjXgBSXNIdFcSOVpHwWKyAZroh71ttyPnXK9nl3jW0re0hlKOeHj6PYgo-profiling-demo-1go-profiling-demo-2",
            "unique_id_env_version": "Zp-9KKvEb4m9aU0OeUMon8MiH2isxqXU742YFlVtokgL2Vy73NwtykkG3vA3X0z1go-profiling-demo-1go-profiling-demo-2",
            "error_count": 0
        }
    ]
    

    The main fields are described below:

    Field Type Description
    time int Millisecond timestamp, time the service call occurred
    time_us int Time the service call occurred, microsecond precision
    source_service string Name of the calling service
    source_wsuuid string Workspace ID from which the calling service reported
    source_env string Deployment environment of the calling service
    source_project string Project name of the calling service
    source_version string Version of the calling service
    source_type string Type of the calling service
    source_organization string Organization to which the workspace reporting the calling service belongs
    source_status string Status of the calling service, ok/error
    source_start int Start time of the calling service's Span, microsecond timestamp
    source_duration int Sum of calling service Span durations per minute, unit: microseconds
    target_service string Name of the called service
    target_wsuuid string Workspace ID from which the called service reported
    target_env string Deployment environment of the called service
    target_project string Project name of the called service
    target_version string Version of the called service
    target_type string Type of the called service
    target_organization string Organization to which the workspace reporting the called service belongs
    target_status string Status of the called service, ok/error
    target_start int Start time of the called service's Span (time the service call occurred), microsecond timestamp
    target_duration int Sum of called service Span durations per minute, unit: microseconds
    count int Number of calls per minute
    unique_id string Unique ID generated only from the calling service name and the called service name
    unique_id_env_version int Unique ID generated by distinguishing calling service, environment, version and called service, environment, version
    error_count int Number of failed calls per minute

    For service call relationships and service call level Metrics, you can use the following DQL to query the relationships between services:

    TSM::`*`:(first(source_service) as source_service,
    first(source_wsuuid) as source_wsuuid,
    first(target_service) as target_service,
    first(target_wsuuid) as target_wsuuid,
    sum(count) as total_count,
    sum(error_count) as total_error_count,
    sum(target_duration) as total_duration
    ){} by unique_id
    

    Feedback

    Is this page helpful? ×