Skip to content

Network Dial Test



The collector collects the data of network dialing test results, and all the data generated by dialing test are reported to Guance Cloud.

Configuration

Environment Variables

By default, the dialtesting service can dial up any network, which may pose certain security risks. If you need to prohibit to connect to certain network, you can restrict it by setting the following environmental variables:

Environment variable name Parameter example Description
ENV_INPUT_DIALTESTING_DISABLE_INTERNAL_NETWORK_TASK true Enable or disable internal network dialing test. Default is false
ENV_INPUT_DIALTESTING_DISABLED_INTERNAL_NETWORK_CIDR_LIST ["192.168.0.0/16"] List of network CIDRs that prohibit testing, which supports multiple entries. If left empty, all private networks will be disabled.

Private Test Node Deployment

To deploy private dial-test nodes, you need to create private dial-test nodes on Guance Cloud page. When you're done, fill in the page with the relevant information in conf.d/network/dialtesting.conf:

Go to the conf.d/network directory under the DataKit installation directory, copy dialtesting.conf.sample and name it dialtesting.conf. Examples are as follows:

[[inputs.dialtesting]]
  # We can also configure a JSON path like "file:///your/dir/json-file-name"
  server = "https://dflux-dial.guance.com"

  # [require] node ID
  region_id = "default"

  # if server are dflux-dial.guance.com, ak/sk required
  ak = ""
  sk = ""

  # The interval to pull the tasks.
  pull_interval = "1m"

  # The timeout for the HTTP request.
  time_out = "30s"

  # The number of the workers.
  workers = 6

  # Collect related metric when job execution time error interval is larger than task_exec_time_interval
  task_exec_time_interval = "5s"

  # Stop the task when the task failed to send data to dataway over max_send_fail_count.
  max_send_fail_count = 16

  # The max number of jobs sending data to dataway in parallel. Default 10.
  max_job_number = 10

  # The max number of job chan. Default 1000.
  max_job_chan_number = 1000

  # Disable internal network task.
  disable_internal_network_task = true

  # Disable internal network cidr list.
  disabled_internal_network_cidr_list = []

  # Custom tags.
  [inputs.dialtesting.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...

Once configured, restart DataKit.

The collector can now be turned on by ConfigMap injection collector configuration.


Attention

Currently, only Linux dial-up nodes support, and the tracing data is stored in the traceroute field of the relevant metrics.

Dial Test Deployment Map

dialtesting-net-arch

Metric

Dialtesting collector could expose some Prometheus metrics. You can upload these metrics to Guance Cloud through Datakit collector. The relevant configuration is as follows:

[[inputs.dk]]
  ......

  metric_name_filter = [

  ### others...

  ### dialtesting
  "datakit_dialtesting_.*",

  ]

  ......

Log

All of the following data collections are appended with a global tag named host by default (the tag value is the host name of the DataKit), or can be named in the configuration by [[inputs.dialtesting.tags]] alternative host.

http_dial_testing

  • tag
Tag Description
city The name of the city
country The name of the country
datakit_version The DataKit version
dest_ip The IP address of the destination
internal The boolean value, true for domestic and false for overseas
isp ISP, such as chinamobile, chinaunicom, chinatelecom
method HTTP method, such as GET
name The name of the task
owner The owner name
proto The protocol of the HTTP, such as 'HTTP/1.1'
province The name of the province
status The status of the task, either 'OK' or 'FAIL'
status_code_class The class of the status code, such as '2xx'
status_code_string The status string, such as '200 OK'
url The URL of the endpoint to be monitored
  • metric list
Metric Description Type Unit
fail_reason The reason that leads to the failure of the task string -
message The message string which includes the header and the body of the request or the response string -
response_body_size The length of the body of the response int B
response_connection HTTP connection time float μs
response_dns HTTP DNS parsing time float μs
response_download HTTP downloading time float μs
response_ssl HTTP ssl handshake time float μs
response_time The time of the response int μs
response_ttfb HTTP response ttfb float μs
seq_number The sequence number of the test int count
status_code The response code int -
success The number to specify whether is successful, 1 for success, -1 for failure int -

tcp_dial_testing

  • tag
Tag Description
city The name of the city
country The name of the country
datakit_version The DataKit version
dest_host The name of the host to be monitored
dest_ip The IP address
dest_port The port of the TCP connection
internal The boolean value, true for domestic and false for overseas
isp ISP, such as chinamobile, chinaunicom, chinatelecom
name The name of the task
owner The owner name
proto The protocol of the task
province The name of the province
status The status of the task, either 'OK' or 'FAIL'
  • metric list
Metric Description Type Unit
fail_reason The reason that leads to the failure of the task string -
message The message string includes the response time or fail reason string -
response_time The time of the response int μs
response_time_with_dns The time of the response, which contains DNS time int μs
seq_number The sequence number of the test int count
success The number to specify whether is successful, 1 for success, -1 for failure int -
traceroute The json string fo the traceroute result string -

icmp_dial_testing

  • tag
Tag Description
city The name of the city
country The name of the country
datakit_version The DataKit version
dest_host The name of the host to be monitored
internal The boolean value, true for domestic and false for overseas
isp ISP, such as chinamobile, chinaunicom, chinatelecom
name The name of the task
owner The owner name
proto The protocol of the task
province The name of the province
status The status of the task, either 'OK' or 'FAIL'
  • metric list
Metric Description Type Unit
average_round_trip_time The average time of the round trip(RTT) float μs
average_round_trip_time_in_millis The average time of the round trip(RTT), deprecated float ms
fail_reason The reason that leads to the failure of the task string -
max_round_trip_time The maximum time of the round trip(RTT) float μs
max_round_trip_time_in_millis The maximum time of the round trip(RTT), deprecated float ms
message The message string includes the average time of the round trip or the failure reason string -
min_round_trip_time The minimum time of the round trip(RTT) float μs
min_round_trip_time_in_millis The minimum time of the round trip(RTT), deprecated float ms
packet_loss_percent The loss percent of the packets float -
packets_received The number of the packets received int count
packets_sent The number of the packets sent int count
seq_number The sequence number of the test int count
std_round_trip_time The standard deviation of the round trip float μs
std_round_trip_time_in_millis The standard deviation of the round trip, deprecated float ms
success The number to specify whether is successful, 1 for success, -1 for failure int -
traceroute The json string fo the traceroute result string -

websocket_dial_testing

  • tag
Tag Description
city The name of the city
country The name of the country
datakit_version The DataKit version
internal The boolean value, true for domestic and false for overseas
isp ISP, such as chinamobile, chinaunicom, chinatelecom
name The name of the task
owner The owner name
proto The protocol of the task
province The name of the province
status The status of the task, either 'OK' or 'FAIL'
url The URL string, such as ws://www.abc.com
  • metric list
Metric Description Type Unit
fail_reason The reason that leads to the failure of the task string -
message The message string includes the response time or the failure reason string -
response_message The message of the response string -
response_time The time of the response int μs
response_time_with_dns The time of the response, include DNS int μs
sent_message The sent message string -
seq_number The sequence number of the test int count
success The number to specify whether is successful, 1 for success, -1 for failure int -

traceroute Field Description

traceroute is the JSON text of the "route trace" data, and the entire data is an array object in which each array element records a route probe, as shown in the following example:

[
    {
        "total": 2,
        "failed": 0,
        "loss": 0,
        "avg_cost": 12700395,
        "min_cost": 11902041,
        "max_cost": 13498750,
        "std_cost": 1129043,
        "items": [
            {
                "ip": "10.8.9.1",
                "response_time": 13498750
            },
            {
                "ip": "10.8.9.1",
                "response_time": 11902041
            }
        ]
    },
    {
        "total": 2,
        "failed": 0,
        "loss": 0,
        "avg_cost": 13775021,
        "min_cost": 13740084,
        "max_cost": 13809959,
        "std_cost": 49409,
        "items": [
            {
                "ip": "10.12.168.218",
                "response_time": 13740084
            },
            {
                "ip": "10.12.168.218",
                "response_time": 13809959
            }
        ]
    }
]

Field description:

Field Type Description
total number Total number of detections
failed number Number of failures
loss number Percentage of failure
avg_cost number Average time spent (μs)
min_cost number Minimum time consumption (μs)
max_cost number Maximum time consumption(μs)
std_cost number Standard deviation of time consumption(μs)
items Array of items Per probe information (see)

Item

Field Type Description
ip string IP address, if it fails, the value is *
response_time number Response time (μs)

Feedback

Is this page helpful? ×