Dataway¶

Introduction¶

DataWay is Guance's data gateway. All data collected by collectors must pass through the DataWay gateway before being reported to Guance.

Installing Dataway¶

Creating a Dataway

In the Guance admin console, navigate to the "Data Gateway" page and click on "New Dataway". Enter a name and bind an address, then click "Create".

After successful creation, a new Dataway will be created automatically and an installation script for Dataway will be generated.

Info

The bound address refers to the Dataway gateway address, which must include the complete HTTP address including protocol, host address, and port, e.g., http(s)://1.2.3.4:9528. The host address typically uses the IP of the machine where Dataway is deployed or can be a domain name (which must be properly resolved).

Note: Ensure that the collector can access this address, otherwise data collection will fail.

Installing Dataway

Host InstallationKubernetes

DW_KODO=http://kodo_ip:port \
   DW_TOKEN=<tkn_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX> \
   DW_UUID=<YOUR_UUID> \
   bash -c "$(curl https://static.guance.com/dataway/install.sh)"

After installation, a dataway.yaml file will be generated under the installation directory. An example of its content is as follows and can be manually modified to take effect after restarting the service.

dataway.yaml (Click to expand)

# ============= DATAWAY CONFIG =============

# Dataway UUID, we can get it on during create a new dataway
uuid:

# It's the workspace token, most of the time, it's
# system worker space's token.
token:

# secret_token used under sinker mode, and to check if incomming datakit
# requests are valid.
secret_token:

# If __internal__ token allowed? If ok, the data/request will direct to
# the workspace with the token above
enable_internal_token: false

# is empty token allowed? If ok, the data/request will direct to
# the workspace with the token above
enable_empty_token: false

# Is dataway cascaded? For cascaded Dataway, it's remote_host is
# another Dataway and not Kodo.
cascaded: false

# kodo(next dataway) related configures
remote_host:
http_timeout: 30s

http_max_idle_conn_perhost: 0 # default to CPU cores
http_max_conn_perhost: 0      # default no limit

insecure_skip_verify: false
http_client_trace: false
max_conns_per_host: 0
sni: ""

# dataway API configures
bind: 0.0.0.0:9528

# disable 404 page
disable_404page: false

# dataway TLS file path
tls_crt:
tls_key:

# enable pprof
pprof_bind: localhost:6060

api_limit_rate : 100000         # 100K
max_http_body_bytes : 67108864  # 64MB
copy_buffer_drop_size : 262144  # 256KB, if copy buffer memory larger than this, this memory released
reserved_pool_size: 4096        # reserved pool size for better GC

within_docker: false

log_level: info
log: log
gin_log: gin.log

cache_cfg:
  # cache disk path
  dir: "disk_cache"

  # disable cache
  disabled: false

  clean_interval: "10s"

  # in MB, max single data package size in disk cache, such as HTTP body
  max_data_size: 100

  # in MB, single disk-batch(single file) size
  batch_size: 128

  # in MB, max disk size allowed to cache data
  max_disk_size: 65535

  # expire duration, default 7 days
  expire_duration: "168h"

prometheus:
  listen: "localhost:9090"
  url: "/metrics"
  enable: true

#sinker:
#  etcd:
#    urls:
#    - http://localhost:2379 # one or multiple etcd host
#    dial_timeout: 30s
#    key_space: "/dw_sinker" # subscribe to the etcd key
#    username: "dataway"
#    password: "<PASSWORD>"
#  #file:
#  #  path: /path/to/sinker.json

The pod YAML for Dataway is as follows:

??? info "*dataway-deploy.yaml* (Click to expand)"

    ```yaml linenums="1"
    ---

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: deployment-utils-dataway
      name: dataway
      namespace: utils
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: deployment-utils-dataway
      template:
        metadata:
          labels:
            app: deployment-utils-dataway
          annotations:
            datakit/logs: |
              [
                {
                  "disable": false,
                  "source": "dataway",
                  "service": "dataway",
                  "multiline_match": "^\\d{4}|^\\[GIN\\]"
                }
              ]
            datakit/prom.instances: |
              [[inputs.prom]]
                url = "http://$IP:9090/metrics"

                source = "dataway"
                measurement_name = "dw"
                interval = "10s"
                disable_instance_tag = true
              [inputs.prom.tags]
                service = "dataway"
                instance = "$PODNAME" # we can set as "guangzhou-$PODNAME"
        spec:
          affinity:
            podAffinity: {}
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                - labelSelector:
                    matchExpressions:
                      - key: app
                        operator: In
                        values:
                          - deployment-utils-dataway
                  topologyKey: kubernetes.io/hostname

          containers:
          - image: pubrepo.guance.com/dataflux/dataway:1.9.0
            imagePullPolicy: IfNotPresent
            name: dataway
            env:
            - name: DW_REMOTE_HOST
              value: "http://kodo.forethought-kodo:9527"
            - name: DW_BIND
              value: "0.0.0.0:9528"
            - name: DW_UUID
              value: "agnt_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Fill in the actual Dataway UUID here
            - name: DW_TOKEN
              value: "tkn_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Fill in the actual Dataway token here, usually the system workspace token
            - name: DW_PROM_LISTEN
              value: "0.0.0.0:9090"

            ports:
            - containerPort: 9528
              name: 9528tcp01
              protocol: TCP
            volumeMounts:
              - mountPath: /usr/local/cloudcare/dataflux/dataway/cache
                name: dataway-cache
            resources:
              limits:
                cpu: '4'
                memory: 4Gi
              requests:
                cpu: 100m
                memory: 512Mi
          # nodeSelector:
          #   key: string              
          imagePullSecrets:
          - name: registry-key
          restartPolicy: Always
          volumes:
          - hostPath:
              path: /root/dataway_cache
              type: DirectoryOrCreate
            name: dataway-cache
    ---

    apiVersion: v1
    kind: Service
    metadata:
      name: dataway
      namespace: utils
    spec:
      ports:
      - name: 9528tcp02
        port: 9528
        protocol: TCP
        targetPort: 9528
        nodePort: 30928
      selector:
        app: deployment-utils-dataway
      type: NodePort
    ```

In *dataway-deploy.yaml*, you can modify Dataway configurations via environment variables. See [here](dataway.md#img-envs).

Alternatively, you can externally mount a *dataway.yaml* file using ConfigMap but it must be mounted as */usr/local/cloudcare/dataflux/dataway/dataway.yaml*:

```yaml
containers:
  volumeMounts:
    - name: dataway-config
      mountPath: /usr/local/cloudcare/dataflux/dataway/dataway.yaml
      subPath: config.yaml
volumes:
- configMap:
    defaultMode: 256
    name: dataway-config
    optional: false
  name: dataway-config
```

Notes

Dataway only runs on Linux systems (currently only Linux arm64/amd64 binaries are published).
When installing on a host, the Dataway installation path is /usr/local/cloudcare/dataflux/dataway.
Kubernetes sets resource limits to 4000m/4Gi by default, which can be adjusted according to actual needs. Minimum requirements are 100m/512Mi.

Verifying Dataway Installation

After installation, wait briefly and refresh the "Data Gateway" page. If you see a version number listed under the "Version Information" column for the newly added data gateway, it means that this Dataway has successfully connected to Guance. Front-end users can now use it to connect data.

Once Dataway has successfully connected to Guance, log into the Guance console and go to the "Integrations"/"DataKit" page. You can view all Dataway addresses there, select the desired Dataway gateway address, and obtain the DataKit installation instructions to execute on your server to start collecting data.

Managing DataWay¶

Deleting DataWay¶

In the Guance admin console, navigate to the "Data Gateway" page, select the DataWay you want to delete, click "Configure", and then click the "Delete" button at the bottom left of the edit DataWay dialog box.

Warning

After deleting a DataWay, you also need to log in to the server where the DataWay gateway is deployed, stop the DataWay process, and delete the installation directory to completely remove the DataWay.

Upgrading DataWay¶

On the "Data Gateway" page in the Guance admin console, if there is an upgrade available for a DataWay, an upgrade prompt will appear in the version information section.

Upgrading Host InstallationUpgrading Kubernetes

DW_UPGRADE=1 bash -c "$(curl https://static.guance.com/dataway/install.sh)"

Simply replace the image version:

- image: pubrepo.guance.com/dataflux/dataway:<VERSION>

Managing the Dataway Service¶

When installing Dataway on a host, you can manage the Dataway service using the following commands.

# Starting
$ systemctl start dataway

# Restarting
$ systemctl restart dataway

# Stopping
$ systemctl stop dataway

For Kubernetes, simply restart the corresponding Pod.

Environment Variables¶

Host Installation Supported Environment Variables¶

We no longer recommend the host installation method. New configuration items are no longer supported via command-line parameters. If changing the deployment method is not possible, it is recommended to manually modify the corresponding configuration after installation (or upgrade). Default configurations are shown in the example above.

When installing on a host, you can inject the following environment variables during the installation command:

Env	Type	Required	Description	Example Value
DW_BIND	string	N	Dataway HTTP API binding address, defaults to `0.0.0.0:9528`
DW_CASCADED	boolean	N	Whether Dataway is cascaded	`true`
DW_HTTP_CLIENT_TRACE	boolean	N	Dataway acts as an HTTP client itself, enabling the collection of some related metrics, which will eventually be output in its Prometheus metrics	`true`
DW_KODO	string	Y	Kodo address, or the next Dataway address, in the format `http://host:port`
DW_TOKEN	string	Y	Usually the data token for the system workspace
DW_UPGRADE	boolean	N	Specify as 1 when upgrading
DW_UUID	string	Y	Dataway UUID, which is generated by the system workspace when creating a new Dataway
DW_TLS_CRT	file-path	N	Specify HTTPS/TLS crt file directory Version-1.4.1
DW_TLS_KEY	file-path	N	Specify HTTPS/TLS key file directory Version-1.4.1
DW_PROM_EXPORTOR_BIND	string	N	Specify the HTTP port (default 9090) where Dataway exposes its own metrics Version-1.5.0
DW_PPROF_BIND	string	N	Specify the HTTP port (default 6060) where Dataway exposes its pprof functionality Version-1.5.0
DW_DISK_CACHE_CAP_MB	int	N	Specify disk cache size (in MB), default 65535MB Version-1.5.0

Warning

Settings related to Sinker require manual modification after installation. Currently, specifying Sinker configurations during installation is not supported. Version-1.5.0

Image Environment Variables¶

When running Dataway in a Kubernetes environment, the following environment variables are supported.

Compatibility with existing dataway.yaml

Since some older versions of Dataway inject configurations via ConfigMap (the filename mounted into the container is generally dataway.yaml), if the Dataway image finds a file injected via ConfigMap in the installation directory after startup, the DW_* environment variables mentioned below will not take effect. These environment variables will only become effective after removing the existing ConfigMap mount.

If the environment variables take effect, there will be a hidden file named .dataway.yaml in the Dataway installation directory (visible with ls -a). You can cat this file to confirm whether the environment variables have taken effect.

HTTP Server Settings¶

Env	Type	Required	Description	Example Value
DW_REMOTE_HOST	string	Y	Kodo address, or the next Dataway address, in the format `http://host:port`
DW_WHITE_LIST	string	N	Client IP whitelist for Dataway, separated by commas `,`
DW_HTTP_TIMEOUT	string	N	Timeout setting for Dataway requests to Kodo or the next Dataway, default 30s
DW_HTTP_MAX_IDLE_CONN_PERHOST	int	N	Maximum idle connection setting for Dataway requests to Kodo, default value is CPU cores Version-1.6.2
DW_HTTP_MAX_CONN_PERHOST	int	N	Maximum connection count setting for Dataway requests to Kodo, default is unlimited Version-1.6.2
DW_BIND	string	N	Dataway HTTP API binding address, defaults to `0.0.0.0:9528`
DW_API_LIMIT	int	N	Rate limiting setting for Dataway API. For example, if set to 1000, each specific API is allowed to be requested up to 1000 times per second, default is 100K
DW_HEARTBEAT	string	N	Heartbeat interval between Dataway and the center, default 60s
DW_MAX_HTTP_BODY_BYTES	int	N	Maximum allowed HTTP Body for Dataway API (unit bytes), default 64MB
DW_TLS_INSECURE_SKIP_VERIFY	boolean	N	Ignore HTTPS/TLS certificate errors	`true`
DW_HTTP_CLIENT_TRACE	boolean	N	Dataway acts as an HTTP client itself, enabling the collection of some related metrics, which will eventually be output in its Prometheus metrics	`true`
DW_ENABLE_TLS	boolean	N	Enable HTTPS Version-1.4.1
DW_TLS_CRT	file-path	N	Specify HTTPS/TLS crt file directory Version-1.4.0
DW_TLS_KEY	file-path	N	Specify HTTPS/TLS key file directory Version-1.4.0
DW_SNI	string	N	Specify current Dataway SNI information Version-1.6.0
DW_DISABLE_404PAGE	boolean	N	Disable 404 page Version-1.6.1

HTTP TLS Settings¶

To generate a TLS certificate valid for one year, you can use the following OpenSSL command:

# Generate a TLS certificate valid for one year
$ openssl req -new -newkey rsa:4096 -x509 -sha256 -days 365 -nodes -out tls.crt -keyout tls.key
...

After executing this command, you will be prompted to enter some necessary information, including your country, region, city, organization name, department name, and email address. This information will be included in your certificate.

After completing the input, two files will be generated: tls.crt (certificate file) and tls.key (private key file). Please keep your private key file secure.

To allow applications to use these TLS certificates, you need to set the absolute paths of these two files into the application's environment variables. Below is an example of how to set these environment variables:

You must first enable DW_ENABLE_TLS, and then the other two ENVs (DW_TLS_CRT/DW_TLS_KEY) will take effect. Version-1.4.1

env:
- name: DW_ENABLE_TLS
  value: "true"
- name: DW_TLS_CRT
  value: "/path/to/your/tls.crt"
- name: DW_TLS_KEY
  value: "/path/to/your/tls.key"

Replace /path/to/your/tls.crt and /path/to/your/tls.key with the actual paths where you store the tls.crt and tls.key files.

After setting up, you can test if TLS is working using the following command:

$ curl -k http://localhost:9528

If successful, it will display an ASCII Art message saying It's working!. If the certificate does not exist, the Dataway logs will show an error similar to the following:

server listen(TLS) failed: open /path/to/your/tls.{crt,key}: no such file or directory

At this point, Dataway cannot start, and the above curl command will also report an error:

$ curl -vvv -k http://localhost:9528
curl: (7) Failed to connect to localhost port 9528 after 6 ms: Couldn't connect to server

Logging Settings¶

Env	Type	Required	Description
DW_LOG	string	N	Log path, default is log
DW_LOG_LEVEL	string	N	Default is `info`
DW_GIN_LOG	string	N	Default is gin.log

Token/UUID Settings¶

Env	Type	Required	Description
DW_UUID	string	Y	Dataway UUID, which is generated by the system workspace when creating a new Dataway
DW_TOKEN	string	Y	Usually the data upload token for the system workspace
DW_SECRET_TOKEN	string	N	Set this token when enabling the Sinker function
DW_ENABLE_INTERNAL_TOKEN	boolean	N	Allow using `__internal__` as the client token, at which point the system workspace token is used by default
DW_ENABLE_EMPTY_TOKEN	boolean	N	Allow uploading data without a token, at which point the system workspace token is used by default

Sinker Settings¶

Env	Type	Required	Description	Example Value
DW_SECRET_TOKEN	string	N	Set this token when enabling the Sinker function
DW_CASCADED	string	N	Whether Dataway is cascaded	`true`
DW_SINKER_ETCD_URLS	string	N	List of etcd addresses, separated by commas `,`, e.g., `http://1.2.3.4:2379,http://1.2.3.4:2380`
DW_SINKER_ETCD_DIAL_TIMEOUT	string	N	etcd connection timeout, default 30s
DW_SINKER_ETCD_KEY_SPACE	string	N	Name of the etcd key where Sinker configurations are located (default `/dw_sinker`)
DW_SINKER_ETCD_USERNAME	string	N	etcd username
DW_SINKER_ETCD_PASSWORD	string	N	etcd password
DW_SINKER_FILE_PATH	file-path	N	Specify Sinker rule configurations via a local file

Warning

If both a local file and etcd methods are specified, the Sinker rules in the local file will take precedence.

Prometheus Metric Exposure¶

Env	Type	Required	Description	Example Value
DW_PROM_URL	string	N	URL Path for Prometheus metrics (default `/metrics`)
DW_PROM_LISTEN	string	N	Address for exposing Prometheus metrics (default `localhost:9090`)
DW_PROM_DISABLED	boolean	N	Disable Prometheus metric exposure	`true`

Disk Cache Settings¶

Env	Type	Required	Description	Example Value
DW_DISKCACHE_DIR	file-path	N	Set the cache directory, this directory should generally be an external storage	path/to/your/cache
DW_DISKCACHE_DISABLE	boolean	N	Disable disk caching, if caching is not disabled, remove this environment variable	`true`
DW_DISKCACHE_CLEAN_INTERVAL	string	N	Cache cleanup interval, default 30s	Duration string
DW_DISKCACHE_EXPIRE_DURATION	string	N	Cache expiration time, default 168h (7d)	Duration string, e.g., `72h` for three days
DW_DISKCACHE_CAPACITY_MB	int	N	Version-1.6.0 Set the available disk space size in MB, default 20GB	Specifying `1024` equals 1GB
DW_DISKCACHE_BATCH_SIZE_MB	int	N	Version-1.6.0 Set the maximum size of a single disk cache file in MB, default 64MB	Specifying `1024` equals 1GB
DW_DISKCACHE_MAX_DATA_SIZE_MB	int	N	Version-1.6.0 Set the maximum size (in MB) of a single cached item (e.g., a single HTTP body), default 64MB. Any packet larger than this size will be discarded	Specifying `1024` equals 1GB

Tips

Setting DW_DISKCACHE_DISABLE will disable disk caching.

Performance-related Settings¶

Version-1.6.0

Env	Type	Required	Description	Example Value
DW_COPY_BUFFER_DROP_SIZE	int	N	HTTP body buffers exceeding the specified size (in bytes) will be immediately cleared to avoid consuming too much memory. Default value is 256KB	524288

Dataway API List¶

Details of the following APIs will be added later.

`GET /v1/ntp/`¶

Version-1.6.0

API description: Get the current Unix timestamp (in seconds) of Dataway

`POST /v1/write/:category`¶

API description: Receive various types of collected data uploaded by Datakit

`GET /v1/datakit/pull`¶

API description: Process Datakit's request to pull central configurations (blacklist/Pipeline)

`POST /v1/write/rum/replay`¶

API description: Receive Session Replay data uploaded by Datakit

`POST /v1/upload/profiling`¶

API description: Receive Profiling data uploaded by Datakit

`POST /v1/election`¶

API description: Process Datakit election requests

`POST /v1/election/heartbeat`¶

API description: Process heartbeat requests for Datakit elections

`POST /v1/query/raw`¶

Process DQL query requests, simple example as follows:

POST /v1/query/raw?token=<workspace-token> HTTP/1.1
Content-Type: application/json

{
    "token": "workspace-token",
    "queries": [
        {
            "query": "M::cpu LIMIT 1"
        }
    ],
    "echo_explain": <true/false>
}

Example response:

{
  "content": [
    {
      "series": [
        {
          "name": "cpu",
          "columns": [
            "time",
            "usage_iowait",
            "usage_total",
            "usage_user",
            "usage_guest",
            "usage_system",
            "usage_steal",
            "usage_guest_nice",
            "usage_irq",
            "load5s",
            "usage_idle",
            "usage_nice",
            "usage_softirq",
            "global_tag1",
            "global_tag2",
            "host",
            "cpu"
          ],
          "values": [
            [
              1709782208662,
              0,
              7.421875,
              3.359375,
              0,
              4.0625,
              0,
              0,
              0,
              1,
              92.578125,
              0,
              0,
              null,
              null,
              "WIN-JCHUL92N9IP",
              "cpu-total"
            ]
          ]
        }
      ],
      "points": null,
      "cost": "24.558375ms",
      "is_running": false,
      "async_id": "",
      "query_parse": {
        "namespace": "metric",
        "sources": {
          "cpu": "exact"
        },
        "fields": {},
        "funcs": {}
      },
      "index_name": "",
      "index_store_type": "",
      "query_type": "guancedb",
      "complete": false,
      "index_names": "",
      "scan_completed": false,
      "scan_index": "",
      "next_cursor_time": -1,
      "sample": 1,
      "interval": 0,
      "window": 0
    }
  ]
}

Response explanation:

The real data is located in the inner series field.
name represents the name of the metric set (in this case, it's the CPU metric; if it's log data, this field won't be present).
columns represent the returned result column names.
values contain the corresponding column results for the columns.

Info

The token in the URL request parameter can differ from the token in the JSON body. The former is used to verify whether the query request is legitimate, while the latter determines the target workspace where the data resides.
The queries field can carry multiple queries, each of which can have additional fields. For a list of specific field details, refer to here

`POST /v1/workspace`¶

API description: Process workspace query requests initiated by Datakit

`POST /v1/object/labels`¶

API description: Process requests to modify object labels

`DELETE /v1/object/labels`¶

API description: Process requests to delete object labels

`GET /v1/check/:token`¶

API description: Check if the token is valid

Dataway Metric Collection¶

HTTP Client Metric Collection

To collect metrics of Dataway's HTTP requests to Kodo (or the next hop Dataway), you need to manually enable the http_client_trace configuration or specify the environment variable DW_HTTP_CLIENT_TRACE=true.

Host DeploymentKubernetes

Dataway itself exposes Prometheus metrics, which can be collected using the built-in prom collector of Datakit. Here's an example configuration for the collector:

[[inputs.prom]]
  ## Exporter URLs.
  urls = [ "http://localhost:9090/metrics", ]
  source = "dataway"
  election = true
  measurement_name = "dw" # The metric set name for dataway is fixed as dw, do not change it
[inputs.prom.tags]
  service = "dataway"

If there is a Datakit deployed in the cluster (needs to be version 1.14.2 or higher), you can enable Prometheus metric exposure in Dataway (the default POD yaml already includes this):

annotations: # The following annotation is already added by default
   datakit/prom.instances: |
     [[inputs.prom]]
       url = "http://$IP:9090/metrics" # The port here (default 9090) depends on the situation
       source = "dataway"
       measurement_name = "dw" # Fixed metric set
       interval = "10s"
       disable_instance_tag = true

     [inputs.prom.tags]
       service = "dataway"
       instance = "$PODNAME"

...
env:
- name: DW_PROM_LISTEN
  value: "0.0.0.0:9090" # Keep this port consistent with the one in the url above

If the collection is successful, you can search for dataway in the "Scenarios"/"Built-in Views" section of Guance to see the corresponding monitoring views.

Dataway Metric List¶

Below are the metrics exposed by Dataway. You can retrieve these metrics by requesting http://localhost:9090/metrics. You can also use the following command to view a specific metric in real-time (every 3 seconds):

Some metrics may not be found if their related business modules haven't been running yet. Some new metrics are only available in the latest versions; the version information for each metric is not listed here. Refer to the list of metrics returned by the /metrics interface for accuracy.

watch -n 3 'curl -s http://localhost:9090/metrics | grep -a <METRIC-NAME>'

TYPE	NAME	LABELS	HELP
SUMMARY	`dataway_http_api_elapsed_seconds`	`api,method,status`	API request latency
SUMMARY	`dataway_http_api_body_buffer_utilization`	`api`	API body buffer utillization(Len/Cap)
SUMMARY	`dataway_http_api_body_copy`	`api`	API body copy
SUMMARY	`dataway_http_api_resp_size_bytes`	`api,method,status`	API response size
SUMMARY	`dataway_http_api_req_size_bytes`	`api,method,status`	API request size
COUNTER	`dataway_http_api_total`	`api,status`	API request count
COUNTER	`dataway_http_api_body_too_large_dropped_total`	`api,method`	API request too large dropped
COUNTER	`dataway_http_api_with_inner_token`	`api,method`	API request with inner token
COUNTER	`dataway_http_api_dropped_total`	`api,method`	API request dropped when sinker rule match failed
COUNTER	`dataway_syncpool_stats`	`name,type`	sync.Pool usage stats
COUNTER	`dataway_http_api_copy_body_failed_total`	`api`	API copy body failed count
COUNTER	`dataway_http_api_signed_total`	`api,method`	API signature count
SUMMARY	`dataway_http_api_cached_bytes`	`api,cache_type,method,reason`	API cached body bytes
SUMMARY	`dataway_http_api_reusable_body_read_bytes`	`api,method`	API re-read body on forking request
SUMMARY	`dataway_http_api_recv_points`	`api`	API /v1/write/:category recevied points
SUMMARY	`dataway_http_api_send_points`	`api`	API /v1/write/:category send points
SUMMARY	`dataway_http_api_cache_points`	`api,cache_type`	Disk cached /v1/write/:category points
SUMMARY	`dataway_http_api_cache_cleaned_points`	`api,cache_type,status`	Disk cache cleaned /v1/write/:category points
COUNTER	`dataway_http_api_forked_total`	`api,method,token`	API request forked total
GAUGE	`dataway_http_info`	`cascaded,docker,http_client_trace,listen,max_body,release_date,remote,version`	Dataway API basic info
GAUGE	`dataway_last_heartbeat_time`	`N/A`	Dataway last heartbeat with Kodo timestamp
GAUGE	`dataway_cpu_usage`	`N/A`	Dataway CPU usage(%)
GAUGE	`dataway_mem_stat`	`type`	Dataway memory usage stats
SUMMARY	`dataway_http_api_copy_buffer_drop_total`	`max`	API copy buffer dropped(too large cached buffer) count
GAUGE	`dataway_open_files`	`N/A`	Dataway open files
GAUGE	`dataway_cpu_cores`	`N/A`	Dataway CPU cores
GAUGE	`dataway_uptime`	`N/A`	Dataway uptime
COUNTER	`dataway_process_ctx_switch_total`	`type`	Dataway process context switch count(Linux only)
COUNTER	`dataway_process_io_count_total`	`type`	Dataway process IO count
SUMMARY	`dataway_http_api_copy_buffer_drop_total`	`max`	API copy buffer dropped(too large cached buffer) count
COUNTER	`dataway_process_io_bytes_total`	`type`	Dataway process IO bytes count
SUMMARY	`dataway_http_api_dropped_expired_cache`	`api,method`	Dropped expired cache data
SUMMARY	`dataway_httpcli_tls_handshake_seconds`	`server`	HTTP TLS handshake cost
SUMMARY	`dataway_httpcli_http_connect_cost_seconds`	`server`	HTTP connect cost
SUMMARY	`dataway_httpcli_got_first_resp_byte_cost_seconds`	`server`	Got first response byte cost
SUMMARY	`http_latency`	`api,server`	HTTP latency
COUNTER	`dataway_httpcli_tcp_conn_total`	`server,remote,type`	HTTP TCP connection count
COUNTER	`dataway_httpcli_conn_reused_from_idle_total`	`server`	HTTP connection reused from idle count
SUMMARY	`dataway_httpcli_conn_idle_time_seconds`	`server`	HTTP connection idle time
SUMMARY	`dataway_httpcli_dns_cost_seconds`	`server`	HTTP DNS cost
SUMMARY	`dataway_sinker_rule_cost_seconds`	`N/A`	Rule cost time seconds
SUMMARY	`dataway_sinker_cache_key_len`	`N/A`	cache key length(bytes)
SUMMARY	`dataway_sinker_cache_val_len`	`N/A`	cache value length(bytes)
COUNTER	`dataway_sinker_pull_total`	`event,source`	Sinker pulled or pushed counter
GAUGE	`dataway_sinker_rule_cache_miss`	`N/A`	Sinker rule cache miss
GAUGE	`dataway_sinker_rule_cache_hit`	`N/A`	Sinker rule cache hit
GAUGE	`dataway_sinker_rule_cache_size`	`N/A`	Sinker rule cache size
GAUGE	`dataway_sinker_rule_error`	`error`	Rule errors
GAUGE	`dataway_sinker_default_rule_hit`	`info`	Default sinker rule hit count
GAUGE	`dataway_sinker_rule_last_applied_time`	`source`	Rule last applied time(Unix timestamp)
COUNTER	`diskcache_put_bytes_total`	`path`	Cache Put() bytes count
COUNTER	`diskcache_get_total`	`path`	Cache Get() count
COUNTER	`diskcache_wakeup_total`	`path`	Wakeup count on sleeping write file
COUNTER	`diskcache_seek_back_total`	`path`	Seek back when Get() got any error
COUNTER	`diskcache_get_bytes_total`	`path`	Cache Get() bytes count
GAUGE	`diskcache_capacity`	`path`	Current capacity(in bytes)
GAUGE	`diskcache_max_data`	`path`	Max data to Put(in bytes), default 0
GAUGE	`diskcache_batch_size`	`path`	Data file size(in bytes)
GAUGE	`diskcache_size`	`path`	Current cache size(in bytes)
GAUGE	`diskcache_open_time`	`no_fallback_on_error,no_lock,no_pos,no_sync,path`	Current cache Open time in unix timestamp(second)
GAUGE	`diskcache_last_close_time`	`path`	Current cache last Close time in unix timestamp(second)
GAUGE	`diskcache_datafiles`	`path`	Current un-read data files
SUMMARY	`diskcache_get_latency`	`path`	Get() time cost(micro-second)
SUMMARY	`diskcache_put_latency`	`path`	Put() time cost(micro-second)
COUNTER	`diskcache_dropped_bytes_total`	`path`	Dropped bytes during Put() when capacity reached.
COUNTER	`diskcache_dropped_total`	`path,reason`	Dropped files during Put() when capacity reached.
COUNTER	`diskcache_rotate_total`	`path`	Cache rotate count, mean file rotate from data to data.0000xxx
COUNTER	`diskcache_remove_total`	`path`	Removed file count, if some file read EOF, remove it from un-read list
COUNTER	`diskcache_put_total`	`path`	Cache Put() count

Metric Collection in Docker Mode¶

There are two modes for host installation: one is native host installation, and the other is Docker installation. Here we specifically explain the differences in metric collection when installing through Docker.

When installing through Docker, the HTTP port for exposing metrics will be mapped to port 19090 of the host machine (by default). At this point, the metric collection address will be http://localhost:19090/metrics.

If a different port is specified, then during Docker installation, the port will be increased by 10000. Therefore, ensure that the specified port does not exceed 45535.

In addition, during Docker installation, the profile collection port will also be exposed. By default, it is mapped to port 16060 on the host machine. Its mechanism is similar, adding 10000 to the specified port.

Self-logging Collection and Processing of Dataway¶

Dataway's own logging is divided into two categories: one is gin logging, and the other is the program's own logging. The following Pipeline can separate them:

# Pipeline for dataway logging

# Testing sample loggin
'''
2023-12-14T11:27:06.744+0800    DEBUG   apis    apis/api_upload_profile.go:272  save profile file to disk [ok] /v1/upload/profiling?token=****************a4e3db8481c345a94fe5a
[GIN] 2021/10/25 - 06:48:07 | 200 |   30.890624ms |  114.215.200.73 | POST     "/v1/write/logging?token=tkn_5c862a11111111111111111111111111"
'''

add_pattern("TOKEN", "tkn_\\w+")
add_pattern("GINTIME", "%{YEAR}/%{MONTHNUM}/%{MONTHDAY}%{SPACE}-%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}")
grok(_,"\\[GIN\\]%{SPACE}%{GINTIME:timestamp}%{SPACE}\\|%{SPACE}%{NUMBER:dataway_code}%{SPACE}\\|%{SPACE}%{NOTSPACE:cost_time}%{SPACE}\\|%{SPACE}%{NOTSPACE:client_ip}%{SPACE}\\|%{SPACE}%{NOTSPACE:method}%{SPACE}%{GREEDYDATA:http_url}")

# gin logging
if cost_time != nil {
  if http_url != nil  {
    grok(http_url, "%{TOKEN:token}")
    cover(token, [5, 15])
    replace(message, "tkn_\\w{0,5}\\w{6}", "****************$4")
    replace(http_url, "tkn_\\w{0,5}\\w{6}", "****************$4")
  }

  group_between(dataway_code, [200,299], "info", status)
  group_between(dataway_code, [300,399], "notice", status)
  group_between(dataway_code, [400,499], "warning", status)
  group_between(dataway_code, [500,599], "error", status)

  if sample(0.1) { # drop 90% debug log
    drop()
    exit()
  } else {
    set_tag(sample_rate, "0.1")
  }

  parse_duration(cost_time)
  duration_precision(cost_time, "ns", "ms")

  set_measurement('gin', true)
  set_tag(service,"dataway")
  exit()
}

# app logging
if cost_time == nil {
  grok(_,"%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{NOTSPACE:status}%{SPACE}%{NOTSPACE:module}%{SPACE}%{NOTSPACE:code}%{SPACE}%{GREEDYDATA:msg}")
  if level == nil {
    grok(message,"Error%{SPACE}%{DATA:errormsg}")
    if errormsg != nil {
      add_key(status,"error")
      drop_key(errormsg)
    }
  }
  lowercase(level)

  # if debug level enabled, drop most of them
  if status == 'debug' {
    if sample(0.1) { # drop 90% debug log
      drop()
      exit()
    } else {
      set_tag(sample_rate, "0.1")
    }
  }

  group_in(status, ["error", "panic", "dpanic", "fatal","err","fat"], "error", status) # mark them as 'error'

  if msg != nil {
    grok(msg, "%{TOKEN:token}")
    cover(token, [5, 15])
    replace(message, "tkn_\\w{0,5}\\w{6}", "****************$4")
    replace(msg, "tkn_\\w{0,5}\\w{6}", "****************$4")
  }

  set_measurement("dataway-log", true)
  set_tag(service,"dataway")
}

Dataway Bug Report¶

Dataway itself exposes metrics and profiling collection endpoints, which we can gather to aid in troubleshooting.

The following information gathering assumes default configured ports and addresses; adjust accordingly based on your actual setup.

dw-bug-report.sh

br_dir="dw-br-$(date +%s)"
mkdir -p $br_dir

echo "save bug report to ${br_dir}"

# Modify the following configurations as needed
dw_ip="localhost" # IP address where Dataway metrics/profile are exposed
metric_port=9090  # Port for metrics exposure
profile_port=6060 # Port for profile exposure
dw_yaml_conf="/usr/local/cloudcare/dataflux/dataway/dataway.yaml"
dw_dot_yaml_conf="/usr/local/cloudcare/dataflux/dataway/.dataway.yaml" # Present in container installations

# Collect runtime metrics
curl -v "http://${dw_ip}:${metric_port}/metrics" -o $br_dir/metrics

# Collect profiling information
curl -v "http://${dw_ip}:${profile_port}/debug/pprof/allocs" -o $br_dir/allocs
curl -v "http://${dw_ip}:${profile_port}/debug/pprof/heap" -o $br_dir/heap
curl -v "http://${dw_ip}:${profile_port}/debug/pprof/profile" -o $br_dir/profile # This command will run for about 30 seconds

cp $dw_yaml_conf $br_dir/dataway.yaml.copy
cp $dw_dot_yaml_conf $br_dir/.dataway.yaml.copy

tar czvf ${br_dir}.tar.gz ${br_dir}
rm -rf ${br_dir}

Run the script:

$ sh dw-bug-report.sh
...

After execution, a file like dw-br-1721188604.tar.gz will be generated. Take this file out for analysis.

FAQ¶

Request Body Too Large Issue¶

Version-1.3.7

Dataway has a default setting for request body size (default 64MB). When the request body is too large, clients will receive an HTTP 413 error (Request Entity Too Large). If the request body is reasonably sized, you can increase this value appropriately (in bytes):

Set the environment variable DW_MAX_HTTP_BODY_BYTES
Set max_http_body_bytes in dataway.yaml

If large request packets occur during operation, they will be reflected in both metrics and logs:

The metric dataway_http_too_large_dropped_total shows the number of large requests dropped.
Search the Dataway logs with cat log | grep 'drop too large request'. The logs will output detailed HTTP request headers for further client analysis.

Warning

There is also a maximum data block write limit in the disk caching module (default 64MB). If you increase the maximum request body configuration, make sure to adjust this configuration (ENV_DISKCACHE_MAX_DATA_SIZE) to ensure that large requests can be correctly written to the disk cache.

请提供需要继续翻译的内容，我将根据之前的翻译继续进行翻译工作。

This limitation is used to avoid issues where the Dataway container/Pod might run into system-imposed limitations, allowing only about 20,000 connections. Increasing the limit may affect Dataway's data upload efficiency. When Dataway traffic is high, consider increasing the CPU allocation for a single Dataway or horizontally scaling Dataway instances. ↩

Dataway¶

Introduction¶

Installing Dataway¶

Managing DataWay¶

Deleting DataWay¶

Upgrading DataWay¶

Managing the Dataway Service¶

Environment Variables¶

Host Installation Supported Environment Variables¶

Image Environment Variables¶

HTTP Server Settings¶

HTTP TLS Settings¶

Logging Settings¶

Token/UUID Settings¶

Sinker Settings¶

Prometheus Metric Exposure¶

Disk Cache Settings¶

Performance-related Settings¶

Dataway API List¶

GET /v1/ntp/¶

POST /v1/write/:category¶

GET /v1/datakit/pull¶

POST /v1/write/rum/replay¶

POST /v1/upload/profiling¶

POST /v1/election¶

POST /v1/election/heartbeat¶

POST /v1/query/raw¶

POST /v1/workspace¶

POST /v1/object/labels¶

DELETE /v1/object/labels¶

GET /v1/check/:token¶

Dataway Metric Collection¶

Dataway Metric List¶

Metric Collection in Docker Mode¶

Self-logging Collection and Processing of Dataway¶

Dataway Bug Report¶

FAQ¶

Request Body Too Large Issue¶

Is this page helpful? ×

`GET /v1/ntp/`¶

`POST /v1/write/:category`¶

`GET /v1/datakit/pull`¶

`POST /v1/write/rum/replay`¶

`POST /v1/upload/profiling`¶

`POST /v1/election`¶

`POST /v1/election/heartbeat`¶

`POST /v1/query/raw`¶

`POST /v1/workspace`¶

`POST /v1/object/labels`¶

`DELETE /v1/object/labels`¶

`GET /v1/check/:token`¶