Skip to content

Pipeline Offload


The PlOffload collector is used to receive pending data offloaded from the DataKit Pipeline Offload function.

The collector will register the route on the http service enabled by DataKit: /v1/write/ploffload/:cagetory, where the category parameter can be logging, network, etc. It is mainly used to process data asynchronously after receiving it, and cache the data to disk after the Pipeline script fails to process the data in time.

Configuration

Collector Configuration

Go to the conf.d/ploffload directory under the DataKit installation directory, copy ploffload.conf.sample and name it ploffload.conf. Examples are as follows:

[inputs.ploffload]

  ## Storage config a local storage space in hard dirver to cache data.
  ## path is the local file path used to cache data.
  ## capacity is total space size(MB) used to store data.
  # [inputs.ploffload.storage]
    # path = "./ploffload_storage"
    # capacity = 5120

After configuration, restart Datakit.

Kubernetes supports modifying configuration parameters in the form of environment variables:

Environment Variable Name Corresponding Configuration Parameter Item Parameter
ENV_INPUT_PLOFFLOAD_STORAGE_PATH storage.path ./ploffload_storage
ENV_INPUT_PLOFFLOAD_STORAGE_CAPACITY storage.capacity 5120

Usage

After the configuration is completed, you need to change the value of the configuration item pipeline.offload.receiver in the datakit.yaml main configuration file of the datakit to be unloaded to ploffload.

Please check whether the host address of the listen configuration item under [http_api] in the DataKit main configuration file is 0.0.0.0 (or LAN IP or WAN IP). If it is 127.0.0.0/8, then Not accessible externally and needs to be modified.

If you need to enable the disk cache function, you need to cancel the storage related comments in the collector configuration, such as modifying it to:

[inputs.ploffload]
  [inputs.ploffload.storage]
    path = "./ploffload_storage"
    capacity = 5120

Feedback

Is this page helpful? ×