Pipeline Offload
The PlOffload collector is used to receive pending data offloaded from the DataKit Pipeline Offload function.
The collector will register the route on the http service enabled by DataKit: /v1/write/ploffload/:cagetory
, where the category
parameter can be logging
, network
, etc. It is mainly used to process data asynchronously after receiving it, and cache the data to disk after the Pipeline script fails to process the data in time.
Configuration¶
Collector Configuration¶
Go to the conf.d/ploffload
directory under the DataKit installation directory, copy ploffload.conf.sample
and name it ploffload.conf
. Examples are as follows:
[inputs.ploffload]
## Storage config a local storage space in hard dirver to cache data.
## path is the local file path used to cache data.
## capacity is total space size(MB) used to store data.
# [inputs.ploffload.storage]
# path = "./ploffload_storage"
# capacity = 5120
After configuration, restart Datakit.
Kubernetes supports modifying configuration parameters in the form of environment variables:
Environment Variable Name | Corresponding Configuration Parameter Item | Parameter |
---|---|---|
ENV_INPUT_PLOFFLOAD_STORAGE_PATH |
storage.path |
./ploffload_storage |
ENV_INPUT_PLOFFLOAD_STORAGE_CAPACITY |
storage.capacity |
5120 |
Usage¶
After the configuration is completed, you need to change the value of the configuration item pipeline.offload.receiver
in the datakit.yaml
main configuration file of the datakit to be unloaded to ploffload
.
Please check whether the host address of the listen
configuration item under [http_api]
in the DataKit main configuration file is 0.0.0.0
(or LAN IP or WAN IP). If it is 127.0.0.0/8
, then Not accessible externally and needs to be modified.
If you need to enable the disk cache function, you need to cancel the storage
related comments in the collector configuration, such as modifying it to: