Dataway Sink¶
Version-1.14.0 version of Datakit can use the Sinker feature here.
Dataway Sinker Features¶
In the daily data collection process, since there are multiple different workspaces, we may need to send different data to different workspaces. For example, in a shared Kubernetes cluster, the collected data may involve different teams or business departments. At this time, we can send data with specific attributes to various different workspaces to achieve fine-grained collection in shared infrastructure scenarios.
The processing flow of Sink requests is as follows:
sequenceDiagram
autonumber
participant dk as DataKit
box Dataway server
participant etcd
participant dw as DataWay
participant rmatch as Rule matching
participant drop as Drop
end
box Workspaces
participant wksp1 as Workspace
participant wkspx as Default workspace
end
etcd ->> dw: pull sinker rules
activate dw
dk ->> dw: upload
deactivate dw
alt non-sink request
dw ->> wksp1: write
else sink request
dw ->> rmatch: matching
end
alt match ok(at lease 1 workspace)
rmatch ->> wksp1: write
else match failed but enabled default workspace
rmatch ->> wkspx: write
else no default workspace
rmatch ->> drop: drop
end
Dataway 1.8.0 supports receiving Sinker/non-Sinker requests. Only one Dataway needs to be deployed.
Dataway Cascaded Mode¶
For SaaS users, they can deploy a Dataway locally (k8s Cluster) specifically for traffic splitting, and then forward the data to Openway:
Warning
In cascaded mode, the Dataway within the cluster needs to enable the cascaded option. See the environment variable description in the installation documentation.
sequenceDiagram
autonumber
participant dk as DataKit
box local Dataway server
participant etcd
participant dw1 as DataWay
end
box SAAS Dataway server
participant dw2 as DataWay
end
box Workspaces
participant wksp1 as Workspace
end
etcd ->> dw1: pull sinker rules
dk ->> dw1: upload data
dw1 ->> dw1: sink rule matching
dw1 ->> dw2: deliver request
dw2 ->> wksp1: write
Impact of cascading:
- The behavior of some APIs will be different. For historical reasons, the requests sent by Datakit and the request URLs on Kodo are different. Dataway plays the role of an API translator here. In cascaded scenarios, the API translation function is disabled.
- The cascaded Dataway will not send heartbeat requests to the center. Because the next-level Dataway does not handle this request (so it returns 404).
- For requests received by the cascaded Dataway, when sending them to the next Dataway, the API will not be signed.
Dataway Installation¶
See here
Dataway Settings¶
In addition to the regular settings of Dataway, several additional configurations need to be set (located in /usr/local/cloudcare/dataflux/dataway/dataway.yaml):
# Set the address where Dataway will upload data, usually Kodo, but can also be another Dataway.
remote_host: https://kodo.guance.com
# If the upload address is Dataway, set this to true, indicating Dataway cascading.
cascaded: false
# This token is a randomly set token on the dataway. We need to fill it in the datakit.conf configuration of Datakit. It needs to have a certain length and format.
secret_token: tkn_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# sinker rule settings
sinker:
etcd: # supports etcd
urls:
- http://localhost:2379
dial_timeout: 30s
key_space: /dw_sinker
username: "dataway"
password: "<PASSWORD>"
#file: # also supports local file mode, often used for debugging.
# path: /path/to/sinker.json
Warning
If secret_token is not set, any request sent by Datakit can pass, which will not cause data problems. However, if Dataway is deployed on the public network, it is recommended to set secret_token.
Sinker Rule Settings¶
Dataway Sinker rules are a set of JSON-formatted configurations. The matching rule writing method is the same as the blacklist writing method. See here.
Currently, two configuration sources are supported:
- Specify a JSON file locally, mainly used for debugging Sinker rules. In this case, after updating the Sinker rules in the JSON file, Dataway needs to be restarted to take effect.
- etcd: Store the debugged rule file in etcd. Later, when fine-tuning the rules, just update etcd directly, without restarting Dataway.
Actually, the JSON stored in etcd is the same as the JSON content in the local file. Only the etcd hosting method is introduced below.
etcd Settings¶
The following commands are operated under Linux.
As an etcd client, Dataway can set the following username and role in etcd (etcd 3.5+). See here
Create the dataway account and corresponding role:
# Add a username, you will be prompted to enter a password here.
$ etcdctl user add dataway
# Add the sinker role
$ etcdctl role add sinker
# Add dataway to the role
$ etcdctl user grant-role dataway sinker
# Limit the key permissions of the role (here /dw_sinker and /ping are the two keys used by default)
$ etcdctl role grant-permission sinker readwrite /dw_sinker
$ etcdctl role grant-permission sinker readwrite /ping # Used to check connectivity
Why create a role?
The role is used to control the permissions of the corresponding user on certain keys. Here, we might be using an existing etcd service of the user, so it is necessary to restrict the data permissions of the Dataway user.
Warning
If etcd enables authentication mode, when executing the etcdctl command, you need to bring the corresponding username and password:
Write Sinker Rules¶
The new version (1.3.6) of Dataway supports operating the Sinker rules in etcd through the
datawaycommand.
Assume the sinker.json rule definition is as follows:
{
"strict":true,
"rules": [
{
"rules": [
"{ host = 'my-host'}"
],
"url": "https://kodo.guance.com?token=tkn_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
},
{
"rules": [
"{ host = 'my-host' OR cluster = 'cluster-A' }"
],
"url": "https://kodo.guance.com?token=tkn_yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy"
}
]
}
Use the following command to write the Sinker rule configuration:
Identify workspace information
Since sinker.json does not support comments, we can add an info field to the JSON as a memo to achieve the effect of a comment:
Default Rule¶
Add the as_default identifier to a specific rule entry to set it as the default fallback rule. The fallback rule can not set any matching conditions (do not configure the rules field); at the same time, it should not participate in regular rule matching. A suggested fallback rule is as follows:
{
"as_default": true,
"info": "This is the default fallback workspace",
"url": "https://kodo.guance.com?token=tkn_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
Note: Only one fallback rule should be set. If there are multiple fallback rules in the sinker configuration, the last one will be used.
Token Specification¶
Since Datakit will check the token on Dataway, the token (including secret_token) set here must meet the following conditions:
Start with
token_ortkn_, and the following character length is 32.
For tokens that do not meet this condition, Datakit will fail to install.
Datakit End Settings¶
In Datakit, we need to make several settings so that it can tag the collected data with specific labels for grouping.
- Configure the global custom Key list.
Datakit will look for fields with these Keys in the data it collects (only looking for string-type fields), extract them, and use them as the basis for grouping and sending.
- Configure "Global Host Tags" and "Global Election Tags"
In all data uploaded by Datakit, these global tags (including tag key and tag value) will be attached as the basis for grouping and sending.
Datakit End Customer Key Settings¶
If you want the data collected by a specific Datakit to meet the needs of traffic splitting, you need to ensure a few points:
- Datakit has enabled the Sinker function.
- Datakit has configured valid Global Customer Keys.
These two configurations are as follows:
# /usr/local/datakit/conf.d/datakit.conf
[dataway]
# Specify a set of customer keys
global_customer_keys = [
# Example: Add two keys, category and class.
# It is not advisable to configure too many keys here, generally 2 ~ 3 is enough.
"category",
"class",
]
# Enable the sinker function
enable_sinker = true
In addition to testing data and regular data categories, it also supports Session Replay and Profiling and other binary file data. Therefore, all field names can be selected here. It should be noted that do not configure non-string type fields. Normal Keys generally come from Tags (all Tag values are string types). Datakit will not use non-string type fields as the basis for traffic splitting.
Impact of Global Tags¶
In addition to global_customer_keys affecting the traffic splitting mark, the global Tags configured on Datakit (including global election Tags and global host Tags) will also affect the traffic splitting mark. That is, if the data point has fields that appear in the global Tags (the field value types must be string types), they will also be counted for traffic splitting. Assume the global election Tag is as follows:
For the following data point:
Since the global election Tag has cluster (regardless of the value configured for this Tag), and the point itself also has the cluster Tag, in the final X-Global-Tags, the key-value pair cluster=cluster_A will be appended:
If global_customer_keys also configures the app key, then the final split Header is (the order of the two key-value pairs is not important):
Note
In this example, the value of cluster configured in datakit.conf is deliberately set different from the value of the cluster field in the data point, mainly to emphasize the impact of the Tag Key here. It can be understood that once a qualified global Tag Key appears in the data point, its effect is equivalent to adding this global Tag Key to global_customer_keys.
Dataway sink Command¶
Starting from version Version-1.3.6, Dataway supports managing the sinker configuration through the command line. The specific usage is as follows:
$ ./dataway sink --help
Usage of sink:
-add string
single rule json file
-cfg-file string
configure file (default "/usr/local/cloudcare/dataflux/dataway/dataway.yaml")
-file string
file path of the rule json, only used for command put and get
-get
get the rule json
-list
list rules
-log string
log file path (default "/dev/null")
-put
save the rule json
-token string
rules filtered by token, eg: xx,yy
Specify configuration file
When the command is executed, the default loaded configuration file is /usr/local/cloudcare/dataflux/dataway/dataway.yaml. If you need to load other configurations, you can specify it through --cfg-file.
Command log settings
By default, the command output log is disabled. If you need to view it, you can set the --log parameter.
# output log to stdout
$ ./dataway sink --list --log stdout
# output log to file
$ ./dataway sink --list --log /tmp/log
View rule list
# list all rules
$ ./dataway sink --list
# list all rules filtered by token
$ ./dataway sink --list --token=token1,token2
CreateRevision: 2
ModRevision: 41
Version: 40
Rules:
[
{
"rules": [
"{ workspace = 'zhengb-test'}"
],
"url": "https://openway.guance.com?token=token1"
}
]
Add rule
Create a rule file rule.json, the content can refer to the following:
[
{
"rules": [
"{ host = 'HOST1'}"
],
"url": "https://openway.guance.com?token=tkn_xxxxxxxxxxxxx"
},
{
"rules": [
"{ host = 'HOST2'}"
],
"url": "https://openway.guance.com?token=tkn_yyyyyyyyyyyyy"
}
]
Add rule
Export configuration
Exporting the configuration can save the sinker configuration to a local file.
Write configuration
Writing rules synchronizes the local rule file to sinker.
Create a rule file sink-put.json, the content can refer to the following:
{
"rules": [
{
"rules": [
"{ workspace = 'test'}"
],
"url": "https://openway.guance.com?token=tkn_xxxxxxxxxxxxxx"
}
],
"strict": true
}
Write configuration
Configuration Examples¶
Kubernetes dataway.yaml example (expand to view)
Specify the sinker JSON directly in the yaml:
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: deployment-utils-dataway
name: dataway
namespace: utils
spec:
replicas: 1
selector:
matchLabels:
app: deployment-utils-dataway
template:
metadata:
labels:
app: deployment-utils-dataway
annotations:
datakit/logs: |
[{"disable": true}]
datakit/prom.instances: |
[[inputs.prom]]
url = "http://$IP:9090/metrics" # The port here (default 9090) depends on the situation.
source = "dataway"
measurement_name = "dw" # Fixed to this measurement.
interval = "10s"
[inputs.prom.tags]
namespace = "$NAMESPACE"
pod_name = "$PODNAME"
node_name = "$NODENAME"
spec:
affinity:
podAffinity: {}
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- deployment-utils-dataway
topologyKey: kubernetes.io/hostname
containers:
- image: registry.jiagouyun.com/dataway/dataway:1.3.6 # Choose the appropriate version number here.
#imagePullPolicy: IfNotPresent
imagePullPolicy: Always
name: dataway
env:
- name: DW_REMOTE_HOST
value: "http://kodo.forethought-kodo:9527" # Fill in the real Kodo address here, or the next Dataway address.
- name: DW_BIND
value: "0.0.0.0:9528"
- name: DW_UUID
value: "agnt_xxxxx" # Fill in the real Dataway UUID here.
- name: DW_TOKEN
value: "tkn_oooooooooooooooooooooooooooooooo" # Fill in the real Dataway token here, usually the token of the system workspace.
- name: DW_PROM_LISTEN
value: "0.0.0.0:9090"
- name: DW_SECRET_TOKEN
value: "tkn_zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz"
- name: DW_SINKER_FILE_PATH
value: "/usr/local/cloudcare/dataflux/dataway/sinker.json"
ports:
- containerPort: 9528
name: 9528tcp01
protocol: TCP
volumeMounts:
- mountPath: /usr/local/cloudcare/dataflux/dataway/cache
name: dataway-cache
- mountPath: /usr/local/cloudcare/dataflux/dataway/sinker.json
name: sinker
subPath: sinker.json
resources:
limits:
cpu: '4'
memory: 4Gi
requests:
cpu: 100m
memory: 512Mi
# nodeSelector:
# key: string
imagePullSecrets:
- name: registry-key
restartPolicy: Always
volumes:
- hostPath:
path: /root/dataway_cache
name: dataway-cache
- configMap:
name: sinker
name: sinker
---
apiVersion: v1
kind: Service
metadata:
name: dataway
namespace: utils
spec:
ports:
- name: 9528tcp02
port: 9528
protocol: TCP
targetPort: 9528
nodePort: 30928
selector:
app: deployment-utils-dataway
type: NodePort
---
apiVersion: v1
kind: ConfigMap
metadata:
name: sinker
namespace: utils
data:
sinker.json: |
{
"strict":true,
"rules": [
{
"rules": [
"{ project = 'xxxxx'}"
],
"url": "http://kodo.forethought-kodo:9527?token=tkn_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
},
{
"rules": [
"{ project = 'xxxxx'}"
],
"url": "http://kodo.forethought-kodo:9527?token=tkn_yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy"
}
]
}
Ingress configuration example (expand to view)
FAQ¶
View details of dropped requests¶
When a request does not meet the Sinker rules, Dataway will discard the request and add a discard count to the metrics. However, during the debugging phase, we need to know the specific situation of a discarded request, especially the X-Global-Tags information carried on the request Header.
We can search the Dataway log with the following command:
In the output results, we can see output similar to the following:
Datakit request dropped troubleshooting¶
When a Datakit request is dropped by Dataway, Dataway will return the corresponding HTTP error. In the Datakit log, there will be an error similar to the following:
post 3641 to http://dataway-ip:9528/v1/write/metric failed(HTTP: 406 Not Acceptable):
{"error_code":"dataway.sinkRulesNotMatched","message":"X-Global-Tags: `host=my-host',
URL: `/v1/write/metric'"}, data dropped
This error indicates that the request /v1/write/metric was dropped because its X-Global-Tags did not satisfy all the rules on Dataway.
At the same time, in the Datakit monitor (datakit monitor -V) in the lower right corner of the DataWay APIs panel, the Status column will have Not Acceptable output, indicating that the corresponding Dataway API request was dropped.
Viewing Datakit's own metrics, you can also see the corresponding metrics:
$ curl -s http://localhost:9529/metrics | grep datakit_io_dataway_api_latency_seconds_count
datakit_io_dataway_api_latency_seconds_count{api="/v1/datakit/pull",status="Not Acceptable"} 50
datakit_io_dataway_api_latency_seconds_count{api="/v1/write/metric",status="Not Acceptable"} 301
Datakit reports error 403¶
If the sinker configuration on Dataway is incorrect, causing all Datakit requests to use secret_token, and this token is not recognized by the center (Kodo), it reports a 403 error kodo.tokenNotFound.
The reason for this problem may be that the etcd username and password are incorrect, causing Dataway to fail to obtain the Sinker configuration. Then Dataway considers the current sinker invalid, and all data is sent directly to the center.
etcd permission configuration problem¶
If there is the following error in the Dataway log, it indicates that the permission setting may be problematic:
If the permissions are not configured properly, you can delete all existing Dataway-based permissions and reconfigure them. For details, see here
Datakit end Key coverage relationship¶
When configuring the "global custom Key list", if the "global host Tag" and "global election Tag" also have the same named Key, then use the Key-Value pair corresponding to the collected data.
For example, if the configured "global custom Key list" has key1,key2,key3, and the "global host Tag" or "global election Tag" also configures these Keys and specifies the corresponding values, for example: key1=value-1, in a certain data collection, there is also a field key1=value-from-data, then the final grouping basis uses key1=value-from-data from the data, ignoring the Value of the corresponding Key configured in the "global host Tag" and "global election Tag".
If there are Keys with the same name between "global host Tag" and "global election Tag", the Key in the "global election Tag" takes precedence. In summary, the priority of the Key value source is as follows (decreasing):
- Collected data
- Global election Tag
- Global host Tag
Built-in "Global Custom Key"¶
Datakit has built-in several available custom Keys. They generally do not appear in the collected data, but Datakit can use these Keys to group data. If there is a need for traffic splitting in these Key dimensions, they can be added to the "global custom Key" list (these Keys are not configured by default). We can use some of these built-in custom Keys to achieve data traffic splitting.
Warning
Adding "global custom Key" will cause data to be packaged when sending. If the granularity is too fine, it will cause Datakit's upload efficiency to drop rapidly. Generally, it is not recommended to have more than 3 "global custom Keys".
classFor object data, after enabling, it will split traffic based on the object's classification. For example, the object classification of Pod iskubelet_pod, so you can formulate traffic splitting rules for pods:
{
"strict": true,
"rules": [
{
"rules": [
"{ class = 'kubelet_pod' AND other_conditon = 'some-value' }",
],
"url": "https://kodo.guance.com?token=<YOUR-TOKEN>"
},
{
... # other rules
}
]
}
measurementFor metric data, we can send specific measurement sets to specific workspaces. For example, the measurement name for disk isdisk. We can write the rule like this:
{
"strict": true,
"rules": [
{
"rules": [
"{ measurement = 'disk' AND other_conditon = 'some-value' }",
],
"url": "https://kodo.guance.com?token=<YOUR-TOKEN>"
},
{
... # other rules
}
]
}
sourceFor logs (L), eBPF network metrics (N), events (E) and RUM data.serviceFor Tracing, Scheck and Profiling.categoryFor all regular data categories, its value is the "name" column of the corresponding data category (such asmetricfor time series,objectfor object, etc.). Taking logs as an example, we can formulate traffic splitting rules specifically for logs like this:
{
"strict": true,
"rules": [
{
"rules": [
"{ category = 'logging' AND other_conditon = 'some-value' }",
],
"url": "https://kodo.guance.com?token=<YOUR-TOKEN>"
},
{
... # other rules
}
]
}
Special traffic splitting behavior¶
Some requests initiated by Datakit are intended to pull resources from the center or for self-identification. Their behavior is already atomic, cannot be split further, and these requests cannot be distributed to multiple workspaces (because Datakit needs to process the returns of these API requests and decide its subsequent behavior). Therefore, these APIs can only be traffic-split to one workspace at most.
If multiple conditions are met in the traffic splitting rules, these APIs will only be traffic-split to the workspace pointed to by the first rule that meets the condition.
The following is a rule example for this kind of traffic splitting:
We suggest adding the following rule in the Sinker rules so that these existing API requests of Datakit can be assigned to a specific workspace.
{
"strict": true,
"info": "Special workspace (only for data pulling and other APIs)",
"rules": [
{
"rules": [
"{ __dataway_api in ['/v1/datakit/pull', '/v1/election', '/v1/election/heartbeat', '/v1/query/raw', '/v1/workspace', '/v1/object/labels', '/v1/check/token'] }",
],
"url": "https://kodo.guance.com?token=<SOME-SPECIAL-WORKSPACE-TOKEN>"
}
]
}
Info
These API URLs are explained as follows:
/v1/election: Election request./v1/election/heartbeat: Election heartbeat request./v1/datakit/pull: Pull center-configured Pipeline and blacklist./v1/query/raw: DQL query./v1/workspace: Get workspace information./v1/object/labels: Update/delete object data./v1/check/token: Check workspace Token information.
Here, the key __dataway_api does not need to be configured in the global_customer_keys of datakit.conf. Dataway will default to using this as the traffic splitting Key, and use the current request's API route as its value. That is, for a certain API:
The final effect of traffic splitting participation is the same as the following:
Therefore, we can directly use the __dataway_api KV pair in the Sink rule for matching. This also reminds us that in this special rule, do not include other important data upload APIs, such as /v1/write/... interfaces. Otherwise, which workspace the data ultimately falls into is undefined.