Event Data Sharding Practice: Implementation Based on Dataway Sink¶
This document provides a detailed explanation of how to achieve intelligent sharding of event data (keyevent) through injecting HTTP Headers via DataFlux Func and Dataway Sinker rule configuration. Through this solution, you can route event data with different business attributes and environment characteristics to specified workspaces.
DataFlux Func Side Injection Identifier: During the reporting of event data, dynamically generate the X-Global-Tags Header via Func configuration, which includes key-value pairs required for sharding (e.g., env=prod).
Dataway Routing Match: Dataway forwards events carrying specific identifiers to corresponding workspaces based on the rules defined in sinker.json.
Route all workspace events to the "Event Central Management" workspace:
Access the Launcher console;
Navigate to the top-right > Modify Application Configuration;
Locate the func2Config configuration item under the func2 namespace;
Add configuration:
CUSTOM_INTERNAL_DATAWAY_X_GLOBAL_TAGS:-category:keyevent# Data categoryfields:df_source# Field used for sharding; here, enter the fixed identifier field for the event
Configure the Dataway Sinker rule: Modify the sinker.json configuration file and set the data routing rules:
{"strict":true,"rules":[{"rules":["{ df_source = 'monitor' }"],"url":"workspace data reporting address"}]}
Extracts data fields (including Tags and Fields); supports direct extraction and rule-based extraction
[#].fields[#]
string
-
Extracts field names and supports additional extraction fields (see table below)
[#].fields[#]
dict
-
Extraction field rules
[#].fields[#].src
string
-
Extracts field names and supports additional extraction fields (see table below)
[#].fields[#].dest
string
Same as src
Field name to write into the Header after extraction
[#].fields[#].default
string
-
Default value to write into the Header when the specified field does not exist
[#].fields[#].fixed
string
-
Fixed value written to the Header
[#].fields[#].remap
dict
null
Maps extracted field values
[#].fields[#].remap_default
string
-
Default value when mapping extracted field values without a corresponding map If not specified, retains the original value Specifying null ignores this field
[#].filter
dict/string
null
Filters data Supports Tag filtering and filterString filtering
Example Function located in script set my_script_set, script my_script
def make_global_tags(category, point, extra_fields):# Only process event type dataif category != 'keyevent':returnglobal_tags_list = {}# Get name and region fields from the data's fields or tagsname = point['fields'].get('name') or point['tags'].get('name')region = point['fields'].get('region') or point['tags'].get('region')# Get the prefix of nameif name:prefix = str(name).split('-')[0]global_tags_list['name_prefix'] = prefix# Get the suffix of regionif region:suffix = str(region).split('-').pop()global_tags_list['region_suffix'] = suffix# Returnreturn global_tags_list