DataKit Operator¶
DataKit Operator is a project integrated with DataKit in Kubernetes orchestration, aimed at facilitating easier deployment of DataKit, as well as providing other functions such as verification and injection.
Overview¶
DataKit Operator provides automated injection capabilities for Kubernetes clusters through the Kubernetes Admission Controller mechanism, helping users integrate observability more easily. Key features include:
- DDTrace Injection: Automatically inject APM tracing agents for Java applications
- Log Collection: Automatically collect container logs via logfwd sidecar
- Performance Profiling: Inject Flameshot or Profiler components for application performance monitoring
- Configuration Management: Support both global configuration and declarative configuration injection methods
Core Advantages:
- Automated Deployment: No need to manually modify application YAML, reducing configuration errors
- Batch Management: Implement batch injection through namespace and label selectors
- Flexible Configuration: Support JSON configuration and fine-grained control via Annotations
- Version Compatibility: Maintain backward compatibility and support smooth upgrades
Prerequisites¶
- Kubernetes v1.24.1 or higher is recommended, with internet access (to download yaml files and pull corresponding images)
- Ensure
MutatingAdmissionWebhookandValidatingAdmissionWebhookcontrollers are enabled - Ensure
admissionregistration.k8s.io/v1API is enabled
Installation¶
Download datakit-operator.yaml, follow these steps:
Prerequisites
- Kubernetes >= 1.14
- Helm >= 3.0+
$ helm install datakit-operator datakit-operator \
--repo https://pubrepo.guance.com/chartrepo/datakit-operator \
-n datakit --create-namespace
Check deployment status:
Upgrade using the following command:
$ helm -n datakit get values datakit-operator -a -o yaml > values.yaml
$ helm upgrade datakit-operator datakit-operator \
--repo https://pubrepo.guance.com/chartrepo/datakit-operator \
-n datakit \
-f values.yaml
Uninstall using the following command:
Attention
- DataKit Operator has a strict correspondence between the program and yaml. If an outdated yaml is used, the new version of DataKit-Operator may not be installed. Please download the latest yaml.
- If
InvalidImageNameerror occurs, you can manually pull the image.
Configuration Explanation¶
DataKit Operator configuration is in JSON format, stored separately as a ConfigMap in Kubernetes, and loaded into the container as environment variables.
Starting from DataKit-Operator v1.7.0, it is recommended to use the admission_inject_v2 configuration item. The new configuration uses an array structure, supporting more flexible configuration methods.
{
"server_listen": "0.0.0.0:9543", // Operator service listening address
"log_level": "info", // Operator log level
"admission_inject_v2": { // Injection configuration v2
"ddtraces": [...], // DDTrace configuration array
"logfwds": [...], // Log forwarding configuration array
"flameshots": [...] // Profiling configuration array
},
"admission_mutate": { // Configuration mutation
"loggings": [...] // Log configuration mutation
}
}
Injection Methods¶
DataKit Operator supports two resource input methods:
-
Selector Configuration Injection (Imperative)
Specify the Namespace and Selector of the target Pod by modifying the DataKit-Operator config. If a Pod meets the conditions, injection is performed.
Advantages: No need to add Annotations to the target Pod (but the target Pod needs to be restarted)
Disadvantages: Scope is not precise enough, potentially leading to invalid injections
-
Annotation Configuration Injection (Declarative)
Add Annotations to the target Pod to enable its own injection.
Advantages: Injection rejection can be precisely controlled via Annotation
Disadvantages: Injection cannot be triggered solely by Annotation; matching rules still need to be configured, meaning besides enabling injection in the target Pod annotation, other fields in the Operator configuration are also required.
Selector Configuration Injection¶
Batch injection can be achieved by configuring namespace_selectors and label_selectors.
In the admission_inject_v2 configuration, namespace_selectors and label_selectors are configured directly in the array item. Taking DDTrace injection as an example:
{
"admission_inject_v2": {
"ddtraces": [
{
"namespace_selectors": ["testns"],
"label_selectors": ["app=log-output"],
...
}
]
}
}
namespace_selectors: Namespace selector array, supports regular expression matching. For exact matching, surround the pattern with^and$, e.g.,^testns$label_selectors: Label selector array, uses Kubernetes Label Selector syntax
If both selectors are configured, the target Pod must satisfy both conditions simultaneously. For guidelines on writing label selectors, refer to this official documentation.
Annotation Configuration Injection¶
Adding specific Annotations to Deployment can control whether injection is allowed. Note that Annotations should be added in the template.
Supported Annotations are as follows:
| Annotation | Description | Values | Priority |
|---|---|---|---|
admission.datakit/ddtrace.enabled |
Controls ddtrace injection | "true"/"false" |
Medium |
admission.datakit/logfwd.enabled |
Controls logfwd injection | "true"/"false" |
Medium |
admission.datakit/flameshot.enabled |
Controls flameshot injection | "true"/"false" |
Medium |
admission.datakit/enabled |
Controls all injection functions | "true"/"false" |
Highest |
Example:
Tip
Annotations can be used to reject injection (set to "false"), but for active injection, the following configuration is required:
- Set matching rules (
namespace_selectors/label_selectors) and corresponding configuration fields in DataKit-Operator configuration - Pod matches configured selectors
Injection Method Summary
- Global Configuration: Suitable for batch scenarios, controlling injection scope via Operator configuration
- Annotation Configuration: Suitable for fine-grained control, deciding whether to inject via Pod annotations
- Priority: Annotation configuration takes precedence over global configuration, useful for rejecting injection
- Compatibility: Supported features vary slightly across versions, please refer to specific version notes
Supported Injection Function List¶
| Function | Brief Description |
|---|---|
| DDtrace Agent | Inject DDTrace component, see DDTrace |
| logfwd | Inject logfwd component to collect logs inside containers, see logfwd |
| Flameshot | Inject Flameshot component for dynamic application Profiling, see Flameshot |
| async-profiler | Inject async-profiler for periodic Profiling of Java applications, see async-profile |
| py-spy | Inject py-spy for Profiling of Python applications, see py-spy |
| logging | Inject log collection configuration, see Logging |
Downward API¶
In DataKit Operator v1.4.2 and later versions, envs supports Kubernetes Downward API environment variable value fields. The following are currently supported:
| Field | Description | Example |
|---|---|---|
metadata.name |
The name of the Pod | nginx-123 |
metadata.namespace |
The namespace of the Pod | middleware |
metadata.uid |
The unique ID of the Pod | 12345678-1234-1234-1234-123456789abc |
metadata.annotations['<KEY>'] |
The value of the Pod's annotation <KEY> |
metadata.annotations['myannotation'] |
metadata.labels['<KEY>'] |
The value of the Pod's label <KEY> |
metadata.labels['app'] |
spec.serviceAccountName |
The name of the Pod's service account | default |
spec.nodeName |
The name of the node where the Pod is running | node-01 |
status.hostIP |
The primary IP address of the node where the Pod is located | 192.168.1.1 |
status.hostIPs |
Dual-stack version of status.hostIP | ["192.168.1.1", "2001:db8::1"] |
status.podIP |
The primary IP address of the Pod | 10.0.0.1 |
status.podIPs |
Dual-stack version of status.podIP | ["10.0.0.1", "2001:db8::2"] |
For example, if there is a Pod named nginx-123 in the middleware namespace, and you want to inject the environment variables POD_NAME and POD_NAMESPACE, refer to the following:
{
"admission_inject": {
"ddtrace": {
"envs": {
"POD_NAME": "{fieldRef:metadata.name}",
"POD_NAMESPACE": "{fieldRef:metadata.namespace}"
}
}
}
}
Ultimately, in that Pod you can see:
Note
If the Value placeholder is unrecognizable, it will be added to the environment variable as a plain string. For example, "POD_NAME": "{fieldRef:metadata.PODNAME}" is an incorrect syntax; the environment variable will be POD_NAME={fieldRef:metadata.PODNAME}.
FAQ¶
How to disable injection for a specific Pod?¶
Add Annotation "admission.datakit/enabled": "false" to that Pod, and no operations will be performed for it. This has the highest priority.
How does it work?¶
DataKit-Operator uses Kubernetes Admission Controller function for resource injection. For detailed mechanisms, please check the official documentation
What to note in AWS EKS environment?¶
Deploying in an AWS EKS environment may cause DataKit-Operator not to take effect; you need to open port 9543 in the security group.
Troubleshooting Guide¶
| Issue | Possible Cause | Solution |
|---|---|---|
| Injection not taking effect | Webhook not configured correctly | Check MutatingAdmissionWebhook and ValidatingAdmissionWebhook |
| Image pull failed | Image address or permission issue | Verify image address, check image repository access permissions |
| Port unreachable | Network or security group configuration | Open port 9543, check network policies |