Skip to content

Enable Observability for the Deployment Plan

Overview

The purpose of this document is to assist users of the Deployment Plan in implementing observability for their deployment to enhance the overall reliability of the Guance service. This document covers two classic observability patterns and how to deploy Datakit data collection, logging and parsing, APM, Synthetic Tests, RUM in a Kubernetes environment. Additionally, we provide one-click import template files for infrastructure and middleware observability and application service observability to facilitate better monitoring of your environment.

Deployment Plan Observability Patterns

This pattern refers to monitoring oneself. In other words, data is sent to one's own workspace. This means that if the environment goes down, one will not be able to observe their own data and further troubleshoot the issue. The advantage of this approach is ease of deployment. The disadvantage is that data is continuously generated, leading to data self-iteration and an endless loop. Additionally, if the cluster crashes, one cannot observe the issue.

This pattern refers to multiple Guance instances sending data to the same node. The advantage is that it avoids creating a closed loop in data transmission and allows real-time monitoring of the cluster's status.

guance2

Account Information Preparation

Name Type Description Creation Syntax (Note: Modify the password) Importance
Private FUNC Database Account DB & USER FUNC service connection account CREATE DATABASE private_func;
create user 'private_func'@'%' identified by 'V4KySbFhzDkxxxx';
GRANT ALL PRIVILEGES ON private_func.* TO private_func;
FLUSH PRIVILEGES;
Optional
MySQL Self-Observability Account USER Self-observability account for collecting MySQL metrics CREATE USER 'datakit'@'%' IDENTIFIED WITH caching_sha2_password by 'SFGS&DFxxxx32!';
GRANT PROCESS ON . TO 'datakit'@'%';
GRANT SELECT ON . TO 'datakit'@'%';
show databases like 'performance_schema';
GRANT SELECT ON performance_schema.* TO 'datakit'@'%';
GRANT SELECT ON mysql.user TO 'datakit'@'%';
GRANT replication client on . to 'datakit'@'%';
Important
Business Data Collection Account USER Used for collecting business data using FUNC CREATE USER 'read'@'%' IDENTIFIED BY 'u19e0LmkL8Fxxxx';
GRANT SELECT ON df_core.* TO 'read'@'%';
FLUSH PRIVILEGES;
Optional
PostgreSQL Self-Observability Account USER Used for GuanceDB 3.0 monitoring CREATE USER datakit WITH PASSWORD 'Z7ZdQ326EeexxxxP';
GRANT pg_monitor TO datakit;
GRANT CONNECT ON DATABASE scopedb_meta TO datakit;
GRANT SELECT ON pg_stat_database TO datakit;
Optional

Configure Data Collection

Deploy DataKit

1) Download datakit.yaml

Note

Note: The default middleware configuration for DataKit is already set up. Minor modifications are required for use.

2) Modify the DaemonSet template file in datakit.yaml

   - name: ENV_DATAWAY
     value: https://openway.guance.com?token=tkn_a624xxxxxxxxxxxxxxxxxxxxxxxx74 ## Fill in the actual dataway address here
   - name: ENV_GLOBAL_TAGS
     value: host=__datakit_hostname,host_ip=__datakit_ip,guance_site=guance,cluster_name_k8s=guance # Modify panel variables according to actual conditions
   - name: ENV_GLOBAL_ELECTION_TAGS
     value: guance_site=guance,cluster_name_k8s=guance     # Modify panel variables according to actual conditions
   image: pubrepo.guance.com/datakit/datakit:1.65.2     ## Modify to the latest image version

3) Modify the related configuration for ConfigMap in datakit.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: datakit-conf
  namespace: datakit
data:
    mysql.conf: |-
        [[inputs.mysql]]
          host = "xxxxxxxxxxxxxxx"      ## Modify the corresponding MySQL connection address
          user = "ste3"                 ## Modify the MySQL username
          pass = "Test1234"             ## Modify the MySQL password
          ......

    redis.conf: |-
        [[inputs.redis]]
          host = "r-xxxxxxxxx.redis.rds.ops.ste3.com"            ## Modify the Redis connection address
          port = 6379                                                   
          # unix_socket_path = "/var/run/redis/redis.sock"
          # Configure multiple dbs. If dbs is configured, db will also be included in the collection list. If dbs=[] or not configured, all non-empty dbs in Redis will be collected.
          # dbs=[]
          # username = "<USERNAME>"
           password = "Test1234"                                        ## Modify the Redis password
          ......

    openes.conf: |-
        [[inputs.elasticsearch]]
          ## Elasticsearch server configuration
          # Supports Basic authentication:
          # servers = ["http://user:pass@localhost:9200"]
          servers = ["http://guance:123.com@opensearch-cluster-client.middleware:9200"]   ## Modify the username, password, etc.
          ......

4) Mount operation

        - mountPath: /usr/local/datakit/conf.d/db/mysql.conf
          name: datakit-conf
          subPath: mysql.conf
          readOnly: false

Note: Multiple configurations are handled similarly. Add them sequentially.

6) Start deploying DataKit after modifications

kubectl apply -f datakit.yaml

Environment Variable Specification

Variable Name Description
ENV_DEFAULT_ENABLED_INPUTS Configure default collection: self,cpu,disk,diskio,mem,swap,system,hostobject,net,host_processes,container,zipkin
ENV_ENABLE_ELECTION When election is enabled, Prometheus collection (or other components) will work in master or candidate mode
ENV_GLOBAL_ELECTION_TAGS Add additional tag dimensions to election components, used for tagging during Prometheus collection (effective when election is enabled)
ENV_INPUT_DDTRACE_COMPATIBLE_OTEL Enable compatibility between otel Trace and DDTrace Trace
ENV_INPUT_DISK_USE_NSENTER Use nsenter method to collect disk usage information, collect cluster dynamic storage block information. If using dynamic storage blocks, this must be set
ENV_INPUT_HOSTOBJECT_USE_NSENTER Use nsenter method to collect disk usage information, collect cluster dynamic storage block information. If using dynamic storage blocks, this must be set
ENV_INPUT_CONTAINER_ENABLE_CONTAINER_METRIC Enable container metric collection
ENV_INPUT_CONTAINER_ENABLE_POD_METRIC Enable Pod metric collection (CPU and memory usage)
ENV_INPUT_CONTAINER_ENABLE_K8S_METRIC Enable k8s metric collection

Import Views, Monitor Templates, and Pipelines

Note

After importing the monitor template, manually set the alert strategy and notification targets.

Import Views and Monitor Templates

Download View and Monitor Template

「Manage」-「Workspace Settings」-「Import」

allin

Note

After importing, modify the corresponding jump link configuration in Monitoring. Replace dsbd_xxxx in the URL with the corresponding dashboard and wksp_xxxx with the space to be monitored.

Import Pipelines

Unzip the guance-self-observing-latest.zip file, path: guance-self-observing-latest/pipeline

「Manage」-「Pipelines」-「Import」

Application Service Observability

Configure Prometheus Collection for Services

Download Prometheus Configuration File

Unzip guance-self-observing-prom-latest.zip and execute the following commands:

cd guance-self-observing-prom-latest
kubectl patch deploy kodo-x -n  forethought-kodo  --type merge --patch "$(cat kodo-x-prom.yaml)" 
kubectl patch deploy kodo -n  forethought-kodo  --type merge --patch "$(cat kodo-prom.yaml)" 
kubectl patch sts kodo-servicemap -n  forethought-kodo  --type merge --patch "$(cat kodo-servicemap-prom.yaml)" 
kubectl patch sts kodo-x-backuplog -n  forethought-kodo  --type merge --patch "$(cat kodo-x-backuplog-prom.yaml)" 
kubectl patch deploy inner -n  forethought-core  --type merge --patch "$(cat core-inner-prom.yaml)" 

Configure APM

Inject forethought-core Configuration

#!/bin/bash
set -euo pipefail

# Namespace
NAMESPACE="${NAMESPACE:-forethought-core}"

# —— Each deployment's exclusive KV is written here (in the script, no external file)——
# One line per: "<deploy> KEY=VAL KEY=VAL ..."
DEPLOY_ENV_CONFIG=(

  'front-backend DD_PATCH_MODULES=redis:true,urllib3:true,httplib:true,sqlalchemy:true,httpx:true DD_AGENT_PORT=9529 DD_GEVENT_PATCH_ALL=true DD_SERVICE=front-backend DD_TAGS=pod_name:$(POD_NAME),project:dataflux'
  'inner DD_PATCH_MODULES=redis:true,urllib3:true,httplib:true,sqlalchemy:true,httpx:true DD_AGENT_PORT=9529 DD_GEVENT_PATCH_ALL=true DD_SERVICE=inner DD_TAGS=pod_name:$(POD_NAME),project:dataflux'
  'management-backend DD_PATCH_MODULES=redis:true,urllib3:true,httplib:true,sqlalchemy:true,httpx:true DD_AGENT_PORT=9529 DD_GEVENT_PATCH_ALL=true DD_SERVICE=management-backend DD_TAGS=pod_name:$(POD_NAME),project:dataflux'
  'open-api DD_PATCH_MODULES=redis:true,urllib3:true,httplib:true,sqlalchemy:true,httpx:true DD_AGENT_PORT=9529 DD_GEVENT_PATCH_ALL=true DD_SERVICE=open-api DD_TAGS=pod_name:$(POD_NAME),project:dataflux'
  'sse DD_PATCH_MODULES=redis:true,urllib3:true,httplib:true,sqlalchemy:true,httpx:true DD_AGENT_PORT=9529 DD_GEVENT_PATCH_ALL=true DD_SERVICE=sse DD_TAGS=pod_name:$(POD_NAME),project:dataflux'
  'core-worker DD_TRACE_ENABLED=false'
  'core-worker-0 DD_TRACE_ENABLED=false'
  'core-worker-beat DD_TRACE_ENABLED=false'
  'core-worker-correlation DD_TRACE_ENABLED=false'
)

# —— Only prefix ddtrace-run to args[0] (do not modify command)——
prefix_ddtrace_run_args_only() { # $1 deploy
  local d="$1"

  # If tracing is explicitly disabled, do not add
  local trace_enabled
  trace_enabled="$(kubectl get deploy "$d" -n "$NAMESPACE" \
     -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="DD_TRACE_ENABLED")].value}' 2>/dev/null || true)"
  if [[ "$trace_enabled" == "false" ]]; then
    echo "  • DD_TRACE_ENABLED=false,skip ddtrace-run."
    return 0
  fi

  # Read args[0]
  local first_arg
  first_arg="$(kubectl get deploy "$d" -n "$NAMESPACE" \
     -o jsonpath='{.spec.template.spec.containers[0].args[0]}' 2>/dev/null || true)"

  # Skip if already starts with ddtrace-run
  if [[ "$first_arg" == "ddtrace-run" ]]; then
    echo "  • args already starts with ddtrace-run, skip."
    return 0
  fi

  # Check if there is already an args array
  local has_args
  has_args="$(kubectl get deploy "$d" -n "$NAMESPACE" \
     -o jsonpath='{.spec.template.spec.containers[0].args}' 2>/dev/null || true)"

  if [[ -n "$has_args" ]]; then
    # Insert ddtrace-run at the beginning of existing args
    kubectl patch deploy "$d" -n "$NAMESPACE" --type='json' -p='[
      {"op":"add","path":"/spec/template/spec/containers/0/args/0","value":"ddtrace-run"}
    ]' >/dev/null
    echo "  • Inserted ddtrace-run at args[0]"
  else
    # If no args, create args and set ddtrace-run as the first element
    kubectl patch deploy "$d" -n "$NAMESPACE" --type='json' -p='[
      {"op":"add","path":"/spec/template/spec/containers/0/args","value":["ddtrace-run"]}
    ]' >/dev/null
    echo "  • No args: created args=[\"ddtrace-run\"]"
  fi
}


# —— Utility functions —— #
has_env() { # $1 deploy  $2 KEY
  kubectl get deploy "$1" -n "$NAMESPACE" \
    -o jsonpath="{.spec.template.spec.containers[0].env[?(@.name=='$2')].name}" 2>/dev/null | grep -qx "$2"
}
ensure_env_array() { # $1 deploy
  local has_array
  has_array="$(kubectl get deploy "$1" -n "$NAMESPACE" -o jsonpath="{.spec.template.spec.containers[0].env}" 2>/dev/null || true)"
  if [[ -z "${has_array}" ]]; then
    kubectl patch deploy "$1" -n "$NAMESPACE" --type='json' -p="[
      {\"op\":\"add\",\"path\":\"/spec/template/spec/containers/0/env\",\"value\":[]}
    ]" >/dev/null
  fi
}

for item in "${DEPLOY_ENV_CONFIG[@]}"; do
  deploy="${item%% *}"
  # If the line only contains the deployment name, skip
  rest="${item#* }"; [[ "$rest" == "$deploy" ]] && rest=""

  echo "→ Processing: $deploy"

  # Check if it exists
  if ! kubectl get deploy "$deploy" -n "$NAMESPACE" >/dev/null 2>&1; then
    echo "  - Not found, skip."
    continue
  fi

  # Ensure there is an env array (otherwise /env/- append will fail)
  ensure_env_array "$deploy"

  # Append Downward API (add if missing): DD_AGENT_HOST=status.hostIP、POD_NAME=metadata.name
  if ! has_env "$deploy" "DD_AGENT_HOST"; then
    kubectl patch deploy "$deploy" -n "$NAMESPACE" --type='json' -p='[
      {"op":"add","path":"/spec/template/spec/containers/0/env/-",
       "value":{"name":"DD_AGENT_HOST","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.hostIP"}}}}
    ]' >/dev/null
    echo "  • add DD_AGENT_HOST (status.hostIP)"
  else
    echo "  • DD_AGENT_HOST exists, skip."
  fi

  if ! has_env "$deploy" "POD_NAME"; then
    kubectl patch deploy "$deploy" -n "$NAMESPACE" --type='json' -p='[
      {"op":"add","path":"/spec/template/spec/containers/0/env/-",
       "value":{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}}}
    ]' >/dev/null
    echo "  • add POD_NAME (metadata.name)"
  else
    echo "  • POD_NAME exists, skip."
  fi

  # Static KEY=VAL (add if missing; skip if exists)
  for kv in $rest; do
    key="${kv%%=*}"
    val="${kv#*=}"
    if has_env "$deploy" "$key"; then
      echo "  • $key exists, skip."
    else
      kubectl set env deploy/"$deploy" -n "$NAMESPACE" "$key=$val" >/dev/null
      echo "  • add $key=$val"
    fi
  done
  # Ensure the command starts with ddtrace-run
  prefix_ddtrace_run_args_only "$deploy"
  echo "  -> Done: $deploy"
done

Inject forethought-kodo Configuration

#!/bin/bash
set -euo pipefail

# Namespace
NAMESPACE="${NAMESPACE:-forethought-kodo}"

# —— Each deployment's exclusive KV is written here (in the script, no external file)——
# One line per: "<deploy> KEY=VAL KEY=VAL ..."
DEPLOY_ENV_CONFIG=(
  'kodo DD_TRACE_ENABLED=true DD_TRACE_AGENT_PORT=9529 DD_TRACE_SAMPLE_RATE=0 DD_SERVICE=kodo DD_TAGS=pod_name:$(POD_NAME),project:dataflux'
  'kodo-inner DD_TRACE_ENABLED=true DD_TRACE_AGENT_PORT=9529 DD_SERVICE=kodo-inner DD_TAGS=pod_name:$(POD_NAME),project:dataflux'

)

# —— Utility functions —— #
has_env() { # $1 deploy  $2 KEY
  kubectl get deploy "$1" -n "$NAMESPACE" \
    -o jsonpath="{.spec.template.spec.containers[0].env[?(@.name=='$2')].name}" 2>/dev/null | grep -qx "$2"
}
ensure_env_array() { # $1 deploy
  local has_array
  has_array="$(kubectl get deploy "$1" -n "$NAMESPACE" -o jsonpath="{.spec.template.spec.containers[0].env}" 2>/dev/null || true)"
  if [[ -z "${has_array}" ]]; then
    kubectl patch deploy "$1" -n "$NAMESPACE" --type='json' -p="[
      {\"op\":\"add\",\"path\":\"/spec/template/spec/containers/0/env\",\"value\":[]}
    ]" >/dev/null
  fi
}

for item in "${DEPLOY_ENV_CONFIG[@]}"; do
  deploy="${item%% *}"
  # If the line only contains the deployment name, skip
  rest="${item#* }"; [[ "$rest" == "$deploy" ]] && rest=""

  echo "→ Processing: $deploy"

  # Check if it exists
  if ! kubectl get deploy "$deploy" -n "$NAMESPACE" >/dev/null 2>&1; then
    echo "  - Not found, skip."
    continue
  fi

  # Ensure there is an env array (otherwise /env/- append will fail)
  ensure_env_array "$deploy"

  # Append Downward API (add if missing): DD_AGENT_HOST=status.hostIP、POD_NAME=metadata.name
  if ! has_env "$deploy" "DD_AGENT_HOST"; then
    kubectl patch deploy "$deploy" -n "$NAMESPACE" --type='json' -p='[
      {"op":"add","path":"/spec/template/spec/containers/0/env/-",
       "value":{"name":"DD_AGENT_HOST","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.hostIP"}}}}
    ]' >/dev/null
    echo "  • add DD_AGENT_HOST (status.hostIP)"
  else
    echo "  • DD_AGENT_HOST exists, skip."
  fi

  if ! has_env "$deploy" "POD_NAME"; then
    kubectl patch deploy "$deploy" -n "$NAMESPACE" --type='json' -p='[
      {"op":"add","path":"/spec/template/spec/containers/0/env/-",
       "value":{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}}}
    ]' >/dev/null
    echo "  • add POD_NAME (metadata.name)"
  else
    echo "  • POD_NAME exists, skip."
  fi

  # Static KEY=VAL (add if missing; skip if exists)
  for kv in $rest; do
    key="${kv%%=*}"
    val="${kv#*=}"
    if has_env "$deploy" "$key"; then
      echo "  • $key exists, skip."
    else
      kubectl set env deploy/"$deploy" -n "$NAMESPACE" "$key=$val" >/dev/null
      echo "  • add $key=$val"
    fi
  done

  echo "  -> Done: $deploy"
done

Configure Synthetic Tests

1) Create a new website to monitor

boce1

2) Configure Synthetic Testing tasks

boce2

Note

Modify according to the actual domain name set

Name Testing Address Type Task Status Operation
xx-dataflux-api https://xx-console-api.guance.com HTTP Start img
xx-open-api https://xx-open-api.guance.com HTTP Start img
xx-dataway https://xx-dataway.guance.com HTTP Start ![img](

Feedback

Is this page helpful? ×