Opentelemetry Protocol, defined by the CNCF (Cloud Native Computing Foundation), represents the latest generation of observability specifications (still in incubation). This specification defines the three pillars of observability: metrics, trace, log. However, merely collecting data from these three pillars without correlation does not distinguish modern observability from traditional monitoring tools (APM, logs, Zabbix, etc.). Is it just a collection of monitoring tools? Therefore, an important concept arises: TAG (tag). For example, a traceID that connects front-end and back-end can be considered a tag, as can a host that initially correlates metrics, traces, and logs. Other examples include project, environment, version number, all of which are individual tags!
In short, using TAGs can achieve data correlation and enable more customized observability, making it crucial. In Guance's current architecture, all observable items support tag settings, with theoretically no upper limit on the number of tags.
Example: A common real-life scenario is job hunting or HR recruitment, where specific requirements such as programming skills, computer knowledge, a bachelor's degree, and years of experience are like tags. Only candidates meeting these tags can qualify for the position. Similarly, in IT systems, if a server runs a specific application, database, and NGINX in a certain environment with a responsible person, having enough tags allows for quick identification of problematic servers, affected services, and responsible parties, thereby improving problem resolution efficiency.
This article will explore the extensibility and flexibility of tags through four examples using Guance.
Companies often have multiple project teams or business units. Each team or unit may use its own infrastructure for business development. If observability is implemented using Guance from infrastructure to applications, how can resources be distinguished beyond workspace separation?
Of course, there is a way. Guance considered this scenario during design. The default DataKit main configuration file includes a global_tag label, which sets tags at the infrastructure level. All components on this infrastructure, such as applications and databases, inherit this tag by default.
DataKit defaults to collecting the hostname at the host level and uses it as a global tag to correlate all metrics, traces, logs, and objects. However, in many enterprise environments, hostnames are random strings without practical meaning. Changing the hostname might affect connections to applications or databases, so companies are hesitant to modify them. To avoid risks, DataKit's built-in ENV_HOSTNAME can handle this situation.
Warning
Note: After applying this method, data from the new hostname will be uploaded anew, and data from the old hostname will no longer be updated. Recommendation: If you need to change the hostname, it is best to do so during the initial installation of DataKit.
1 Modify datakit-inputs to Configure [environments]¶
Internal company NGINX servers typically handle domain forwarding or service forwarding. They may forward frontend requests to multiple backend subdomains or different ports, or directly serve multiple domains. Unified NGINX monitoring cannot meet these needs. How does Guance address this issue?
Scenario: NGINX exposes ports 18889 and 80, forwarding to internal server 118.178.57.79 on ports 8999 and 18999 respectively.
Requirement: Statistically analyze data for NGINX ports 18889 and 80, such as PV, UV, and error counts.
Prerequisite: Access logs for NGINX ports 80 and 18889 are configured in separate directories (or different log file names).
Refer to the integration documentation <Nginx> for detailed configuration.
Enable the nginx.conf performance metrics module
Check if the http_stub_status_module is enabled in nginx.
(This example already has it enabled.)
Add nginx_status location in nginx.conf
$ cd /etc/nginx
// Adjust nginx path as needed
$ vim nginx.conf
server {
listen 80;
server_name localhost;
// Port can be customized
location /nginx_status {
stub_status on;
allow 127.0.0.1;
deny all;
}
}
Execute nginx -s reload to reload nginx
Enable nginx.inputs in DataKit
$ cd /usr/local/datakit/conf.d/nginx/
$ cp nginx.conf.sample nginx.conf
$ vim nginx.conf
Similarly, different tags can be used to distinguish different projects, different owners, different business modules, different environments, etc. The specific capabilities of tags depend on your imagination.
Experiment Four: Confirm Specific Service Owner via Tag for Alert Notifications¶
As businesses grow, microservices and containers are widely used, increasing the number of service components and corresponding development and operations personnel. With finer divisions of labor, the best alert practice is to directly notify the responsible person when a business or IT system fails, thus improving alert closure efficiency. This can be achieved by sending alerts only to relevant individuals or assigning tickets in Jira. How does Guance handle this? In Guance, simply add a tag in specific observable inputs (unlimited tags supported), for example, adding a custom tag owner = "xxx" in nginx-inputs, then set owner as a variable in anomaly detection. Anomaly detection will automatically recognize this field and send notifications to DingTalk or WeCom groups, as shown below:
For example, add the following in the custom NGINX logs: