Nginx
NGINX collector can take many metrics from NGINX instances, such as the total number of requests, connections, cache and other metrics, and collect the metrics into Guance to help monitor and analyze various abnormal situations of NGINX.
Config¶
Requirements¶
-
NGINX version >=
1.8.0; Already tested version:- 1.23.2
- 1.22.1
- 1.21.6
- 1.18.0
- 1.14.2
- 1.8.0
-
NGINX collects the data of
http_stub_status_moduleby default. When thehttp_stub_status_moduleis opened, see here, which will report the data of NGINX measurements later. -
If you are using VTS or want to monitor more data, it is recommended to turn on VTS-related data collection by setting the option
use_vtstotrueinnginx.conf. For how to start VTS, see here. -
After VTS function is turned on, the following measurements can be generated:
nginxnginx_server_zonenginx_upstream_zone(NGINX needs to configureupstreamrelated configuration)nginx_cache_zone(NGINX needs to configurecacherelated configuration)
-
Take the example of generating the
nginx_upstream_zonemeasurements. An example of NGINX-related configuration is as follows:
...
http {
...
upstream your-upstreamname {
server upstream-ip:upstream-port;
}
server {
...
location / {
root html;
index index.html index.htm;
proxy_pass http://yourupstreamname;
}}}
-
After the VTS function has been turned on, it is no longer necessary to collect the data of the
http_stub_status_modulemodule, because the data of the VTS module will include the data of thehttp_stub_status_modulemodule. -
NGINX Plus users can still use the
http_stub_status_moduleto collect basic data. Additionally,http_api_moduleshould be enabled in the NGINX configuration file (Reference) and set status_zone in the server blocks you want to monitor. The configuration example is as follows:
# enable http_api_module
server {
listen 8080;
location /api {
api write=on;
}
}
# monitor more detailed metrics
server {
listen 80;
status_zone <ZONE_NAME>;
...
}
-
To enable NGINX Plus collection, you need to set the option
use_plus_apito true in thenginx.conffile and uncomment theplus_api_urloption. (Note: VTS does not support NGINX Plus). -
NGINX Plus can generate the following measurements:
nginx_location_zone
Configuration¶
Go to the conf.d/samples directory under the DataKit installation directory, copy nginx.conf.sample and name it nginx.conf. Examples are as follows:
[[inputs.nginx]]
## Nginx status URL.
## (Default) If not use with VTS, the formula is like this: "http://localhost:80/basic_status".
## If using with VTS, the formula is like this: "http://localhost:80/status/format/json".
url = "http://localhost:80/basic_status"
# If using Nginx Plus, this formula is like this: "http://localhost:8080/api/<api_version>".
# Note: Nginx Plus not support VTS and should be used with http_stub_status_module (Default)
# plus_api_url = "http://localhost:8080/api/9"
## Optional Can set ports as [<form>,<to>], Datakit will collect all ports.
# ports = [80,80]
## Optional collection interval, default is 10s
# interval = "30s"
use_vts = false
use_plus_api = false
## Optional TLS Config
# tls_ca = "/xxx/ca.pem"
# tls_cert = "/xxx/cert.cer"
# tls_key = "/xxx/key.key"
## Use TLS but skip chain & host verification
insecure_skip_verify = false
## HTTP response timeout (default: 5s)
response_timeout = "20s"
## Set true to enable election
election = true
# [inputs.nginx.log]
# files = ["/var/log/nginx/access.log","/var/log/nginx/error.log"]
## grok pipeline script path
# pipeline = "nginx.p"
# [inputs.nginx.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
After configuration, restart DataKit.
Inject collector configuration through ConfigMap to enable the collector
Note
The url address is subject to the specific configuration of NGINX. The common usage is to use the /basic_status route.
Metric¶
For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.nginx.tags] if needed:
nginx¶
| Tags & Fields | Description |
|---|---|
| host ( tag) |
Host name which installed nginx |
| nginx_port ( tag) |
Nginx server port |
| nginx_server ( tag) |
Nginx server host |
| nginx_version ( tag) |
Nginx version, exist when using vts |
| connection_accepts | The total number of accepts client connections Type: int | (count) Unit: count |
| connection_active | The current number of active client connections Type: int | (count) Unit: count |
| connection_dropped | The total number of dropped client connections Type: int | (count) Unit: count |
| connection_handled | The total number of handled client connections Type: int | (count) Unit: count |
| connection_reading | The total number of reading client connections Type: int | (count) Unit: count |
| connection_requests | The total number of requests client connections Type: int | (count) Unit: count |
| connection_waiting | The total number of waiting client connections Type: int | (count) Unit: count |
| connection_writing | The total number of writing client connections Type: int | (count) Unit: count |
| load_timestamp | Nginx process load time in milliseconds, exist when using vts Type: int | (gauge) Unit: timeStamp,msec |
| pid | The pid of nginx process (only for Nginx plus) Type: int | (count) Unit: count |
| ppid | The ppid of nginx process (only for Nginx plus) Type: int | (count) Unit: count |
nginx_server_zone¶
| Tags & Fields | Description |
|---|---|
| host ( tag) |
host name which installed nginx |
| nginx_port ( tag) |
nginx server port |
| nginx_server ( tag) |
nginx server host |
| nginx_version ( tag) |
nginx version |
| server_zone ( tag) |
server zone |
| code_200 | The number of responses with status code 200 (only for Nginx plus) Type: int | (count) Unit: count |
| code_301 | The number of responses with status code 301 (only for Nginx plus) Type: int | (count) Unit: count |
| code_404 | The number of responses with status code 404 (only for Nginx plus) Type: int | (count) Unit: count |
| code_503 | The number of responses with status code 503 (only for Nginx plus) Type: int | (count) Unit: count |
| discarded | The number of requests being discarded (only for Nginx plus) Type: int | (count) Unit: count |
| processing | The number of requests being processed (only for Nginx plus) Type: int | (count) Unit: count |
| received | The total amount of data received from clients. Type: int | (gauge) Unit: digital,B |
| requests | The total number of client requests received from clients. Type: int | (count) Unit: count |
| response_1xx | The number of responses with status codes 1xx Type: int | (count) Unit: count |
| response_2xx | The number of responses with status codes 2xx Type: int | (count) Unit: count |
| response_3xx | The number of responses with status codes 3xx Type: int | (count) Unit: count |
| response_4xx | The number of responses with status codes 4xx Type: int | (count) Unit: count |
| response_5xx | The number of responses with status codes 5xx Type: int | (count) Unit: count |
| responses | The total number of responses (only for Nginx plus) Type: int | (count) Unit: count |
| send | The total amount of data sent to clients. Type: int | (gauge) Unit: digital,B |
nginx_upstream_zone¶
| Tags & Fields | Description |
|---|---|
| host ( tag) |
host name which installed nginx |
| nginx_port ( tag) |
nginx server port |
| nginx_server ( tag) |
nginx server host |
| nginx_version ( tag) |
nginx version |
| upstream_server ( tag) |
upstream server |
| upstream_zone ( tag) |
upstream zone |
| active | The number of active connections (only for Nginx plus) Type: int | (count) Unit: count |
| backup | Whether it is configured as a backup server (only for Nginx plus) Type: int | (count) Unit: count |
| fails | The number of failed requests (only for Nginx plus) Type: int | (count) Unit: count |
| received | The total number of bytes received from this server. Type: int | (gauge) Unit: digital,B |
| request_count | The total number of client requests received from server. Type: int | (count) Unit: count |
| response_1xx | The number of responses with status codes 1xx Type: int | (count) Unit: count |
| response_2xx | The number of responses with status codes 2xx Type: int | (count) Unit: count |
| response_3xx | The number of responses with status codes 3xx Type: int | (count) Unit: count |
| response_4xx | The number of responses with status codes 4xx Type: int | (count) Unit: count |
| response_5xx | The number of responses with status codes 5xx Type: int | (count) Unit: count |
| send | The total number of bytes sent to clients. Type: int | (gauge) Unit: digital,B |
| state | The current state of the server (only for Nginx plus) Type: int | (count) Unit: count |
| unavail | The number of unavailable server (only for Nginx plus) Type: int | (count) Unit: count |
| weight | Weights used when load balancing (only for Nginx plus) Type: int | (count) Unit: count |
nginx_cache_zone¶
| Tags & Fields | Description |
|---|---|
| cache_zone ( tag) |
cache zone |
| host ( tag) |
host name which installed nginx |
| nginx_port ( tag) |
nginx server port |
| nginx_server ( tag) |
nginx server host |
| nginx_version ( tag) |
nginx version |
| max_size | The limit on the maximum size of the cache specified in the configuration Type: int | (gauge) Unit: digital,B |
| received | The total number of bytes received from the cache. Type: int | (gauge) Unit: digital,B |
| responses_bypass | The number of cache bypass Type: int | (count) Unit: count |
| responses_expired | The number of cache expired Type: int | (count) Unit: count |
| responses_hit | The number of cache hit Type: int | (count) Unit: count |
| responses_miss | The number of cache miss Type: int | (count) Unit: count |
| responses_revalidated | The number of cache revalidated Type: int | (count) Unit: count |
| responses_scarce | The number of cache scarce Type: int | (count) Unit: count |
| responses_stale | The number of cache stale Type: int | (count) Unit: count |
| responses_updating | The number of cache updating Type: int | (count) Unit: count |
| send | The total number of bytes sent from the cache. Type: int | (gauge) Unit: digital,B |
| used_size | The current size of the cache. Type: int | (gauge) Unit: digital,B |
nginx_location_zone¶
| Tags & Fields | Description |
|---|---|
| host ( tag) |
host name which installed nginx |
| location_zone ( tag) |
cache zone |
| nginx_port ( tag) |
nginx server port |
| nginx_server ( tag) |
nginx server host |
| nginx_version ( tag) |
nginx version |
| code_200 | The number of 200 code (only for Nginx plus) Type: int | (count) Unit: count |
| code_301 | The number of 301 code (only for Nginx plus) Type: int | (count) Unit: count |
| code_404 | The number of 404 code (only for Nginx plus) Type: int | (count) Unit: count |
| code_503 | The number of 503 code (only for Nginx plus) Type: int | (count) Unit: count |
| discarded | The total number of discarded request (only for Nginx plus) Type: int | (gauge) Unit: digital,B |
| received | The total number of received bytes (only for Nginx plus) Type: int | (gauge) Unit: digital,B |
| requests | The number of requests (only for Nginx plus) Type: int | (gauge) Unit: digital,B |
| response | The number of response (only for Nginx plus) Type: int | (gauge) Unit: digital,B |
| response_1xx | The number of 1xx response (only for Nginx plus) Type: int | (count) Unit: count |
| response_2xx | The number of 2xx response (only for Nginx plus) Type: int | (count) Unit: count |
| response_3xx | The number of 3xx response (only for Nginx plus) Type: int | (count) Unit: count |
| response_4xx | The number of 4xx response (only for Nginx plus) Type: int | (count) Unit: count |
| response_5xx | The number of 5xx response (only for Nginx plus) Type: int | (count) Unit: count |
| sent | The total number of send bytes (only for Nginx plus) Type: int | (count) Unit: count |
collector¶
| Tags & Fields | Description |
|---|---|
| instance ( tag) |
Server addr of the instance |
| job ( tag) |
Server name of the instance |
| up | Type: int | (gauge) Unit: - |
Custom Object¶
web_server¶
| Tags & Fields | Description |
|---|---|
| col_co_status ( tag) |
Current status of collector on this instance(OK/NotOK) |
| host ( tag) |
The server host address |
| ip ( tag) |
Connection IP of the instance |
| name ( tag) |
Object uniq ID |
| reason ( tag) |
If status not ok, we'll get some reasons about the status |
| display_name | Displayed name in UI Type: string | (gauge) Unit: N/A |
| uptime | Current instance uptime Type: int | (gauge) Unit: time,s |
| version | Current version of the instance Type: string | (gauge) Unit: N/A |
Log¶
To collect NGINX logs, open files in nginx.conf and write to the absolute path of the NGINX log file. For example:
[[inputs.nginx]]
...
[inputs.nginx.log]
files = ["/var/log/nginx/access.log","/var/log/nginx/error.log"]
When log collection is turned on, logs with a log source of nginx are generated by default.
Note: DataKit must be installed on the NGINX host to collect NGINX logs.
Log Pipeline Feature Cut Field Description¶
- NGINX error log cutting
Example error log text:
2021/04/21 09:24:04 [alert] 7#7: *168 write() to "/var/log/nginx/access.log" failed (28: No space left on device) while logging request, client: 120.204.196.129, server: localhost, request: "GET / HTTP/1.1", host: "47.98.103.73"
The list of cut fields is as follows:
| Field Name | Field Value | Description |
|---|---|---|
| status | error | Log level (alert changed to error) |
| client_ip | 120.204.196.129 | client ip address |
| server | localhost | server address |
| http_method | GET | http request mode |
| http_url | / | http request url |
| http_version | 1.1 | http version |
| ip_or_host | 47.98.103.73 | requestor ip or host |
| msg | 7#7: *168 write()...host: \"47.98.103.73 | Log content |
| time | 1618968244000000000 | Nanosecond timestamp (as line protocol time) |
Example of error log text:
The list of cut fields is as follows:
| Field Name | Field Value | Description |
|---|---|---|
status |
error |
Log level (emerg changed to error) |
msg |
50102#0: unexpected \";\" in /usr/local/etc/nginx/nginx.conf:23 |
log content |
time |
1619684678000000000 |
Nanosecond timestamp (as row protocol time) |
- NGINX access log cutting
Example of access log text:
127.0.0.1 - - [24/Mar/2021:13:54:19 +0800] "GET /basic_status HTTP/1.1" 200 97 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.72 Safari/537.36"
The list of cut fields is as follows:
| Field Name | Field Value | Description |
|---|---|---|
client_ip |
127.0.0.1 |
Log level (emerg changed to error) |
status |
ok |
log level |
status_code |
200 |
http code |
http_method |
GET |
http request method |
http_url |
/basic_status |
http request url |
http_version |
1.1 |
http version |
agent |
Mozilla/5.0... Safari/537.36 |
User-Agent |
browser |
Chrome |
browser |
browserVer |
89.0.4389.72 |
browser version |
isMobile |
false |
Is it a cell phone |
engine |
AppleWebKit |
engine |
os |
Intel Mac OS X 11_1_0 |
system |
time |
1619243659000000000 |
Nanosecond timestamp (as line protocol time) |
Tracing¶
Requirements¶
- Install nginx (>=1.9.13)
This module only supports the Linux operating system
Install Nginx OpenTracing Plugin¶
The Nginx OpenTracing plugin is an open-source link tracking plugin for OpenTracing, written in C++,It's work for Jaeger、Zipkin、LightStep、Datadog.
- Download the plugin corresponding to the current Nginx version, and use the following command to view the current Nginx version
- Extract
- Install plugin
Add the following information at the top of the nginx.conf file
Install DDAgent Nginx OpenTracing plugin¶
The DDAgent Nginx OpenTracing plugin is a set of vendor implementations based on Nginx OpenTracing, with different APMs having their own encoding and decoding implementations.
-
Download
dd-opentracing-cpp,libdd_opentracing.soorlinux-amd64-libdd_opentracing_plugin.so.gz -
Configure Nginx
opentracing_load_tracer /etc/nginx/tracer/libdd_opentracing.so /etc/nginx/tracer/dd.json;
opentracing on; # Enable OpenTracing
opentracing_tag http_user_agent $http_user_agent;
opentracing_trace_locations off;
opentracing_propagate_context;
opentracing_operation_name nginx-$host;
opentracing_load_tracer : load opentracing tracer
opentracing_propagate_context; : Indicates that the link context needs to be passed
- Configure DDTrace
dd.json is used to configure ddtrace ,such as:service、agent_host, etc., the content is as follows:
{
"environment": "test",
"service": "nginx",
"operation_name_override": "nginx.handle",
"agent_host": "localhost",
"agent_port": 9529
}
- Nginx logging configuration
Inject Trace information into Nginx logs. You can edit as follows:
log_format with_trace_id '$remote_addr - $http_x_forwarded_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'"$opentracing_context_x_datadog_trace_id" "$opentracing_context_x_datadog_parent_id"';
access_log /var/log/nginx/access-with-trace.log with_trace_id;
Note: The
log_formatkeyword tells Nginx that there is a set of logging rules defined here, Thewith_trace_idis the rule name and can be modified by yourself. Please use the same name to associate the rules of the log when specifying the log path below The path and file name inaccess_logcan be changed. Usually, the original Nginx is equipped with log rules. We can configure multiple rules and output different log formats to different files, that is, keep the original Theaccess_logrule and path remain unchanged, and a new log rule containing trace information is added, named as a different log file for different logging tools to read.
- Verify whether the plugin is working properly
Execute the following command to verify
$:/etc/nginx# nginx -t
info: DATADOG TRACER CONFIGURATION - {"agent_url":"http://localhost:9529","analytics_enabled":false,"analytics_sample_rate":null,"date":"2023-09-25T14:33:40+0800","enabled":true,"env":"prod","lang":"cpp","lang_version":"201402","operation_name_override":"nginx.handle","report_hostname":false,"sampling_rules":"[]","service":"nginx","version":"v1.3.7"}
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
info: DATADOG TRACER CONFIGURATION Indicates that DDTrace has been successfully loaded 。
Service tracing propagate¶
After Nginx generates link information, it needs to forward the relevant request header information to the backend, which can form a link concatenation operation between Nginx and the backend.
If there is a mismatch between Nginx link information and DDTrace, it is necessary to check if this step is standardized.
The following configuration needs to be added to the location under the corresponding server
location ^~ / {
...
proxy_set_header X-datadog-trace-id $opentracing_context_x_datadog_trace_id;
proxy_set_header X-datadog-parent-id $opentracing_context_x_datadog_parent_id;
...
}
Load nginx configure¶
Execute the following command to make the Nginx configuration effective:
root@liurui:/etc/nginx/tracer# nginx -s reload
info: DATADOG TRACER CONFIGURATION - {"agent_url":"http://localhost:9529","analytics_enabled":false,"analytics_sample_rate":null,"date":"2023-09-25T11:30:10+0800","enabled":true,"env":"prod","lang":"cpp","lang_version":"201402","operation_name_override":"nginx.handle","report_hostname":false,"sampling_rules":"[]","service":"nginx","version":"v1.3.7"}
root@liurui:/etc/nginx/tracer#
If the following error occurs:
root@liurui:/etc/nginx/conf.d# nginx -s reload
info: DATADOG TRACER CONFIGURATION - {"agent_url":"http://localhost:9529","analytics_enabled":false,"analytics_sample_rate":null,"date":"2023-09-25T12:28:53+0800","enabled":true,"env":"prod","lang":"cpp","lang_version":"201402","operation_name_override":"nginx.handle","report_hostname":false,"sampling_rules":"[]","service":"nginx","version":"v1.3.7"}
nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
The following configuration needs to be added to the http module of nginx.conf: