NSQ
采集 NSQ 运行数据并以指标的方式上报到观测云。
配置¶
前置条件¶
推荐 NSQ 版本 >= 1.0.0,已测试的版本:
- 1.2.1
- 1.1.0
- 0.3.8
采集器配置¶
进入 DataKit 安装目录下的 conf.d/samples 目录,复制 nsq.conf.sample 并命名为 nsq.conf。示例如下:
[[inputs.nsq]]
## NSQ Lookupd HTTP API endpoint
lookupd = "http://localhost:4161"
## NSQD HTTP API endpoint
## example:
## ["http://localhost:4151"]
nsqd = []
## time units are "ms", "s", "m", "h"
interval = "10s"
## Set true to enable election
election = true
## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
[inputs.nsq.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
配置好后,重启 DataKit 即可。
目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。
NSQ 采集器提供两种配置方式
lookupd:配置 NSQ 集群的lookupd地址,采集器会自动发现 NSQ Server 并采集数据,扩展性更佳nsqd:配置固定的 NSQ Daemon(nsqd)地址列表,采集器只会采集该列表的 NSQ Server 数据
以上两种配置方式是互斥的,lookupd 优先级更高,推荐使用 lookupd 配置方式。
指标¶
以下所有数据采集,默认会追加全局选举 tag,也可以在配置中通过 [inputs.nsq.tags] 指定其它标签:
nsq_topics¶
Metrics of all topics in the NSQ cluster
| Tags & Fields | Description |
|---|---|
| channel ( tag) |
Channel name |
| topic ( tag) |
Topic name |
| backend_depth | Total number of unconsumed messages exceeding the max-queue-size. Type: int Unit: count |
| deferred_count | Number of messages that have been requeued and are not yet ready for re-sending. Type: int Unit: count |
| depth | Total number of unconsumed messages in the current channel. Type: int Unit: count |
| in_flight_count | Number of messages during the sending process or client processing that have not been sent FIN, REQ (requeued), or timed out. Type: int Unit: count |
| message_count | Total number of messages processed in the current channel. Type: int Unit: count |
| requeue_count | Number of messages that have timed out or have been sent REQ by the client. Type: int Unit: count |
| timeout_count | Number of messages that have timed out and are still unprocessed. Type: int Unit: count |
nsq_nodes¶
Metrics of all nodes in the NSQ cluster.
| Tags & Fields | Description |
|---|---|
| host ( tag) |
Hostname |
| server_host ( tag) |
Service address, that is host:ip. |
| backend_depth | Total number of unconsumed messages exceeding the max-queue-size. Type: int Unit: count |
| depth | Total number of unconsumed messages in the current node. Type: int Unit: count |
| message_count | Total number of messages processed by the current node. Type: int Unit: count |
collector¶
| Tags & Fields | Description |
|---|---|
| instance ( tag) |
Server addr of the instance |
| job ( tag) |
Server name of the instance |
| up | Type: int | (gauge) Unit: - |
自定义对象¶
mq¶
| Tags & Fields | Description |
|---|---|
| col_co_status ( tag) |
Current status of collector on instance(OK/NotOK) |
| host ( tag) |
The server host address |
| ip ( tag) |
Connection IP of the instance |
| name ( tag) |
Object uniq ID |
| reason ( tag) |
If status not ok, we'll get some reasons about the status |
| display_name | Displayed name in UI Type: string | (gauge) Unit: N/A |
| uptime | Current instance uptime Type: int | (gauge) Unit: time,s |
| version | Current version of the instance Type: string | (gauge) Unit: N/A |