RabbitMQ
RabbitMQ collector monitors RabbitMQ by collecting data through the plug-in rabbitmq-management and can:
- RabbitMQ overview, such as connections, queues, total messages, and so on.
- Track RabbitMQ queue information, such as queue size, consumer count and so on.
- Rack RabbitMQ node information, such as
socketmem. - Tracking RabbitMQ exchange information such as
message_publish_count.
Configuration¶
Preconditions¶
-
RabbitMQ version >=
3.6.0; Already tested version:- 3.11.x
- 3.10.x
- 3.9.x
- 3.8.x
- 3.7.x
- 3.6.x
-
Install
rabbitmq, takeUbuntuas an example -
Start
REST API plug-ins -
Create user, for example:
Collector Configuration¶
Go to the conf.d/samples directory under the DataKit installation directory, copy rabbitmq.conf.sample and name it rabbitmq.conf. Examples are as follows:
[[inputs.rabbitmq]]
# rabbitmq url ,required
url = "http://localhost:15672"
# rabbitmq user, required
username = "guest"
# rabbitmq password, required
password = "guest"
# ##(optional) collection interval, default is 30s
# interval = "30s"
## Optional TLS Config
# tls_ca = "/xxx/ca.pem"
# tls_cert = "/xxx/cert.cer"
# tls_key = "/xxx/key.key"
## Use TLS but skip chain & host verification
insecure_skip_verify = false
## Set true to enable election
election = true
# [inputs.rabbitmq.log]
# files = []
# #grok pipeline script path
# pipeline = "rabbitmq.p"
[inputs.rabbitmq.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
# ...
After configuration, restart DataKit.
The collector can now be turned on by ConfigMap injection collector configuration.
Metric¶
For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.rabbitmq.tags] if needed:
rabbitmq_overview¶
| Tags & Fields | Description |
|---|---|
| cluster_name ( tag) |
RabbitMQ cluster name |
| host ( tag) |
Hostname of RabbitMQ running on. |
| rabbitmq_version ( tag) |
RabbitMQ version |
| url ( tag) |
RabbitMQ url |
| message_ack_count | Number of messages delivered to clients and acknowledged Type: int | (count) Unit: count |
| message_ack_rate | Rate of messages delivered to clients and acknowledged per second Type: float | (gauge) Unit: percent,percent |
| message_confirm_count | Count of messages confirmed Type: int | (count) Unit: count |
| message_confirm_rate | Rate of messages confirmed per second Type: float | (gauge) Unit: percent,percent |
| message_deliver_get_count | Sum of messages delivered in acknowledgement mode to consumers, in no-acknowledgement mode to consumers, in acknowledgement mode in response to basic.get, and in no-acknowledgement mode in response to basic.get Type: int | (count) Unit: count |
| message_deliver_get_rate | Rate per second of the sum of messages delivered in acknowledgement mode to consumers, in no-acknowledgement mode to consumers, in acknowledgement mode in response to basic.get, and in no-acknowledgement mode in response to basic.get Type: float | (gauge) Unit: percent,percent |
| message_publish_count | Count of messages published Type: int | (count) Unit: count |
| message_publish_in_count | Count of messages published from channels into this overview Type: int | (count) Unit: count |
| message_publish_in_rate | Rate of messages published from channels into this overview per sec Type: float | (gauge) Unit: percent,percent |
| message_publish_out_count | Count of messages published from this overview into queues Type: int | (count) Unit: count |
| message_publish_out_rate | Rate of messages published from this overview into queues per second Type: float | (gauge) Unit: percent,percent |
| message_publish_rate | Rate of messages published per second Type: float | (gauge) Unit: percent,percent |
| message_redeliver_count | Count of subset of messages in deliver_get which had the redelivered flag set Type: int | (count) Unit: count |
| message_redeliver_rate | Rate of subset of messages in deliver_get which had the redelivered flag set per second Type: float | (gauge) Unit: percent,percent |
| message_return_unroutable_count | Count of messages returned to publisher as unroutable Type: int | (count) Unit: count |
| message_return_unroutable_count_rate | Rate of messages returned to publisher as unroutable per second Type: float | (gauge) Unit: percent,percent |
| object_totals_channels | Total number of channels Type: int | (count) Unit: count |
| object_totals_connections | Total number of connections Type: int | (count) Unit: count |
| object_totals_consumers | Total number of consumers Type: int | (count) Unit: count |
| object_totals_queues | Total number of queues Type: int | (count) Unit: count |
| queue_totals_messages_count | Total number of messages (ready plus unacknowledged) Type: int | (count) Unit: count |
| queue_totals_messages_rate | Total rate of messages (ready plus unacknowledged) Type: float | (gauge) Unit: percent,percent |
| queue_totals_messages_ready_count | Number of messages ready for delivery Type: int | (count) Unit: count |
| queue_totals_messages_ready_rate | Rate of number of messages ready for delivery Type: float | (gauge) Unit: percent,percent |
| queue_totals_messages_unacknowledged_count | Number of unacknowledged messages Type: int | (count) Unit: count |
| queue_totals_messages_unacknowledged_rate | Rate of number of unacknowledged messages Type: float | (gauge) Unit: percent,percent |
rabbitmq_queue¶
| Tags & Fields | Description |
|---|---|
| cluster_name ( tag) |
RabbitMQ cluster name |
| host ( tag) |
Hostname of RabbitMQ running on. |
| node_name ( tag) |
RabbitMQ node name |
| queue_name ( tag) |
RabbitMQ queue name |
| url ( tag) |
RabbitMQ host URL |
| vhost ( tag) |
RabbitMQ queue virtual hosts |
| bindings_count | Number of bindings for a specific queue Type: int | (count) Unit: count |
| consumer_utilization | The ratio of time that a queue's consumers can take new messages Type: float | (gauge) Unit: percent,percent |
| consumers | Number of consumers Type: int | (count) Unit: count |
| head_message_timestamp | Timestamp of the head message of the queue. Shown as millisecond Type: int | (gauge) Unit: timeStamp,msec |
| memory | Bytes of memory consumed by the Erlang process associated with the queue, including stack, heap and internal structures Type: int | (gauge) Unit: digital,B |
| message_ack_count | Number of messages in queues delivered to clients and acknowledged Type: int | (count) Unit: count |
| message_ack_rate | Number per second of messages delivered to clients and acknowledged Type: float | (gauge) Unit: percent,percent |
| message_deliver_count | Count of messages delivered in acknowledgement mode to consumers Type: int | (count) Unit: count |
| message_deliver_get_count | Sum of messages in queues delivered in acknowledgement mode to consumers, in no-acknowledgement mode to consumers, in acknowledgement mode in response to basic.get, and in no-acknowledgement mode in response to basic.get. Type: int | (count) Unit: count |
| message_deliver_get_rate | Rate per second of the sum of messages in queues delivered in acknowledgement mode to consumers, in no-acknowledgement mode to consumers, in acknowledgement mode in response to basic.get, and in no-acknowledgement mode in response to basic.get. Type: float | (gauge) Unit: percent,percent |
| message_deliver_rate | Rate of messages delivered in acknowledgement mode to consumers Type: float | (gauge) Unit: percent,percent |
| message_publish_count | Count of messages in queues published Type: int | (count) Unit: count |
| message_publish_rate | Rate per second of messages published Type: float | (gauge) Unit: percent,percent |
| message_redeliver_count | Count of subset of messages in queues in deliver_get which had the redelivered flag set Type: int | (count) Unit: count |
| message_redeliver_rate | Rate per second of subset of messages in deliver_get which had the redelivered flag set Type: float | (gauge) Unit: percent,percent |
| messages | Count of the total messages in the queue Type: int | (count) Unit: count |
| messages_rate | Count per second of the total messages in the queue Type: float | (gauge) Unit: percent,percent |
| messages_ready | Number of messages ready to be delivered to clients Type: int | (count) Unit: count |
| messages_ready_rate | Number per second of messages ready to be delivered to clients Type: float | (gauge) Unit: percent,percent |
| messages_unacknowledged | Number of messages delivered to clients but not yet acknowledged Type: int | (count) Unit: count |
| messages_unacknowledged_rate | Number per second of messages delivered to clients but not yet acknowledged Type: float | (gauge) Unit: percent,percent |
rabbitmq_exchange¶
| Tags & Fields | Description |
|---|---|
| auto_delete ( tag) |
If set, the exchange is deleted when all queues have finished using it |
| cluster_name ( tag) |
RabbitMQ cluster name |
| durable ( tag) |
If set when creating a new exchange, the exchange will be marked as durable. Durable exchanges remain active when a server restarts. Non-durable exchanges (transient exchanges) are purged if/when a server restarts. |
| exchange_name ( tag) |
RabbitMQ exchange name |
| host ( tag) |
Hostname of RabbitMQ running on. |
| internal ( tag) |
If set, the exchange may not be used directly by publishers, but only when bound to other exchanges. Internal exchanges are used to construct wiring that is not visible to applications |
| type ( tag) |
RabbitMQ exchange type |
| url ( tag) |
RabbitMQ host URL |
| vhost ( tag) |
RabbitMQ exchange virtual hosts |
| message_ack_count | Number of messages in exchanges delivered to clients and acknowledged Type: int | (count) Unit: count |
| message_ack_rate | Rate of messages in exchanges delivered to clients and acknowledged per second Type: float | (gauge) Unit: percent,percent |
| message_confirm_count | Count of messages in exchanges confirmed Type: int | (count) Unit: count |
| message_confirm_rate | Rate of messages in exchanges confirmed per second Type: float | (gauge) Unit: percent,percent |
| message_deliver_get_count | Sum of messages in exchanges delivered in acknowledgement mode to consumers, in no-acknowledgement mode to consumers, in acknowledgement mode in response to basic.get, and in no-acknowledgement mode in response to basic.get Type: int | (count) Unit: count |
| message_deliver_get_rate | Rate per second of the sum of exchange messages delivered in acknowledgement mode to consumers, in no-acknowledgement mode to consumers, in acknowledgement mode in response to basic.get, and in no-acknowledgement mode in response to basic.get Type: float | (gauge) Unit: percent,percent |
| message_publish_count | Count of messages in exchanges published Type: int | (count) Unit: count |
| message_publish_in_count | Count of messages published from channels into this exchange Type: int | (count) Unit: count |
| message_publish_in_rate | Rate of messages published from channels into this exchange per sec Type: float | (gauge) Unit: percent,percent |
| message_publish_out_count | Count of messages published from this exchange into queues Type: int | (count) Unit: count |
| message_publish_out_rate | Rate of messages published from this exchange into queues per second Type: float | (gauge) Unit: percent,percent |
| message_publish_rate | Rate of messages in exchanges published per second Type: float | (gauge) Unit: percent,percent |
| message_redeliver_count | Count of subset of messages in exchanges in deliver_get which had the redelivered flag set Type: int | (count) Unit: count |
| message_redeliver_rate | Rate of subset of messages in exchanges in deliver_get which had the redelivered flag set per second Type: float | (gauge) Unit: percent,percent |
| message_return_unroutable_count | Count of messages in exchanges returned to publisher as un-routable Type: int | (count) Unit: count |
| message_return_unroutable_count_rate | Rate of messages in exchanges returned to publisher as un-routable per second Type: float | (gauge) Unit: percent,percent |
rabbitmq_node¶
| Tags & Fields | Description |
|---|---|
| cluster_name ( tag) |
RabbitMQ cluster name |
| host ( tag) |
Hostname of RabbitMQ running on. |
| node_name ( tag) |
RabbitMQ node name |
| url ( tag) |
RabbitMQ url |
| disk_free | Current free disk space Type: int | (gauge) Unit: digital,B |
| disk_free_alarm | Does the node have disk alarm Type: bool | (gauge) Unit: N/A |
| fd_used | Used file descriptors Type: int | (gauge) Unit: count |
| io_read_avg_time | Average wall time (milliseconds) for each disk read operation in the last statistics interval Type: float | (gauge) Unit: time,ms |
| io_seek_avg_time | Average wall time (milliseconds) for each seek operation in the last statistics interval Type: float | (gauge) Unit: time,ms |
| io_sync_avg_time | Average wall time (milliseconds) for each fsync() operation in the last statistics interval Type: float | (gauge) Unit: time,ms |
| io_write_avg_time | Average wall time (milliseconds) for each disk write operation in the last statistics interval Type: float | (gauge) Unit: time,ms |
| mem_alarm | Does the node have mem alarm Type: bool | (gauge) Unit: N/A |
| mem_limit | Memory usage high watermark in bytes Type: int | (gauge) Unit: digital,B |
| mem_used | Memory used in bytes Type: int | (gauge) Unit: digital,B |
| run_queue | Average number of Erlang processes waiting to run Type: int | (count) Unit: count |
| running | Is the node running or not Type: bool | (gauge) Unit: N/A |
| sockets_used | Number of file descriptors used as sockets Type: int | (count) Unit: count |
collector¶
| Tags & Fields | Description |
|---|---|
| instance ( tag) |
Server addr of the instance |
| job ( tag) |
Server name of the instance |
| up | Type: int | (gauge) Unit: - |
Custom Object¶
mq¶
| Tags & Fields | Description |
|---|---|
| col_co_status ( tag) |
Current status of collector on instance(OK/NotOK) |
| host ( tag) |
The server host address |
| ip ( tag) |
Connection IP of the instance |
| name ( tag) |
Object uniq ID |
| reason ( tag) |
If status not ok, we'll get some reasons about the status |
| display_name | Displayed name in UI Type: string | (gauge) Unit: N/A |
| uptime | Current instance uptime Type: int | (gauge) Unit: time,s |
| version | Current version of the instance Type: string | (gauge) Unit: N/A |
Log Collection¶
Note
DataKit must be installed on the host where RabbitMQ is located to collect RabbitMQ logs.
To collect the RabbitMQ log, open files in RabbitMQ.conf and write to the absolute path of the RabbitMQ log file. For example:
[[inputs.rabbitmq]]
...
[inputs.rabbitmq.log]
files = ["/var/log/rabbitmq/rabbit@your-hostname.log"]
When log collection is turned on, a log with a log source of rabbitmq is generated by default.
Log Pipeline Function Cut Field Description¶
- RabbitMQ universal log cutting
Example of common log text:
2021-05-26 14:20:06.105 [warning] <0.12897.46> rabbitmqctl node_health_check and its HTTP API counterpart are DEPRECATED. See https://www.rabbitmq.com/monitoring.html#health-checks for replacement options.
The list of cut fields is as follows:
| Field Name | Field Value | Description |
|---|---|---|
| status | warning | Log level |
| msg | <0.12897.46>...replacement options | Log level |
| time | 1622010006000000000 | Nanosecond timestamp (as row protocol time) |