Redis
Redis indicator collector, which collects the following data:
- Turn on AOF data persistence and collect relevant metrics
- RDB data persistence metrics
- Slow Log monitoring metrics
- Big Key scan monitoring
- Master-slave replication
Configuration¶
Already tested version:
- 7.0.11
- 6.2.12
- 6.0.8
- 5.0.14
- 4.0.14
Precondition¶
- Redis version v5.0+
When collecting data under the master-slave architecture, please configure the host information of the slave node or master node for data collection, and you can get the different metric information related to the master-slave.
Create Monitor User (optional)
redis6.0+ goes to the redis-cli
command line, create the user and authorize
- goes to the
redis-cli
command line, authorization statisticshotkey/bigkey
information
CONFIG SET maxmemory-policy allkeys-lfu
ACL SETUSER username on +get +@read +@connection +@keyspace ~*
- collect hotkey &
bigkey
remote, need install redis-cli (collect local need not install it)
Collector Configuration¶
Go to the conf.d/db
directory under the DataKit installation directory, copy redis.conf.sample
and name it redis.conf
. Examples are as follows:
[[inputs.redis]]
host = "localhost"
port = 6379
## TLS connection config, redis-cli version must up 6.0+
## These tls configuration files should be the same as the ones used on the server.
## See also: https://redis.io/docs/latest/operate/oss_and_stack/management/security/encryption/
# insecure_skip_verify = true
# ca_certs = ["/opt/tls/ca.crt"]
# cert = "/opt/tls/redis.crt"
# cert_key = "/opt/tls/redis.key"
## we can encode these file content in base64 format:
# ca_certs_base64 = ["LONG_STING......"]
# cert_base64 = "LONG_STING......"
# cert_key_base64 = "LONG_STING......"
# server_name = "your-SNI-name"
# unix_socket_path = "/var/run/redis/redis.sock"
## Configure multiple dbs and configure dbs, and the dbs will also be placed in the collection list.
## dbs=[] or not configured, all non-empty dbs in Redis will be collected
# dbs=[0]
# username = "<USERNAME>"
# password = "<PASSWORD>"
## @param connect_timeout - number - optional - default: 10s
# connect_timeout = "10s"
## @param service - string - optional
service = "redis"
## @param interval - number - optional - default: 15
interval = "15s"
## @param redis_cli_path - string - optional - default: "redis-cli"
## If you want to use a custom redis-cli path for bigkey or hotkey, set this to the path of the redis-cli binary.
# redis_cli_path = "/usr/bin/redis-cli"
## @param hotkey - boolean - optional - default: false
## If you collet hotkey, set this to true
# hotkey = false
## @param bigkey - boolean - optional - default: false
## If you collet bigkey, set this to true
# bigkey = false
## @param key_interval - number - optional - default: 5m
## Interval of collet hotkey & bigkey
# key_interval = "5m"
## @param key_timeout - number - optional - default: 5m
## Timeout of collet hotkey & bigkey
# key_timeout = "5m"
## @param key_scan_sleep - string - optional - default: "0.1"
## Mean sleep 0.1 sec per 100 SCAN commands
# key_scan_sleep = "0.1"
## @param keys - list of strings - optional
## The length is 1 for strings.
## The length is zero for keys that have a type other than list, set, hash, or sorted set.
#
# keys = ["KEY_1", "KEY_PATTERN"]
## @param warn_on_missing_keys - boolean - optional - default: true
## If you provide a list of 'keys', set this to true to have the Agent log a warning
## when keys are missing.
#
# warn_on_missing_keys = true
## @param slow_log - boolean - optional - default: true
slow_log = true
## @param all_slow_log - boolean - optional - default: false
## Collect all slowlogs returned by Redis. When set to false, will only collect slowlog
## that are generated after this input starts, and collect the same slowlog only once.
all_slow_log = false
## @param slowlog-max-len - integer - optional - default: 128
slowlog-max-len = 128
## @param command_stats - boolean - optional - default: false
## Collect INFO COMMANDSTATS output as metrics.
# command_stats = false
## @param latency_percentiles - boolean - optional - default: false
## Collect INFO LATENCYSTATS output as metrics.
# latency_percentiles = false
## Set true to enable election
election = true
# [inputs.redis.log]
# #required, glob logfiles
# files = ["/var/log/redis/*.log"]
## glob filteer
#ignore = [""]
## grok pipeline script path
#pipeline = "redis.p"
## optional encodings:
## "utf-8", "utf-16le", "utf-16le", "gbk", "gb18030" or ""
#character_encoding = ""
## The pattern should be a regexp. Note the use of '''this regexp'''
## regexp link: https://golang.org/pkg/regexp/syntax/#hdr-Syntax
#match = '''^\S.*'''
[inputs.redis.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
After configuration, restart DataKit.
Can be turned on by ConfigMap Injection Collector Configuration or Config ENV_DATAKIT_INPUTS .
Note
If it is Alibaba Cloud Redis and the corresponding username and PASSWORD are set, the <PASSWORD>
should be set to your-user:your-password
, such as datakit:Pa55W0rd
.
Log Collection Configuration¶
To collect Redis logs, you need to open the log file redis.config
output configuration in Redis:
[inputs.redis.log]
# Log path needs to be filled with absolute path
files = ["/var/log/redis/*.log"]
Note
When configuring log collection, you need to install the DataKit on the same host as the Redis service, or otherwise mount the log on the DataKit machine.
In K8s, Redis logs can be exposed to stdout, and DataKit can automatically find its corresponding log.
Metrics¶
For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.redis.tags]
if needed:
redis_client
¶
- Tags
Tag | Description |
---|---|
addr | Address without port of the client |
host | Hostname |
name | The name set by the client with CLIENT SETNAME , default unknown |
server | Server addr |
service_name | Service name |
- Metrics
Metric | Description |
---|---|
age | Total duration of the connection in seconds Type: float Unit: time,s |
argv_mem | Incomplete arguments for the next command (already extracted from query buffer). Type: float Unit: count |
db | Current database ID. Type: float Unit: count |
fd | File descriptor corresponding to the socket. Type: float Unit: count |
id | Unique 64-bit client ID. Type: float Unit: count |
idle | Idle time of the connection in seconds Type: float Unit: time,s |
multi | Number of commands in a MULTI/EXEC context. Type: float Unit: count |
multi_mem | Memory is used up by buffered multi commands. Added in Redis 7.0. Type: float Unit: count |
obl | Output buffer length. Type: float Unit: count |
oll | Output list length (replies are queued in this list when the buffer is full). Type: float Unit: count |
omem | Output buffer memory usage. Type: float Unit: count |
psub | Number of pattern matching subscriptions Type: float Unit: count |
qbuf | Query buffer length (0 means no query pending). Type: float Unit: count |
qbuf_free | Free space of the query buffer (0 means the buffer is full). Type: float Unit: count |
redir | Client id of current client tracking redirection. Type: float Unit: count |
resp | Client RESP protocol version. Added in Redis 7.0. Type: float Unit: count |
ssub | Number of shard channel subscriptions. Added in Redis 7.0.3. Type: float Unit: count |
sub | Number of channel subscriptions Type: float Unit: count |
tot_mem | Total memory consumed by this client in its various buffers. Type: float Unit: count |
redis_cluster
¶
- Tags
Tag | Description |
---|---|
host | Hostname |
server_addr | Server addr |
service_name | Service name |
- Metrics
Metric | Description |
---|---|
cluster_current_epoch | The local Current Epoch variable. This is used in order to create unique increasing version numbers during fail overs. Type: int Unit: N/A |
cluster_known_nodes | The total number of known nodes in the cluster, including nodes in HANDSHAKE state that may not currently be proper members of the cluster. Type: int Unit: count |
cluster_my_epoch | The Config Epoch of the node we are talking with. This is the current configuration version assigned to this node. Type: int Unit: N/A |
cluster_size | The number of master nodes serving at least one hash slot in the cluster. Type: int Unit: count |
cluster_slots_assigned | Number of slots which are associated to some node (not unbound). This number should be 16384 for the node to work properly, which means that each hash slot should be mapped to a node. Type: int Unit: count |
cluster_slots_fail | Number of hash slots mapping to a node in FAIL state. If this number is not zero the node is not able to serve queries unless cluster-require-full-coverage is set to no in the configuration. Type: int Unit: count |
cluster_slots_ok | Number of hash slots mapping to a node not in FAIL or PFAIL state.Type: int Unit: count |
cluster_slots_pfail | Number of hash slots mapping to a node in PFAIL state. Note that those hash slots still work correctly, as long as the PFAIL state is not promoted to FAIL by the failure detection algorithm. PFAIL only means that we are currently not able to talk with the node, but may be just a transient error.Type: int Unit: count |
cluster_state | State is 1(ok) if the node is able to receive queries. 0(fail) if there is at least one hash slot which is unbound (no node associated), in error state (node serving it is flagged with FAIL flag), or if the majority of masters can't be reached by this node. Type: int Unit: enum |
cluster_stats_messages_auth_ack_received | Message indicating a vote during leader election. Type: int Unit: count |
cluster_stats_messages_auth_ack_sent | Message indicating a vote during leader election. Type: int Unit: count |
cluster_stats_messages_auth_req_received | Replica initiated leader election to replace its master. Type: int Unit: count |
cluster_stats_messages_auth_req_sent | Replica initiated leader election to replace its master. Type: int Unit: count |
cluster_stats_messages_fail_received | Mark node xxx as failing received. Type: int Unit: count |
cluster_stats_messages_fail_sent | Mark node xxx as failing send. Type: int Unit: count |
cluster_stats_messages_meet_received | Handshake message received from a new node, either through gossip or CLUSTER MEET. Type: int Unit: count |
cluster_stats_messages_meet_sent | Handshake message sent to a new node, either through gossip or CLUSTER MEET. Type: int Unit: count |
cluster_stats_messages_mfstart_received | Pause clients for manual failover. Type: int Unit: count |
cluster_stats_messages_mfstart_sent | Pause clients for manual failover. Type: int Unit: count |
cluster_stats_messages_module_received | Module cluster API message. Type: int Unit: count |
cluster_stats_messages_module_sent | Module cluster API message. Type: int Unit: count |
cluster_stats_messages_ping_received | Cluster bus received PING (not to be confused with the client command PING). Type: int Unit: count |
cluster_stats_messages_ping_sent | Cluster bus send PING (not to be confused with the client command PING). Type: int Unit: count |
cluster_stats_messages_pong_received | PONG received (reply to PING). Type: int Unit: count |
cluster_stats_messages_pong_sent | PONG send (reply to PING). Type: int Unit: count |
cluster_stats_messages_publish_received | Pub/Sub Publish propagation received. Type: int Unit: count |
cluster_stats_messages_publish_sent | Pub/Sub Publish propagation send. Type: int Unit: count |
cluster_stats_messages_publishshard_received | Pub/Sub Publish shard propagation, see Sharded Pubsub. Type: int Unit: count |
cluster_stats_messages_publishshard_sent | Pub/Sub Publish shard propagation, see Sharded Pubsub. Type: int Unit: count |
cluster_stats_messages_received | Number of messages received via the cluster node-to-node binary bus. Type: int Unit: count |
cluster_stats_messages_sent | Number of messages sent via the cluster node-to-node binary bus. Type: int Unit: count |
cluster_stats_messages_update_received | Another node slots configuration. Type: int Unit: count |
cluster_stats_messages_update_sent | Another node slots configuration. Type: int Unit: count |
total_cluster_links_buffer_limit_exceeded | Accumulated count of cluster links freed due to exceeding the cluster-link-sendbuf-limit configuration.Type: int Unit: count |
redis_command_stat
¶
- Tags
Tag | Description |
---|---|
host | Hostname |
method | Command type |
server | Server addr |
service_name | Service name |
- Metrics
Metric | Description |
---|---|
calls | The number of calls that reached command execution. Type: float Unit: count |
failed_calls | The number of failed calls (errors within the command execution). Type: float Unit: count |
rejected_calls | The number of rejected calls (errors prior command execution). Type: float Unit: count |
usec | The total CPU time consumed by these commands. Type: float Unit: time,μs |
usec_per_call | The average CPU consumed per command execution. Type: float Unit: time,μs |
redis_db
¶
- Tags
Tag | Description |
---|---|
db | DB name. |
host | Hostname. |
server | Server addr. |
service_name | Service name. |
- Metrics
Metric | Description |
---|---|
avg_ttl | Average ttl. Type: int Unit: N/A |
expires | expires time. Type: int Unit: N/A |
keys | Key. Type: int Unit: N/A |
redis_info
¶
- Tags
Tag | Description |
---|---|
command_type | Command type. |
error_type | Error type. |
host | Hostname. |
maxmemory_policy | The value of the maxmemory-policy configuration directive. |
os | Operating system of the Redis server. |
process_id | Process ID of the Redis server. |
quantile | Histogram quantile . |
redis_build_id | Build ID of the Redis server. |
redis_mode | Mode of the Redis server. |
redis_version | Version of the Redis server. |
role | Value is master if the instance is replica of no one, or slave if the instance is a replica of some master instance. |
run_id | Random value identifying the Redis server (to be used by Sentinel and Cluster). |
server | Server addr. |
service_name | Service name. |
- Metrics
Metric | Description |
---|---|
acl_access_denied_auth | Number of authentication failures. Type: float Unit: count |
acl_access_denied_channel | Number of commands rejected because of access denied to a channel. Type: float Unit: count |
acl_access_denied_cmd | Number of commands rejected because of access denied to the command. Type: float Unit: count |
acl_access_denied_key | Number of commands rejected because of access denied to a key. Type: float Unit: count |
active_defrag_hits | Number of value reallocations performed by active the defragmentation process Type: float Unit: count |
active_defrag_key_hits | Number of keys that were actively defragmented Type: float Unit: count |
active_defrag_key_misses | Number of keys that were skipped by the active defragmentation process Type: float Unit: count |
active_defrag_misses | Number of aborted value reallocations started by the active defragmentation process Type: float Unit: count |
active_defrag_running | Flag indicating if active defragmentation is active Type: float Unit: bool |
allocator_active | Total bytes in the allocator active pages, this includes external-fragmentation.. Type: float Unit: digital,B |
allocator_allocated | Total bytes allocated form the allocator, including internal-fragmentation. Normally the same as used_memory.. Type: float Unit: digital,B |
allocator_frag_bytes | Delta between allocator_active and allocator_allocated. See note about mem_fragmentation_bytes.. Type: float Unit: digital,B |
allocator_frag_ratio | Ratio between allocator_active and allocator_allocated. This is the true (external) fragmentation metric (not mem_fragmentation_ratio).. Type: float Unit: unknown |
allocator_resident | Total bytes resident (RSS) in the allocator, this includes pages that can be released to the OS (by MEMORY PURGE, or just waiting).. Type: float Unit: digital,B |
allocator_rss_bytes | Delta between allocator_resident and allocator_active. Type: float Unit: digital,B |
allocator_rss_ratio | Ratio between allocator_resident and allocator_active. This usually indicates pages that the allocator can and probably will soon release back to the OS.. Type: float Unit: unknown |
aof_base_size | AOF file size on latest startup or rewrite. Type: float Unit: digital,B |
aof_buffer_length | Size of the AOF buffer Type: float Unit: digital,B |
aof_current_rewrite_time_sec | Duration of the on-going AOF rewrite operation if any. Type: float Unit: time,s |
aof_current_size | AOF current file size Type: float Unit: digital,B |
aof_delayed_fsync | Delayed fsync counter. Type: float Unit: count |
aof_enabled | Flag indicating AOF logging is activated. Type: float Unit: bool |
aof_last_cow_size | The size in bytes of copy-on-write memory during the last AOF rewrite operation. Type: float Unit: digital,B |
aof_last_rewrite_time_sec | Duration of the last AOF rewrite operation in seconds Type: float Unit: time,s |
aof_pending_bio_fsync | Number of fsync pending jobs in background I/O queue. Type: float Unit: count |
aof_pending_rewrite | Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete.. Type: float Unit: bool |
aof_rewrite_buffer_length | Size of the AOF rewrite buffer. Note this field was removed in Redis 7.0. Type: float Unit: digital,B |
aof_rewrite_in_progress | Flag indicating a AOF rewrite operation is on-going Type: float Unit: bool |
aof_rewrite_scheduled | Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete.. Type: float Unit: bool |
aof_rewrites | Number of AOF rewrites performed since startup. Type: float Unit: count |
arch_bits | Architecture (32 or 64 bits). Type: float Unit: count |
async_loading | Currently loading replication data-set asynchronously while serving old data. This means repl-diskless-load is enabled and set to swapdb . Added in Redis 7.0..Type: float Unit: bool |
blocked_clients | Number of clients pending on a blocking call (BLPOP/BRPOP/BRPOPLPUSH/BLMOVE/BZPOPMIN/BZPOPMAX )Type: float Unit: count |
client_biggest_input_buf | Biggest input buffer among current client connections Type: float Unit: digital,B |
client_longest_output_list | Longest output list among current client connections Type: float Unit: count |
client_recent_max_input_buffer | Biggest input buffer among current client connections. Type: float Unit: count |
client_recent_max_output_buffer | Biggest output buffer among current client connections. Type: float Unit: count |
clients_in_timeout_table | Number of clients in the clients timeout table. Type: float Unit: count |
cluster_connections | An approximation of the number of sockets used by the cluster's bus. Type: float Unit: count |
cluster_enabled | Indicate Redis cluster is enabled. Type: float Unit: bool |
configured_hz | The server's configured frequency setting. Type: float Unit: count |
connected_clients | Number of client connections (excluding connections from replicas) Type: float Unit: count |
connected_slaves | Number of connected replicas Type: float Unit: count |
current_active_defrag_time | The time passed since memory fragmentation last was over the limit, in milliseconds. Type: float Unit: time,ms |
current_cow_peak | The peak size in bytes of copy-on-write memory while a child fork is running. Type: float Unit: digital,B |
current_cow_size | The size in bytes of copy-on-write memory while a child fork is running. Type: float Unit: digital,B |
current_cow_size_age | The age, in seconds, of the current_cow_size value.. Type: float Unit: time,s |
current_eviction_exceeded_time | The time passed since used_memory last rose above maxmemory , in milliseconds.Type: float Unit: time,ms |
current_fork_perc | The percentage of progress of the current fork process. For AOF and RDB forks it is the percentage of current_save_keys_processed out of current_save_keys_total.. Type: float Unit: percent,percent |
current_save_keys_processed | Number of keys processed by the current save operation. Type: float Unit: count |
current_save_keys_total | Number of keys at the beginning of the current save operation. Type: float Unit: count |
dump_payload_sanitizations | Total number of dump payload deep integrity validations (see sanitize-dump-payload config).. Type: float Unit: count |
errorstat | Track of the different errors that occurred within Redis. Type: int Unit: count |
eventloop_cycles | Total number of eventloop cycles.Type: float Unit: count |
eventloop_duration_cmd_sum | Total time spent on executing commands in microseconds. Type: float Unit: time,μs |
eventloop_duration_sum | Total time spent in the eventloop in microseconds (including I/O and command processing).Type: float Unit: time,μs |
evicted_clients | Number of evicted clients due to maxmemory-clients limit. Added in Redis 7.0..Type: float Unit: count |
evicted_keys | Number of evicted keys due to Max-Memory limit Type: float Unit: count |
expire_cycle_cpu_milliseconds | The cumulative amount of time spent on active expiry cycles. Type: float Unit: time,ms |
expired_keys | Total number of key expiration events Type: float Unit: count |
expired_stale_perc | The percentage of keys probably expired. Type: float Unit: percent,percent |
expired_time_cap_reached_count | The count of times that active expiry cycles have stopped early. Type: float Unit: count |
hz | The server's current frequency setting. Type: float Unit: count |
info_latency_ms | The latency of the redis INFO command. Type: float Unit: time,ms |
instantaneous_eventloop_cycles_per_sec | Number of eventloop cycles per second.Type: float Unit: count |
instantaneous_eventloop_duration_usec | Average time spent in a single eventloop cycle in microseconds.Type: float Unit: time,μs |
instantaneous_input_kbps | The network's read rate per second in KB/sec. Type: float Unit: traffic,B/S |
instantaneous_input_repl_kbps | The network's read rate per second in KB/sec for replication purposes. Type: float Unit: traffic,B/S |
instantaneous_ops_per_sec | Number of commands processed per second. Type: float Unit: count |
instantaneous_output_kbps | The network's write rate per second in KB/sec. Type: float Unit: traffic,B/S |
instantaneous_output_repl_kbps | The network's write rate per second in KB/sec for replication purposes. Type: float Unit: traffic,B/S |
io_threaded_reads_processed | Number of read events processed by the main and I/O threads. Type: float Unit: count |
io_threaded_writes_processed | Number of write events processed by the main and I/O threads. Type: float Unit: count |
io_threads_active | Flag indicating if I/O threads are active. Type: float Unit: bool |
keyspace_hits | Number of successful lookup of keys in the main dictionary Type: float Unit: count |
keyspace_misses | Number of failed lookup of keys in the main dictionary Type: float Unit: count |
latency_percentiles_usec | Latency percentile distribution statistics based on the command type. Type: float Unit: time,ms |
latest_fork_usec | Duration of the latest fork operation in microseconds Type: float Unit: time,μs |
lazyfree_pending_objects | The number of objects waiting to be freed (as a result of calling UNLINK, or FLUSHDB and FLUSHALL with the ASYNC option).Type: float Unit: count |
lazyfreed_objects | The number of objects that have been lazy freed.. Type: float Unit: count |
loading | Flag indicating if the load of a dump file is on-going. Type: float Unit: bool |
loading_eta_seconds | ETA in seconds for the load to be complete Type: float Unit: time,s |
loading_loaded_bytes | Number of bytes already loaded Type: float Unit: digital,B |
loading_loaded_perc | Same value expressed as a percentage Type: float Unit: percent,percent |
loading_rdb_used_mem | The memory usage of the server that had generated the RDB file at the time of the file's creation. Type: float Unit: digital,B |
loading_start_time | Epoch-based timestamp of the start of the load operation. Type: float Unit: timeStamp,sec |
loading_total_bytes | Total file size. Type: float Unit: digital,B |
lru_clock | Clock incrementing every minute, for LRU management. Type: float Unit: time,ms |
master_last_io_seconds_ago | Number of seconds since the last interaction with master Type: float Unit: time,s |
master_link_down_since_seconds | Number of seconds since the link is down. Type: float Unit: time,s |
master_repl_offset | The server's current replication offset Type: float Unit: count |
master_sync_in_progress | Indicate the master is syncing to the replica Type: float Unit: bool |
master_sync_last_io_seconds_ago | Number of seconds since last transfer I/O during a SYNC operation. Type: float Unit: time,s |
master_sync_left_bytes | Number of bytes left before syncing is complete (may be negative when master_sync_total_bytes is 0) Type: float Unit: digital,B |
master_sync_perc | The percentage master_sync_read_bytes from master_sync_total_bytes, or an approximation that uses loading_rdb_used_mem when master_sync_total_bytes is 0. Type: float Unit: percent,percent |
master_sync_read_bytes | Number of bytes already transferred. Type: float Unit: digital,B |
master_sync_total_bytes | Total number of bytes that need to be transferred. this may be 0 when the size is unknown (for example, when the repl-diskless-sync configuration directive is used).Type: float Unit: digital,B |
maxclients | The value of the maxclients configuration directive. This is the upper limit for the sum of connected_clients, connected_slaves and cluster_connections.Type: float Unit: count |
maxmemory | The value of the Max Memory configuration directive Type: float Unit: digital,B |
mem_aof_buffer | Transient memory used for AOF and AOF rewrite buffers. Type: float Unit: digital,B |
mem_clients_normal | Memory used by normal clients. Type: float Unit: digital,B |
mem_clients_slaves | Memory used by replica clients - Starting Redis 7.0, replica buffers share memory with the replication backlog, so this field can show 0 when replicas don't trigger an increase of memory usage.. Type: float Unit: digital,B |
mem_cluster_links | Memory used by links to peers on the cluster bus when cluster mode is enabled.. Type: float Unit: digital,B |
mem_fragmentation_bytes | Delta between used_memory_rss and used_memory. Note that when the total fragmentation bytes is low (few megabytes), a high ratio (e.g. 1.5 and above) is not an indication of an issue.. Type: float Unit: digital,B |
mem_fragmentation_ratio | Ratio between used_memory_rss and used_memory Type: float Unit: unknown |
mem_not_counted_for_evict | Used memory that's not counted for key eviction. This is basically transient replica and AOF buffers.. Type: float Unit: digital,B |
mem_replication_backlog | Memory used by replication backlog. Type: float Unit: digital,B |
mem_total_replication_buffers | Total memory consumed for replication buffers - Added in Redis 7.0.. Type: float Unit: digital,B |
migrate_cached_sockets | The number of sockets open for MIGRATE purposes. Type: float Unit: count |
min_slaves_good_slaves | Number of replicas currently considered good. Type: float Unit: count |
module_fork_in_progress | Flag indicating a module fork is on-going. Type: float Unit: bool |
module_fork_last_cow_size | The size in bytes of copy-on-write memory during the last module fork operation. Type: float Unit: digital,B |
pubsub_channels | Global number of pub/sub channels with client subscriptions Type: float Unit: count |
pubsub_patterns | Global number of pub/sub pattern with client subscriptions Type: float Unit: count |
pubsubshard_channels | Global number of pub/sub shard channels with client subscriptions. Added in Redis 7.0.3. Type: float Unit: count |
rdb_bgsave_in_progress | Flag indicating a RDB save is on-going Type: float Unit: bool |
rdb_changes_since_last_save | Refers to the number of operations that produced some kind of changes in the dataset since the last time either SAVE or BGSAVE was called.Type: float Unit: count |
rdb_current_bgsave_time_sec | Duration of the on-going RDB save operation if any. Type: float Unit: time,s |
rdb_last_bgsave_time_sec | Duration of the last RDB save operation in seconds Type: float Unit: time,s |
rdb_last_cow_size | The size in bytes of copy-on-write memory during the last RDB save operation. Type: float Unit: digital,B |
rdb_last_load_keys_expired | Number of volatile keys deleted during the last RDB loading. Added in Redis 7.0.. Type: float Unit: count |
rdb_last_load_keys_loaded | Number of keys loaded during the last RDB loading. Added in Redis 7.0.. Type: float Unit: count |
rdb_last_save_time | Epoch-based timestamp of last successful RDB save. Type: float Unit: timeStamp,sec |
rdb_saves | Number of RDB snapshots performed since startup. Type: float Unit: count |
rejected_connections | Number of connections rejected because of Max-Clients limit Type: float Unit: count |
repl_backlog_active | Flag indicating replication backlog is active. Type: float Unit: bool |
repl_backlog_first_byte_offset | The master offset of the replication backlog buffer. Type: float Unit: count |
repl_backlog_histlen | Size in bytes of the data in the replication backlog buffer Type: float Unit: digital,B |
repl_backlog_size | Total size in bytes of the replication backlog buffer. Type: float Unit: digital,B |
replica_announced | Flag indicating if the replica is announced by Sentinel.. Type: float Unit: count |
rss_overhead_bytes | Delta between used_memory_rss (the process RSS) and allocator_resident. Type: float Unit: digital,B |
rss_overhead_ratio | Ratio between used_memory_rss (the process RSS) and allocator_resident. This includes RSS overheads that are not allocator or heap related.. Type: float Unit: unknown |
second_repl_offset | The offset up to which replication IDs are accepted. Type: float Unit: count |
server_time_usec | Epoch-based system time with microsecond precision. Type: float Unit: time,ms |
shutdown_in_milliseconds | The maximum time remaining for replicas to catch up the replication before completing the shutdown sequence. This field is only present during shutdown. Type: float Unit: time,ms |
slave_expires_tracked_keys | The number of keys tracked for expiry purposes (applicable only to writable replicas). Type: float Unit: count |
slave_priority | The priority of the instance as a candidate for failover. Type: float Unit: count |
slave_read_only | Flag indicating if the replica is read-only. Type: float Unit: count |
slave_read_repl_offset | The read replication offset of the replica instance.. Type: float Unit: count |
slave_repl_offset | The replication offset of the replica instance Type: float Unit: count |
stat_reply_buffer_expands | Total number of output buffer expands. Type: float Unit: count |
stat_reply_buffer_shrinks | Total number of output buffer shrinks. Type: float Unit: count |
sync_full | The number of full resyncs with replicas.Type: float Unit: count |
sync_partial_err | The number of denied partial resync requests. Type: float Unit: count |
sync_partial_ok | The number of accepted partial resync requests. Type: float Unit: count |
tcp_port | TCP/IP listen port. Type: float Unit: time,ms |
total_active_defrag_time | Total time memory fragmentation was over the limit, in milliseconds. Type: float Unit: time,ms |
total_blocking_keys | Number of blocking keys. Type: float Unit: count |
total_blocking_keys_on_nokey | Number of blocking keys that one or more clients that would like to be unblocked when the key is deleted. Type: float Unit: count |
total_commands_processed | Total number of commands processed by the server. Type: float Unit: count |
total_connections_received | Total number of connections accepted by the server. Type: float Unit: count |
total_error_replies | Total number of issued error replies, that is the sum of rejected commands (errors prior command execution) and failed commands (errors within the command execution). Type: float Unit: count |
total_eviction_exceeded_time | Total time used_memory was greater than maxmemory since server startup, in milliseconds.Type: float Unit: time,ms |
total_forks | Total number of fork operations since the server start. Type: float Unit: count |
total_net_input_bytes | The total number of bytes read from the network Type: float Unit: digital,B |
total_net_output_bytes | The total number of bytes written to the network Type: float Unit: digital,B |
total_net_repl_input_bytes | The total number of bytes read from the network for replication purposes. Type: float Unit: digital,B |
total_net_repl_output_bytes | The total number of bytes written to the network for replication purposes. Type: float Unit: digital,B |
total_reads_processed | Total number of read events processed. Type: float Unit: count |
total_system_memory | The total amount of memory that the Redis host has. Type: float Unit: digital,B |
total_writes_processed | Total number of write events processed. Type: float Unit: count |
tracking_clients | Number of clients being tracked (CLIENT TRACKING). Type: float Unit: count |
tracking_total_items | Number of items, that is the sum of clients number for each key, that are being tracked. Type: float Unit: count |
tracking_total_keys | Number of keys being tracked by the server. Type: float Unit: count |
tracking_total_prefixes | Number of tracked prefixes in server's prefix table (only applicable for broadcast mode). Type: float Unit: count |
unexpected_error_replies | Number of unexpected error replies, that are types of errors from an AOF load or replication. Type: float Unit: count |
uptime_in_days | Same value expressed in days. Type: float Unit: time,d |
uptime_in_seconds | Number of seconds since Redis server start. Type: float Unit: time,s |
used_cpu_sys | System CPU consumed by the Redis server, which is the sum of system CPU consumed by all threads of the server process (main thread and background threads). Type: float Unit: time,s |
used_cpu_sys_children | System CPU consumed by the background processes. Type: float Unit: time,s |
used_cpu_sys_main_thread | System CPU consumed by the Redis server main thread. Type: float Unit: time,s |
used_cpu_sys_percent | System CPU percentage consumed by the Redis server, which is the sum of system CPU consumed by all threads of the server process (main thread and background threads) Type: float Unit: percent,percent |
used_cpu_user | User CPU consumed by the Redis server, which is the sum of user CPU consumed by all threads of the server process (main thread and background threads). Type: float Unit: time,s |
used_cpu_user_children | User CPU consumed by the background processes. Type: float Unit: time,s |
used_cpu_user_main_thread | User CPU consumed by the Redis server main thread. Type: float Unit: time,s |
used_cpu_user_percent | User CPU percentage consumed by the Redis server, which is the sum of user CPU consumed by all threads of the server process (main thread and background threads) Type: float Unit: percent,percent |
used_memory | Total number of bytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc) Type: float Unit: digital,B |
used_memory_dataset | The size in bytes of the dataset (used_memory_overhead subtracted from used_memory). Type: float Unit: digital,B |
used_memory_dataset_perc | The percentage of used_memory_dataset out of the net memory usage (used_memory minus used_memory_startup). Type: float Unit: percent,percent |
used_memory_lua | Number of bytes used by the Lua engine Type: float Unit: digital,B |
used_memory_overhead | The sum in bytes of all overheads that the server allocated for managing its internal data structures Type: float Unit: digital,B |
used_memory_peak | Peak memory consumed by Redis (in bytes) Type: float Unit: digital,B |
used_memory_peak_perc | The percentage of used_memory_peak out of used_memory. Type: float Unit: percent,percent |
used_memory_rss | Number of bytes that Redis allocated as seen by the operating system (a.k.a resident set size) Type: float Unit: digital,B |
used_memory_scripts | Number of bytes used by cached Lua scripts. Type: float Unit: digital,B |
used_memory_startup | Initial amount of memory consumed by Redis at startup in bytes Type: float Unit: digital,B |
redis_replica
¶
- Tags
Tag | Description |
---|---|
host | Hostname. |
master_addr | Master addr, only collected for slave redis. |
server | Server addr. |
service_name | Service name. |
slave_addr | Slave addr, only collected for master redis. |
slave_id | Slave ID, only collected for master redis. |
slave_state | Slave state, only collected for master redis. |
- Metrics
Metric | Description |
---|---|
master_link_down_since_seconds | Number of seconds since the link is down when the link between master and replica is down, only collected for slave redis. Type: int Unit: N/A |
master_link_status | Status of the link (up/down), 1 for up, 0 for down, only collected for slave redis.Type: int Unit: N/A |
master_repl_offset | The server's current replication offset. Type: int Unit: N/A |
slave_lag | Slave lag, only collected for master redis. Type: int Unit: N/A |
slave_offset | Slave offset, only collected for master redis. Type: int Unit: N/A |
collector
¶
- Tags
Tag | Description |
---|---|
instance | Server addr of the instance |
job | Server name of the instance |
- Metrics
Metric | Description |
---|---|
up | Type: int Unit: - |
Custom Object¶
database
¶
- Tags
Tag | Description |
---|---|
col_co_status | Current status of collector on instance(OK/NotOK ) |
host | Connection name(domain) host address |
ip | Connection IP of the instance |
name | Object uniq ID |
reason | If status not ok, we'll get some reasons about the status |
- Metrics
Metric | Description |
---|---|
display_name | Displayed name in UI Type: string Unit: N/A |
uptime | Current instance uptime Type: int Unit: time,s |
version | Current version of the instance Type: string Unit: N/A |
Logging¶
redis_bigkey
¶
- Tags
Tag | Description |
---|---|
db_name | DB name. |
host | Hostname. |
key | Key name. |
key_type | Key type. |
server | Server addr. |
service_name | Service name. |
- Metrics
Metric | Description |
---|---|
keys_sampled | Sampled keys in the key space. Type: int Unit: count |
value_length | Key length. Type: int Unit: digital,B |
redis_hotkey
¶
- Tags
Tag | Description |
---|---|
db_name | DB name. |
host | Hostname. |
key | Key name. |
server | Server addr. |
service_name | Service name. |
- Metrics
Metric | Description |
---|---|
key_count | Key count times. Type: int Unit: count |
keys_sampled | Sampled keys in the key space. Type: int Unit: count |
redis_latency
¶
- Tags
Tag | Description |
---|---|
server | Server addr |
service_name | Service name |
- Metrics
Metric | Description |
---|---|
cost_time | Latest event latency in millisecond. Type: int Unit: time,ms |
event_name | Event name. Type: string Unit: N/A |
max_cost_time | All-time maximum latency for this event. Type: int Unit: time,ms |
occur_time | Unix timestamp of the latest latency spike for the event. Type: int Unit: timeStamp,sec |
redis_slowlog
¶
Redis slow query history logging
- Tags
Tag | Description |
---|---|
host | host |
message | log message |
server | server |
service_name | Service name |
- Metrics
Metric | Description |
---|---|
client_addr | The client ip:port that run the slow query Type: string Unit: N/A |
client_name | The client name that run the slow query(if client setname executed on client-side)Type: string Unit: N/A |
command | Slow command Type: string Unit: N/A |
slowlog_95percentile | Slow 95th percentile duration Type: int Unit: time,μs |
slowlog_avg | Slow average duration Type: float Unit: time,μs |
slowlog_id | Slow log unique ID Type: int Unit: N/A |
slowlog_max | Slow maximum duration Type: int Unit: time,μs |
slowlog_median | Slow median duration Type: int Unit: time,μs |
slowlog_micros | Cost time Type: int Unit: time,μs |
Logging Pipeline¶
The original log is:
The list of cut fields is as follows:
Field Name | Field Value | Description |
---|---|---|
pid |
122 |
process id |
role |
M |
role |
serverity |
* |
service |
statu |
notice |
log level |
msg |
Background saving terminated with success |
log content |
time |
1557861100164000000 |
Nanosecond timestamp (as line protocol time) |