Skip to content

Redis

·


Redis indicator collector, which collects the following data:

  • Turn on AOF data persistence and collect relevant metrics
  • RDB data persistence metrics
  • Slow Log monitoring metrics
  • Big Key scan monitoring
  • Master-slave replication

Configuration

Already tested version:

  • 7.0.11
  • 6.2.12
  • 6.0.8
  • 5.0.14
  • 4.0.14

Precondition

  • Redis version v5.0+

When collecting data under the master-slave architecture, please configure the host information of the slave node or master node for data collection, and you can get the different metric information related to the master-slave.

Create Monitor User (optional)

redis6.0+ goes to the redis-cli command line, create the user and authorize

ACL SETUSER username >password
ACL SETUSER username on +@dangerous +ping
  • goes to the redis-cli command line, authorization statistics hotkey/bigkey information
CONFIG SET maxmemory-policy allkeys-lfu
ACL SETUSER username on +get +@read +@connection +@keyspace ~*
  • collect hotkey & bigkey remote, need install redis-cli (collect local need not install it)
# ubuntu 
apt-get install redis-tools

# centos
yum install -y  redis

Collector Configuration

Go to the conf.d/db directory under the DataKit installation directory, copy redis.conf.sample and name it redis.conf. Examples are as follows:

[[inputs.redis]]
  host = "localhost"
  port = 6379

  ## TLS connection config, redis-cli version must up 6.0+
  ## These tls configuration files should be the same as the ones used on the server. 
  ## See also: https://redis.io/docs/latest/operate/oss_and_stack/management/security/encryption/
  # insecure_skip_verify = true
  # ca_certs = ["/opt/tls/ca.crt"]
  # cert = "/opt/tls/redis.crt"
  # cert_key = "/opt/tls/redis.key"
  ## we can encode these file content in base64 format:
  # ca_certs_base64 = ["LONG_STING......"]
  # cert_base64 = "LONG_STING......"
  # cert_key_base64 = "LONG_STING......"
  # server_name = "your-SNI-name"

  # unix_socket_path = "/var/run/redis/redis.sock"
  ## Configure multiple dbs and configure dbs, and the dbs will also be placed in the collection list.
  ## dbs=[] or not configured, all non-empty dbs in Redis will be collected
  # dbs=[0]
  # username = "<USERNAME>"
  # password = "<PASSWORD>"

  ## @param connect_timeout - number - optional - default: 10s
  # connect_timeout = "10s"

  ## @param service - string - optional
  service = "redis"

  ## @param interval - number - optional - default: 15
  interval = "15s"

  ## @param redis_cli_path - string - optional - default: "redis-cli"
  ## If you want to use a custom redis-cli path for bigkey or hotkey, set this to the path of the redis-cli binary.
  # redis_cli_path = "/usr/bin/redis-cli"

  ## @param hotkey - boolean - optional - default: false
  ## If you collet hotkey, set this to true
  # hotkey = false

  ## @param bigkey - boolean - optional - default: false
  ## If you collet bigkey, set this to true
  # bigkey = false

  ## @param key_interval - number - optional - default: 5m
  ## Interval of collet hotkey & bigkey
  # key_interval = "5m"

  ## @param key_timeout - number - optional - default: 5m
  ## Timeout of collet hotkey & bigkey
  # key_timeout = "5m"

  ## @param key_scan_sleep - string - optional - default: "0.1"
  ## Mean sleep 0.1 sec per 100 SCAN commands
  # key_scan_sleep = "0.1"

  ## @param keys - list of strings - optional
  ## The length is 1 for strings.
  ## The length is zero for keys that have a type other than list, set, hash, or sorted set.
  #
  # keys = ["KEY_1", "KEY_PATTERN"]

  ## @param warn_on_missing_keys - boolean - optional - default: true
  ## If you provide a list of 'keys', set this to true to have the Agent log a warning
  ## when keys are missing.
  #
  # warn_on_missing_keys = true

  ## @param slow_log - boolean - optional - default: true
  slow_log = true

  ## @param all_slow_log - boolean - optional - default: false
  ## Collect all slowlogs returned by Redis. When set to false, will only collect slowlog
  ## that are generated after this input starts, and collect the same slowlog only once.
  all_slow_log = false

  ## @param slowlog-max-len - integer - optional - default: 128
  slowlog-max-len = 128

  ## @param command_stats - boolean - optional - default: false
  ## Collect INFO COMMANDSTATS output as metrics.
  # command_stats = false

  ## @param latency_percentiles - boolean - optional - default: false
  ## Collect INFO LATENCYSTATS output as metrics.
  # latency_percentiles = false

  ## Set true to enable election
  election = true

  # [inputs.redis.log]
  # #required, glob logfiles
  # files = ["/var/log/redis/*.log"]

  ## glob filteer
  #ignore = [""]

  ## grok pipeline script path
  #pipeline = "redis.p"

  ## optional encodings:
  ##    "utf-8", "utf-16le", "utf-16le", "gbk", "gb18030" or ""
  #character_encoding = ""

  ## The pattern should be a regexp. Note the use of '''this regexp'''
  ## regexp link: https://golang.org/pkg/regexp/syntax/#hdr-Syntax
  #match = '''^\S.*'''

  [inputs.redis.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"

After configuration, restart DataKit.


Note

If it is Alibaba Cloud Redis and the corresponding username and PASSWORD are set, the <PASSWORD> should be set to your-user:your-password, such as datakit:Pa55W0rd.

Log Collection Configuration

To collect Redis logs, you need to open the log file redis.config output configuration in Redis:

[inputs.redis.log]
    # Log path needs to be filled with absolute path
    files = ["/var/log/redis/*.log"]
Note

When configuring log collection, you need to install the DataKit on the same host as the Redis service, or otherwise mount the log on the DataKit machine.

In K8s, Redis logs can be exposed to stdout, and DataKit can automatically find its corresponding log.

Metrics

For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.redis.tags] if needed:

 [inputs.redis.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...

redis_client

  • Tags
Tag Description
addr Address without port of the client
host Hostname
name The name set by the client with CLIENT SETNAME, default unknown
server Server addr
service_name Service name
  • Metrics
Metric Description
age Total duration of the connection in seconds
Type: float
Unit: time,s
argv_mem Incomplete arguments for the next command (already extracted from query buffer).
Type: float
Unit: count
db Current database ID.
Type: float
Unit: count
fd File descriptor corresponding to the socket.
Type: float
Unit: count
id Unique 64-bit client ID.
Type: float
Unit: count
idle Idle time of the connection in seconds
Type: float
Unit: time,s
multi Number of commands in a MULTI/EXEC context.
Type: float
Unit: count
multi_mem Memory is used up by buffered multi commands. Added in Redis 7.0.
Type: float
Unit: count
obl Output buffer length.
Type: float
Unit: count
oll Output list length (replies are queued in this list when the buffer is full).
Type: float
Unit: count
omem Output buffer memory usage.
Type: float
Unit: count
psub Number of pattern matching subscriptions
Type: float
Unit: count
qbuf Query buffer length (0 means no query pending).
Type: float
Unit: count
qbuf_free Free space of the query buffer (0 means the buffer is full).
Type: float
Unit: count
redir Client id of current client tracking redirection.
Type: float
Unit: count
resp Client RESP protocol version. Added in Redis 7.0.
Type: float
Unit: count
ssub Number of shard channel subscriptions. Added in Redis 7.0.3.
Type: float
Unit: count
sub Number of channel subscriptions
Type: float
Unit: count
tot_mem Total memory consumed by this client in its various buffers.
Type: float
Unit: count

redis_cluster

  • Tags
Tag Description
host Hostname
server_addr Server addr
service_name Service name
  • Metrics
Metric Description
cluster_current_epoch The local Current Epoch variable. This is used in order to create unique increasing version numbers during fail overs.
Type: int
Unit: N/A
cluster_known_nodes The total number of known nodes in the cluster, including nodes in HANDSHAKE state that may not currently be proper members of the cluster.
Type: int
Unit: count
cluster_my_epoch The Config Epoch of the node we are talking with. This is the current configuration version assigned to this node.
Type: int
Unit: N/A
cluster_size The number of master nodes serving at least one hash slot in the cluster.
Type: int
Unit: count
cluster_slots_assigned Number of slots which are associated to some node (not unbound). This number should be 16384 for the node to work properly, which means that each hash slot should be mapped to a node.
Type: int
Unit: count
cluster_slots_fail Number of hash slots mapping to a node in FAIL state. If this number is not zero the node is not able to serve queries unless cluster-require-full-coverage is set to no in the configuration.
Type: int
Unit: count
cluster_slots_ok Number of hash slots mapping to a node not in FAIL or PFAIL state.
Type: int
Unit: count
cluster_slots_pfail Number of hash slots mapping to a node in PFAIL state. Note that those hash slots still work correctly, as long as the PFAIL state is not promoted to FAIL by the failure detection algorithm. PFAIL only means that we are currently not able to talk with the node, but may be just a transient error.
Type: int
Unit: count
cluster_state State is 1(ok) if the node is able to receive queries. 0(fail) if there is at least one hash slot which is unbound (no node associated), in error state (node serving it is flagged with FAIL flag), or if the majority of masters can't be reached by this node.
Type: int
Unit: enum
cluster_stats_messages_auth_ack_received Message indicating a vote during leader election.
Type: int
Unit: count
cluster_stats_messages_auth_ack_sent Message indicating a vote during leader election.
Type: int
Unit: count
cluster_stats_messages_auth_req_received Replica initiated leader election to replace its master.
Type: int
Unit: count
cluster_stats_messages_auth_req_sent Replica initiated leader election to replace its master.
Type: int
Unit: count
cluster_stats_messages_fail_received Mark node xxx as failing received.
Type: int
Unit: count
cluster_stats_messages_fail_sent Mark node xxx as failing send.
Type: int
Unit: count
cluster_stats_messages_meet_received Handshake message received from a new node, either through gossip or CLUSTER MEET.
Type: int
Unit: count
cluster_stats_messages_meet_sent Handshake message sent to a new node, either through gossip or CLUSTER MEET.
Type: int
Unit: count
cluster_stats_messages_mfstart_received Pause clients for manual failover.
Type: int
Unit: count
cluster_stats_messages_mfstart_sent Pause clients for manual failover.
Type: int
Unit: count
cluster_stats_messages_module_received Module cluster API message.
Type: int
Unit: count
cluster_stats_messages_module_sent Module cluster API message.
Type: int
Unit: count
cluster_stats_messages_ping_received Cluster bus received PING (not to be confused with the client command PING).
Type: int
Unit: count
cluster_stats_messages_ping_sent Cluster bus send PING (not to be confused with the client command PING).
Type: int
Unit: count
cluster_stats_messages_pong_received PONG received (reply to PING).
Type: int
Unit: count
cluster_stats_messages_pong_sent PONG send (reply to PING).
Type: int
Unit: count
cluster_stats_messages_publish_received Pub/Sub Publish propagation received.
Type: int
Unit: count
cluster_stats_messages_publish_sent Pub/Sub Publish propagation send.
Type: int
Unit: count
cluster_stats_messages_publishshard_received Pub/Sub Publish shard propagation, see Sharded Pubsub.
Type: int
Unit: count
cluster_stats_messages_publishshard_sent Pub/Sub Publish shard propagation, see Sharded Pubsub.
Type: int
Unit: count
cluster_stats_messages_received Number of messages received via the cluster node-to-node binary bus.
Type: int
Unit: count
cluster_stats_messages_sent Number of messages sent via the cluster node-to-node binary bus.
Type: int
Unit: count
cluster_stats_messages_update_received Another node slots configuration.
Type: int
Unit: count
cluster_stats_messages_update_sent Another node slots configuration.
Type: int
Unit: count
total_cluster_links_buffer_limit_exceeded Accumulated count of cluster links freed due to exceeding the cluster-link-sendbuf-limit configuration.
Type: int
Unit: count

redis_command_stat

  • Tags
Tag Description
host Hostname
method Command type
server Server addr
service_name Service name
  • Metrics
Metric Description
calls The number of calls that reached command execution.
Type: float
Unit: count
failed_calls The number of failed calls (errors within the command execution).
Type: float
Unit: count
rejected_calls The number of rejected calls (errors prior command execution).
Type: float
Unit: count
usec The total CPU time consumed by these commands.
Type: float
Unit: time,μs
usec_per_call The average CPU consumed per command execution.
Type: float
Unit: time,μs

redis_db

  • Tags
Tag Description
db DB name.
host Hostname.
server Server addr.
service_name Service name.
  • Metrics
Metric Description
avg_ttl Average ttl.
Type: int
Unit: N/A
expires expires time.
Type: int
Unit: N/A
keys Key.
Type: int
Unit: N/A

redis_info

  • Tags
Tag Description
command_type Command type.
error_type Error type.
host Hostname.
maxmemory_policy The value of the maxmemory-policy configuration directive.
os Operating system of the Redis server.
process_id Process ID of the Redis server.
quantile Histogram quantile.
redis_build_id Build ID of the Redis server.
redis_mode Mode of the Redis server.
redis_version Version of the Redis server.
role Value is master if the instance is replica of no one, or slave if the instance is a replica of some master instance.
run_id Random value identifying the Redis server (to be used by Sentinel and Cluster).
server Server addr.
service_name Service name.
  • Metrics
Metric Description
acl_access_denied_auth Number of authentication failures.
Type: float
Unit: count
acl_access_denied_channel Number of commands rejected because of access denied to a channel.
Type: float
Unit: count
acl_access_denied_cmd Number of commands rejected because of access denied to the command.
Type: float
Unit: count
acl_access_denied_key Number of commands rejected because of access denied to a key.
Type: float
Unit: count
active_defrag_hits Number of value reallocations performed by active the defragmentation process
Type: float
Unit: count
active_defrag_key_hits Number of keys that were actively defragmented
Type: float
Unit: count
active_defrag_key_misses Number of keys that were skipped by the active defragmentation process
Type: float
Unit: count
active_defrag_misses Number of aborted value reallocations started by the active defragmentation process
Type: float
Unit: count
active_defrag_running Flag indicating if active defragmentation is active
Type: float
Unit: bool
allocator_active Total bytes in the allocator active pages, this includes external-fragmentation..
Type: float
Unit: digital,B
allocator_allocated Total bytes allocated form the allocator, including internal-fragmentation. Normally the same as used_memory..
Type: float
Unit: digital,B
allocator_frag_bytes Delta between allocator_active and allocator_allocated. See note about mem_fragmentation_bytes..
Type: float
Unit: digital,B
allocator_frag_ratio Ratio between allocator_active and allocator_allocated. This is the true (external) fragmentation metric (not mem_fragmentation_ratio)..
Type: float
Unit: unknown
allocator_resident Total bytes resident (RSS) in the allocator, this includes pages that can be released to the OS (by MEMORY PURGE, or just waiting)..
Type: float
Unit: digital,B
allocator_rss_bytes Delta between allocator_resident and allocator_active.
Type: float
Unit: digital,B
allocator_rss_ratio Ratio between allocator_resident and allocator_active. This usually indicates pages that the allocator can and probably will soon release back to the OS..
Type: float
Unit: unknown
aof_base_size AOF file size on latest startup or rewrite.
Type: float
Unit: digital,B
aof_buffer_length Size of the AOF buffer
Type: float
Unit: digital,B
aof_current_rewrite_time_sec Duration of the on-going AOF rewrite operation if any.
Type: float
Unit: time,s
aof_current_size AOF current file size
Type: float
Unit: digital,B
aof_delayed_fsync Delayed fsync counter.
Type: float
Unit: count
aof_enabled Flag indicating AOF logging is activated.
Type: float
Unit: bool
aof_last_cow_size The size in bytes of copy-on-write memory during the last AOF rewrite operation.
Type: float
Unit: digital,B
aof_last_rewrite_time_sec Duration of the last AOF rewrite operation in seconds
Type: float
Unit: time,s
aof_pending_bio_fsync Number of fsync pending jobs in background I/O queue.
Type: float
Unit: count
aof_pending_rewrite Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete..
Type: float
Unit: bool
aof_rewrite_buffer_length Size of the AOF rewrite buffer. Note this field was removed in Redis 7.0.
Type: float
Unit: digital,B
aof_rewrite_in_progress Flag indicating a AOF rewrite operation is on-going
Type: float
Unit: bool
aof_rewrite_scheduled Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete..
Type: float
Unit: bool
aof_rewrites Number of AOF rewrites performed since startup.
Type: float
Unit: count
arch_bits Architecture (32 or 64 bits).
Type: float
Unit: count
async_loading Currently loading replication data-set asynchronously while serving old data. This means repl-diskless-load is enabled and set to swapdb. Added in Redis 7.0..
Type: float
Unit: bool
blocked_clients Number of clients pending on a blocking call (BLPOP/BRPOP/BRPOPLPUSH/BLMOVE/BZPOPMIN/BZPOPMAX)
Type: float
Unit: count
client_biggest_input_buf Biggest input buffer among current client connections
Type: float
Unit: digital,B
client_longest_output_list Longest output list among current client connections
Type: float
Unit: count
client_recent_max_input_buffer Biggest input buffer among current client connections.
Type: float
Unit: count
client_recent_max_output_buffer Biggest output buffer among current client connections.
Type: float
Unit: count
clients_in_timeout_table Number of clients in the clients timeout table.
Type: float
Unit: count
cluster_connections An approximation of the number of sockets used by the cluster's bus.
Type: float
Unit: count
cluster_enabled Indicate Redis cluster is enabled.
Type: float
Unit: bool
configured_hz The server's configured frequency setting.
Type: float
Unit: count
connected_clients Number of client connections (excluding connections from replicas)
Type: float
Unit: count
connected_slaves Number of connected replicas
Type: float
Unit: count
current_active_defrag_time The time passed since memory fragmentation last was over the limit, in milliseconds.
Type: float
Unit: time,ms
current_cow_peak The peak size in bytes of copy-on-write memory while a child fork is running.
Type: float
Unit: digital,B
current_cow_size The size in bytes of copy-on-write memory while a child fork is running.
Type: float
Unit: digital,B
current_cow_size_age The age, in seconds, of the current_cow_size value..
Type: float
Unit: time,s
current_eviction_exceeded_time The time passed since used_memory last rose above maxmemory, in milliseconds.
Type: float
Unit: time,ms
current_fork_perc The percentage of progress of the current fork process. For AOF and RDB forks it is the percentage of current_save_keys_processed out of current_save_keys_total..
Type: float
Unit: percent,percent
current_save_keys_processed Number of keys processed by the current save operation.
Type: float
Unit: count
current_save_keys_total Number of keys at the beginning of the current save operation.
Type: float
Unit: count
dump_payload_sanitizations Total number of dump payload deep integrity validations (see sanitize-dump-payload config)..
Type: float
Unit: count
errorstat Track of the different errors that occurred within Redis.
Type: int
Unit: count
eventloop_cycles Total number of eventloop cycles.
Type: float
Unit: count
eventloop_duration_cmd_sum Total time spent on executing commands in microseconds.
Type: float
Unit: time,μs
eventloop_duration_sum Total time spent in the eventloop in microseconds (including I/O and command processing).
Type: float
Unit: time,μs
evicted_clients Number of evicted clients due to maxmemory-clients limit. Added in Redis 7.0..
Type: float
Unit: count
evicted_keys Number of evicted keys due to Max-Memory limit
Type: float
Unit: count
expire_cycle_cpu_milliseconds The cumulative amount of time spent on active expiry cycles.
Type: float
Unit: time,ms
expired_keys Total number of key expiration events
Type: float
Unit: count
expired_stale_perc The percentage of keys probably expired.
Type: float
Unit: percent,percent
expired_time_cap_reached_count The count of times that active expiry cycles have stopped early.
Type: float
Unit: count
hz The server's current frequency setting.
Type: float
Unit: count
info_latency_ms The latency of the redis INFO command.
Type: float
Unit: time,ms
instantaneous_eventloop_cycles_per_sec Number of eventloop cycles per second.
Type: float
Unit: count
instantaneous_eventloop_duration_usec Average time spent in a single eventloop cycle in microseconds.
Type: float
Unit: time,μs
instantaneous_input_kbps The network's read rate per second in KB/sec.
Type: float
Unit: traffic,B/S
instantaneous_input_repl_kbps The network's read rate per second in KB/sec for replication purposes.
Type: float
Unit: traffic,B/S
instantaneous_ops_per_sec Number of commands processed per second.
Type: float
Unit: count
instantaneous_output_kbps The network's write rate per second in KB/sec.
Type: float
Unit: traffic,B/S
instantaneous_output_repl_kbps The network's write rate per second in KB/sec for replication purposes.
Type: float
Unit: traffic,B/S
io_threaded_reads_processed Number of read events processed by the main and I/O threads.
Type: float
Unit: count
io_threaded_writes_processed Number of write events processed by the main and I/O threads.
Type: float
Unit: count
io_threads_active Flag indicating if I/O threads are active.
Type: float
Unit: bool
keyspace_hits Number of successful lookup of keys in the main dictionary
Type: float
Unit: count
keyspace_misses Number of failed lookup of keys in the main dictionary
Type: float
Unit: count
latency_percentiles_usec Latency percentile distribution statistics based on the command type.
Type: float
Unit: time,ms
latest_fork_usec Duration of the latest fork operation in microseconds
Type: float
Unit: time,μs
lazyfree_pending_objects The number of objects waiting to be freed (as a result of calling UNLINK, or FLUSHDB and FLUSHALL with the ASYNC option).
Type: float
Unit: count
lazyfreed_objects The number of objects that have been lazy freed..
Type: float
Unit: count
loading Flag indicating if the load of a dump file is on-going.
Type: float
Unit: bool
loading_eta_seconds ETA in seconds for the load to be complete
Type: float
Unit: time,s
loading_loaded_bytes Number of bytes already loaded
Type: float
Unit: digital,B
loading_loaded_perc Same value expressed as a percentage
Type: float
Unit: percent,percent
loading_rdb_used_mem The memory usage of the server that had generated the RDB file at the time of the file's creation.
Type: float
Unit: digital,B
loading_start_time Epoch-based timestamp of the start of the load operation.
Type: float
Unit: timeStamp,sec
loading_total_bytes Total file size.
Type: float
Unit: digital,B
lru_clock Clock incrementing every minute, for LRU management.
Type: float
Unit: time,ms
master_last_io_seconds_ago Number of seconds since the last interaction with master
Type: float
Unit: time,s
master_link_down_since_seconds Number of seconds since the link is down.
Type: float
Unit: time,s
master_repl_offset The server's current replication offset
Type: float
Unit: count
master_sync_in_progress Indicate the master is syncing to the replica
Type: float
Unit: bool
master_sync_last_io_seconds_ago Number of seconds since last transfer I/O during a SYNC operation.
Type: float
Unit: time,s
master_sync_left_bytes Number of bytes left before syncing is complete (may be negative when master_sync_total_bytes is 0)
Type: float
Unit: digital,B
master_sync_perc The percentage master_sync_read_bytes from master_sync_total_bytes, or an approximation that uses loading_rdb_used_mem when master_sync_total_bytes is 0.
Type: float
Unit: percent,percent
master_sync_read_bytes Number of bytes already transferred.
Type: float
Unit: digital,B
master_sync_total_bytes Total number of bytes that need to be transferred. this may be 0 when the size is unknown (for example, when the repl-diskless-sync configuration directive is used).
Type: float
Unit: digital,B
maxclients The value of the maxclients configuration directive. This is the upper limit for the sum of connected_clients, connected_slaves and cluster_connections.
Type: float
Unit: count
maxmemory The value of the Max Memory configuration directive
Type: float
Unit: digital,B
mem_aof_buffer Transient memory used for AOF and AOF rewrite buffers.
Type: float
Unit: digital,B
mem_clients_normal Memory used by normal clients.
Type: float
Unit: digital,B
mem_clients_slaves Memory used by replica clients - Starting Redis 7.0, replica buffers share memory with the replication backlog, so this field can show 0 when replicas don't trigger an increase of memory usage..
Type: float
Unit: digital,B
mem_cluster_links Memory used by links to peers on the cluster bus when cluster mode is enabled..
Type: float
Unit: digital,B
mem_fragmentation_bytes Delta between used_memory_rss and used_memory. Note that when the total fragmentation bytes is low (few megabytes), a high ratio (e.g. 1.5 and above) is not an indication of an issue..
Type: float
Unit: digital,B
mem_fragmentation_ratio Ratio between used_memory_rss and used_memory
Type: float
Unit: unknown
mem_not_counted_for_evict Used memory that's not counted for key eviction. This is basically transient replica and AOF buffers..
Type: float
Unit: digital,B
mem_replication_backlog Memory used by replication backlog.
Type: float
Unit: digital,B
mem_total_replication_buffers Total memory consumed for replication buffers - Added in Redis 7.0..
Type: float
Unit: digital,B
migrate_cached_sockets The number of sockets open for MIGRATE purposes.
Type: float
Unit: count
min_slaves_good_slaves Number of replicas currently considered good.
Type: float
Unit: count
module_fork_in_progress Flag indicating a module fork is on-going.
Type: float
Unit: bool
module_fork_last_cow_size The size in bytes of copy-on-write memory during the last module fork operation.
Type: float
Unit: digital,B
pubsub_channels Global number of pub/sub channels with client subscriptions
Type: float
Unit: count
pubsub_patterns Global number of pub/sub pattern with client subscriptions
Type: float
Unit: count
pubsubshard_channels Global number of pub/sub shard channels with client subscriptions. Added in Redis 7.0.3.
Type: float
Unit: count
rdb_bgsave_in_progress Flag indicating a RDB save is on-going
Type: float
Unit: bool
rdb_changes_since_last_save Refers to the number of operations that produced some kind of changes in the dataset since the last time either SAVE or BGSAVE was called.
Type: float
Unit: count
rdb_current_bgsave_time_sec Duration of the on-going RDB save operation if any.
Type: float
Unit: time,s
rdb_last_bgsave_time_sec Duration of the last RDB save operation in seconds
Type: float
Unit: time,s
rdb_last_cow_size The size in bytes of copy-on-write memory during the last RDB save operation.
Type: float
Unit: digital,B
rdb_last_load_keys_expired Number of volatile keys deleted during the last RDB loading. Added in Redis 7.0..
Type: float
Unit: count
rdb_last_load_keys_loaded Number of keys loaded during the last RDB loading. Added in Redis 7.0..
Type: float
Unit: count
rdb_last_save_time Epoch-based timestamp of last successful RDB save.
Type: float
Unit: timeStamp,sec
rdb_saves Number of RDB snapshots performed since startup.
Type: float
Unit: count
rejected_connections Number of connections rejected because of Max-Clients limit
Type: float
Unit: count
repl_backlog_active Flag indicating replication backlog is active.
Type: float
Unit: bool
repl_backlog_first_byte_offset The master offset of the replication backlog buffer.
Type: float
Unit: count
repl_backlog_histlen Size in bytes of the data in the replication backlog buffer
Type: float
Unit: digital,B
repl_backlog_size Total size in bytes of the replication backlog buffer.
Type: float
Unit: digital,B
replica_announced Flag indicating if the replica is announced by Sentinel..
Type: float
Unit: count
rss_overhead_bytes Delta between used_memory_rss (the process RSS) and allocator_resident.
Type: float
Unit: digital,B
rss_overhead_ratio Ratio between used_memory_rss (the process RSS) and allocator_resident. This includes RSS overheads that are not allocator or heap related..
Type: float
Unit: unknown
second_repl_offset The offset up to which replication IDs are accepted.
Type: float
Unit: count
server_time_usec Epoch-based system time with microsecond precision.
Type: float
Unit: time,ms
shutdown_in_milliseconds The maximum time remaining for replicas to catch up the replication before completing the shutdown sequence. This field is only present during shutdown.
Type: float
Unit: time,ms
slave_expires_tracked_keys The number of keys tracked for expiry purposes (applicable only to writable replicas).
Type: float
Unit: count
slave_priority The priority of the instance as a candidate for failover.
Type: float
Unit: count
slave_read_only Flag indicating if the replica is read-only.
Type: float
Unit: count
slave_read_repl_offset The read replication offset of the replica instance..
Type: float
Unit: count
slave_repl_offset The replication offset of the replica instance
Type: float
Unit: count
stat_reply_buffer_expands Total number of output buffer expands.
Type: float
Unit: count
stat_reply_buffer_shrinks Total number of output buffer shrinks.
Type: float
Unit: count
sync_full The number of full resyncs with replicas.
Type: float
Unit: count
sync_partial_err The number of denied partial resync requests.
Type: float
Unit: count
sync_partial_ok The number of accepted partial resync requests.
Type: float
Unit: count
tcp_port TCP/IP listen port.
Type: float
Unit: time,ms
total_active_defrag_time Total time memory fragmentation was over the limit, in milliseconds.
Type: float
Unit: time,ms
total_blocking_keys Number of blocking keys.
Type: float
Unit: count
total_blocking_keys_on_nokey Number of blocking keys that one or more clients that would like to be unblocked when the key is deleted.
Type: float
Unit: count
total_commands_processed Total number of commands processed by the server.
Type: float
Unit: count
total_connections_received Total number of connections accepted by the server.
Type: float
Unit: count
total_error_replies Total number of issued error replies, that is the sum of rejected commands (errors prior command execution) and failed commands (errors within the command execution).
Type: float
Unit: count
total_eviction_exceeded_time Total time used_memory was greater than maxmemory since server startup, in milliseconds.
Type: float
Unit: time,ms
total_forks Total number of fork operations since the server start.
Type: float
Unit: count
total_net_input_bytes The total number of bytes read from the network
Type: float
Unit: digital,B
total_net_output_bytes The total number of bytes written to the network
Type: float
Unit: digital,B
total_net_repl_input_bytes The total number of bytes read from the network for replication purposes.
Type: float
Unit: digital,B
total_net_repl_output_bytes The total number of bytes written to the network for replication purposes.
Type: float
Unit: digital,B
total_reads_processed Total number of read events processed.
Type: float
Unit: count
total_system_memory The total amount of memory that the Redis host has.
Type: float
Unit: digital,B
total_writes_processed Total number of write events processed.
Type: float
Unit: count
tracking_clients Number of clients being tracked (CLIENT TRACKING).
Type: float
Unit: count
tracking_total_items Number of items, that is the sum of clients number for each key, that are being tracked.
Type: float
Unit: count
tracking_total_keys Number of keys being tracked by the server.
Type: float
Unit: count
tracking_total_prefixes Number of tracked prefixes in server's prefix table (only applicable for broadcast mode).
Type: float
Unit: count
unexpected_error_replies Number of unexpected error replies, that are types of errors from an AOF load or replication.
Type: float
Unit: count
uptime_in_days Same value expressed in days.
Type: float
Unit: time,d
uptime_in_seconds Number of seconds since Redis server start.
Type: float
Unit: time,s
used_cpu_sys System CPU consumed by the Redis server, which is the sum of system CPU consumed by all threads of the server process (main thread and background threads).
Type: float
Unit: time,s
used_cpu_sys_children System CPU consumed by the background processes.
Type: float
Unit: time,s
used_cpu_sys_main_thread System CPU consumed by the Redis server main thread.
Type: float
Unit: time,s
used_cpu_sys_percent System CPU percentage consumed by the Redis server, which is the sum of system CPU consumed by all threads of the server process (main thread and background threads)
Type: float
Unit: percent,percent
used_cpu_user User CPU consumed by the Redis server, which is the sum of user CPU consumed by all threads of the server process (main thread and background threads).
Type: float
Unit: time,s
used_cpu_user_children User CPU consumed by the background processes.
Type: float
Unit: time,s
used_cpu_user_main_thread User CPU consumed by the Redis server main thread.
Type: float
Unit: time,s
used_cpu_user_percent User CPU percentage consumed by the Redis server, which is the sum of user CPU consumed by all threads of the server process (main thread and background threads)
Type: float
Unit: percent,percent
used_memory Total number of bytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc)
Type: float
Unit: digital,B
used_memory_dataset The size in bytes of the dataset (used_memory_overhead subtracted from used_memory).
Type: float
Unit: digital,B
used_memory_dataset_perc The percentage of used_memory_dataset out of the net memory usage (used_memory minus used_memory_startup).
Type: float
Unit: percent,percent
used_memory_lua Number of bytes used by the Lua engine
Type: float
Unit: digital,B
used_memory_overhead The sum in bytes of all overheads that the server allocated for managing its internal data structures
Type: float
Unit: digital,B
used_memory_peak Peak memory consumed by Redis (in bytes)
Type: float
Unit: digital,B
used_memory_peak_perc The percentage of used_memory_peak out of used_memory.
Type: float
Unit: percent,percent
used_memory_rss Number of bytes that Redis allocated as seen by the operating system (a.k.a resident set size)
Type: float
Unit: digital,B
used_memory_scripts Number of bytes used by cached Lua scripts.
Type: float
Unit: digital,B
used_memory_startup Initial amount of memory consumed by Redis at startup in bytes
Type: float
Unit: digital,B

redis_replica

  • Tags
Tag Description
host Hostname.
master_addr Master addr, only collected for slave redis.
server Server addr.
service_name Service name.
slave_addr Slave addr, only collected for master redis.
slave_id Slave ID, only collected for master redis.
slave_state Slave state, only collected for master redis.
  • Metrics
Metric Description
master_link_down_since_seconds Number of seconds since the link is down when the link between master and replica is down, only collected for slave redis.
Type: int
Unit: N/A
master_link_status Status of the link (up/down), 1 for up, 0 for down, only collected for slave redis.
Type: int
Unit: N/A
master_repl_offset The server's current replication offset.
Type: int
Unit: N/A
slave_lag Slave lag, only collected for master redis.
Type: int
Unit: N/A
slave_offset Slave offset, only collected for master redis.
Type: int
Unit: N/A

collector

  • Tags
Tag Description
instance Server addr of the instance
job Server name of the instance
  • Metrics
Metric Description
up
Type: int
Unit: -

Custom Object

database

  • Tags
Tag Description
col_co_status Current status of collector on instance(OK/NotOK)
host Connection name(domain) host address
ip Connection IP of the instance
name Object uniq ID
reason If status not ok, we'll get some reasons about the status
  • Metrics
Metric Description
display_name Displayed name in UI
Type: string
Unit: N/A
uptime Current instance uptime
Type: int
Unit: time,s
version Current version of the instance
Type: string
Unit: N/A

Logging

redis_bigkey

  • Tags
Tag Description
db_name DB name.
host Hostname.
key Key name.
key_type Key type.
server Server addr.
service_name Service name.
  • Metrics
Metric Description
keys_sampled Sampled keys in the key space.
Type: int
Unit: count
value_length Key length.
Type: int
Unit: digital,B

redis_hotkey

  • Tags
Tag Description
db_name DB name.
host Hostname.
key Key name.
server Server addr.
service_name Service name.
  • Metrics
Metric Description
key_count Key count times.
Type: int
Unit: count
keys_sampled Sampled keys in the key space.
Type: int
Unit: count

redis_latency

  • Tags
Tag Description
server Server addr
service_name Service name
  • Metrics
Metric Description
cost_time Latest event latency in millisecond.
Type: int
Unit: time,ms
event_name Event name.
Type: string
Unit: N/A
max_cost_time All-time maximum latency for this event.
Type: int
Unit: time,ms
occur_time Unix timestamp of the latest latency spike for the event.
Type: int
Unit: timeStamp,sec

redis_slowlog

Redis slow query history logging

  • Tags
Tag Description
host host
message log message
server server
service_name Service name
  • Metrics
Metric Description
client_addr The client ip:port that run the slow query
Type: string
Unit: N/A
client_name The client name that run the slow query(if client setname executed on client-side)
Type: string
Unit: N/A
command Slow command
Type: string
Unit: N/A
slowlog_95percentile Slow 95th percentile duration
Type: int
Unit: time,μs
slowlog_avg Slow average duration
Type: float
Unit: time,μs
slowlog_id Slow log unique ID
Type: int
Unit: N/A
slowlog_max Slow maximum duration
Type: int
Unit: time,μs
slowlog_median Slow median duration
Type: int
Unit: time,μs
slowlog_micros Cost time
Type: int
Unit: time,μs

Logging Pipeline

The original log is:

122:M 14 May 2019 19:11:40.164 * Background saving terminated with success

The list of cut fields is as follows:

Field Name Field Value Description
pid 122 process id
role M role
serverity * service
statu notice log level
msg Background saving terminated with success log content
time 1557861100164000000 Nanosecond timestamp (as line protocol time)

Feedback

Is this page helpful? ×