Oracle
Oracle 监控指标采集,具有以下数据收集功能
- Process 相关
- Table Space 相关数据
- System 数据采集
- 自定义查询数据采集
已测试的版本:
- Oracle 19c
- Oracle 12c
- Oracle 11g
自 DataKit 1.32.0 版本 开始,支持通过 DataKit 直接采集和外部采集器两种方式采集 Oracle 指标。
配置¶
前置条件¶
- 创建监控账号
如果是使用单 PDB 或者非 CDB 实例,一个本地用户(local user)就足够了:
-- Create the datakit user. Replace the password placeholder with a secure password.
CREATE USER datakit IDENTIFIED BY <PASSWORD>;
-- Grant access to the datakit user.
GRANT CONNECT, CREATE SESSION TO datakit;
GRANT SELECT_CATALOG_ROLE to datakit;
GRANT SELECT ON DBA_TABLESPACE_USAGE_METRICS TO datakit;
GRANT SELECT ON DBA_TABLESPACES TO datakit;
GRANT SELECT ON DBA_USERS TO datakit;
GRANT SELECT ON SYS.DBA_DATA_FILES TO datakit;
GRANT SELECT ON V_$ACTIVE_SESSION_HISTORY TO datakit;
GRANT SELECT ON V_$ARCHIVE_DEST TO datakit;
GRANT SELECT ON V_$ASM_DISKGROUP TO datakit;
GRANT SELECT ON V_$DATABASE TO datakit;
GRANT SELECT ON V_$DATAFILE TO datakit;
GRANT SELECT ON V_$INSTANCE TO datakit;
GRANT SELECT ON V_$LOG TO datakit;
GRANT SELECT ON V_$OSSTAT TO datakit;
GRANT SELECT ON V_$PGASTAT TO datakit;
GRANT SELECT ON V_$PROCESS TO datakit;
GRANT SELECT ON V_$RECOVERY_FILE_DEST TO datakit;
GRANT SELECT ON V_$RESTORE_POINT TO datakit;
GRANT SELECT ON V_$SESSION TO datakit;
GRANT SELECT ON V_$SGASTAT TO datakit;
GRANT SELECT ON V_$SYSMETRIC TO datakit;
GRANT SELECT ON V_$SYSTEM_PARAMETER TO datakit;
如果想监控来自 CDB 和所有 PDB 中的表空间(Table Spaces),需要一个有合适权限的公共用户(common user):
-- Create the datakit user. Replace the password placeholder with a secure password.
CREATE USER datakit IDENTIFIED BY <PASSWORD>;
-- Grant access to the datakit user.
ALTER USER datakit SET CONTAINER_DATA=ALL CONTAINER=CURRENT;
GRANT CONNECT, CREATE SESSION TO datakit;
GRANT SELECT_CATALOG_ROLE to datakit;
GRANT SELECT ON v_$instance TO datakit;
GRANT SELECT ON v_$database TO datakit;
GRANT SELECT ON v_$sysmetric TO datakit;
GRANT SELECT ON v_$system_parameter TO datakit;
GRANT SELECT ON v_$session TO datakit;
GRANT SELECT ON v_$recovery_file_dest TO datakit;
GRANT SELECT ON v_$active_session_history TO datakit;
GRANT SELECT ON v_$osstat TO datakit;
GRANT SELECT ON v_$restore_point TO datakit;
GRANT SELECT ON v_$process TO datakit;
GRANT SELECT ON v_$datafile TO datakit;
GRANT SELECT ON v_$pgastat TO datakit;
GRANT SELECT ON v_$sgastat TO datakit;
GRANT SELECT ON v_$log TO datakit;
GRANT SELECT ON v_$archive_dest TO datakit;
GRANT SELECT ON v_$asm_diskgroup TO datakit;
GRANT SELECT ON sys.dba_data_files TO datakit;
GRANT SELECT ON DBA_TABLESPACES TO datakit;
GRANT SELECT ON DBA_TABLESPACE_USAGE_METRICS TO datakit;
GRANT SELECT ON DBA_USERS TO datakit;
注意:上述的 SQL 语句由于 Oracle 版本的原因部分可能会出现 "表不存在" 等错误,忽略即可。
- 安装依赖包
如果使用 DataKit 直接采集,可以跳过此步骤。
根据操作系统和 Oracle 版本选择安装对应的安装包,参考这里,如:
wget https://download.oracle.com/otn_software/linux/instantclient/2110000/instantclient-basiclite-linux.x64-21.10.0.0.0dbru.zip
unzip instantclient-basiclite-linux.x64-21.10.0.0.0dbru.zip
将解压后的目录文件路径添加到以下配置信息中的 LD_LIBRARY_PATH 环境变量路径中。
也可以直接下载我们预先准备好的依赖包:
wget https://static.guance.com/otn_software/instantclient/instantclient-basiclite-linux.x64-21.10.0.0.0dbru.zip \
-O /usr/local/datakit/externals/instantclient-basiclite-linux.zip \
&& unzip /usr/local/datakit/externals/instantclient-basiclite-linux.zip -d /opt/oracle \
&& mv /opt/oracle/instantclient_21_10 /opt/oracle/instantclient;
wget https://download.oracle.com/otn_software/linux/instantclient/1919000/instantclient-basiclite-linux.arm64-19.19.0.0.0dbru.zip
unzip instantclient-basiclite-linux.arm64-19.19.0.0.0dbru.zip
将解压后的目录文件路径添加到以下配置信息中的 LD_LIBRARY_PATH 环境变量路径中。
也可以直接下载我们预先准备好的依赖包:
wget https://static.guance.com/otn_software/instantclient/instantclient-basiclite-linux.arm64-19.19.0.0.0dbru.zip \
-O /usr/local/datakit/externals/instantclient-basiclite-linux.zip \
&& unzip /usr/local/datakit/externals/instantclient-basiclite-linux.zip -d /opt/oracle \
&& mv /opt/oracle/instantclient_19_19 /opt/oracle/instantclient;
部分系统需要安装额外的依赖库:
采集器配置¶
进入 DataKit 安装目录下的 conf.d/samples 目录,复制 oracle.conf.sample 并命名为 oracle.conf。示例如下:
[[inputs.oracle]]
# host name
host = "localhost"
## port
port = 1521
## user name
user = "datakit"
## password
password = "<PASS>"
## service
service = "XE"
## Interval (waiting event, locked session metrics).
interval = "10s"
## connection timeout
connect_timeout = "30s"
## slow query time threshold defined. If larger than this, the executed sql will be reported.
slow_query_time = "0s"
## Metric name in metric_exclude_list will not be collected.
metric_exclude_list = [""]
## Set true to enable election
election = true
## v2+ override all metric measurements to "oracle", default: v2
## If you want to use the old metric set, you can change it to "v1"
measurement_version = "v2"
## collect object
[inputs.oracle.object]
# Set true to enable collecting objects
enabled = true
# interval to collect oracle object which will be greater than collection interval
interval = "600s"
## tablespace collection
[inputs.oracle.tablespace]
# Set true to enable collecting tablespace metrics (default: true)
enabled = true
# Collection interval for tablespace metrics (default: 600s)
interval = "600s"
## slow query collection
[inputs.oracle.slow_query]
# Set true to enable collecting slow query metrics (default: true)
enabled = true
# Collection interval for slow query metrics (default: 60s)
interval = "60s"
## process collection
[inputs.oracle.process]
# Set true to enable collecting process metrics (default: true)
enabled = true
# Collection interval for process metrics (default: 60s)
interval = "60s"
## system metrics collection
[inputs.oracle.system]
# Set true to enable collecting system metrics (default: true)
enabled = true
# Collection interval for system metrics (default: 60s)
interval = "60s"
## Database Monitoring (DBM) configuration
## DBM provides deep visibility into database performance by collecting query metrics, activity, and execution plans
[inputs.oracle.dbm]
# Set true to enable DBM metrics collection
enabled = false
## Config DBM metric (query metrics)
## Collects cumulative execution statistics of SQL queries aggregated by query signature, plan hash, and PDB
[inputs.oracle.dbm.metric]
# Set true to enable collecting query metrics
enabled = true
# Collection interval for query metrics (default: 60s)
collection_interval = "60s"
# Maximum number of rows to collect from V$SQLSTATS (default: 10000)
# This limits the initial query result size before aggregation
db_rows_limit = 10000
# Maximum number of queries to report per collection interval (default: 500)
# Only the top N queries (sorted by derivative elapsed time) will be reported as metrics
max_queries = 500
# Lookback window in seconds for filtering queries (default: 300)
# Only queries that executed within this time window will be collected
lookback_window = 300
# Enable plan collection (default: true)
plan_enabled = true
# Plan object cache TTL (default: 1h)
plan_cache_ttl = "1h"
# Maximum runtime in seconds for statement metrics collection (default: 30)
# If collection takes longer than this, plan collection will be skipped
max_run_time = 30
# Disable last active time filter (default: false)
# If true, queries will be selected randomly instead of by last active time
disable_last_active = false
## Config DBM activity (current active queries)
## Collects information about currently executing queries and active sessions
[inputs.oracle.dbm.activity]
# Set true to enable collecting active query information
enabled = true
# Collection interval for activity metrics (default: 10s)
collection_interval = "10s"
# Maximum number of rows to collect from V$SESSION (default: 1000)
db_rows_limit = 1000
## Run a custom SQL query and collect corresponding metrics.
# [[inputs.oracle.custom_queries]]
# sql = '''
# SELECT
# GROUP_ID, METRIC_NAME, VALUE
# FROM GV$SYSMETRIC
# '''
# metric = "oracle_custom"
# tags = ["GROUP_ID", "METRIC_NAME"]
# fields = ["VALUE"]
# interval = "10s"
[inputs.oracle.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
配置好后,重启 DataKit 即可。
目前可以通过 ConfigMap 方式注入采集器配置来开启采集器。
外部采集器的配置示例如下:
[[inputs.external]]
daemon = true
name = "oracle"
cmd = "/usr/local/datakit/externals/oracle"
## Set true to enable election
election = true
## Modify below if necessary.
## The password use environment variable named "ENV_INPUT_ORACLE_PASSWORD".
args = [
"--interval" , "1m" ,
"--host" , "<your-oracle-host>" ,
"--port" , "1521" ,
"--username" , "<oracle-user-name>" ,
"--service-name" , "<oracle-service-name>" ,
"--slow-query-time" , "0s" ,
"--log" , "/var/log/datakit/oracle.log" ,
]
envs = [
"ENV_INPUT_ORACLE_PASSWORD=<oracle-password>",
"LD_LIBRARY_PATH=/opt/oracle/instantclient:$LD_LIBRARY_PATH",
]
[inputs.external.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
## Run a custom SQL query and collect corresponding metrics.
# [[inputs.external.custom_queries]]
# sql = '''
# SELECT
# GROUP_ID, METRIC_NAME, VALUE
# FROM GV$SYSMETRIC
# '''
# metric = "oracle_custom"
# tags = ["GROUP_ID", "METRIC_NAME"]
# fields = ["VALUE"]
#############################
# Parameter Description (Marked with * is required field)
#############################
# *--interval : Collect interval (Default is 1m).
# *--host : Oracle instance address (IP).
# *--port : Oracle listen port (Default is 1521).
# *--username : Oracle username.
# *--service-name : Oracle service name.
# *--slow-query-time : Oracle slow query time threshold defined. If larger than this, the executed sql will be reported.
# *--log : Collector log path.
# *ENV_INPUT_ORACLE_PASSWORD : Oracle password.
Tip
上述配置会以命令行形式展示在进程列表中(包括密码),如果想隐藏密码,可以通过将密码写进环境变量 ENV_INPUT_ORACLE_PASSWORD 形式实现,示例:
该环境变量在读取密码时有最高优先级,即只要出现该环境变量,那密码就以该环境变量中的值为准。如果密码中有特殊字符,可以参见这里的做法来处理。
Oracle RAC¶
对于 Oracle RAC,当前采集器推荐按“每实例一个 input”的方式部署,而不是通过单个 input 查询 GV$ 视图来聚合整个 RAC。
- 为每个 RAC 节点或实例分别配置一个
[[inputs.oracle]] - 每个 input 应连接固定节点地址、VIP 或实例专属 service
- 不建议通过 SCAN、负载均衡或连接池连接,否则同一个 input 可能漂移到不同实例
- 建议连接 CDB service,不要分别连接单独 PDB
可以通过自定义 tag 将同一套 RAC 的多个 input 关联起来,例如:
[[inputs.oracle]]
host = "rac-node-1-vip"
port = 1521
user = "datakit"
password = "<PASS>"
service = "CDB1_NODE1"
...
[inputs.oracle.tags]
rac_cluster = "prod-rac"
[[inputs.oracle]]
host = "rac-node-2-vip"
port = 1521
user = "datakit"
password = "<PASS>"
service = "CDB1_NODE2"
...
[inputs.oracle.tags]
rac_cluster = "prod-rac"
数据库监控 (DBM)¶
数据库监控(Database Monitoring,DBM)功能提供对 Oracle 数据库性能的深度可见性,通过收集查询指标、活动会话和执行计划来帮助分析和优化数据库性能。
启用 DBM¶
查询指标 (Metric)¶
收集 SQL 查询的累积执行统计信息,按 SQL ID、执行计划哈希值和 PDB 进行聚合。包含执行次数、CPU 时间、逻辑读取、物理读取、等待时间等,通过计算导数指标(两次采集之间的差值)来反映查询的实际执行情况。
注意:只有连续两个采集窗口都出现的查询才会上报指标。首次采集到的查询仅作为基线,不会上报指标。
[inputs.oracle.dbm.metric]
# 启用查询指标采集(默认: false)
enabled = true
# 采集间隔(默认: 60s)
collection_interval = "60s"
# 查询的最大行数限制(默认: 10000)
db_rows_limit = 10000
# 每个采集间隔报告的最大查询数(默认: 500)
# 按导数执行时间排序后,只报告前 N 个查询
max_queries = 500
# 查询过滤的回看窗口(秒,默认: 300)
# 只收集在此时间窗口内执行的查询
lookback_window = 300
# 启用执行计划采集(默认: true)
plan_enabled = true
# 执行计划对象缓存 TTL(默认: 1h)
plan_cache_ttl = "1h"
# 执行计划采集的最大运行时间(秒,默认: 30)
# 如果采集时间超过此值,将跳过执行计划采集
max_run_time = 30
# 禁用最后活动时间过滤(默认: false)
# 如果设置为 true,将收集所有查询,而不仅仅是最近活动的查询
disable_last_active = false
执行计划 (Plan):当 plan_enabled = true 时,会同时采集执行计划。
活动查询 (Activity)¶
收集当前正在执行的查询和活动会话信息。记录会话 ID、序列号、状态、等待事件、阻塞信息、SQL 文本(已脱敏)等,用于实时监控数据库的当前活动状态和问题诊断。
会话指标 (Session):基于活动查询数据,会自动生成会话指标。
连接指标 (Connection):独立查询数据库连接信息,按用户名、状态、PDB 等维度统计连接数。
[inputs.oracle.dbm.activity]
# 启用活动查询信息采集(默认: false)
enabled = true
# 活动指标采集间隔(默认: 10s)
collection_interval = "10s"
# 查询的最大行数限制(默认: 1000)
db_rows_limit = 1000
# 包含所有会话(默认: false)
# 如果设置为 true,将收集所有会话,而不仅仅是活动会话
include_all_sessions = false
指标¶
以下所有数据采集,默认会追加全局选举 tag,也可以在配置中通过 [inputs.oracle.tags] 指定其它标签:
oracle¶
Metric set including Oracle process, tablespace, system, locked session, waiting event, and DBM (metric/session/connection) statistics, unified in v2
| Tags & Fields | Description |
|---|---|
| cdb_name ( tag) |
The name of the CDB (Container Database) |
| client ( tag) |
The client machine name |
| con_id ( tag) |
The container ID (con_id) in Oracle multi tenant architecture |
| connection_status ( tag) |
Connection status: ACTIVE, INACTIVE, KILLED, etc. |
| database_instance ( tag) |
Oracle instance identifier from configured tag or v$instance.host_name. Common tag. |
| event ( tag) |
The wait event name. |
| event_type ( tag) |
Event type, such as USER/BACKGROUND |
| force_matching_signature ( tag) |
The force matching signature of the query |
| host ( tag) |
Host name. Common tag. |
| oracle_server ( tag) |
Server addr. Deprecated. Please use server. Common tag. |
| oracle_service ( tag) |
Server service. Common tag. |
| pdb_name ( tag) |
The name of the PDB (Pluggable Database) |
| plan_hash_value ( tag) |
The hash value of the query execution plan |
| program ( tag) |
The program name |
| query_signature ( tag) |
Hash signature generated to link metrics and objects |
| server ( tag) |
The address of the server. The value is host:port. Common tag. |
| session_status ( tag) |
Session status: active (ACTIVE status), idle (INACTIVE status), blocked (being blocked) |
| tablespace_name ( tag) |
Table space name |
| username ( tag) |
The name of the database user |
| wait_class ( tag) |
The wait event class. |
| wait_group ( tag) |
Datakit unified wait group: Lock, I/O, Concurrency, Memory, Network, CPU, Commit/Log, Other. |
| active_sessions | Number of active sessions Type: float | (gauge) Unit: count Tagged by: pdb_name |
| avg_elapsed_time | The average elapsed time (microseconds) per query execution during the collection interval (calculated from delta_elapsed_time / delta_executions). Type: int | (gauge) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| buffer_cachehit_ratio | Ratio of buffer cache hits Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| cache_blocks_corrupt | Corrupt cache blocks Type: float | (gauge) Unit: count Tagged by: pdb_name |
| cache_blocks_lost | Lost cache blocks Type: float | (gauge) Unit: count Tagged by: pdb_name |
| connection_count | Number of user connections in this dimension group Type: int | (gauge) Unit: count Tagged by: cdb_name, connection_status, pdb_name, username |
| consistent_read_changes | Consistent read changes per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| consistent_read_gets | Consistent read gets per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| count | Waiting event count Type: int | (gauge) Unit: count Tagged by: event, event_type, program, username |
| cursor_cachehit_ratio | Ratio of cursor cache hits Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| database_cpu_time_ratio | Database CPU time ratio Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| database_wait_time_ratio | Memory sorts per second Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| db_block_changes | DB block changes per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| db_block_gets | DB block gets per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| delta_application_wait_time | The application wait time (microseconds) during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_avoided_executions | The avoided executions during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_buffer_gets | The number of buffer gets during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_cluster_wait_time | The cluster wait time (microseconds) during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_concurrency_wait_time | The concurrency wait time (microseconds) during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_cpu_time | The CPU time (microseconds) consumed during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_direct_reads | The number of direct reads during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_direct_writes | The number of direct writes during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_disk_reads | The number of disk reads during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_elapsed_time | The elapsed time (microseconds) for query executions during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_end_of_fetch_count | The number of end of fetch operations during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_executions | The number of query executions during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_fetches | The number of fetches during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_invalidations | The number of invalidations during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_io_cell_offload_eligible_bytes | The I/O cell offload eligible bytes during the collection interval (delta value). Type: int | (count) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_io_cell_offload_returned_bytes | The I/O cell offload returned bytes during the collection interval (delta value). Type: int | (count) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_io_cell_uncompressed_bytes | The I/O cell uncompressed bytes during the collection interval (delta value). Type: int | (count) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_io_interconnect_bytes | The I/O interconnect bytes during the collection interval (delta value). Type: int | (count) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_java_exec_time | The Java execution time (microseconds) during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_loads | The number of loads during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_obsolete_count | The obsolete count during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_parse_calls | The number of parse calls during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_physical_read_bytes | The number of physical read bytes during the collection interval (delta value). Type: int | (count) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_physical_read_requests | The number of physical read requests during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_physical_write_bytes | The number of physical write bytes during the collection interval (delta value). Type: int | (count) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_physical_write_requests | The number of physical write requests during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_plsql_exec_time | The PL/SQL execution time (microseconds) during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_px_servers_executions | The number of parallel server executions during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_rows_processed | The number of rows processed during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_serializable_aborts | The number of serializable aborts during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_sorts | The number of sorts during the collection interval (delta value). Type: int | (count) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| delta_user_io_wait_time | The user I/O wait time (microseconds) during the collection interval (delta value). Type: int | (count) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| disk_sorts | Disk sorts per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| enqueue_timeouts | Enqueue timeouts per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| execute_without_parse | Execute without parse ratio Type: float | (gauge) Unit: count Tagged by: pdb_name |
| gc_cr_block_received | GC CR block received Type: float | (gauge) Unit: count Tagged by: pdb_name |
| host_cpu_utilization | Host CPU utilization (%) Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| in_use | Percentage of used space,as a function of the maximum possible Tablespace size Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name, tablespace_name |
| library_cachehit_ratio | Ratio of library cache hits Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| logical_reads | Logical reads per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| logons | Number of logon attempts Type: float | (gauge) Unit: count Tagged by: pdb_name |
| memory_sorts_ratio | Memory sorts ratio Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| off_use | Total space consumed by the Tablespace, in database blocks Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, tablespace_name |
| pga_alloc_mem | PGA memory allocated by process Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, program |
| pga_freeable_mem | PGA memory freeable by process Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, program |
| pga_max_mem | PGA maximum memory ever allocated by process Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, program |
| pga_over_allocation_count | Over-allocating PGA memory count Type: float | (gauge) Unit: count Tagged by: pdb_name |
| pga_used_mem | PGA memory used by process Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, program |
| physical_reads | Physical reads per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| physical_reads_direct | Physical reads direct per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| physical_writes | Physical writes per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| pid | Oracle process identifier Type: int | (gauge) Unit: N/A Tagged by: pdb_name, program |
| redo_generated | Redo generated per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| redo_writes | Redo writes per second Type: float | (gauge) Unit: count Tagged by: pdb_name |
| rows_per_sort | Rows per sort Type: float | (gauge) Unit: count Tagged by: pdb_name |
| service_response_time | Service response time Type: float | (gauge) Unit: time,ms Tagged by: pdb_name |
| session_blocked_count | Number of sessions that are being blocked Type: int | (gauge) Unit: count Tagged by: cdb_name, client, event, pdb_name, program, session_status, username, wait_class, wait_group |
| session_blocking_count | Number of sessions that are blocking other sessions Type: int | (gauge) Unit: count Tagged by: cdb_name, client, event, pdb_name, program, session_status, username, wait_class, wait_group |
| session_count | Session count Type: float | (gauge) Unit: count Tagged by: pdb_name |
| session_group_count | Number of sessions in this dimension group Type: int | (gauge) Unit: count Tagged by: cdb_name, client, event, pdb_name, program, session_status, username, wait_class, wait_group |
| session_limit_usage | Session limit usage Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| session_total_wait_time | Total wait time for sessions in this group (milliseconds). Note: Only active sessions have this data. Type: int | (gauge) Unit: time,ms Tagged by: cdb_name, client, event, pdb_name, program, session_status, username, wait_class, wait_group |
| sharable_mem | The amount of sharable memory (bytes) used by the cursor. Type: int | (gauge) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| shared_pool_free | Shared pool free memory % Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| soft_parse_ratio | Soft parse ratio Type: float | (gauge) Unit: percent,percent Tagged by: pdb_name |
| sorts_per_user_call | Sorts per user call Type: float | (gauge) Unit: count Tagged by: pdb_name |
| temp_space_used | Temp space used Type: float | (gauge) Unit: digital,B Tagged by: pdb_name |
| total_buffer_gets | The total number of buffer gets (cumulative value from Oracle). Type: int | (gauge) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| total_cpu_time | The total CPU time (microseconds) consumed by query executions (cumulative value from Oracle). Type: int | (gauge) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| total_elapsed_time | The total elapsed time (microseconds) for query executions (cumulative value from Oracle). Type: int | (gauge) Unit: time,μs Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| total_executions | The total number of query executions (cumulative value from Oracle). Type: int | (gauge) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| total_rows_processed | The total number of rows processed (cumulative value from Oracle). Type: int | (gauge) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| ts_size | Table space size Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, tablespace_name |
| typecheck_mem | The amount of typecheck memory (bytes) used by the cursor. Type: int | (gauge) Unit: digital,B Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| uptime | Instance uptime Type: int | (gauge) Unit: time,s Tagged by: pdb_name |
| used_space | Used space Type: float | (gauge) Unit: digital,B Tagged by: pdb_name, tablespace_name |
| user_rollbacks | Number of user rollbacks Type: float | (gauge) Unit: count Tagged by: pdb_name |
| version_count | The number of versions of the cursor in the shared pool. Type: int | (gauge) Unit: count Tagged by: cdb_name, con_id, force_matching_signature, pdb_name, plan_hash_value, query_signature |
| waiting_session_count | Locked session count Type: int | (gauge) Unit: count Tagged by: event |
collector¶
| Tags & Fields | Description |
|---|---|
| instance ( tag) |
Server addr of the instance |
| job ( tag) |
Server name of the instance |
| up | Type: int | (gauge) Unit: - |
对象¶
database¶
Oracle object metrics( Version-1.77.0)
| Tags & Fields | Description |
|---|---|
| database_instance ( tag) |
Oracle instance identifier from configured tag or v$instance.host_name. |
| database_type ( tag) |
The type of the database. The value is Oracle |
| host ( tag) |
The hostname of the Oracle server |
| name ( tag) |
The object identifier. The value is <server>-<database_instance> |
| port ( tag) |
The port of the Oracle server |
| server ( tag) |
The address of the server. The value is host:port |
| version ( tag) |
The version of the Oracle server |
| avg_query_time | The average time taken by a query to execute Type: float Unit: timeStamp,usec |
| qps | The number of queries executed by the database per second Type: float Unit: gauge |
| slow_queries | The number of queries that have taken more than slow_query_time.Type: int Unit: count |
| slow_query_log | Whether the slow query log is enabled according to whether slow_query_time is greater than 0 . The value can be OFF to disable the log or ON to enable the log.Type: string Unit: - |
| tps | The number of transactions executed by the database per second Type: float Unit: gauge |
| uptime | The number of seconds that the server has been up Type: int Unit: time,s |
db_query¶
Oracle DBM query objects. Each object represents a unique SQL query identified by server:database_instance:query_signature, containing the obfuscated SQL text.
| Tags & Fields | Description |
|---|---|
| cdb_name ( tag) |
The name of the CDB (Container Database) |
| con_id ( tag) |
The container ID (con_id) in Oracle multi tenant architecture |
| database_instance ( tag) |
Oracle instance identifier from configured tag or v$instance.host_name. |
| database_type ( tag) |
The type of the database. The value is Oracle |
| force_matching_signature ( tag) |
The force matching signature of the query |
| name ( tag) |
Object identifier generated from server:database_instance:query_signature |
| pdb_name ( tag) |
The name of the PDB (Pluggable Database) |
| query_signature ( tag) |
Hash signature generated from pdb_name:query_hash to link metrics and objects |
| server ( tag) |
The server address (host:port) |
| sql_id ( tag) |
The SQL ID of the query |
| message | The obfuscated/normalized SQL text (full text) Type: string Unit: - |
db_exec_plan¶
Oracle DBM plan objects. Each object represents a unique execution plan identified by server:database_instance:query_signature:plan_hash_value, containing the obfuscated plan content.
| Tags & Fields | Description |
|---|---|
| cdb_name ( tag) |
The name of the CDB (Container Database) |
| con_id ( tag) |
The container ID (con_id) in Oracle multi tenant architecture |
| database_instance ( tag) |
Oracle instance identifier from configured tag or v$instance.host_name. |
| database_type ( tag) |
The type of the database. The value is Oracle |
| force_matching_signature ( tag) |
The force matching signature of the query |
| name ( tag) |
Object identifier generated from server:database_instance:query_signature:plan_hash_value |
| pdb_name ( tag) |
The name of the PDB (Pluggable Database) |
| plan_hash_value ( tag) |
The hash value of the query execution plan |
| plan_type ( tag) |
The format of the plan content. The value is JSON |
| query_signature ( tag) |
Hash signature generated from pdb_name:query_hash to link metrics and objects |
| server ( tag) |
The server address (host:port) |
| sql_id ( tag) |
The SQL ID of the query |
| message | The obfuscated/normalized execution plan content (full content) Type: string Unit: - |
| optimizer_mode | The optimizer mode used for the execution plan Type: string Unit: - |
| other | Other information about the execution plan Type: string Unit: - |
| timestamp | The timestamp when the execution plan was created Type: string Unit: - |
慢查询采集¶
DataKit 可以将执行超过用户自定义时间的 SQL 语句报告给观测云,在日志中显示,来源名是 oracle_log。
该功能默认情况下是关闭的,用户可以在 Oracle 的配置文件中将其打开,方法如下:
将 slow_query_time 的值从 0s 改成用户心中的阈值,最小值 1 毫秒。一般推荐 10 秒。
字段说明
avg_elapsed: 该 SQL 语句执行的平均耗时username:执行该语句的用户名failed_obfuscate:SQL 脱敏失败的原因。只有在 SQL 脱敏失败才会出现。SQL 脱敏失败后原 SQL 会被上报- 如果
slow_query_time为空或小于 1 毫秒,则不会开启 Oracle 采集器的慢查询功能,即默认状态 - 没有执行完成的 SQL 语句不会被查询到
更多字段解释可以查看这里。
日志¶
oracle_log¶
For full and detailed field into, see here
| Tags & Fields | Description |
|---|---|
| action ( tag) |
Contains the name of the action that was executing when the SQL statement was first parsed as set by calling DBMS_APPLICATION_INFO.SET_ACTION |
| command_type ( tag) |
Oracle command type definition |
| database_instance ( tag) |
Oracle instance identifier from configured tag or v$instance.host_name. |
| module ( tag) |
Contains the name of the module that was executing when the SQL statement was first parsed as set by calling DBMS_APPLICATION_INFO.SET_MODULE |
| oracle_server ( tag) |
Server addr. Deprecated. Please use server |
| server ( tag) |
The address of the server. The value is host:port |
| sql_id ( tag) |
SQL identifier of the parent cursor in the library cache |
| status ( tag) |
Log level, always warning here |
| application_wait_time | Application wait time (in microseconds) Type: int Unit: time,μs |
| avg_elapsed | Average elapsed time of executions(elapsed_time/executions)Type: int Unit: time,μs |
| buffer_gets | Sum of buffer gets over all child cursors Type: int Unit: count |
| cluster_wait_time | Cluster wait time (in microseconds) Type: int Unit: time,μs |
| concurrency_wait_time | Concurrency wait time (in microseconds) Type: int Unit: time,μs |
| cpu_time | CPU time (in microseconds) used by this cursor for parsing, executing, and fetching Type: int Unit: time,μs |
| disk_reads | Sum of the number of disk reads over all child cursors Type: int Unit: count |
| elapsed_time | Elapsed time (in microseconds) used by this cursor for parsing, executing, and fetching. If the cursor uses parallel execution, then ELAPSED_TIME is the cumulative time...Type: int Unit: time,μs |
| executions | Total number of executions, totalled over all the child cursors Type: int Unit: count |
| last_active_time | Time at which the query plan was last active Type: string Unit: N/A |
| message | JSON dump of all queried fields of table V$SQLAREAType: string Unit: N/A |
| parse_calls | Sum of all parse calls to all the child cursors under this parent Type: int Unit: count |
| parsing_schema_name | Schema name that was used to parse this child cursor Type: string Unit: N/A |
| plan_hash_value | Numeric representation of the current SQL plan for this cursor. Comparing one PLAN_HASH_VALUE to another easily identifies whether or not two plans are the same (rather than comparing the two plans line by line).Type: int Unit: N/A |
| rows_processed | Total number of rows processed on behalf of this SQL statement Type: int Unit: count |
| sorts | Sum of the number of sorts that were done for all the child cursors Type: int Unit: count |
| sql_fulltext | All characters of the SQL text for the current cursor Type: string Unit: N/A |
| user_io_wait_time | User I/O Wait Time (in microseconds) Type: int Unit: time,μs |
| username | Name of the user Type: string Unit: N/A |
oracle_dbm_activity¶
Collect the currently active queries, session information, wait events, and blocking information for Oracle.
| Tags & Fields | Description |
|---|---|
| action ( tag) |
The action name |
| cdb_name ( tag) |
The name of the CDB (Container Database) |
| con_id ( tag) |
The container ID (con_id) in Oracle multi tenant architecture |
| database_instance ( tag) |
Oracle instance identifier from configured tag or v$instance.host_name. |
| force_matching_signature ( tag) |
The force matching signature |
| machine ( tag) |
The machine name |
| module ( tag) |
The module name |
| pdb_name ( tag) |
The name of the PDB (Pluggable Database) |
| plan_hash_value ( tag) |
The plan hash value |
| program ( tag) |
The program name |
| query_signature ( tag) |
Hash signature generated from normalized SQL text to link metrics and objects |
| server ( tag) |
The server address (host:port) |
| service_name ( tag) |
The service name |
| session_status ( tag) |
The session status (ACTIVE, INACTIVE, KILLED, etc.) |
| session_type ( tag) |
The session type (USER, BACKGROUND) |
| sql_id ( tag) |
The SQL ID |
| terminal ( tag) |
The terminal name |
| username ( tag) |
The Oracle username |
| wait_class ( tag) |
The wait event class. |
| wait_group ( tag) |
Datakit unified wait group: Lock, I/O, Concurrency, Memory, Network, CPU, Commit/Log, Other. |
| blocking_instance | The blocking instance ID (for RAC). 0 means not blocked. Type: int | (gauge) Unit: count |
| blocking_session_id | The blocking session ID. 0 means not blocked. Type: int | (gauge) Unit: count |
| client_identifier | The client identifier Type: string | (string) Unit: N/A |
| client_info | The client information set by DBMS_APPLICATION_INFO Type: string | (string) Unit: N/A |
| client_port | The client TCP port Type: string | (string) Unit: N/A |
| command_name | The command name (e.g., SELECT, INSERT, UPDATE) Type: string | (string) Unit: N/A |
| event | The wait event name Type: string | (string) Unit: N/A |
| final_blocking_instance | The final blocking instance ID in the blocking chain (for RAC). 0 means not blocked. Type: int | (gauge) Unit: count |
| final_blocking_session_id | The final blocking session ID in the blocking chain. 0 means not blocked. Type: int | (gauge) Unit: count |
| logon_time | The time when the session logged on Type: string | (string) Unit: N/A |
| message | The text of the normalized/obfuscated SQL text Type: string | (string) Unit: N/A |
| process | The client process ID Type: string | (string) Unit: N/A |
| serial_number | The session serial number (SERIAL#) Type: int | (gauge) Unit: count |
| session_id | The session ID (SID) Type: int | (gauge) Unit: count |
| sql_exec_start | The time when the SQL statement started executing Type: string | (string) Unit: N/A |
| wait_time | The wait time in microseconds Type: int | (gauge) Unit: time,μs |
FAQ¶
如何查看 Oracle 采集器的运行日志?¶
由于 Oracle 采集器是外部采集器,其日志是默认单独存放在 [DataKit 安装目录]/externals/oracle.log 中。
另外,可以在配置文件中通过 --log 参数来指定日志文件位置。
自 DataKit 1.32.0 开始,Oracle 采集器不再是外部采集器,它的日志跟 DataKit 自身日志(Linux 下为 /var/log/datakit/log)在一起。
为何 monitor 中无数据显示?¶
该问题只有当 Oracle 是外部采集器(需要依赖当前环境的动态库)时才会发生,大概原因有如下几种可能:
- Oracle 动态库依赖有问题
即使你本机当前可能已经有对应的 Oracle 包,仍然建议使用上面文档中指定的依赖包且确保其安装路径跟 LD_LIBRARY_PATH 所指定的路径一致。
- glibc 版本有问题
由于 Oracle 采集器是独立编译的,且开启了 CGO,故其运行时需要 glibc 的依赖在 Linux 上可通过如下命令检查当前机器的 glibc 依赖是否有问题:
$ ldd <DataKit 安装目录>/externals/oracle
linux-vdso.so.1 (0x00007ffed33f9000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f70144e1000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f70144be000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f70142cc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f70144fc000)
如果有报告如下信息,则基本是当前机器上的 glibc 版本较低导致:
- Oracle 采集器只能在 Linux x86_64/ARM64 架构的 DataKit 使用,其它平台均不支持
这意味着 Oracle 这个采集器只能在 x86_64/ARM64 的 Linux 上运行,其它平台一律无法运行当前的 Oracle 采集器。
为什么看不到 oracle_system 指标集?¶
需要数据库运行起来之后,过 1 分钟才能看到。