跳转至

IPMI


·


IPMI 指标展示被监测设备的电流、电压、功耗、占用率、风扇转速、温度以及设备状态等信息。

IPMI 是智能型平台管理接口(Intelligent Platform Management Interface)的缩写,是管理基于 Intel 结构的企业系统中所使用的外围设备采用的一种工业标准,该标准由英特尔、惠普、NEC、美国戴尔电脑和 SuperMicro 等公司制定。用户可以利用 IPMI 监视服务器的物理健康特征,如温度、电压、风扇工作状态、电源状态等。

IPMI 可以让运维系统无侵入获得被监控服务器等设备的运行健康指标,保障信息安全。

配置

前置条件

  • 安装 ipmitool 工具包

Datakit 是通过 ipmitool 这个工具来采集 IPMI 数据的,故需要机器上安装这个工具。可通过如下命令安装:

# CentOS
yum -y install ipmitool

# Ubuntu
sudo apt-get update && sudo apt -y install ipmitool

# macOS
brew install ipmitool # macOS
  • 加载模块
modprobe ipmi_msghandler
modprobe ipmi_devintf

安装成功后,运行如下命令,即可以看到 ipmi 服务器输出的信息:

ipmitool -I lanplus -H <IP 地址> -U <用户名> -P <密码> sdr elist

SEL              | 72h | ns  |  7.1 | No Reading
Intrusion        | 73h | ok  |  7.1 | 
Fan1A RPM        | 30h | ok  |  7.1 | 2160 RPM
Fan2A RPM        | 32h | ok  |  7.1 | 2280 RPM
Fan3A RPM        | 34h | ok  |  7.1 | 2280 RPM
Fan4A RPM        | 36h | ok  |  7.1 | 2400 RPM
Fan5A RPM        | 38h | ok  |  7.1 | 2280 RPM
Fan6A RPM        | 3Ah | ok  |  7.1 | 2160 RPM
Inlet Temp       | 04h | ok  |  7.1 | 23 degrees C
Exhaust Temp     | 01h | ok  |  7.1 | 37 degrees C
Temp             | 0Fh | ok  |  3.2 | 45 degrees C
... more
Attention
  1. IP 地址指的是被您远程管理服务器的 IPMI 口 IP 地址
  2. 服务器的「IPMI 设置 -> 启用 LAN 上的 IPMI」需要勾选
  3. 服务器「信道权限级别限制」操作员级别需要和「用户名」保持级别一致
  4. ipmitool 工具包是安装到运行 Datakit 的机器里。

进入 DataKit 安装目录下的 conf.d/ipmi 目录,复制 ipmi.conf.sample 并命名为 ipmi.conf。示例如下:

[[inputs.ipmi]]
  ## If you have so many servers that 10 seconds can't finish the job.
  ## You can start multiple collectors.

  ## (Optional) Collect interval: (defaults to "10s").
  interval = "10s"

  ## Set true to enable election
  election = true

  ## The binPath of ipmitool
  ## (Example) bin_path = "/usr/bin/ipmitool"
  bin_path = "/usr/bin/ipmitool"

  ## (Optional) The envs of LD_LIBRARY_PATH
  ## (Example) envs = [ "LD_LIBRARY_PATH=XXXX:$LD_LIBRARY_PATH" ]

  ## The ips of ipmi servers
  ## (Example) ipmi_servers = ["192.168.1.1"]
  ipmi_servers = ["192.168.1.1"]

  ## The interfaces of ipmi servers: (defaults to []string{"lan"}).
  ## If len(ipmi_users)<len(ipmi_ips), will use ipmi_users[0].
  ## (Example) ipmi_interfaces = ["lanplus"]
  ipmi_interfaces = ["lanplus"]

  ## The users name of ipmi servers: (defaults to []string{}).
  ## If len(ipmi_users)<len(ipmi_ips), will use ipmi_users[0].
  ## (Example) ipmi_users = ["root"]
  ## (Warning!) You'd better use hex_keys, it's more secure.
  ipmi_users = ["root"]

  ## The passwords of ipmi servers: (defaults to []string{}).
  ## If len(ipmi_passwords)<len(ipmi_ips), will use ipmi_passwords[0].
  ## (Example) ipmi_passwords = ["calvin"]
  ## (Warning!) You'd better use hex_keys, it's more secure.
  ipmi_passwords = ["calvin"]

  ## (Optional) Provide the hex key for the IMPI connection: (defaults to []string{}).
  ## If len(hex_keys)<len(ipmi_ips), will use hex_keys[0].
  ## (Example) hex_keys = ["XXXX"]
  # hex_keys = []

  ## (Optional) Schema Version: (defaults to [1]).input.go
  ## If len(metric_versions)<len(ipmi_ips), will use metric_versions[0].
  ## (Example) metric_versions = [2]
  metric_versions = [2]

  ## (Optional) Exec ipmitool timeout: (defaults to "5s").
  timeout = "5s"

  ## (Optional) Ipmi server drop warning delay: (defaults to "300s").
  ## (Example) drop_warning_delay = "300s"
  drop_warning_delay = "300s"

  ## Key words of current.
  ## (Example) regexp_current = ["current"]
  regexp_current = ["current"]

  ## Key words of voltage.
  ## (Example) regexp_voltage = ["voltage"]
  regexp_voltage = ["voltage"]

  ## Key words of power.
  ## (Example) regexp_power = ["pwr","power"]
  regexp_power = ["pwr","power"]

  ## Key words of temp.
  ## (Example) regexp_temp = ["temp"]
  regexp_temp = ["temp"]

  ## Key words of fan speed.
  ## (Example) regexp_fan_speed = ["fan"]
  regexp_fan_speed = ["fan"]

  ## Key words of usage.
  ## (Example) regexp_usage = ["usage"]
  regexp_usage = ["usage"]

  ## Key words of usage.
  ## (Example) regexp_count = []
  # regexp_count = []

  ## Key words of status.
  ## (Example) regexp_status = ["fan"]
  regexp_status = ["fan"]

[inputs.ipmi.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"

配置好后,重启 DataKit 即可。

Kubernetes 中支持以环境变量的方式修改配置参数(只在 DataKit 以 K8s DaemonSet 方式运行时生效,主机部署的 DataKit 不支持此功能):

环境变量名 对应的配置参数项 参数示例
ENV_INPUT_IPMI_TAGS tags tag1=value1,tag2=value2 如果配置文件中有同名 tag,会覆盖它
ENV_INPUT_IPMI_INTERVAL interval 10s
ENV_INPUT_IPMI_TIMEOUT timeout 5s
ENV_INPUT_IPMI_DEOP_WARNING_DELAY drop_warning_delay 300s
ENV_INPUT_IPMI_BIN_PATH bin_path "/usr/bin/ipmitool"
ENV_INPUT_IPMI_ENVS envs ["LD_LIBRARY_PATH=XXXX:$LD_LIBRARY_PATH"]
ENV_INPUT_IPMI_SERVERS ipmi_servers ["192.168.1.1"]
ENV_INPUT_IPMI_INTERFACES ipmi_interfaces ["lanplus"]
ENV_INPUT_IPMI_USERS ipmi_users ["root"]
ENV_INPUT_IPMI_PASSWORDS ipmi_passwords ["calvin"]
ENV_INPUT_IPMI_HEX_KEYS hex_keys ["50415353574F5244"]
ENV_INPUT_IPMI_METRIC_VERSIONS metric_versions [2]
ENV_INPUT_IPMI_REGEXP_CURRENT regexp_current ["current"]
ENV_INPUT_IPMI_REGEXP_VOLTAGE regexp_voltage ["voltage"]
ENV_INPUT_IPMI_REGEXP_POWER regexp_power ["pwr","power"]
ENV_INPUT_IPMI_REGEXP_TEMP regexp_temp ["temp"]
ENV_INPUT_IPMI_REGEXP_FAN_SPEED regexp_fan_speed ["fan"]
ENV_INPUT_IPMI_REGEXP_USAGE regexp_usage ["usage"]
ENV_INPUT_IPMI_REGEXP_COUNT regexp_count []
ENV_INPUT_IPMI_REGEXP_STATUS regexp_status ["fan"]
配置提示
  • 各个参数归类的关键词,一律用小写
  • 参考 ipmitool -I ... 指令返回的数据,合理配置关键词

指标

以下所有数据采集,默认会追加名为 host 的全局 tag(tag 值为 DataKit 所在主机名),也可以在配置中通过 [inputs.ipmi.tags] 指定其它标签:

 [inputs.ipmi.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...
  • 标签
Tag Description
host Monitored host name
unit Unit name in the host
  • 指标列表
Metric Description Type Unit
count Count. int count
current Current. float ampere
fan_speed Fan speed. int RPM
power_consumption Power consumption. float watt
status Status of the unit. int -
temp Temperature. float C
usage Usage. float percent
voltage Voltage. float volt
warning Warning on/off. int -

文档评价

文档内容是否对您有帮助? ×