vsphere-monitor 通过 pyvmomi 采集 vsphere 集群的数据。只需要连接 vcenter 就可以采集集群内的包括 ESXi,datastore,vm 等各种监控数据。数据通过 open-falcon 的数据接口上报给 open-falcon。
python 2.7
metric | tag | type | note |
---|---|---|---|
datastore.capacity | datacetner=datacenter,datastore=datastore,type=type | GAUGE | 存储容量 |
datastore.free | datacetner=datacenter,datastore=datastore,type=type | GAUGE | 存储剩余容量 |
datastore.freePercent | datacetner=datacenter,datastore=datastore,type=type | GAUGE | 存储剩余容量 |
esxi.alive | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi 存活,值为 1,可以用来做 Nodata |
esxi.net.if.in | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi 网络进流量(所有网卡总和) |
esxi.net.if.out | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi 网络出流量(所有网卡总和) |
esxi.memory.freePercent | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi 剩余内存百分比 |
esxi.memory.usage | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi 内存使用量 |
esxi.memory.capacity | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi 内存总量 |
esxi.cpu.usage | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi CPU 使用率 |
esxi.uptime | datacetner=datacenter,cluster_name=cluster_name,host=host | GAUGE | esxi uptime |
vm.power | vm=vm_name | GAUGE | 虚机是否开机,开机 = 1,关机 = 0,可以用来做 nodata |
vm.net.if.in | vm=vm_name | GAUGE | 虚机网络进流量(所有网卡总和) |
vm.net.if.out | vm=vm_name | GAUGE | 虚机网络出流量(所有网卡总和) |
vm.datastore.io.write_latency | vm=vm_name | GAUGE | 虚机存储 io 写延迟 |
vm.datastore.io.read_latency | vm=vm_name | GAUGE | 虚机存储 io 读延迟 |
vm.datastore.io.write_numbers | vm=vm_name | GAUGE | 虚机存储写 IOPS |
vm.datastore.io.read_numbers | vm=vm_name | GAUGE | 虚机存储读 IOPS |
vm.datastore.io.write_bytes | vm=vm_name | GAUGE | 虚机存储写流量 |
vm.datastore.io.read_bytes | vm=vm_name | GAUGE | 虚机存储读流量 |
vm.memory.freePercent | vm=vm_name | GAUGE | 虚机内存剩余百分比 |
vm.memory.usage | vm=vm_name | GAUGE | 虚机内存量使用量 |
vm.memory.capacity | vm=vm_name | GAUGE | 虚机内存量总量 |
vm.cpu.usage | vm=vm_name | GAUGE | 虚机 cpu 使用量 |
vm.uptime | vm=vm_name | GAUGE | 虚机 uptime |
获取代码
git clone https://github.com/freedomkk-qfeng/vsphere-monitor.git
安装依赖
yum install -y python-virtualenv
cd vsphere-monitor
virtualenv ./env
./env/bin/pip install -r requirement.txt
修改配置文件 config.py
# falcon
endpoint = "vcenter" # 上报给 open-falcon 的 endpoint
push_api = "http://127.0.0.1:6060/api/push" # 上报的 http api 接口
interval = 60 # 上报的 step 间隔
# vcenter
host = "vcenter.host" # vcenter 的地址
user = "administrator@vsphere.local" # vcenter 的用户名
pwd = "password" # vcenter 的密码
port = 443 # vcenter 的端口
# esxi
esxi_names = [] # 需要采集的 esxi ,留空则全部采集
# datastore
datastore_names = [] # 需要采集的 datastore ,留空则全部采集
# vm
vm_enable = True # 是否要采集虚拟机信息
vm_names = [ # 需要采集的虚拟机,留空则全部采集
"vm1",
"vm2",
"vm3"
]
先尝试跑一下,假定 vsphere-monitor
放在 /opt
下
/opt/vsphere-monitor/env/bin/python /opt/vsphere-monitor/vsphere-monitor.py
没有问题的话,将他放入定时任务
crontab -e
0-59/1 * * * * /opt/vsphere-monitor/env/bin/python /opt/vsphere-monitor/vsphere-monitor.py
- 部分 vm 可能会采集不到
vm.net.if.in
和vm.net.if.out
指标,这是因为网络指标是通过vsphere
的PerformanceManager
采集的。可以在vcenter
中查看虚拟机的性能
图表时,也会发现显示错误未指定衡量指标
或者No Metric Specified
。这是vmware
的 bug,在vSphere 6.0 Update 1
以上版本被修复 详见官方 Knowledge —— 在 VMware vSphere Client 6.0 中查看虚拟机网络的实时性能图表时显示错误:未指定衡量指标 (2125021) - 针对早期vSphere版本,需要特定的pyvmomi版本,详见https://pypi.org/project/pyvmomi/
Compatibility Policy pyVmomi versions are marked vSphere_version-release . Pyvmomi maintains minimum backward compatibility with the previous four releases of vSphere and it’s own previous four releases. Compatibility with much older versions may continue to work but will not be actively supported. For example, version v6.0.0 is most compatible with vSphere 6.0, 5.5, 5.1 and 5.0. Initial releases compatible with a version of vSphere will bare a naked version number of v6.0.0 indicating that version of pyVmomi was released simultaneously with the GA version of vSphere with the same version number.