-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PMON Voltage/Current Sensor Monitoring Enhancement #1394
Conversation
@bmridul why not use the existing
|
Hi @prgeor, Please note that the solution proposed is not mutually exclusive to Sensord/Lmsensors. The solution proposed here can be used alongside or without Sensord. There are 2 main reasons we are proposing this.
However, the issue is that there are a number of devices which report voltage/current measurements that are not supported by hwmon. An example is device LTC2497 which can report voltage measurements. The open source driver available for this device supports a different linux infrastructure – Industrial I/O The driver code is here : https://github.com/torvalds/linux/blob/master/drivers/iio/adc/ltc2497.c This device will not be monitored by hwmon unless we patch its standard kernel driver which is not something we would want to do. Sensormon can cover this gap and provide one CLI/rpcoess to view/monitor all the voltage and current sensors.
|
@bmridul I am still not quite convinced the use of sensormon, since devices using IIO is mostly used in consumer products like laptop, mobile phones. The operating range of the chip is best known to the chip manufacturer and hence its better to depend upon the chip based alarm rather than platform integrator arriving at the customized threshold to raise alarms. As you rightly pointed out, hwmon already includes almost all sensors barring few exceptions, so I would prefer if you make effort to include these few exception devices included in hwmon support. Nevertheless, please capture the reasons in the HLD for sensormon. |
@prgeor , Thanks for your comments. I will update the HLD with more details. Few responses
The uber point is that the sensord may not provide complete coverage. If Linux Sensord is good enough for a platform, it can choose not to use this infra. We can discuss further in community meeting. |
As discussed during the community call today -- please explore integration with SONiC PDDF (https://github.com/sonic-net/SONiC/blob/master/doc/platform/brcm_pdk_pddf.md) so that platform vendors can easily integrate support for voltage/current sensors using their PDDF JSON files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few general comments:
- How is the daemon connect to system health service?
- How is the daemon connect to snmp?
- How to build the sensor hierarchy relation? For example, should PSU voltage sensor object under PSU object or under chassis object?
- There is already logic in PSUD to handle PSU voltage threshold, do you plan to remove it?
- How to handle hot swappable sensors? For example, PSU sensor, module sensor?
I looked briefly into the PDDF documentation and some code. It seems the PDDF support for voltage and current sensors can be added in a similar way as as other platform components are supported. E,g, The code below is implementing the platform support for thermal sensors. Voltage and current sensors support can be similarly added. |
Added support for voltage and current sensor monitoring CLIs as mentioned in the feature HLD for PMON Voltage/Current Sensor Monitoring Enhancement. sonic-net/SONiC#1394 * Addressed review comments * Fix review comment * UT file * Fixed dependency for running unit test on platform common changes * Fixed standalone unit test failure * Addressed review comment
Sensormond daemon is introduced. It collects the voltage and current sensor information from the platform and populates in the StateDB. The sensor data is available to view using CLIs. Sensormond raises syslogs when the sensors report measurement outside the thresholds. sonic-net/SONiC#1394
|
It would be good to show the out of out of range sensor readings in system health o/p. Currently out of range temperature sensors are also not shown in system health. We should add those as well. I have added PR for system health integration. |
SNMP support will need to be added to represent the voltage/current sensors in Entity MIB. We will need to address this with changes in rfc2737 implmentation. |
SNMP support PR added. |
@Junchao-Mellanox @jeff-yin @prgeor , Pls check if ur comments have been addressed. |
I didn't really see any updates with regards to PDDF specifically. If it's not going to be covered, please indicate that in the HLD and provide a path for integration in the future. |
Hi Jeff, I had replied to your comment above - #1394 (comment) Currently our platforms donot support PDDF infra so I am not sure how to test PDDF integration. However it seems straightforward to me. I can open a git issue for this and future integration can be tracked that way. LMK. |
Added support for voltage and current sensor monitoring CLIs as mentioned in the feature HLD for PMON Voltage/Current Sensor Monitoring Enhancement. sonic-net/SONiC#1394 * Addressed review comments * Fix review comment * UT file * Fixed dependency for running unit test on platform common changes * Fixed standalone unit test failure * Addressed review comment
Sure, that's fine. Please do create the issue to track it. |
Enable Sensormon daemon in PMON container. Pls see HLD : sonic-net/SONiC#1394
Enable Sensormon daemon in PMON container. Pls see HLD : sonic-net/SONiC#1394
Enable Sensormon daemon in PMON container. Pls see HLD : sonic-net/SONiC#1394
HLD is merged but code PRs are still open, move to backlog |
HLD for Voltage and Current sensor monitoring.
Related PRs: