Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] non-smartctl support for 3rd party tools #746

Open
lars18th opened this issue Feb 4, 2025 · 4 comments
Open

[FEAT] non-smartctl support for 3rd party tools #746

lars18th opened this issue Feb 4, 2025 · 4 comments

Comments

@lars18th
Copy link

lars18th commented Feb 4, 2025

Is your feature request related to a problem? Please describe.
When using a RAID controller, you can use the smartctl tool only if the tool has support for your specific controller. However, some other tools, like perccli or storcli (for megaraid) created by the manufacturer of the controller could provide the same information exporting the data in JSON format. In this case it would be interesting to have support inside the collector to use these other tools.

Describe the solution you'd like
At time the collector has these configuration values:

#  metrics_smartctl_bin: 'smartctl' # change to provide custom `smartctl` binary path, eg. `/usr/sbin/smartctl`
#  metrics_scan_args: '--scan --json' # used to detect devices
#  metrics_info_args: '--info --json' # used to determine device unique ID & register device with Scrutiny
#  metrics_smart_args: '--xall --json' # used to retrieve smart data for each device.

So we can call to a different binary (smartctl_bin) and replace the args for the different tool. But, in this case it's necessary to add support for a different parser to collect the data retreived from these other tools. More or less, all the data is present, so it's only a process to capture the available information in the JSON structure presented by the alternative tool.

To do it in the best and simple way, the idea is to collect (using the same physiical machine), the output of smartctl and perccli (or any other tool). All in JSON format, and then compare the results. After that, it could be easy to add the alternative parser tool.

What you think about this idea?

Additional context
I can share the JSON output of these commands:

  • perccli /c0 show all: full info of the controller and the virtual disks
  • perccli /c0 /eall /sall show smart: status of SMART data of all disks in the controller
  • perccli /c0 /eall /s0 show all: detailed info of disk 0

I hope you think this is a good idea.

@bashers222
Copy link

bashers222 commented Feb 11, 2025

If doing 3rd party integrations, Dells BOSS driver would be handy
It needs to run the BOSS 'mvcli' command and pull stats with this command

cd /opt/BOSS/mvcli_5.0.13.1111_A00/x64/static
./mvcli smart -p 0

This is based on the Marvell Raid controller.
Not a whole lot of support out there, but there does seem to have been a Nagios plug-in developed that can poll most RAID controller and this has support for the BELL BOSS driver: t/check_mvcli.t

Sample output for Port 0:


Smart Info
ID      Attribute Name                                  Current Worst   Threshhold      RawValue
01      Raw read error rate                             100     1       50              00000000216E
05      Reallocated block count                         100     100     5               000000000000
09      Power-on hours count                            100     100     0               000000006B54
0C      Power cycle count                               100     100     0               0000000002C3
AD      Per block max erase count                       78      78      5               00000000009E
B5      Program fail count (total)                      100     100     0               000000000000
C2      Drive temperature                               47      71      0               000000470A2F
C6      Offline scan uncorrectable LBA count            100     100     0               000000000000
C7      CRC error count                                 100     100     0               000000000003
C9      Volatile memory backup source failure           78      78      5               00160000009E
F1      NAND sectors written (total)                    100     100     0               00000044FEEE

@lars18th
Copy link
Author

Hi @bashers222 ,

Thank you for your comment. It seems that this idea is useful for different users. And it has different tools to support.

Therefore, perhaps it's time to request to @AnalogJ if he thinks it's a good idea to support it. And whats the best roadmap.

@bashers222
Copy link

probably best i suppose
I've created a feature request

@lars18th
Copy link
Author

probably best i suppose I've created a feature request

Hi @bashers222 ,

Your issue #749 is the same as this one. I recommend to close this other to maintain the repository clean.

In fact, at time the question to resolve is: It has sense to support external souce tools?

If the answer from @AnalogJ's point of view is that it makes sense, then he will provide more information on better ways to support it. In the meantime it's preferable to wait for an official response.

Or that's my point of view.
Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants