Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SMART monitoring for disks #4

Closed
wants to merge 2 commits into from
Closed

Conversation

yooooki
Copy link
Contributor

@yooooki yooooki commented Jun 8, 2024

Added a feature to detect the disk failure based on smartmon (smartctl -H).
Turns off the led for empty disk slots, instead of turning red.

Added SMART monitoring based on smartctl. The led now turns red if the corresponding disk fails (smartctl -H does not show PASSED).
Optimized the pooling loop to only iterate the existing drives at the execution of the script
The led will be turned off if no disk in installed in the slot.
@meyergru
Copy link

meyergru commented Jun 8, 2024

There is a problem: I recently found that for my device (a DXP8800), the mapping of sda..sdh to the physical slots 1..8 is NOT 1:1, but sometimes random.

See my improved shell script in my fork on how the correct mapping can be done….

@yooooki
Copy link
Contributor Author

yooooki commented Jun 9, 2024

@meyergru
According to https://unix.stackexchange.com/questions/658998/how-the-h-c-t-l-numbers-read-from-in-lsscsi ,
H in HCTL for sata devices can change after reboot.
It's also not clear whether 0:0:0:0 is the first or the 4-th or the last drive, and whether this is consistent for all UGreen devices (e.g. 600 series and 800 series, plus and non-plus) since they may use different pcie to sata adapter.
A better solution might be mapping led to serial number of sata devices, but that would require install the disks once a time and manually configure them in the script, which is
The original script guarantees each led is mapped to different physical drives, and I think that's enough for casual use.

@meyergru
Copy link

meyergru commented Jun 9, 2024

No, as I said, it does not. It is actually misleading w/r to the Position of a defective Drive. As such, it is doing more harm than good.

Iff HCTL should be changing - and I cannot Imagine how it should - my approach would not work either, but using sda..sdh is VERIFIED to not work as it should, so the problem persists.

@miskcoo
Copy link
Owner

miskcoo commented Jun 9, 2024

Hi @yooooki, I've also encountered the same issue today, and with the new script using HCTL, the disks are correctly mapped to the LEDs in my 4600Pro. Maybe it is better to use HCTL, which is now confirmed to work under 4600Pro and 8800Plus. For other models, I do not know if these I2C commands are compatible for controlling LEDs, so maybe it is too early to consider the HCTL changing issue.

BTW, for this PR, I think you should move the counter cnt outside the for loop, and it would be good to remove the two echos of disk failure messages in the loop as their outputs will flood the log.

@yooooki yooooki closed this Jun 9, 2024
@yooooki
Copy link
Contributor Author

yooooki commented Jun 9, 2024

Since you both confirmed HCTL is more stable, I'll follow it. What I can find online is that every one says /dev/sdX is almost as unreliable as HCTL, and the "real" solution would be by-id or by-uuid, which I think is a little bit anonying for casual users as multiple reboots must be performed if those id are not listed on the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants