Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[system-health] User-defined health checkers fail to start #12701

Open
antonptashnik opened this issue Nov 14, 2022 · 1 comment
Open

[system-health] User-defined health checkers fail to start #12701

antonptashnik opened this issue Nov 14, 2022 · 1 comment
Assignees
Labels
NVIDIA Triaged this issue has been triaged

Comments

@antonptashnik
Copy link

Description

Per HLD , a user can define own checkers to be executed on health checks in a form provided below, in file /usr/share/sonic/device//system_health_monitoring_config.json

{
...
  "user_defined_checkers": ["program_name -option1 value1 -option2 value2"],
...
}

Attempt to add any script demonstrates that it does not start

Steps to reproduce the issue:

  1. create a sample file at "~/checker_output.txt" with content below that adheres to output format a user-defined checker should produce
ExternalCategory
ExternalService:Service is not working
ExternalDevice:Device is broken
  1. add a new checker that just outputs the created text file, by appending the cat ~/checker_output.txt into /usr/share/sonic/device/<platform>/system_health_monitoring_config.json. On instance:
{
...
  "user_defined_checkers": ["cat ~/checker_output.txt"],
...
}
  1. wait for a minute (default health check interval) and check results
sudo show system-health detail

Describe the results you received:

No parsed check results, an error is logged

...
Reasons: Failed to get output of command "cat ~/checker_output.txt"
...

Describe the results you expected:

Parsed result are shown as

ExternalService      Not OK    UserDefine
ExternalDevice       Not OK    UserDefine

Output of show version:

SONiC Software Version: SONiC.master.172539-dirty-20221110.095210
Distribution: Debian 11.5
Kernel: 5.10.0-12-2-amd64
Build commit: 7c746e67d
Build date: Thu Nov 10 16:18:12 UTC 2022
Built by: AzDevOps@sonic-build-workers-002D7H

Platform: x86_64-arista_7170_64c
HwSKU: Arista-7170-64C

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

Investigation showed the issue is at https://github.com/sonic-net/sonic-buildimage/blob/master/src/system-health/health_checker/utils.py#L11 . subprocess.Popen requires a list with a command name and args followed, but just a single string is provided instead.

@azure-pipelines-wrapper
Copy link

Thanks for opening this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NVIDIA Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

3 participants