-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Monit] Monitor multiple processes with the same name but using different arguments. #4257
base: master
Are you sure you want to change the base?
Conversation
different arguments. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
name is valid or not. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
@@ -0,0 +1,88 @@ | |||
#!/usr/bin/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this file should be broken into two separate files, one for teamd and one for dhcp_relay. In the repo, the files should reside in the directories of their respective dockers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will break it into two separate files and place each one into their docker directories in the repo.
teamd and dhcrelay processes. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
check_teamd_processes. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Since Monit can only monitor the process with unique name, it is unable to do | ||
this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay | ||
processes which executes a same commad but with different arguments. The number | ||
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/non-empry/non-empty/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
#!/usr/bin/python | ||
''' | ||
This script is used to monitor dhcrelay processes in dhcp_relay docker container. | ||
Since Monit can only monitor the process with unique name, it is unable to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the process with unique name/processes with unique names/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded.
''' | ||
This script is used to monitor dhcrelay processes in dhcp_relay docker container. | ||
Since Monit can only monitor the process with unique name, it is unable to do | ||
this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Usually there will be multiple/There can exist multiple/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded.
This script is used to monitor dhcrelay processes in dhcp_relay docker container. | ||
Since Monit can only monitor the process with unique name, it is unable to do | ||
this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay | ||
processes which executes a same commad but with different arguments. The number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/commad/command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
processes which executes a same commad but with different arguments. The number | ||
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers. | ||
As such, we let Monit to monitor this script which will read number of vlans with | ||
no-empty list of dhcp servers form Config_DB, then find whether there exist a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/no-empry/non-empty/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
processes which executes a same commad but with different arguments. The number | ||
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers. | ||
As such, we let Monit to monitor this script which will read number of vlans with | ||
no-empty list of dhcp servers form Config_DB, then find whether there exist a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/exist/exists/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers. | ||
As such, we let Monit to monitor this script which will read number of vlans with | ||
no-empty list of dhcp servers form Config_DB, then find whether there exist a | ||
process in Linux corresponding to a vlan. If this script fails to find such process, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra space before "such"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed extra space.
|
||
def check_teamd_processes(): | ||
port_channels = retrieve_portchannels() | ||
cmd = "sudo monit procmatch '/usr/bin/teamd -r -t '" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making a call to monit, I'd prefer if we use a Python library like psutil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! I will do that. I also found psutil library is not installed by default in host image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used psutil library to check whether one of teamd processes is running or not. Please help me review.
|
||
def check_dhcrelay_processes(): | ||
vlans = retrieve_vlans() | ||
cmd = "sudo monit procmatch '/usr/sbin/dhcrelay -d -m discard'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making a call to monit, I'd prefer if we use a Python library like psutil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used psutil library to check whether one of dhcrelay processes is running or not. Please help me review.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
…relay processes is running or not. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
|
||
from swsssdk import ConfigDBConnector | ||
|
||
def retrieve_vlans(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the approach is complicated. suggest to use supervisor ctl to check
- What I did
This script is used to monitor teamd process and dhcrelay process in teamd and dhcp_relay
docker container respectively. Since Monit can only monitor the process with unique name,
it is unable to do this monitoring for teamd and dhcrelay processes. Usually there will be
multiple teamd and dhcrelay processes which executes a same commad but with different arguments.
- How I did it
The number of teamd processes is decided by the number of port channels in Config_DB and
the number of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers. As such, we let Monit to monitor this script which will read number of port channles and
vlans with no-empty list of dhcp servers form Config_DB, then find whether there exist a
process in Linux corresponding to a port channel or a vlan. If this script fails to find
such process, it will write an alert message into syslog file.
- How to verify it
We can explicitly kill a teamd process or dhcrelay process and then check whether there
will be an alert message written in syslog file.