-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/execute shutdown command #7709
Feature/execute shutdown command #7709
Conversation
What exactly is a "shutdown host" and how does it behave? Which problem are you aiming to solve with this pull request? Please provide more details and the story behind this, also in regard of the technical design, compatibility and maintenance purpose (troubleshooting this feature) - since this involves API and cluster as I've seen from a quick look into the diff. |
Also, the code portions which involve sending a command to a command_endpoint duplicate the existing code parts we already have in checkable-check.cpp and likewise. To anyone who reviews this - this code duplication must be resolved before approval. |
macros = params->Get("macros"); | ||
|
||
Array::Ptr command_line_tmp = command->GetCommandLine(); | ||
Value shutdown_command_value = params->Get("shutdown_command"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is retrieved in various places, but it never gets set. Is that a new command object attribute or is the PR incomplete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The endpoint is called by passing a parameter called shutdown_command, which represents a valid shutdown command for the target host. An example follows:
curl -s -k -u root:admin -H 'X-HTTP-Method-Override: POST' -H 'Accept: application/json' https://localhost:5665/v1/actions/shutdown-host -d '{ "type": "Host", "filter": "host.name==\"<hostname>\"", "shutdown_command": ["systemctl", "poweroff"]}'
The user scenario behind the feature is the need of being able to shut down a host in a clean way when a monitoring condition is met (it can be a temperature level above threshold, for example). The user can specify a shutdown command for a single host or a group of hosts. Then, when a condition is met, this command is triggered by calling the API endpoint we implemented. The logic behind the execution of the command resorts to the clustering infrastructure in order to distribute the message containing the command to execute to the target host. |
So basically this implements a remote command execution triggered with an API endpoint? Or do I still need a CheckCommand object on the agent for this? From a security perspective, I see quite a few points where this can be considered harmful. Also, the implementation mimics the command endpoint way of doing things, whose design was incomplete. Imho this requirement follows along the inventory and discovery ideas we had in the past, and should be taken into account in their design. Is there a specific reason not to use Ansible or Puppet's Bolt for this type of shutdown scenario? Especially since you are only adding a transport layer with using the Icinga cluster protocol for this, I'm not yet convinced that Icinga is the right environment for this. |
The Icinga2 clustering infrastructure gives us the possibility of using a secure channel to propagate the shutdown messages from the master to any agent in the cluster. Having this in place, it's a matter of sending the right message to the proper recipient without adding an extra ingredient as Ansible (which is, I agree on that, capable of doing similar things if properly used). To execute the commands, we deploy a dummy shutdown command in director-global, which is then used as placeholder for the correct command that is received by the agent via cluster messages |
Hi @dnsmichi , The main advantage for using icinga2 for this purpose is in fact the already set-up secure communication and to install/configure one single agent on every machine. Another future plan for us is developing a "service test" feature (based on this one) where a user can test a particular command by tuning the parameters before deploying them in the director. |
Please coordinate the design and concept with @lippserd to ensure this is something which adds properly to Icinga's feature set. Creating a pull request and code without discussing the design and technical details beforehand adds a burden for both sides investing time into it. In its current state, I'll stall the PR until the design is fixed. |
Hi @dnsmichi, The patch at the end was not reviewed, and during the following meeting in December, we were asked to open this PR for an open discussion with you about the design and where to go from here. For us, this is a starting point for opening a discussion to identify if this use case is interesting for the community. As you can imagine, we are now confused about what we should do now. Could you please help us to understand how is the best way to proceed so that we can learn how to contribute to Icinga in the correct way? |
Hi, unfortunately I'm not aware of these meetings nor patches, I just saw the PR without any details and was wondering what this is. That is why I started reviewing and analysing it, that's a normal thing in my daily work contributing to Icinga. Meanwhile I have learned that you are in discussion with Eric, that is why I have updated this issue since I've been mentioned here. Otherwise the conversation would look incomplete - please join Eric on the matter and clarify how we can proceed here. Cheers, |
Hello @lestat80 and thank you for contributing! Why a shutdown command and not a general command? What problem not solved by EventCommands does this solve? Best, |
I'm sorry but what the hell? Are you serious to implement a shutdown via an monitoring agent??? A event handler should be sufficient in every case to shutdown a machine. For what purpose should icinga2 do this? There are much better ways to stonith machines or automatically shut them down all at once. If there is a slightly bug which would allow me to initiate shutdowns without permissions, I could shutdown thousands of michines by hundrets of customers at once. Imho it's mental to merge such an pull request. |
This feature adds the shutdown-host endpoint to the API.
This endpoint allows submitting shutdown commands to Icinga2 nodes in a distributed monitoring environment.
Commands are distributed via cluster messages.