Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralized service to restart Wazuh #4277

Closed
AlexRuiz7 opened this issue Jun 17, 2022 · 6 comments · Fixed by #4365 or #4405
Closed

Centralized service to restart Wazuh #4277

AlexRuiz7 opened this issue Jun 17, 2022 · 6 comments · Fixed by #4365 or #4405
Assignees
Labels
type/enhancement Enhancement issue

Comments

@AlexRuiz7
Copy link
Member

AlexRuiz7 commented Jun 17, 2022

Related issue: #4181

Description

To follow the discussion on issue #4181, we need to centralize the restart of Wazuh using a React service or component, so any view of the App will be able to restart the environment in the same way.

Restrictions and considerations

  • In cluster mode, a delay of at least 15 seconds needs to be applied when a cluster restart is immediately triggered after changing the ruleset files, in order to allow the cluster to synchronize the changes along the nodes. This is the safety time the Framework team told us to use.
    For a detailed explanation, head to the Wazuh Cluster documentation.

    UPDATE: the @wazuh/framework team will improve their ruleset modification mechanism to distribute the changes along the cluster nodes immediately, so this delay will not be needed anymore. Issue: Improve ruleset modification in cluster environments wazuh#14492

    UPDATE: the @wazuh/framework team decided to halt the development mentioned above and implement Create mechanism to know if the cluster is synchronized after a specific event wazuh#14520 instead. We'll need to add a second polling mechanism to detect when the cluster is synchronized.

    In consequence, we need to clean any delay applied to the requests to restart the cluster:

    • Delete any delay applied to the requests to restart the cluster
  • It's possible that Wazuh will only exist as a cluster in the future, and the single-instance mode will exist as a single-node cluster instead. Take this in EXTRAORDINARY consideration during design and coding, so we can easily adjust this service if this eventually happens.

Requirements

  • During the restart process, the app must block any user interaction, including navigation, by deploying an overlay mask plus a modal (or similar) in which the user is provided with feedback about the restart and the actions being taken.
  • As Wazuh can be deployed as a single-instance or as a cluster, the actions to be taken differ slightly. The restart process must be able to detect the mode Wazuh is deployed, and perform the restart accordingly.
  • Once the restart order has been sent to Wazuh, the app's restart process will start a polling routine, pinging the API within a 2 seconds interval, and a maximum of 30 attempts. As soon as the API responds that Wazuh is ready, the restart process ends, meaning that the UI elements that had been added will be cleared. Otherwise, if the maximum numbers of attempts is reached, the App will automatically navigate to the Healthcheck after 5 seconds, as something did not go as expected during the restart.
  • No errors must be raised during this polling routine. Request failures are expected (the API will be down for some time).

Design

Flow

The current flow to restart Wazuh has been modeled in the following activity diagram:

Outdated.- Reason: delay to restart the cluster is no longer required.
AD_Wazuh_restart
Note: rev.2 - Last updated: Thu, 04 Aug 2022 13:40:42 +0200

User Interface

Note: be aware the UI design might change over time, do not take this design as final, unless explicitly specified so.

New, custom, UI components will be needed. We'll work on a PoC using several built-in components from EUI, which will include:

  • A modal-like element to display the restart status. We'll use the EUI Empty prompt component.
    modal_restart

  • An overlay mask component, used to move the focus to the modal, block user interaction and reinforce the feeling of a task that takes some time to complete.

  • A progress bar. There are two options here:
    a) countdown, starting at delay * total_attempts (2 * 30), and updated each second.
    b) current attempt, starting at 0 until total_attempts (30), and updated on each attempt.
    We need to discuss which design we like the most. Option B
    Note: the progress bar will only reach 0% (option A) or 100% (option B) in the worst case scenario. Wazuh should be completely restarted before this happens.

Preview

Work in progress

This is a demo for the desired design: https://codesandbox.io/s/wazuh-restart-forked-7mnxj1?file=/demo.js

modal_restart

@Desvelao
Copy link
Member

Desvelao commented Jun 23, 2022

Research

In the current plugin there is some methods that are being reused:

  • Management/Configuration/Edit configuration: Restart <node_name> / Restart manager
  • When adding/editing some rule/decoder/cdb list file and importing

Restart selected manager node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L270-L287
Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L293-L309

Restart cluster or manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L528-L549

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L315-L340

Restart node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L346-L369

⚠️ this could be similar to another method. We should review if we could unify the behavior.

For another hand, there is another logic used by Management/Status that restarts the manager/cluster.
Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L97-L117

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L68-L92

We could try to refactor to use the same service/functions and unify the behavior.

Possible tasks:

  • Use the same service/functions in the different sections where it is required
    • Move the location of the reusable logic
  • Review if there is unused logic and remove it.

@Desvelao Desvelao self-assigned this Jun 24, 2022
@Desvelao

This comment was marked as resolved.

@snaow snaow moved this to Triage in Release 4.4.0 Jun 27, 2022
@snaow snaow moved this from Triage to Todo in Release 4.4.0 Jun 27, 2022
@snaow snaow removed this from Release 4.3.6 Jun 27, 2022
@Desvelao

This comment was marked as resolved.

@yenienserrano yenienserrano moved this from Todo to In Progress in Release 4.3.7 Jul 29, 2022
@yenienserrano yenienserrano linked a pull request Jul 29, 2022 that will close this issue
@yenienserrano
Copy link
Member

these functions were moved to a service to handle the restarting (wz-restart.js)

restartManager and restartCluster were changed by 1 function restart

Restart selected manager node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L270-L287

Restart cluster or manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L528-L549

Restart node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L346-L369

Change

Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L293-L309

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L315-L340

For another hand, there is another logic used by Management/Status that restarts the manager/cluster.

The 2 functions were eliminated as they were doing the same things and we started using the created service

Change

Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L97-L117

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L68-L92

and it is also called in the file restart-cluster-manager-callout.tsx

https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/components/common/restart-cluster-manager-callout.tsx#L64

@AlexRuiz7
Copy link
Member Author

AlexRuiz7 commented Sep 6, 2022

Blocked by wazuh/wazuh#14776
Blocked by wazuh/wazuh#14918

@snaow snaow modified the milestones: Release 4.4.0, Release 4.5.0 Nov 16, 2022
@snaow snaow removed this from the Release 4.5.0 milestone Dec 21, 2022
@yenienserrano yenienserrano linked a pull request Feb 6, 2023 that will close this issue
@gdiazlo gdiazlo removed the cat-3 label Mar 9, 2023
@yenienserrano
Copy link
Member

Closed until priorities change

@wazuhci wazuhci moved this to Done in Release 4.5.0 Apr 21, 2023
@AlexRuiz7 AlexRuiz7 closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2023
@wazuhci wazuhci moved this to Done in Release 4.6.0 Jun 26, 2023
@wazuhci wazuhci removed this from Release 4.5.0 Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment