Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inspur][HLD] Add cli_wrapper hld #1091

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

inspurSDN
Copy link

Description:
    cli_wrapper high-level design

    Description:
        cli_wrapper high-level design
@zhangyanzhao
Copy link
Collaborator

@zhangyanzhao zhangyanzhao removed the request for review from richardyu-ms May 30, 2023 15:20

* Disable CLI wrapper service

![Encode format](images/wrapper_disable.jpg)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this done? Does a user have to manually rename the file and stop the server? A global rules flag set to "n" should take care of these steps instead of manual intervention on every switch.

@zhangyanzhao
Copy link
Collaborator

@inspurSDN can you please update the HLD by using the HLD template https://github.com/sonic-net/SONiC/blob/master/doc/hld_template.md? Some required info/sections are missed in your current HLD. Thanks.

@zhangyanzhao
Copy link
Collaborator

@venkatmahalingam registered as reviewer.


CLI_wrapper server is responsible for the execution of click command. It starts when system boots up, and loads all needed libraries to server. Then setup a socket to wait client's request. Once it accepts a request, it spawns a CLI_worker process to deal with this request.

In order to avoid that some command may stuck in CLI worker and cause resource leakage, a watchdog thread is designed to watch the execution time of CLI worker, once the execution time exceeds maximum execution time, the CLI worker will be gracefully terminated. But if the CLI worker is still alive after one minute, a kill signal will be sent to CLI worker to tear down it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are config/show commands that can go beyond 1 min, hope you have a configuration to increase the timeout if required.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the show output has more than 1 page, how are you handling the pagination? can we make sure the current behavior is not disturbed with this new design?


In order to avoid that some command may stuck in CLI worker and cause resource leakage, a watchdog thread is designed to watch the execution time of CLI worker, once the execution time exceeds maximum execution time, the CLI worker will be gracefully terminated. But if the CLI worker is still alive after one minute, a kill signal will be sent to CLI worker to tear down it.

The local DB is used to store the record of spawned CLI worker, it records the steady timestamp and pid of the CLI worker.
Copy link
Collaborator

@venkatmahalingam venkatmahalingam May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this local DB is some kind of a Q?

When there are more than client, how is server handling the requests from clients at a time? How's config/show errors are being responded back to the client?


![Command execution flow](images/command_execution_flow.jpg)

* System ctrl bring up CLI_wrapper server when system starts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep this feature disabled by default, if we're comfortable making it enabled some point in time in the future, we can have it enabled by default.

Copy link
Collaborator

@venkatmahalingam venkatmahalingam May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an option to use current click commands (without wrapper) even after this feature is enabled? I just want to make sure we're not blocked because of some bug in the wrapper routines?


### 1.3.2 Command execution flow

![Command execution flow](images/command_execution_flow.jpg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the diagram to include more than one client, also, not sure how the context for the particular command is carried end to end, please clarify.

* CLI worker decodes the command to get parameters and environment setting, then execute this command based on those settings. In the end, it sends return code and data back to CLI wrapper client

### 1.3.3 Watchdog execution flow

Copy link
Collaborator

@venkatmahalingam venkatmahalingam May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to accept optional value for each command for execution time? sometimes, it's hard to predict the execution time of the command.


* For less powerful CPUs, cli_wrapper can significantly reduce execution time.

# 3. Open questions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify the design around the interactive commands (yes/no).


* CLI_wrapper server

CLI_wrapper server is responsible for the execution of click command. It starts when system boots up, and loads all needed libraries to server. Then setup a socket to wait client's request. Once it accepts a request, it spawns a CLI_worker process to deal with this request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the case of warmboot/fastboot while spawning CLI wrapper server. Are there any CLI commands that are executed during the warmboot/fastboot? If not I recommend delaying the CLI wrapper server until reconciliation is complete.


CLI_wrapper server is responsible for the execution of click command. It starts when system boots up, and loads all needed libraries to server. Then setup a socket to wait client's request. Once it accepts a request, it spawns a CLI_worker process to deal with this request.

In order to avoid that some command may stuck in CLI worker and cause resource leakage, a watchdog thread is designed to watch the execution time of CLI worker, once the execution time exceeds maximum execution time, the CLI worker will be gracefully terminated. But if the CLI worker is still alive after one minute, a kill signal will be sent to CLI worker to tear down it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this needs to be rephrased. If I am not mistaken the watchdog will terminate only after 10 minutes and send kill at 11th minute if there is no resposne.


![Encode format](images/wrapper_enable.jpg)

1. Replace main.py with wrapped main.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enable disable at run time through a CLI?


* TODO:

* Dynamically generating wrapped files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this feature work with auto CLI generation feature? Please check https://github.com/sonic-net/SONiC/blob/master/doc/cli_auto_generation/cli_auto_generation.md

* TODO:

* Dynamically generating wrapped files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check the interoperability with GCU feature. Since GCU loads the yang models once for every command, can we think about loading the entire yang only once and then validating?

@zhangyanzhao
Copy link
Collaborator

@dgsudharsan registered as reviewer of this feature.

@zhangyanzhao
Copy link
Collaborator

@inspurSDN can you please help to add the code PR by referring to #806? Thanks.

@zhangyanzhao
Copy link
Collaborator

code PR is not ready, move to backlog for future release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: DeferredForNextRelease
Status: MovedToBacklog
Development

Successfully merging this pull request may close these issues.

6 participants