Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suite API - On-The-Fly #2123

Open
2 of 3 tasks
Tracked by #1249
oliver-sanders opened this issue Jan 20, 2017 · 6 comments
Open
2 of 3 tasks
Tracked by #1249

Suite API - On-The-Fly #2123

oliver-sanders opened this issue Jan 20, 2017 · 6 comments
Assignees
Milestone

Comments

@oliver-sanders
Copy link
Member

oliver-sanders commented Jan 20, 2017

Automatically generate user interfaces based on the GraphQL API:

Self Documenting Comms Layer

#2020 Introduced a new way to access the cylc "suite API".

With the command line the network steps currently are:

  1. Submit command
  2. Await response

Whereas the HTML API in #2020 works somewhat differently:

  1. Request API (provided as an HTML form in a web page)
  2. Fill in form (user input)
  3. Submit command
  4. Await response

The subtle difference is that the user is in effect requesting the API from the suite rather than using the API as it is known to the cylc command. This reflects the way that a cylc suite can now be though of as a web service (as of #1923). This isolates the suite API from the particular cylc version employed by the user improving robustness.

The Proposal

The proposal is that the command line API be re-written to follow the pattern of the HTML API. The commands to which this proposal applies are those involved in user - suite interaction (i.e. cylc help control plus show, dump, get-suite-version, ...), system commands (i.e. cylc help task minus submit) would not be included in this change (although the command line interface could still be auto-generated [non-runtime] from the comms layer).

Advantages:

  • Cylc always uses the correct API to communicate with a suite.
  • The network API is fundamentally tethered to the command line and HTML APIs preventing them from getting out of sync.
  • Arguments, types and documentation only have to be written once in one place but are used for all APIs.
  • Arguments - which are converted to strings for transfer over HTTP can be automatically "cast" back on the network side averting the requirement to do this manually (which has caused bugs).
  • Future web-technology cylc GUI can use the HTML API for forms (e.g. cylc run) rather than relying on hard-coded ones (prevents the GUI from getting out of sync with the network interface).

Disadvantages:

  • Each command now results in four network calls rather than two. I see this as being OK as the usage is user-interaction rather than system function so the suite should never be overwhelmed by such requests.
  • Only suite commands can be implemented in this way, other commands (e.g. cylc help preparation), will stay as they are meaning that the suite API is implemented separately to the remainder of the cylc API.

Philosophy

Since #1923 the "suite interface" has effectively moved from the command line into the network layer. Suite commands are now more-or-less simple bash scripts which effectively serve as wrappers for the comms layer. At present the "interface" is defined by the arguments/options defined by the bash scripts, with the addition of the HTML API it would make sense to move this "interface" into the comms layer where it could be written in a more abstract form - for example as a docstring:

def cylc_ping(...):
   """
   Exit with success if the suite (or task) provided is currently running.

   Args:
       reg (str): The name of the suite to check.
       task (optional - str): The name of the task within the suite <reg> to check.
       comms-timeout (optional - int): The connection timeout.
   """

This interface could then be used to build the command line / HTML APIs on-the-fly.

The information in the interface should be sufficient to perform basic user-side verification (i.e. JavaScript form validation for the HTML API, error message for the command line API), but should also be sufficient to "cast" arguments as they arrive server side (i.e. no more nasty setting in [True, 'True'] statements).

Client side mechanics:

  • Verification:
    • Check provided args/opts against a spec (are all args provided, type checking, etc).

Suite side mechanics:

  • Validation:
    • Any more advanced checking required by the suite prior to inserting the command into the queue.
  • Type casting.

See #423, #2020 and #1980.

@oliver-sanders oliver-sanders added this to the soon milestone Jan 20, 2017
@oliver-sanders oliver-sanders self-assigned this Jan 20, 2017
@hjoliver
Copy link
Member

hjoliver commented Apr 8, 2017

I just had a play with Ben's original branch from #2020 - it would be good to get onto this soon...

@oliver-sanders
Copy link
Member Author

Following up on #2020 (comment)

If we're being really wild, there must be some kind of auto-generation of HTML forms based on input metadata for the parameters...

For suite commands the API now rests (pun intended) with the comms layer rather than the CLI. Consequently the CLI scripts for suite commands should now just be nothing more than simple API wrappers doing the following:

  • Definition and parsing of arguments
  • Help
  • Argument type checking (e.g. int, float, taskid, ...)
  • Validation

This same functionality is also desired of the comms layer. To me it makes sense to move all of this functionality into the comms layer so that it is in one place where it can be re-used for both the CLI and HTTP(S) APIs.

If we write the method docstrings in a standard format [1, 2] then they could be parsed by a system like sphinx+napoleon [3]. From this information we should be able to:

  • Auto-generate CLI API scripts (with basic validation)
  • Auto-generate HTTP API forms (with basic real-time validation)
  • Auto-generate an API reference (trivial to build from docstrings with sphinx+napoleon) [4]
  • Automatically cast API parameters (at present this is done manually either in the comms layer or elsewhere in cylc).

In the long run auto-generated HTTP forms could be used by future web-based GUIs which should make progress faster.

Doing this would remove the risk of:

  • The CLI API and HTTP(S) API getting out of sync.
  • Hard-to-test-for TypeErrors occurring at runtime due to lack of casting or API changes resulting in the failure of downstream logic.
  • (future) web GUI and the HTTP(S) API from getting out of sync

All three of these points have spawned bugs since cylc7.

I think most of this can be achieved using docstrings in the comms layer (to define arguments and types as implemented in #2020). @cylc/core your thoughts on this.

[1] http://google.github.io/styleguide/pyguide.html
[2] http://www.sphinx-doc.org/en/1.5.1/ext/example_google.html#example-google
[3] http://www.sphinx-doc.org/en/1.5.1/ext/napoleon.html?highlight=napoleon
[4] metomi/rose#2046

@hjoliver
Copy link
Member

hjoliver commented May 2, 2017

@oliver-sanders - thanks for that, sounds brilliant.

@oliver-sanders oliver-sanders changed the title Self Documenting Comms Layer API - On-The-Fly Jun 14, 2017
@oliver-sanders oliver-sanders changed the title API - On-The-Fly Suite API - On-The-Fly Jun 14, 2017
@oliver-sanders
Copy link
Member Author

This change could be made in combination with a shift to graphql which provides a flexible deprecation friendly way to define a communication interface.

@kinow
Copy link
Member

kinow commented Mar 15, 2019

Spent some time today reading this and the two related pull requests. Looks like the changes for the CLI will bring many benefits, such as simpler interface, parameter type check and casting, etc.

I am not too sure about the HTML and Web layer. For that layer, there are already a few tools that do that, the most famous being Swagger, now OpenAPI.

Not sure if applicable here, or if I haven't understood the scope of this proposal. But if we are going to define the data type used for communication, in a way that forms can be generated to query that, and also as a tool to document the API for clients, then I think it would be best to write the minimum code possible for that.

The Swagger/OpenAPI Petstore has become the de-facto HTML automatic form generator for the specs, used too by JupyterHub, and JS developers should be familiar with that I think.

I haven't looked how JupyterHub devs do it, but I suspect they sync changes manually, which could lead to errors. zalando/connexion not only uses the OpenAPI spec, but also validates the exposed Flask endpoints against the specification. So if your server code gets out of sync with the API spec, the server fails to initialize.

I haven't used it with GraphQL, but looks like there was some work to support it already.

RPC appears to be an ongoing discussion since 2016, without an official solution, but with some people implementing their own version.

And looks like some people have already shared some ZMQ/Protobuf endpoints documented with OpenAPI, so that should be doable.

There's also an OpenAPI sphinx library, which could be used to generate the sphinx documentation (never tested that tho).

Sorry if what I said is out of context here.

Cheers
Bruno

@oliver-sanders
Copy link
Member Author

GUI component completed, CLI component less urgent, bumping the remainder of this issue to 8.x or beyond.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants