[DISCUSS] Service Composition #178

kevin-bates · 2020-01-30T22:52:54Z

In today's Server meeting, we discussed service composition or the ability to optionally specify which services should be exposed by a given instance of Jupyter Server. This issue is intended to summarize that discussion and, more importantly, stimulate discussion amongst the interested community.

TL;DR

The purpose of this issue is to discuss how we might adjust server internals to enable composability of services. This is more than just exposing a list-based trait consisting of which services should be exposed - but more in line with an intuitive grouping of related functionality. Regardless of the outcome of this discussion, the default behavior will be that all services are exposed so as to best preserve existing functionality.

Introduction

One of the goals of Jupyter Server is to enable the ability for users to configure subsets of services. For example, if a user wanted to configure the server to only serve kernels, they should be able to configure only the necessary services for exposing kernel functionality. Similarly, one should be able to configure a server as a “content server” where things like kernel functionality is not present.

Within the main server application, there is a general assumption that all services are enabled and available. Applications that wish to alter the set of “system-defined” services must subclass ServerApp and override the set of services. Still, there are general assumptions throughout the ServerApp class instance that assumes all services are present. This issue is meant to generate discussion of how we might want to refactor service-relative functionality so that the Server itself can be more easily configured into “units of functionality”. Please note, however, that regardless of where this discussion leads, the default behavior of Jupyter Server will be, generally speaking, that of today’s Notebook server in that, by default, all services (or “units of functionality”) will be present and enabled.

Terminology

Service: By service I mean the portion of the functionality that is exposed via a REST API. This does not include underlying “manager” instances. In many cases, we still want to use the “manager” corresponding to a given functional unit (e.g., FileContentsManager), yet not expose that functionality via a Service. That said, exposure via a service should imply the existence of an underlying “manager” instance (just not vice versa).
Extension vs. Subclass: It’s easy to view a subclass as an extension of the its superclass - that’s correct. But for the purposes of this discussion, we should treat the term extension as meaning the implementation of a server extension in the sense that the functionality it provides is in addition to the functionality provided by the server. Whereas subclassing implies that the existing functionality provided by the server is altered or transformed.

Default services

The set of default services can be found here. For simplicity, I’ve added them below:

    # A list of services whose handlers will be exposed.
    # Subclasses can override this list to
    # expose a subset of these handlers.
    default_services = (
        'api',
        'auth',
        'config',
        'contents',
        'edit',
        'files',
        'kernels',
        'kernelspecs',
        'nbconvert',
        'security',
        'sessions',
        'shutdown',
        'view'
    )

The idea, relative to this proposal, is that all these services would be exposed by default. However, from a composability perspective, we should discuss which of these services are required and which are optional and how composition would be expressed.

Required services

By definition, required services are those services that need to be present when any optional services are configured. I think we'd want to enforce behavior such that the server cannot be started with JUST required services configured. That said, we shouldn't preclude a base server (consisting of only required services), in conjunction with at least one server extension that introduces its own service, from being started. As a result, the complete picture should be considered prior to automatically shutting down an "invalid server" (i.e. one consisting of only required services).

Of the services listed above, I think only 'auth', 'security' and perhaps 'config' and 'api' would be considered "required". 'config' is in this group because anything pertaining to a "valid" server will have configuration. 'api' is in this group because there needs to be some description for how the REST APIs of the "valid" server are to be used. However, the handler for this service would likely build its response based on what other (optional) services are configured. This implies today's single yaml file would be broken down to the composable service boundaries, probably even with separate 'parameters', 'paths' and 'definitions' sections for each, so as to enable easier composition of the response.

Optional services and functional composition

Obviously, services not considered required services, would fill out the set of optional services. However, we probably shouldn't expose these services individually, but, instead, as units of logical functionality. For example, users wishing to run Jupyter Server as a Kernel Server would need 'kernels', 'kernelspecs' and (not listed) 'kernelspec_resources' (for serving resource files). One approach would be to group these into a kernel_services set. Likewise, users wishing to run Jupyter Server as a Content Server would need 'contents', 'edit', 'files', 'view', so these would be grouped into a 'content_services' set, etc.

Note that 'sessions' and 'nbconvert' weren't in either kernel_services or content_services, yet are required by today's Notebook front-ends. For these we could do nothing (since the default is everything) or create a notebook_services set that would be composed of kernel_services, content_services, 'sessions' and 'nbconvert' (we'd probably add 'terminal' to this as well, but that's already optional by virtue of the existence of terminado- although we should make that option more explicit).

One could then envision command-line options of jupyter server --kernel-services or jupyter server --content-services or jupyter server --notebook-services (which is the same as just jupyter server) where each of these "service-oriented" options are mutually exclusive. Or functionality-based sub-commands could be implemented enabling commands like jupyter kernel-server and jupyter content-server.

Other services

In looking through the code, there are a number of endpoints that are not addressed in the default_services set. Things like 'bundler', 'terminal', and 'metrics' (prometheus) to name a few should be considered for placement, perhaps even in the required set (metrics seems logical for example).

General refactoring

One of the areas in which this issue came up is during server startup and the messages it produces...

[I 12:34:58.097 ServerApp] Serving notebooks from local directory: /Users/kbates/repos/oss/jupyter/notebook/dist
[I 12:34:58.097 ServerApp] Jupyter Server 0.2.0.dev0 is running at:
[I 12:34:58.097 ServerApp] http://localhost:8888/?token=c617532e143f1dba866ecc316e79eb0ed9dc52b501003cfe
[I 12:34:58.097 ServerApp]  or http://127.0.0.1:8888/?token=c617532e143f1dba866ecc316e79eb0ed9dc52b501003cfe
[I 12:34:58.097 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

If we make services functionally composable, we will need to address the messaging produced during startup. For example, if I don't run with contents services enable, then I shouldn't see the message about where notebook files are being served from. Likewise, if I'm not using kernels, then messages regarding kernels should not be displayed, etc. As a result, if we were to adopt the sub-command approach (not sure that's the correct term) then we could subclass ServerApp (or some BaseServerApp class) with classes like KernelServer and ContentsServer where these "apps" simply provide the set of services to its superclass. Applications could then extend those subclasses, etc. For example, with this approach, each subclass would implement its own running_server_info() method since each knows its set of functionality.

Inter-service dependencies

Some services may have dependencies on others. However, it's not clear to me if those dependencies actually consist of other services or are confined to just the managers from those services. For example, 'sessions' uses managers from both Kernels and Contents, but doesn't hit the endpoints exposed by those services. As such, those kinds of dependencies should be fine. If there are instances of one service (or its manager) hitting endpoints of another service, then when we form the functional grouping for that service, it needs to include the other service(s).

There's probably more to say about all this, but this seems like a good place to stop.

Comments, suggestions, concerns are welcome (please).

The text was updated successfully, but these errors were encountered:

kevin-bates · 2020-02-04T00:09:06Z

I've created two branches with some basic prototyping in place. The existing tests pass when the 'default' invocation is used.

service-composition-w-classes: converts the existing ServerApp class to BaseServerApp, then creates three subclasses: ServerApp (for serving notebooks), KernelServerApp (for serving kernels), and ContentsServerApp (for server content). I believe this approach has some advantages in that it allows more explicit control over what each class can do. However, as noted in the commit description, it may break the ability to extend the Kernel or Contents servers. We could probably move ServerApp into its original slot in the hierarchy, replacing BaseServerApp and letting KernelServerApp and ContentsServerApp derive from ServerApp, but I'm not certain that will address the server-extension issue on those subclasses.

To launch a kernel server, you'd use: jupyter kernel-server ...
To launch a contents server, you'd use: jupyter contents-server ...
Note: this branch doesn't implement the ContentsServerApp class yet.

service-composition-w-options: introduces the option server-mode={'notebooks', 'kernels', 'contents'} (default = 'notebooks'). This retains the single-class implementation, thereby probably not impacting the ability to create server extensions. This option probably most closely resembles the discussion above.

To launch a kernel server, you'd use: jupyter server --server-mode=kernels ...
To launch a contents server, you'd use: jupyter server --server-mode=contents ...

Based on these (quick) experiments, I think I'd lean toward using the options (--server-mode) approach (should we decide to do this).

echarles · 2020-02-04T06:58:48Z

Thx @kevin-bates. As a end-user only looking at the CLI, I would favor the service-composition-w-options with the server-mode flag. I like the idea that I would always use jupyter server command.

Scanning the diff, it seems like the w-options has less impact than the w-classes.

Now to better understand your quote I believe this approach has some advantages in that it allows more explicit control over what each class can do., it would be great if you could give an example of a feature that would not be possible to implement in the w-options case but would be well possible in the w-classes case.

kevin-bates · 2020-02-04T17:11:35Z

Hi @echarles - thanks for the comment/question. Yeah, I think w-options is probably more straightforward as well and you make a great point about what an end-user would prefer.

My comment regarding w-classes is that a subclassed approach gives one greater flexibility to diverge implementations. For example, server modes of 'kernels' or 'contents' may be more useful in 'as a service' offerings where they are inherently multi-user. As a result, they may require different means of authentication, etc. If anything touches shared handlers, things could get more difficult in the absence of subclasses that could be used to intercept the shared calls via overrides, apply their different behaviors and continue - either to the superclass method or bypassing it altogether.

One example came up just today in EG. The user would like to filter the list of running kernels by user. (They refer to /api/kernelspecs, but I'm fairly certain they're really talking about /api/kernels.) This kind of thing is really more about an extension on top of "kernel mode", but it I think it paints the picture. If "kernel mode" was implemented via subclassing, the subclass would have free reign to implement the desired endpoint - one that you wouldn't want to necessarily support/expose when running as a notebooks server.

I suppose a couple things come to mind. Users could extend the w-options implementation using subclasses themselves. That is, they could create a subclass that hardcodes server-mode to 'kernels'. Users could also create a server extension, where the server extension only loads if the server is started with --server-mode=kernels (assuming the load method has access to such information).

So, at the end of the day 😄, I suppose I've just argued that either way doesn't make a huge difference.

Zsailer · 2020-02-04T19:21:58Z

This is awesome, @kevin-bates. Thank you so much for writing out this proposal in a clear and thorough way. I may steal some of this text for documentation in the Jupyter Server.

A lot of this harkens back to jupyter/enhancement-proposals#31. It's great to see how our conversation has evolved over time.

Okay, here is a summary of my take:

I suggest we go the subclass route; however, I might implement it differently than you have above.

Instead of extending the BaseServerApp in each "mode" class, I would make these classes mixins (i.e. KernelsMixin, ContentsMixin, etc.). Each mixin includes the traits, managers, and methods that are specific to that mode. That way, anyone can compose a combination of mixins with BaseServerApp to create a server that suits their needs. It's a subtle difference, but I think it makes it less confusing for anyone composing new "ServerApps" from multiple modes. The inheritance tree becomes less intuitive to a developer when multiple "mode" classes are combined that have similar parent classes. What are your thoughts?

Other thoughts:

adding a required_services attribute. I'm not sure we need this. In principle, no services are ever required. For example—yes, auth is important, but it's not unreasonable that someone might want to create their own ServerApp without auth. I don't think we should prevent them from doing this, even if we don't encourage it.
other services: the missing services, like nbconvert, terminals, etc. were an oversight. They should be included in the ServerApp's default_services. I think we should still create mode Mixin classes for these pieces as well.
Log-messaging: I totally agree that we need to refactor our messaging logic to handle these changes. I think we can make this much more composable and flexible. Let's do it! :)
Inter-service dependencies: I am not aware of services that call other services through the REST API. There are likely frontend applications that do this (i.e. Notebook, Jupyterlab, etc.), but they will depend on the complete ServerApp. I think we can move forward without trying to tackle this problem in the server. If we need to add dependency injection tools later, we can.

Overall, I think the clear composability of the subclass design is more "pythonic" than the single entry-point + flag-mode switching. Yes, someone can introduce some back-door approach to composing classes with the single entry-point design, but IMO that breaks many of goals in the "Zen of Python" :)...

kevin-bates · 2020-02-05T00:51:53Z

Thanks for the response @Zsailer. Before I respond to your Mixin idea, I wanted to say that I agree with all your points on "Other thoughts". If folks wanted to create a ServerApp with none of the services available, who are we to say that shouldn't be possible?

Regarding Mixins, I'm having trouble visualizing how this would work. I believe you're suggesting, based on the "other services" response, that every service listed in default_services would be associated with a Mixin class, is that correct?

I'm also not following how the methods within the various Mixins would work themselves out. Would the mixin instance essentially be a container that exposes traits, managers and service names relative to itself and the handler and manager class implementations remain where they are?

For example, if the main application wanted to get its list of services, it would call something like self.get_services(), and all the Mixin classes would implement something like...

class ApiMixin():

    def get_services():
        return ['api'].extend(super(ApiMixin, self).get_services())

with BaseServerApp.get_services() being...

    def get_services():
        return []

so that given the class definition of...

class KernelServerApp(ApiMixin, AuthMixin, ConfigMixin, KernelsMixin, KernelspecsMixin, KernelResourceMixin, SecurityMixin, BaseServerApp):

A call to self.get_services() results in ['api', 'auth', 'config'...]

Also, for the code in init_configurables(), would the idea be that each Mixin that has "configurables" implement init_configurables() so the KernelServerApp class would implement its init_configurables() as...

    def init_configurables():
        super(KernelServerApp, self).init_configurables()

but could then go about using self.kernel_manager since the KernelsMixin holds the kernel_manager attribute?

In cases where the current module just needs to know if a given mixin is "in play", would it just ask isinstance(self, KernelsMixin)?

If what I've asked is your idea, this sounds interesting. I like the flexibility it introduces. On the other hand, it might be difficult for users to get the compositions correct. For example, they would need to know that KernelspecsMixin is required when KernelsMixin is selected. Some the content-related services would require that ContentsMixin be present, etc. Inter-mixin dependencies. 😄 But I suppose we could come up with some validation logic for these (few) circumstances.

kevin-bates · 2020-02-08T01:32:08Z

I've created branch service-composition-w-mixins to attempt to demonstrate ServerApp (and a KernelServerApp) implementation using mixins to express the composition. There were a few hurdles to overcome with this and I didn't like having to keep the class-based traits in the BaseServerApp class. I suspect there are ways to address this aspect of things but my traitlets fu just isn't strong enough.

Also, I could see how the "goal" of encapsulation via mixins could get messy if there are ever circular dependencies that don't manifest otherwise. The issue there is that the order the mixin is listed in the class definition matters in some cases, and that might get confusing.

I'm also seeing issues with aliases that I didn't deal with - especially after discovering those same issues exist in the service-composition-w-classes branch as well.

echarles · 2020-02-08T08:25:48Z

Also, I could see how the "goal" of encapsulation via mixins could get messy if there are ever circular dependencies that don't manifest otherwise. The issue there is that the order the mixin is listed in the class definition matters in some cases, and that might get confusing.

Quick comment to get the ball rolling... options to overcome such dependency issues by order of complexity and gain:

Document the dependencies and order?
Add a method _ensure_deps that would check for currently already loaded mixins.
Dependency Injection framework (limited to those Mixin classes).

kevin-bates added the enhancement label Jan 30, 2020

kevin-bates mentioned this issue Apr 23, 2020

kernel_manager_class configurable by a server extension? #207

Closed

kevin-bates mentioned this issue Jul 7, 2020

Integrate jupyter-archive #252

Closed

kevin-bates mentioned this issue Sep 6, 2021

Towards a new generic and composable server jupyter-server/team-compass#11

Open

Zsailer mentioned this issue Sep 1, 2022

Meeting Notes 2022 jupyter-server/team-compass#15

Closed

kevin-bates mentioned this issue Nov 21, 2022

New configurable/overridable kernel ZMQ+Websocket connection API #1047

Merged

5 tasks

kevin-bates mentioned this issue Nov 29, 2022

Extract out kernels #1104

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DISCUSS] Service Composition #178

[DISCUSS] Service Composition #178

kevin-bates commented Jan 30, 2020

kevin-bates commented Feb 4, 2020

echarles commented Feb 4, 2020

kevin-bates commented Feb 4, 2020

Zsailer commented Feb 4, 2020

kevin-bates commented Feb 5, 2020

kevin-bates commented Feb 8, 2020

echarles commented Feb 8, 2020