Proposal cylc authorisation #125

datamel · 2021-04-26T12:56:37Z

Proposal for UI-Server authorisation.
Proof of concept PR : cylc/cylc-uiserver#204

datamel · 2021-04-27T12:37:32Z

Converting this to a draft as the config section in the proposal requires additional thought.

hjoliver

Looks really good so far @datamel, nice and clear. A few things to discuss at the next meeting, if not resolved by then.

…-admin into proposal-cylc-authorisation

hjoliver · 2021-05-05T01:38:47Z

I took another pass through this, still looks great 👍

The only thing I'm wondering about is the current read/write/execute model, where execute permission (scheduler start, stop) implies write permission (task mutations). Arguably that should be the other way around, or perhaps separate with no inheritance of one from the other. If you can start my workflow it will run exactly what I configured it to run. But if you can execute task mutations on my running workflow you can change what actually runs (potentially drastically).

dwsutherland · 2021-05-07T02:21:32Z

Yeah, need something like read/play/edit

Where play/operate would be stop/play/trigger ...etc, and write would be file-manipulation(viewing?)/trigger-edit/broadcast

UI Server back-end code should prevent users manipulating client-side code to bypass authorisation. For example, sending a GraphQL query for a different operation to that stated in the operation name parameter.

As mentioned elsewhere, we can achieve fine grained authorisation via middleware. For example:
https://docs.graphene-python.org/en/latest/execution/middleware/#example
where can write any function into this, and have all information needed here (even the cached authorisation config if so desired).. And it should be the first middleware that is executed (the last thing is always the resolver fetching/command-run). i.e.

        return self._create_handler(
            path,
            clazz,
            schema=schema,
            resolvers=self.resolvers,
            backend=CylcGraphQLBackend(),
            middleware=[AuthorisationMiddleware, IgnoreFieldMiddleware],
            **kwargs
        )

https://github.com/cylc/cylc-uiserver/blob/1c194fdf59983d009f9afe10a2e3091fce9fc746/cylc/uiserver/main.py#L364

The reason we should use Graphene's inbuilt middleware, is because request object methods (like get_graphql_params in your PoC) only give you the operation i.e. query, mutation, subscription. And you'd need to parse the graphql doc from there, because you can run something like this:

mutation {
  pause (workflows: ["fox/run1"]) {
    result
  }
  kill (workflows: ["fox/run1"], tasks: ["*"]) {
    result
  }
}

At the middle level you could allow pause and do nothing (or return unauthorised) on kill (or w/e combination).
However, I'm not entirely sure how the UI will get the information to grey-out functionality.. We may just have to use a UIS query or separate end-point, as GraphQL introspection probably won't send different content for different users (?).

I've tried a number of methods for fetching group membership for the authenticated browser users i.e. return the same groups as I see using groups on the Linux command line, which should be both local groups and remote via SSSD/LDAP.

I don't think we need to tether the grouping of our UIS authorisation to the system groupings (as the HPC/System admins may have their own grouping ideas), but allow for users to decide which other system users can do with their suites. At NIWA, I don't have write permission or are part of the same file system grouping as our operational role accounts (but have sudo access to assume their role). Obviously the config should be user write only (0644), and this should be checked when loading it in.

It will essentially allow you to give full permission of your user and it's capabilities (as with edit/execute permission, file/job edit and broadcast can be made to do anything you like), so to please HPC/systems/security folk, you'd need to have a max giveaway configurable at the site level (defaulted to read).

Not directly related to this proposal, although perhaps worth bearing in mind is access on the command line.

Once the CLI can go through the proxy => hub => UIS, we'll be covered 👍

Nice work!

datamel · 2021-05-07T08:46:38Z

Thanks so much for the feedback @dwsutherland. I had investigated middleware and moved away from it initially, due to other things needing authorisation (not just graphql queries/mutations). Thanks for your example, I shall have another look.
The POC is a little outdated (as the proposal has evolved), I will look at updating it and this proposal.

docs/proposal-multi-user-approach.md

datamel · 2021-05-25T09:21:09Z

So far, we have the following options.
It would be good to get some final thoughts from you all. Perhaps I need to extend my PoC for all 3?

Option A: Decorating handler actions

Adapt my current authorization PoC so that it parses the query body to get all operations in the graphql. I had originally identified this as something we would need to do (was aware that we could not rely on operation name alone):

# TODO Should also check the query e.g. it starts with 'mutation pause'
# For security in case an operation name is sent different to the query
# Hence fetching below

I still feel strongly that the responsibility of authentication and authorization should sit with the handler layer and prevent unauthorised access as early as possible. We expose more code, the further down we check authorization, which is not great if we’re also being a bit hacky with said code to make it do auth elsewhere.

Option B: Middleware

As @dwsutherland rightly suggests, we could place the auth in graphene middleware. Although we do not have immediate access to the authenticated user at this point.
So, to do this we’d push additional properties onto the middleware object in the handlers. I have tested this (obviously just hard-coded to the second element but we can find it via object type).
For the subscription handler via the subscription server field e.g.

           self.subscription_server.middleware[1].current_user = self.current_user
           self.subscription_server.middleware[1].user_auth_config = self.user_auth_config

For the graphql handler via the middleware list:

    self.middleware[1].current_user = self.current_user

This worked fine, although we would still need to parse the graphql in the middleware to determine the operations.
This would only cover authorisation for graphql and not the static files, the main UI handler or the user profiler handler. It would also mean having 2 independent authorisation solutions, both needing maintenance etc.

Option C: Decorating the resolver functions

As per @oliver-sanders thoughts that we should avoid additional parsing of the query, which we need to do to discover operations in a query.
My instincts when I first picked this work up was to work with the resolvers themselves but I was told that only the UI server code should be touched. So I have not had a chance to prove this.

I’m happy to give it a go for another PoC but there’s a few points for discussion first:
• Do we really want to pass authentication that far down from the handler? Feels wrong to raise a web.HTTPError(403) from cylc flow code, which shouldn’t have ‘knowledge’ of the website side of things.
• Might be hacky getting user details that far (not proved it yet).
• Wouldn’t we be doing additional processing over every resolver being hit (until an unauthorized operation is reached)? - surely this would tip the argument in favour of option A which, granted, has drawback of additional parsing of the query but all three options have some additional processing debt.
• As Option B: This would only cover authorization for graphql and not the static files, the main UI handler or the user profiler handler. It would also mean having 2 independent authorization solutions, both needing maintenance etc.

Also worth noting, I have looked into whether we can parse the request at handler and then continue passing this through, ready parsed, but the graphene layer parses the request regardless.
https://github.com/graphql-python/graphene-tornado/blob/2800ff581fd2edd665d2cf996e1601e7e892cb53/graphene_tornado/tornado_graphql_handler.py#L171.
I could put a pr into graphene to skip when it receives an already parsed request but I suspect it would not be accepted as that is major code change which could have a knock on effect for other users. So, I feel if we go with this option we are stuck parsing twice.

On a related point, if a user requests multiple mutations in one request, and they are authorised for e.g. 2 out of 3, do we want…
a) Two to be allowed and the third to be blocked and logged
b) The entire request to be blocked and logged, i.e. all 3 mutations denied?
I'm thinking most likely a) but thought I'd double check.

oliver-sanders · 2021-05-25T14:13:26Z

I can see the temptation to parse the GraphQL document up front but I'll chuck in a few comments to show how this could be vulnerable.

This approach depends a lot on our ability to replicate the internal workings of Graphene as closely as possible and to keep up with their internal implementation. If we lapse the approach might not be quite as iron-clad as we would like it to be.

In parsing the document upfront we are trying to guess what endpoints that the request would hit rather than waiting for the server to tell us what endpoints are being hit. If we guess wrong then we are in trouble, here are some conditions that could break our guesses:

We parse the document differently to the Graphene internals.
We don't interpret the results correctly.
The attacker finds a way to disguise a mutation or change its identifier in the request.
The Graphene internal logic changes in a future version (we don't have control over what version the user installs).
The GraphQL language evolves providing new representations of mutations.
A bug in graphene causes something to be called which wasn't supposed to be.

Then, without any protections further down the chain the entire authorisation layer could be sidestepped.

In a conventional HTTP(s) server (like the Cylc 7 suite server) the server routes each request based on the URL, any auth is performed at the endpoint. In a GraphQL server the routing is based on the GraphQL document, the endpoints are the resolvers.

I found an authorisation framework for GraphQL (2.8k⭐️) called graphql-shield, sadly it's Javascript (like all the GraphQL goodies) so we can't use it, however, worth noting they went down the middleware approach.

hjoliver · 2021-05-26T05:43:43Z

I agree with @oliver-sanders on this.

dwsutherland · 2021-06-09T00:25:41Z

I agree with @oliver-sanders too, we should not be parsing the GraphQL document/object ourselves.. For all the stated reasons.

However, we really need to be sure of what information we have and what we can do with GraphQL already, before ruling things out.. So just some clarifications on Option B:

Although we do not have immediate access to the authenticated user at this point.

Can't we just pass this in via info/context? You simply set fields on info.context.
It may even be simpler than that, because the whole request object is also available via info.context['request'] (or the like).

This worked fine, although we would still need to parse the graphql in the middleware to determine the operations.

No (sorry to be blunt). This information is all available (info.operation.operation, info.operation.name.value.. info.operation more generally), no parsing.
For example:

class IgnoreFieldMiddleware:
    """Set to null/None type undesired field values for stripping."""

    # Sometimes `next` is a Partial(coroutine) or Promise,
    # making inspection for know how to resolve it difficult.
    ASYNC_OPS = {'query', 'mutation'}

    def __init__(self):
        self.args_tree = {}
        self.tree_paths = set()
        self.field_sets = {}

    def resolve(self, next, root, info, **args):
        """Middleware resolver; handles field according to operation."""
        # GraphiQL introspection is 'query' but not async
        if getattr(info.operation.name, 'value', None) == 'IntrospectionQuery':
            return next(root, info, **args)

        if info.operation.operation in STRIP_OPS:
            path_string = f'{info.path}'
.
.
.

There's all the information we need here. i.e. info.field_name, info.path... etc

I think we need two levels:

Top level: Handlers .. etc.
GraphQL: Middleware

Also:

In a GraphQL server the routing is based on the GraphQL document, the endpoints are the resolvers.

Middleware is essentially a wrapper to all resolvers, including the default resolver (not defined in our code) which reads the data from root argument:
https://github.com/graphql-python/graphene/blob/master/graphene/types/resolver.py
So Middleware is like a decorator, without all the effort of decorating (and defining our own decorated default resolver)

…d ADMIN

oliver-sanders

All looks good to me.

Just one little curveball that you may have already spotted, and which we were aware before we started on this. Commenting it here for the record and also as a reminder that we should add a note of some form to the documentation when we write it.

These site-configured limits can be hacked and overwritten by users.

There are at least two Cylc mechanisms by which the limits could be hacked from user-space:

Setting CYLC_SITE_CONF_PATH, writing your own site config and injecting it:
- In the spawner args (user config).
- In shell profile files (unless Cylc is deployed in a venv of some form with a wrapper script).
- On the cli (i.e. starting the UIS manually).
Monkeypatching the UIS code from the config file.
- Because the config file is in Python and you can change anything in Python.
- We could switch to a JSON config file, but, Jupyter server supports Python config files and we can't turn that support off.

These sound like security bugs, but they are not. Preventing these mechanisms from being exploited would be pseudo security. There are plenty of non Cylc-specific mechanisms too:

Monkeypatch the UIS code from the PYTHONSTARTUP file.
- This applies to ALL Python programs!
Wrap the Cylc UIS code to create your own uis command...
- ...and get the spawner to run this in the user config.
- ...and patch it by setting PYTHONPATH.
I'm sure many OS level approaches.

On the surface this might sound bad, but it isn't because:

The UIS runs in user space, users have control over this, there is nothing Cylc can do to stop them.
Users can always give away full access to their accounts by sharing credentials (short of biometric security even 2FA can't prevent this).
The only way to truly achieve this would involve running the entire UIS under a site-controlled privileged account.

So in the same way that "not giving away your user credentials" or "not using infrastructure for malicious purposes" are enforced as human policies "not hacking Cylc to fiddle the site-defined limits" is a human factor too.

docs/proposal-multi-user-approach.md

hjoliver · 2021-06-09T23:10:20Z

docs/proposal-multi-user-approach.md

+
+We will need to consider the desired behaviour if a user appears twice with different defaults and limits set, this is probably most likely to occur when a user appears in either multiple groups or in a group and as a user.
+
+Proposal: all permissions are additive, if user appears elsewhere in config, the permission level should be taken as the greatest possible. Note, default in site config does not contribute to permissions, unless user does not appear in ui server owner's config.


That's sensible. If the higher level of permission appears in the config, then no one can rightfully complain about us granting it (even if lower levels are explicitly granted as well).

docs/proposal-multi-user-approach.md

hjoliver · 2021-06-09T23:20:01Z

docs/proposal-multi-user-approach.md

+
+Some of the below are not currently available to users but including them here for consideration. This is also a fairly substantial list, which makes the case for read and admin pre-set access groups which would be easier for users and sites to configure, rather than defining a long list of mutations per user/group.
+
+As a springboard for discussion, defaults could be assigned as follows:


Looks to me like you've nailed it, as-is.

docs/proposal-multi-user-approach.md

hjoliver · 2021-06-09T23:30:00Z

Excellent diagrams 🎉

docs/proposal-multi-user-approach.md

hjoliver

Nice proposal 👍 Just a handful of minor comments and suggestions.

docs/proposal-multi-user-approach.md

Update docs/proposal-multi-user-approach.md Co-authored-by: Hilary James Oliver <hilary.j.oliver@gmail.com> Update proposal with READ/CONTROL decision

oliver-sanders

👍

dpmatthews

Several minor changes needed now ALL is not the same as READ + CONTROL

docs/proposal-multi-user-approach.md

hjoliver · 2021-06-17T01:16:11Z

🎉

datamel added 3 commits April 26, 2021 12:55

UI Server Authorisation Proposal

f25f599

Add Auth Diagram

e174859

tidy

a54f988

datamel self-assigned this Apr 26, 2021

hjoliver self-requested a review April 27, 2021 05:40

update config load and enchancements

942ee6d

datamel marked this pull request as draft April 27, 2021 12:37

hjoliver reviewed Apr 28, 2021

View reviewed changes

datamel added 6 commits May 4, 2021 22:27

Add Auth Images

fed25b0

Add files via upload

ae2887d

Config proposal changes

fe29396

Merge branch 'proposal-cylc-authorisation' of github.com:datamel/cylc…

81a7071

…-admin into proposal-cylc-authorisation

Add diagrams

1f7f369

add use case for auth

5c63a53

datamel marked this pull request as ready for review May 5, 2021 01:35

datamel requested a review from dpmatthews May 5, 2021 01:35

Apply suggestions from DM

1bd7ffd

datamel commented May 11, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

dpmatthews reviewed May 12, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

dpmatthews reviewed May 12, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

Tidy example config

5f89f1e

datamel requested a review from oliver-sanders May 14, 2021 13:58

Add examples, change proposal wrt default inheritance between READ an…

0ab1272

…d ADMIN

oliver-sanders reviewed Jun 9, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

hjoliver reviewed Jun 9, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

hjoliver reviewed Jun 9, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

hjoliver reviewed Jun 9, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

hjoliver reviewed Jun 9, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Outdated Show resolved Hide resolved

hjoliver reviewed Jun 9, 2021

View reviewed changes

docs/proposal-multi-user-approach.md Show resolved Hide resolved

datamel added 2 commits June 10, 2021 10:50

Update diagram

2a42d24

ADMIN to CONTROL and amendments from review

9a675c0

Update docs/proposal-multi-user-approach.md Co-authored-by: Hilary James Oliver <hilary.j.oliver@gmail.com> Update proposal with READ/CONTROL decision

datamel force-pushed the proposal-cylc-authorisation branch from 9d98014 to 9a675c0 Compare June 10, 2021 09:51

update access group table

f50e4a0

oliver-sanders approved these changes Jun 16, 2021

View reviewed changes

dpmatthews reviewed Jun 16, 2021

View reviewed changes

Review Feedback

09ea3f1

hjoliver approved these changes Jun 17, 2021

View reviewed changes

hjoliver merged commit 5a995cc into cylc:master Jun 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal cylc authorisation #125

Proposal cylc authorisation #125

datamel commented Apr 26, 2021

datamel commented Apr 27, 2021

hjoliver left a comment

hjoliver commented May 5, 2021

dwsutherland commented May 7, 2021 •

edited

Loading

datamel commented May 7, 2021

datamel commented May 25, 2021

oliver-sanders commented May 25, 2021

hjoliver commented May 26, 2021

dwsutherland commented Jun 9, 2021 •

edited

Loading

oliver-sanders left a comment

hjoliver Jun 9, 2021 •

edited

Loading

hjoliver Jun 9, 2021

hjoliver commented Jun 9, 2021

hjoliver left a comment

oliver-sanders left a comment

dpmatthews left a comment

hjoliver commented Jun 17, 2021


		We will need to consider the desired behaviour if a user appears twice with different defaults and limits set, this is probably most likely to occur when a user appears in either multiple groups or in a group and as a user.

		Proposal: all permissions are additive, if user appears elsewhere in config, the permission level should be taken as the greatest possible. Note, default in site config does not contribute to permissions, unless user does not appear in ui server owner's config.


		Some of the below are not currently available to users but including them here for consideration. This is also a fairly substantial list, which makes the case for read and admin pre-set access groups which would be easier for users and sites to configure, rather than defining a long list of mutations per user/group.

		As a springboard for discussion, defaults could be assigned as follows:

Proposal cylc authorisation #125

Proposal cylc authorisation #125

Conversation

datamel commented Apr 26, 2021

datamel commented Apr 27, 2021

hjoliver left a comment

Choose a reason for hiding this comment

hjoliver commented May 5, 2021

dwsutherland commented May 7, 2021 • edited Loading

datamel commented May 7, 2021

datamel commented May 25, 2021

Option A: Decorating handler actions

Option B: Middleware

Option C: Decorating the resolver functions

oliver-sanders commented May 25, 2021

hjoliver commented May 26, 2021

dwsutherland commented Jun 9, 2021 • edited Loading

oliver-sanders left a comment

Choose a reason for hiding this comment

hjoliver Jun 9, 2021 • edited Loading

Choose a reason for hiding this comment

hjoliver Jun 9, 2021

Choose a reason for hiding this comment

hjoliver commented Jun 9, 2021

hjoliver left a comment

Choose a reason for hiding this comment

oliver-sanders left a comment

Choose a reason for hiding this comment

dpmatthews left a comment

Choose a reason for hiding this comment

hjoliver commented Jun 17, 2021

dwsutherland commented May 7, 2021 •

edited

Loading

dwsutherland commented Jun 9, 2021 •

edited

Loading

hjoliver Jun 9, 2021 •

edited

Loading