Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit of new auth example, including todo + spec #3777

Merged
merged 18 commits into from
Oct 14, 2020

Conversation

JedWatson
Copy link
Member

There's a bit going on in this PR.

Primarily, I want to add an example project that specifically demonstrates and documents how to use the new auth package. This actually includes quite a bit of access control example as well; that makes sense for now because they work together, but in the future I expect we'll fork this into two projects (one demo'ing auth, with minimal access control, and a more comprehensive access control demo with a simpler auth setup)

Secondarily, I think I've finished spec'ing what "complete" looks like for both access control and field modes in the new interfaces. That includes using the new types landed in #3774 and adding some fantasy API demos.

Finally it includes quite a bit of TODO documentation, describing what we need to do to get all this working (side note: we really need to revisit the todo markdown file in the auth package, it's now quite out of date, but I'm calling that out of scope for this PR)

In particular, I've written up a lot of notes on how I think we should change the access control arguments. We have to make breaking changes here anyway because of the new session stuff, so it's probably a good time to review these and consolidate / clean up some stuff.

That might actually include a change to how access control is invoked for mutations that operate on multiple items, which I'd really like @timleslie's input on. As far as breaking changes to the main keystone package(s) go it's probably not a painful one for anyone using keystone, but I'm not sure and it needs discussion.

I'd really like @mitchellhamilton @molomby @Noviny and @timleslie to review this example (and the comments / docs / spec) and give feedback now that we can see how it's all coming together 🙏

@changeset-bot
Copy link

changeset-bot bot commented Sep 24, 2020

⚠️ No Changeset found

Latest commit: 301dbff

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@JedWatson
Copy link
Member Author

Oh and a note; this will fail listing because there are deliberate type errors. They're not going away until we either fix the types (and underlying implementation as I've described) or update the spec I've written up.

lists,
session: withItemData(
// Stateless sessions will store the listKey and itemId of the signed-in user in a cookie
statelessSessions({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to be "stateless"? Are these JWTs? How are they expired?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is hard 😛

Stateless -- yes, they're effectively JWTs. There are a lot of different approaches to expiry and we don't actually encode an opinion. Basically the data (in this case { itemId, listKey } is stored in an encrypted cookie that we decrypt to hydrate the initial session value. (An alternative being, which needs its own example, using the cookie to store the ID of a session in a data store -- we've added a redis adapter for this so far in the new interfaces)

In this actual implementation though, withItemData is taking that cookie and querying the database, to hydrate the item. It will clear the session (i.e read it as undefined, not actually clear the cookie) if the item can't be found or doesn't validate.

So this is basically working the same way Keystone does now, except each of these decisions is a composable piece of functionality. You could use the stateless sessions without the item hydration; or you could use a session database to store the itemId and listKey and use hydration; or any other combo.

We've tried to make how it currently works easy for people to use these high-level functions to compose as if it's config, but internally these are all independent things working together so it's much more flexible.

Copy link
Contributor

@jesstelford jesstelford Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all makes sense.

I worry though about the security implications of having a forever-alive token. As long as the itemID is valid, that token is forever-valid and can never be expired.

Actual JWTs at least have an exp key which forces the JWT processor to act as if the token is invalid after that expiry time.

That's not to mention the problem with JWTs in general with respect to state & expiration, which I outline with a strong sarcastic tone in this Twitter thread:

JWTs are a really great idea!
... Until they meet reality; then you realise they're useless.

JWTs are stateless! No more storing current sessions in a database! Hurray!
Oh, you want to logout? Well, you can't invalidate a JWT, so you'll have to blacklist its ID.
Where is that blacklist stored? Why, in a database of course!

JWTs can expire! Set a short TTL so even if one leaks, it wont by useful after 5 min.
Your users hate logging in every 5 mins? Well, you can provide a refreshToken! That'll get you a new JWT!
When does that refreshToken expire? Never, of course! Infinite JWTs for all!

The argument I've heard for this one is; The refreshToken is like a CSRF token; it's sent with every auth request and can only be used once.
How do you know if it's been used? You store it on the User table of course!
Oh, you want multi-device login... ¯\_(ツ)_/¯

Hint: The answer is "A session table".

For a different perspective with a more balanced view (and less sarcastic tone), see @molomby's comment here: #170 (comment)

At any rate, the change away from session based to token based is a large one that can have far reaching impacts on the security and auth model of Keystone apps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Stateless" is a terrible/completely wrong name but +1 that naming is hard in this context.

*
* Or maybe my understanding is wrong? This is what's implied by the documentation (see
* https://www.keystonejs.com/api/access-control#field-level-access-control) but then if it's not
* the case I don't see how `existingItem` works in mutations that operate on multiple items.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API here should be existingItems (note the plural), which then allows looking at the items to determine if access is allowed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure (haven't checked yet) if that's a bug or the docs are wrong... @mitchellhamilton did say that he thought the field access control is resolved differently to list access control but we want to check with @timleslie about it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick search on GitHub turned up this: https://github.com/keystonejs/keystone/pull/3241/files /

async updateManyMutation(data, context, mutationState) {
const operation = 'update';
const gqlName = this.gqlNames.updateManyMutationName;
const ids = data.map(d => d.id);
const extraData = { gqlName, itemIds: ids };
const access = await this.checkListAccess(context, data, operation, extraData);
const existingItems = await this.getAccessControlledItems(ids, access);
const existingItemsById = arrayToObject(existingItems, 'id');
const itemsToUpdate = zipObj({
existingItem: ids.map(id => existingItemsById[id]),
id: ids, // itemId is taken from here in checkFieldAccess
data: data.map(d => d.data),
});
// FIXME: We should do all of these in parallel and return *all* the field access violations
await this.checkFieldAccess(operation, itemsToUpdate, context, extraData);
return Promise.all(
itemsToUpdate.map(({ existingItem, id, data }) =>
this._updateSingle(id, data, existingItem, context, mutationState)
)
);
}

I was wrong in my assertion that it's a plural. It's still singular, but gives an array of things to check by the looks of it.

*
* This seems like a real foot-gun, and I'd prefer that they're implemented consistently, in favor
* of aligning on FieldAccessInput and resolving the function once per item being updated in the
* case of mutations that operate on multiple items.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the reasoning behind why ListAccess seems to be resolved once per mutation for mutations that operate on multiple items, while FieldAccess seems to be resolved once per item

Because there's only one list, but there are multiple items. No need to re-resolve access on the list level over and over again when the result will be the same every time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That "resolve the list level" thing makes sense for create, but this is more about "item" and "field" level access -- if you're looking at the item data at all for an update check, with multiple itemIds, how do you say "some of these but not others"? does the entire operation fail? how do we return which items you don't have access to? There may be answers but they're not in our docs.

And I would imagine it leads to more complex access control functions because you always need to code for either single or multiple itemId(s) being provided and handle each case correctly.

I think it's simpler and much safer to only write these functions to check a single item... and call it one time each for multi-item mutations. (and by safer, access control functions are really important for developers to get exactly right, so the bar for "it works how you think it works" is really high here)

Arguably this is less optimised when you don't need to access item data to resolve the access, but then again if you're checking anything on the item data, you have to resolve it for each item anyway, and I'd go for safety over optimisation here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we're talking about the same thing 👍

On a updateMany style mutation, it will check list-level access once, and will check field-level access once per item to be updated.

* I also have questions about `originalInput` in the case of nested mutations (e.g creating an
* item as the input to a relationship field). Is it the whole "original" input? Or just the nested
* input for the relationship? Seems like it would have to be the second, but it's not clear from
* the documentation. Is this consistent with how the `gqlName` argument works?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originalInput is the entire input object as passed directly into the graphql mutation, regardless of what nested operation is running.

The point here is to give access to that input before it's modified by defaultValues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you'd need to do an incredibly large amount of work to use that then.

If I'm checking input as part of resolving create access control for the user list, in this mutation:

mutation {
  createPost(data: {
    title: "Hello World"
    author: {
      create: {
        name: "Access Hack"
        isAdmin: true
      }
    }) {
    id
  }
}

If what I'm getting is the entire input, I need to check for all the ways users can be created.

Or maybe I'm misreading it and the input would be what's passed to author { create: { /* this input */ } } in which case that makes a heap more sense 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the second (author { create: { /* this input */ }) 👍

Comment on lines 40 to 41
* tbh I'm inclined to remove the gqlName argument unless there's a compelling reason to have it;
* esp. given it's marked as optional.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gqlName argument is useful to know the difference between calling allUsers vs _allUsersMeta, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that actually deliver value? You'd need to (I imagine) handle every case for the top level gqlName if it was important in any way... ie you might resolve permissions for users through a relationship to content, in which case the top-level query name might be allPosts 🤷‍♂️

If distinguishing between querying items and meta is important, we could pass that in, but I also don't really get the use case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe I've misinterpreted the docs:

The name of the query or mutation which triggered the access check.

But still, then in the following query don't I need to account for the fields that trigger the query?

e.g if I'm writing an access check on users, and the query looks like:

query {
  allPosts {
    author {
      name
    }
  }
}

Is the gqlName allPosts / author or something else?

or in this case:

query {
  allUsers {
    _postsMeta {
      count
    }
  }
}

What's the gqlName for the Post list access check? allUsers / _postsMeta etc? in either case this seems super hard because you're encoding knowledge of all the relationships into access control.

@JedWatson JedWatson merged commit d9b2e59 into master Oct 14, 2020
@JedWatson JedWatson deleted the new-interfaces-auth-example branch October 14, 2020 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants