Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defending against query selector injections for v6.0 #10430

Merged
merged 6 commits into from
Jul 29, 2021
Merged

Defending against query selector injections for v6.0 #10430

merged 6 commits into from
Jul 29, 2021

Conversation

vkarpov15
Copy link
Collaborator

2 different approaches for defending against query selector injections in Mongoose 6:

  1. Add a sanitizeFilter option, like sanitizeProjection, that removes all query selectors from queries by default, unless you wrap the query selector in a querySelector() function call.

For example:

const User = mongoose.model('User', mongoose.Schema({ name: String, password: String }));

await User.findOne({ name: 'Val', password: { $ne: null } }).setOptions({ sanitizeFilter: true }); // Throws a CastError

await User.findOne({ name: 'Val', password: mongoose.querySelector({ $ne: null }) }).setOptions({ sanitizeFilter: true }); // Works!

The upside is that it is easy for us to add sanitizeFilter as a global option. But then any queries that use query selectors, even legitimate ones, will break.

  1. Add Query#eq(), and allow passing an object to Query#eq(), Query#lt(), etc.

For example:

const User = mongoose.model('User', mongoose.Schema({ name: String, password: String }));

await User.findOne().eq({ name: 'Val', password: { $ne: null } }); // Throws a CastError

await User.findOne({ password: { $ne: null } }).eq({ name: 'Val' }); // Works!

I was working on approach (1) for a while, but I recently came up with approach (2). The downside of approach (2) is that it requires query chaining and isn't easy to "turn on" with one global option. But I think it is a less obtrusive approach for handling mixed trusted and untrusted data, like:

await User.findOne({ isDeleted: { $ne: true } }).eq(req.query);

// versus:
await User.findOne({ mongoose.querySelector({ isDeleted: { $ne: true } }), ...req.query });

What do you think @IslandRhythms @AbdelrahmanHafez , (1) or (2), or both, or neither?

@vkarpov15 vkarpov15 self-assigned this Jul 7, 2021
Copy link
Collaborator

@IslandRhythms IslandRhythms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think option two would be better since option 1 has the potential to break people's code from a standard use.

@taxilian
Copy link
Contributor

taxilian commented Jul 8, 2021

I have a few thoughts:

  1. Since this will break code in unexpected ways, it probably can't be on by default, but it might make sense to have a warning on startup at least when in dev mode if it isn't either enabled or explicitly disabled -- otherwise people just won't know about it and may think it's on.

  2. Once it is on, the expectation should absolutely be that it will break any existing code that doesn't specifically account for it; thus I prefer option (1) but think we should do as much as possible to make it easy to update "trusted" code to work with the new method. Perhaps add a .trusted() chained method on Query (like .lean()) which if added indicates "this query is trusted and does not need to be sanitized" -- that makes it easy to go through and with a small change make everything work. You could have an alternative syntax where if .trusted() tells it that the whole query is trusted, but .trusted(query) tells it to trust just the part of the query which is passed in to that function, so you can have a portion of the query where you put user-generated input and another part where you put your "trusted" query.

  3. Whatever error messages are generated need to be as clear as possible and tell people where to find information on fixing the problem. Goes without saying, but I said it anyway =]

Examples of point 2:

// Everything in the query is marked as trusted
const allTeachersAndStudents = await UserModel
    .find({type: {$in: ['teacher', 'student']}})
    .trusted() // Mark full query trusted
    .exec();

// The first portion is sanitized, second portion trusted
const teacherOrStudentCur = await UserModel
    .find({username: req.param.username}) // sanitized portion
    .trusted({type: {$in: ['teacher', 'student']}}) // "trusted" portion
    .exec();

Copy link
Collaborator

@AbdelrahmanHafez AbdelrahmanHafez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think people will more likely use option 2 in their existing applications, but I think providing both options and letting people choose what they prefer is a good idea.

A third option would be to set the option on a schema path which will sanitize that path by default unless mentioned otherwise.

const userSchema = new Schema({
  username: { type: String, sanitizeFilter: true },
  password: { type: String, sanitizeFilter: true },
  age: Number
});

@AbdelrahmanHafez
Copy link
Collaborator

The following is a rant regarding how we protect ourselves from that attack:

In general, we categorize our APIs in two different categories:

  • APIs that take actions on documents (POST/PUT).
    For those APIs, we strictly validate the user input using Joi (I'm a big fan), the input is usually a documentId and other necessary data for processing the action, such as the customer address, preferred delivery time, etc.

We take the documentId, merge it with the userId from the already verified access token, and that's our filter, so users can't directly pass the filter in this case.

  • APIs that GET documents:
    We try to make our GET APIs as flexible as possible, so we allow users to pass any filter they want, and we just add the userId to verify that a user can only access documents they own, however, there's a risk with that approach that's explained with the following example:

We have an items collection, each item has a cost and price field, the cost is how much we bought it from a seller, and the price is how much we're selling it to the customer. We don't want to show the customer the cost so we .select('-cost');.

Someone with enough time can get the cost even though we're not selecting it by bruteforcing the filter:
unitCost: { $gt: 50, $lt: 100 }, and now we know that the items coming back from this API call have cost between 50 and 100, in this case we can add sanitizeFilter: true to the cost schema path as an option, which will make it more difficult but technically possible by just passing unitCost: 50, so even sanitizeFilter won't work in this case. My best option here would be to

delete query.cost
but then again someone smart enough can send the following query $and: [{ cost: 50 }] which won't get caught by delete query.cost.

@vkarpov15 vkarpov15 merged commit cebb0d1 into 6.0 Jul 29, 2021
@Uzlopak Uzlopak deleted the gh-3944-2 branch June 14, 2022 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants