Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessible by strategy: Exists subquery #691

Conversation

kaspernj
Copy link
Contributor

The new subquery strategy is great, but doing a "WHERE id IN (...)" can be slow, because Postgres will resolve the ID's on the right first, so if you have a big table that is going to take a lot of juice.

"WHERE EXISTS (...)" should be a lot faster, because it will already know the related ID's and only sub query those instead of resolving them first through the given sub query.

There is a problem however. It's difficult to alias the table properly in ActiveRecord. I have gotten around that by doing two exist-queries just to make it possible to reference the original table in the exists-subquery. It's still much faster than doing a "WHERE id IN (...)" however and it solves my own performance issues.

When I try to run specs with the following command I get an error: uninitialized constant CanCan::ModelAdapters::ActiveRecord5Adapter

Do I need to do something special to get the specs up and running?

@ghiculescu
Copy link
Contributor

Do I need to do something special to get the specs up and running?

Rebase against master, now that Travis has been replaced by Github Actions #707

@kaspernj kaspernj marked this pull request as ready for review May 13, 2021 07:59
@kaspernj
Copy link
Contributor Author

Rebase against master, now that Travis has been replaced by Github Actions #707

I still get the same error when trying to run the specs locally. I have tried running bundle exec rake and bundle exec rspec but both result in the same error unfortunately :'-(

@kaspernj kaspernj force-pushed the accessible_by_strategy/double_exist_subquery branch 9 times, most recently from 5d11842 to 58c7204 Compare May 13, 2021 09:06
@kaspernj kaspernj force-pushed the accessible_by_strategy/double_exist_subquery branch from 58c7204 to bd4fb38 Compare May 13, 2021 09:08
end
@model_class.where(@model_class.primary_key => inner)

build_joins_relation_subquery(where_conditions)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactor was needed to deal with Rubocop complexity offences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably, at some point, extract these strategies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coorasse I have extracted all strategies as separate classes (also to solve Rubocop complexity issues). What do you think?

@model_class.where("EXISTS (#{double_exists_query_sql(where_conditions)})")
end

def double_exists_inner_query(where_conditions)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the code has gotten less abstract having to use this many helper methods, but I couldn't otherwise find a way around the Rubocop complexity offences :'-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly might get rid of rubocop and switch to standard soon :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I feel that Rubocop has forced me to make the code more abstract than it had to be here :-)

@coorasse
Copy link
Member

I really like this PR, and I really like the fact that we start working on different strategies (SQL) to fetch the records. cancancan cannot always know what's the best strategy, so the only way we have is to allow users to use custom strategies.

Being said that. I'd like to include this new strategy. I want to first pull the branch and see if I can add more tests for more cases

@coorasse coorasse self-assigned this Jun 18, 2021
@kaspernj
Copy link
Contributor Author

@coorasse That sounds great. We cannot live without this strategy in the project I am working on, which involves a lot of really big tables with tons of data :-)

@kaspernj kaspernj changed the title Accessible by strategy: Double exist subquery Accessible by strategy: Exist subquery Jun 24, 2021
@kaspernj kaspernj changed the title Accessible by strategy: Exist subquery Accessible by strategy: Exists subquery Jun 24, 2021
@kaspernj
Copy link
Contributor Author

kaspernj commented Jun 24, 2021

I have refactored to two different strategies:

Joined alias exists subquery

Joins the same table with an alias, which is then used to make an exists query, which is faster than "WHERE id IN (...)". You will need to use the full table name for columns, since the alias will have the same columns - but you should always do that in ActiveRecord anyway ;-)

Joined alias each rule as exists

Joins the same table with an alias, iterates through all relevant rules, and uses them as separate or-exists queries.

This can be very useful to avoid ActiveRecord messing up the join names, if the same tables are used in multiple other different rules, which is something that has plagued us a lot. It is sometimes faster than the other strategies, but it might be your only way out, if ActiveRecord decides to mess you up, which is currently the case for us in Rails 6.1 (but works in 6.0.4 but not in 6.0.3).

It only left joins if nil values are detected, which can also affect the query speed.

Here you must also use the table names for columns to not mess with the alias.

relation.left_joins(joins).distinct
end
def strategy_class
require_relative "strategies/#{CanCan.accessible_by_strategy}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just require them at the top of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require code that the user in most cases would never use. I assume that most users are just going to set a single strategy and use that for everything :-)

@coorasse
Copy link
Member

Thanks! I went through the PR and looks solid. I'll probably extract some parts of it and merge them separately.

@coorasse
Copy link
Member

@kaspernj regarding the tests, you can run them using appraisal (see README). For example:

DB='sqlite' bundle exec appraisal rake

@coorasse
Copy link
Member

I am now looking at the first of the two stretegies, and I don't understand why you need to join the table with itaself....:🤔

https://github.com/CanCanCommunity/cancancan/pull/691/files#diff-cf63e8e34aa3f5976e07653a98befdb9ae3cabef3996551d81352afdaa9e0f71R9

@coorasse coorasse mentioned this pull request Nov 12, 2021
@kaspernj
Copy link
Contributor Author

@coorasse The short version is to not dynamically join tables directly in the query, which can cause issues when using other libraries that work with dynamic queries like Ransack (and others) :-)

Here is an example.

  1. Multiple different roles that has different types of access.
  2. Based on which role the current user has a number of tables are joined.
  3. Same table is joined by different queries.
  4. Because the same table is joined multiple times the aliases changes based on which role the user has.

To get around this a sub query is used where CanCan joins the table. Manually joined tables (or tables joined through other libraries like Ransack) are unaffected by which role the user has.

This can also result in bugs, because joining tables with aliases can mess up in ActiveRecord. I have had multiple issues about this in both Rails and Ransack already. By joining the tables from CanCan in a sub query, we can minimise those issues greatly, which directly fixed a number of issues in two projects I was working :-)

@kaspernj
Copy link
Contributor Author

@coorasse So I see the two new strategies as tools to get around those issues. Maybe you wouldn't use these strategies by default, but they would be options to get around the mentioned issues if you run into them like I did :-)

@kaspernj
Copy link
Contributor Author

I am now looking at the first of the two stretegies, and I don't understand why you need to join the table with itaself....:🤔

https://github.com/CanCanCommunity/cancancan/pull/691/files#diff-cf63e8e34aa3f5976e07653a98befdb9ae3cabef3996551d81352afdaa9e0f71R9

@coorasse In order to do a sub query, I cannot use the original table name, so I have to alias the same table to use its ID in the sub query. If you know a better way of doing this then please enlighten me? :-)

@coorasse
Copy link
Member

I tried a lot of different things, but I am not able to generate an alias 😞 . I think your solution is the only working one, although I still really don't like that it performs a join that is not necessary. Pity. I'll think this through a bit more today, but if I don't find an alternative, I'd rather have this merged than not. So expect to have this available in the next release of cancancan anyway.

@kaspernj
Copy link
Contributor Author

@coorasse I feel the exact same way about the alias, and I have also sunk a considerable amount of time into trying to get around it :'-(

@kaspernj
Copy link
Contributor Author

The performance gains of this has been huge however :-)

@coorasse coorasse merged commit 24ed063 into CanCanCommunity:develop Mar 23, 2022
@kaspernj kaspernj deleted the accessible_by_strategy/double_exist_subquery branch May 1, 2022 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants