Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize update_user_groups to a fixed number of DB queries. #220

Closed

Conversation

tim-schilling
Copy link
Member

By using Django's bulk_create and set operations we can reduce the
time complexity of this function from two sorts to none and from
3 + (2ish * (claim_groups - existing _group_names) OR (claim_groups - existing _group_names) ) DB queries to a constant 3.

The downside of this approach is that the bulk_create operation
will not cause the pre_save and post_save signals to be raised
when creating the Group instances.

By using Django's bulk_create and set operations we can reduce the
time complexity of this function from two sorts to None and from
3 + (2ish * (claim_groups - existing _group_names) OR (
claim_groups - existing _group_names) ) DB queries to a constant 3.

The downside of this approach is that the bulk_create operation
will not cause the pre_save and post_save signals to be raised
when creating the Group instances.
@sonarcloud
Copy link

sonarcloud bot commented Feb 8, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@codecov
Copy link

codecov bot commented Feb 8, 2022

Codecov Report

Merging #220 (0914b94) into master (a2e9b37) will decrease coverage by 0.2%.
The diff coverage is 70.0%.

@@           Coverage Diff            @@
##           master    #220     +/-   ##
========================================
- Coverage    85.6%   85.3%   -0.3%     
========================================
  Files          11      11             
  Lines         500     492      -8     
========================================
- Hits          428     420      -8     
  Misses         72      72             
Impacted Files Coverage Δ
django_auth_adfs/backend.py 84.3% <70.0%> (-0.8%) ⬇️

@JonasKs
Copy link
Member

JonasKs commented Feb 9, 2022

Hi @tim-schilling! I'll look into this later today. Thank you 🚀

As a first thought I don't mind those signals not being kicked off, but I think the discussion of a major version bump should be had. What do you think?

Personally I don't believe the group creation signal is mentioned anywhere and not really officially supported, but I'd like to hear opinions. Efficiency on these things is a far more important concern to me.

CC @sondrelg

@sondrelg
Copy link
Member

sondrelg commented Feb 9, 2022

Hmm this is a tough one.

I doubt there are many projects that rely on signals based off of these events. At the same time, if there are, the outcome could be pretty bad for them if they upgrade and this change flies under the radar. I think if we accept the change, we should probably bump major versions.

I'd normally advocate not having too high of a bar for major bumps - I think upgrading often is good generally, but maybe it's worth asking what the performance impact of the reduced time complexity is here. Have you benchmarked it @tim-schilling? If it's negligible in practice and we lose signals, maybe it wouldn't be a net benefit to users. I'd be happy to help try test it later this week if you want 🙂

And it's been a while since I dived into the signals API, but is it possible to add these back manually? This line from the Django project seems promising. If so, perhaps that would remove the downside.

@tim-schilling
Copy link
Member Author

As a first thought I don't mind those signals not being kicked off, but I think the discussion of a major version bump should be had. What do you think?

I think a major upgrade is reasonable. This change would be easy to miss and in the right circumstances have major impacts on an integration.

Have you benchmarked it @tim-schilling?

I have not. I have thought about it more and it does seem relatively limited. The groups claim from AD will at most have 200 elements. I'll try to put something together so we have a better idea of what it would mean.

And it's been a while since I dived into the signals API, but is it possible to add these back manually? This line from the Django project seems promising. If so, perhaps that would remove the downside.

Unfortunately I don't think that's going to be feasible. I suspect the django devs would have supported this if they could have. The problem is that we don't know which objects are being created before running bulk_create. In Django 3.2+, it returns the created instances, but even then we'd be past the point where pre_save is helpful.

I could create a configuration option that allows the signals to be raised when MIRROR_GROUPS is enabled to allow devs to choose moving forward. Or that could get added if someone finds that they wouldn't be able to upgrade because of that lack of functionality.

@JonasKs
Copy link
Member

JonasKs commented Feb 10, 2022

I think a major upgrade is reasonable. This change would be easy to miss and in the right circumstances have major impacts on an integration.

Alright, let's do that if you want to merge this, based on your benchmark findings. 👍

@tim-schilling
Copy link
Member Author

I'm closing this for now. I may revisit it in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants