Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrity error (#93) #95

Merged
merged 8 commits into from
Sep 13, 2020
Merged

Integrity error (#93) #95

merged 8 commits into from
Sep 13, 2020

Conversation

JonasKs
Copy link
Member

@JonasKs JonasKs commented Jan 20, 2020

Attempt to fix #93. All tests pass and I've confirmed that it works locally.

Please let me know if you have any concerns or comments.

@JonasKs JonasKs requested a review from jobec January 20, 2020 17:34
@JonasKs
Copy link
Member Author

JonasKs commented Jan 20, 2020

Ah, I forgot to do a flake8 check. I’ll fix tomorrow when I get back to work, didn’t bring my laptop.

@codecov
Copy link

codecov bot commented Jan 21, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@390a2f9). Click here to learn what that means.
The diff coverage is 79.16%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #95   +/-   ##
=========================================
  Coverage          ?   82.35%           
=========================================
  Files             ?        7           
  Lines             ?      408           
  Branches          ?        0           
=========================================
  Hits              ?      336           
  Misses            ?       72           
  Partials          ?        0
Impacted Files Coverage Δ
django_auth_adfs/rest_framework.py 70.83% <ø> (ø)
django_auth_adfs/config.py 84.68% <100%> (ø)
django_auth_adfs/backend.py 79.59% <68.75%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 390a2f9...3741e3e. Read the comment docs.

@JonasKs
Copy link
Member Author

JonasKs commented Jan 21, 2020

I have figured that the real reason this would happen so often to us is because of a group name in AD being changed from myGroup to MyGroup in PascalCase.

groups_to_remove = set(django_groups) - set(claim_groups)
groups_to_add = set(claim_groups) - set(django_groups)

it would result in the following:

DEBUG 2020-01-21 11:04:26,562 django_auth_adfs User removed from group 'myGroup'
DEBUG 2020-01-21 11:04:26,564 django_auth_adfs Created group 'MyGroup'
DEBUG 2020-01-21 11:04:26,575 django_auth_adfs User added to group 'MyGroup'

In other words, on every request to the site, it would do a full clean, making it get an IntegrityError very often.

I you'd like, I could rather raise a different PR that simply does this. This won't solve the issue for integrity errors when a user has a different group set, though.

How ever, I have made the checks case insensitive when looking for groups in this PR.

@jobec
Copy link
Collaborator

jobec commented Jan 21, 2020

It depends whether Django's group names are case sensitive or not. And whether there are situation where it could be case sensitive (e.g. when someone extends django's authentication).

@jobec
Copy link
Collaborator

jobec commented Jan 21, 2020

Looking at the logs you posted, it does look like Django's groups are case insensitive themselves:

https://github.com/jobec/django-auth-adfs/blob/1e930661cc1cb6d9c9321c512a2bb4faac4f7ded/django_auth_adfs/backend.py#L198

The code determines the group needs to be deleted, but when "creating" the group, it actually retrieves the one with the lower case my again and adds the user to that group.

Strange... That's quite an unexpected quirk from Django.

@JonasKs
Copy link
Member Author

JonasKs commented Jan 21, 2020

You are correct.

If I do Group.objects.get(name='mygroup') in all lower case, it will still fetch the correct group:

>>> Group.objects.get(name='mygroup')
<Group: MyGroup>

EDIT: Same with get_or_create.

>>> group, created = Group.objects.get_or_create(name='mygroup')
>>> created
False
>>> group
<Group: MyGroup>

@jobec
Copy link
Collaborator

jobec commented Jan 21, 2020

Let me see where this case insensitivity comes from. I always though you need to do it explicitly with iexact

@jobec
Copy link
Collaborator

jobec commented Jan 21, 2020

When I test, getting groups by name is case sensitive (or at least on the latest 2.2.x version). So it looks like something modified the behavior of you Django setup

>>> Group.objects.get_or_create(name="MyTestGroup")
(<Group: MyTestGroup>, True)

>>> Group.objects.get(name="MyTestGroup")
<Group: MyTestGroup>

>>> Group.objects.get(name="MyTestGROUP")
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/vagrant/.venv/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/vagrant/.venv/lib/python3.7/site-packages/django/db/models/query.py", line 408, in get
    self.model._meta.object_name
django.contrib.auth.models.Group.DoesNotExist: Group matching query does not exist.

>>> Group.objects.get_or_create(name="MyTestGROUP")
(<Group: MyTestGROUP>, True)

>>> Group.objects.get(name="MyTestGROUP")
<Group: MyTestGROUP>

What is your Django version? What other packages did you install? What other apps are activated in Django?

@JonasKs
Copy link
Member Author

JonasKs commented Jan 21, 2020

Wait, that's really interesting.

I tested on my django-guid project now(Django 3.0.2), except that I changed the database to be a MySQL database, as it is in my real environment:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'HOST': 'localhost',
        'PASSWORD': '',
        'NAME': 'django-adfs-test',
        'USER': 'test',
    },
}

Driver used here is mysqlclient 1.4.6, as recommended in the docs.

from django.contrib.auth.models import Group
Group.objects.get_or_create(name="MyTestGroup")
(<Group: MyTestGroup>, True)
Group.objects.get(name="MyTestGroup")
<Group: MyTestGroup>
Group.objects.get(name="MyTestGROUP")
<Group: MyTestGroup>

How ever, with a sqlit3 database:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
    }
}
from django.contrib.auth.models import Group
Group.objects.get_or_create(name="MyTestGroup")
(<Group: MyTestGroup>, True)
Group.objects.get(name="MyTestGroup")
<Group: MyTestGroup>
Group.objects.get(name="MyTestGROUP")
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/aa545@i04.local/Documents/projects/personal/django-guid/venv/lib/python3.8/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/aa545@i04.local/Documents/projects/personal/django-guid/venv/lib/python3.8/site-packages/django/db/models/query.py", line 415, in get
    raise self.model.DoesNotExist(
django.contrib.auth.models.Group.DoesNotExist: Group matching query does not exist.

EDIT: Seems like I need to fix my databases. Link
Would you like me to remove the case insensitive code from this PR?

EDIT2: Confirmed that all my databases uses collation utf8mb4_0900_ai_ci. Damn.

@jobec
Copy link
Collaborator

jobec commented Jan 21, 2020

Looks like it's a "feature" of MySQL:

https://dev.mysql.com/doc/refman/8.0/en/case-sensitivity.html

@JonasKs
Copy link
Member Author

JonasKs commented Jan 21, 2020

Yeah, in Django I can do this (when I run my tests):

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'HOST': 'localhost',
        'PASSWORD': '',
        'USER': 'test',
        'TEST': {
            'NAME': 'django-adfs-test',
            'CHARSET': 'utf8mb4',
            'COLLATION': 'utf8mb4_0900_as_cs',
        },
    },
}

Now, the question still stand though. Since multiple groups with different case sensitivity is not possible in AD(MyGroup and MyGROUP can not exist at the same time), should the check be case insensitive when matching groups? Or should one have it case sensitive and potentially have unused groups in the database?

@jobec
Copy link
Collaborator

jobec commented Jan 22, 2020

To recap:

  • You pre-created groups in django, according to it's name in AD.
  • ADFS changes the group names to pascal case
  • Because of this, you group names in django don't match in django-auth-adfs's code, but the undelaying database is case-insensitive, causing the group with the incorrect case to match instead of creating a new group

The missing link here, is "why" ADFS changes the group name. Are you sure the field ADFS uses as the group's name is the same as what you used to create the groups initially?

Ideally, the groups you create in Django would match with what ADFS sends back.
Maybe getting the groups with an _iexact and giving an error when we get 2 results, might be a workaround.

@JonasKs
Copy link
Member Author

JonasKs commented Jan 22, 2020

Recap:

  • User logged into my site, groups were created
  • A letter in a group name was changed in AD from upper case to lower case
  • User logged into the site again, group name now differs.
  • Because of my case insensitive database, this became an issue. For people with case sensitive databses, this wouldn't have been an issue.

I have no strong feelings about a solution here, I can fix the tables on my end and add a little note in your documentation about this for MySQL users.

@jobec
Copy link
Collaborator

jobec commented Jan 22, 2020

My preference goes to an entry in the FAQ to be honest. (If you get this error, then...)

Implementing case insensitivity is going to trigger other issues I'm afraid.

@JonasKs
Copy link
Member Author

JonasKs commented Jan 22, 2020

Perfect. I'll document it in the FAQ after work or tomorrow if you don't mind.

The rest of the code, any comments on that?

@jobec
Copy link
Collaborator

jobec commented Jan 22, 2020

Remove the lower case changes and I'll have another look.

@JonasKs
Copy link
Member Author

JonasKs commented Jan 22, 2020

Done. I'll write a little note about it in the docs tonight.

I'm just happy that I finally figured out why the API were slow for a selected few people.

@codecov
Copy link

codecov bot commented Jan 22, 2020

Codecov Report

Merging #95 into master will decrease coverage by 0.08%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #95      +/-   ##
==========================================
- Coverage   82.77%   82.69%   -0.09%     
==========================================
  Files           7        7              
  Lines         418      416       -2     
==========================================
- Hits          346      344       -2     
  Misses         72       72
Impacted Files Coverage Δ
django_auth_adfs/backend.py 80.51% <85.71%> (-0.25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 41fa2b8...edf16a5. Read the comment docs.

@JonasKs
Copy link
Member Author

JonasKs commented Jan 23, 2020

Added docs. In case you don't want to pull to build them, here's how it looks:

image

Edit: looking back a few hours later I can spot a spelling error. I’ll fix that..

@JonasKs
Copy link
Member Author

JonasKs commented Jan 29, 2020

Any thoughts or comments?

@JonasKs
Copy link
Member Author

JonasKs commented Aug 4, 2020

This is still causing issues for us. Did you end up checking out the solution? :)

@sondrelg
Copy link
Member

We're experiencing a lot of these errors (same company but different projects).

Would really appreciate if this could be looked at soon, as we're currently forced to switch to Jonas' fork, just based on the number of errors we're seeing from this in production.

@jobec jobec merged commit caa7eec into snok:master Sep 13, 2020
mmcclelland1002 pushed a commit to mmcclelland1002/django-auth-adfs that referenced this pull request Nov 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Database Integrity Error when using multiple workers/threads with Gunicorn
3 participants