Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe refresh #31

Merged
merged 21 commits into from
Jun 9, 2022
Merged

Safe refresh #31

merged 21 commits into from
Jun 9, 2022

Conversation

jwoehr
Copy link
Contributor

@jwoehr jwoehr commented Jun 3, 2022

Context:
It's possible to save bad refresh tokens by accidentally including non-printable characters when copying them.

Description of the Change:
Settings.save() does a sanity check and raises on unprintables in the refresh token.

Benefits:
Less surprise and mystery on a user error in performing a token refresh

Possible Drawbacks:
This works, but though I wrote a test, I have not been able to make the test work, could use some help on test_settings.py test test_save_unprintable

Related GitHub Issues:

Copy link
Collaborator

@Mandrenkov Mandrenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the contribution, @jwoehr! 🎉 It's always appreciated when someone takes the time to improve a project based on their own experience with the software! 👍

I left several comments (mostly related to formatting and style) but overall this is quite nice!

.github/CHANGELOG.md Outdated Show resolved Hide resolved
.github/CHANGELOG.md Outdated Show resolved Hide resolved
tests/test_settings.py Outdated Show resolved Hide resolved
tests/test_settings.py Outdated Show resolved Hide resolved
tests/test_settings.py Outdated Show resolved Hide resolved
tests/test_settings.py Outdated Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
jwoehr and others added 8 commits June 3, 2022 08:28
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 3, 2022

@Mandrenkov I'm not sure the rest of the test test_save_unprintable() is correct.
The test is to make sure an exception is thrown on an unprintable character.
Does the test as modified really prove that this is so?

@Mandrenkov
Copy link
Collaborator

@Mandrenkov I'm not sure the rest of the test test_save_unprintable() is correct. The test is to make sure an exception is thrown on an unprintable character. Does the test as modified really prove that this is so?

Hi @jwoehr, I think it depends what the intention of the sanity check is. Consider:

settings = xcc.Settings(
            REFRESH_TOKEN="j.w.t\n", HOST="example.com", PORT=80, TLS=False
)

match = r"The REFRESH_TOKEN setting contains non-printable character(s)\."

# Check that a ValueError is thrown since "\n" is not printable.
with pytest.raises(ValueError, match=match):
    settings.save()

# Check that the .env file was not modified since there was a "\n" in the refresh token.
assert dotenv_values(env_file.name) == {
    "XANADU_CLOUD_REFRESH_TOKEN": "j.w.t",  # Note that the space at the end is deleted here.
    "XANADU_CLOUD_HOST": "example.com",
    "XANADU_CLOUD_PORT": "80",
    "XANADU_CLOUD_TLS": "False",
}

Here, we verify that the newline character is caught by the sanity check and the .env file is not modified. Of course, we can also expand the coverage of the test by parameterizing it:

@pytest.mark.parametrize("refresh_token", [chr(0x0a), chr(0x20)])

Note, however, that chr(0x0a) (i.e., a newline) is not printable while chr(0x20) (i.e., a space) is printable.

What do you think about increasing the strictness of the sanity check by matching the JWT format exactly?

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 3, 2022

What do you think about increasing the strictness of the sanity check by matching the JWT format exactly?

Yes, that sounds better than dealing with it piecemeal. Is there a quick reference on the format?
And are there other .env variables that should be sanity checked?
(And even so, should we leave this particular PR to focus just on sanity-checking REFRESH_TOKEN ?)

jwoehr and others added 2 commits June 3, 2022 15:28
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
@Mandrenkov
Copy link
Collaborator

Thanks for the reply, @jwoehr!

Yes, that sounds better than dealing with it piecemeal.

🙌

Is there a quick reference on the format?

Yes! I would highly recommend taking a look at https://jwt.io/introduction (namely What is the JSON Web Token structure?). In a nutshell though, a JWT is three Base64-encoded strings concatenated with two periods (.).

And are there other .env variables that should be sanity checked? (And even so, should we leave this particular PR to focus just on sanity-checking REFRESH_TOKEN ?)

I think most of the settings in the .env file could benefit from additional validation (e.g., checking that PORT lies between 0 and 65535); however, REFRESH_TOKEN is the only field I would expect users to modify (and therefore would benefit the most from a sanity check). So let's keep this PR focused on just that field!

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 4, 2022

@Mandrenkov since the REFRESH_TOKEN is only manipulated by the user and by the affected code from the user side in its Base64Url form, then all that we have to check for is a character that's not legal in Base64URL, correct?

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 4, 2022

Given all of the above, a Base64URL value can be defined using the following regular expression:
^[A-Za-z0-9_-]+$
Base64URL Guru

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 4, 2022

I had to add . (dot) to the set because it's used in JWT to separate sections.
But I think the code works now.

- do check before starting save because save dict not ordered
@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 4, 2022

I can't make the post-check in the test_settings.py routine test_save_bad_Base64URL work so I've commented it out.
If it's in, I get this:

____________________________________________________________ TestSettings.test_save_bad_Base64URL ____________________________________________________________

self = <test_settings.TestSettings object at 0x7f70cda37250>, env_file = <tempfile._TemporaryFileWrapper object at 0x7f70cdad0910>

    def test_save_bad_Base64URL(self, env_file):
        """Tests that a ValueError will be raised
        if REFRESH_TOKEN contains a character outside the Base64URL with
        the addition that the '.' dot character is part of the token as a
        section separator.
        """
        settings = xcc.Settings(
            REFRESH_TOKEN="j.w.t\n", HOST="example.com", PORT=80, TLS=False
        )
        match = r"REFRESH_TOKEN contains non-Base64URL character\(s\)"
        with pytest.raises(ValueError, match=match):
            settings.save()
    
        # Check that the .env file was not modified since there was a "\n" in the refresh token.
>       assert dotenv_values(env_file.name) == {
            "XANADU_CLOUD_REFRESH_TOKEN": "j.w.t",
            "XANADU_CLOUD_HOST": "example.com",
            "XANADU_CLOUD_PORT": "80",
            "XANADU_CLOUD_TLS": "False",
        }
E       AssertionError: assert OrderedDict() == {'XANADU_CLOU...TLS': 'False'}
E         Right contains 4 more items:
E         {'XANADU_CLOUD_HOST': 'example.com',
E          'XANADU_CLOUD_PORT': '80',
E          'XANADU_CLOUD_REFRESH_TOKEN': 'j.w.t',
E          'XANADU_CLOUD_TLS': 'False'}
E         Use -v to get more diff

tests/test_settings.py:83: AssertionError
================================================================== short test summary info ===================================================================
FAILED tests/test_settings.py::TestSettings::test_save_bad_Base64URL - AssertionError: assert OrderedDict() == {'XANADU_CLOU...TLS': 'False'}

@Mandrenkov
Copy link
Collaborator

Mandrenkov commented Jun 6, 2022

@Mandrenkov since the REFRESH_TOKEN is only manipulated by the user and by the affected code from the user side in its Base64Url form, then all that we have to check for is a character that's not legal in Base64URL, correct?

Yeah, almost! We also need to account for the . characters which delimit each part of the JWT.

I had to add . (dot) to the set because it's used in JWT to separate sections.
But I think the code works now.

Exactly!

I can't make the post-check in the test_settings.py routine test_save_bad_Base64URL work so I've commented it out. If it's in, I get this:

____________________________________________________________ TestSettings.test_save_bad_Base64URL ____________________________________________________________

self = <test_settings.TestSettings object at 0x7f70cda37250>, env_file = <tempfile._TemporaryFileWrapper object at 0x7f70cdad0910>

    def test_save_bad_Base64URL(self, env_file):
        """Tests that a ValueError will be raised
        if REFRESH_TOKEN contains a character outside the Base64URL with
        the addition that the '.' dot character is part of the token as a
        section separator.
        """
        settings = xcc.Settings(
            REFRESH_TOKEN="j.w.t\n", HOST="example.com", PORT=80, TLS=False
        )
        match = r"REFRESH_TOKEN contains non-Base64URL character\(s\)"
        with pytest.raises(ValueError, match=match):
            settings.save()
    
        # Check that the .env file was not modified since there was a "\n" in the refresh token.
>       assert dotenv_values(env_file.name) == {
            "XANADU_CLOUD_REFRESH_TOKEN": "j.w.t",
            "XANADU_CLOUD_HOST": "example.com",
            "XANADU_CLOUD_PORT": "80",
            "XANADU_CLOUD_TLS": "False",
        }
E       AssertionError: assert OrderedDict() == {'XANADU_CLOU...TLS': 'False'}
E         Right contains 4 more items:
E         {'XANADU_CLOUD_HOST': 'example.com',
E          'XANADU_CLOUD_PORT': '80',
E          'XANADU_CLOUD_REFRESH_TOKEN': 'j.w.t',
E          'XANADU_CLOUD_TLS': 'False'}
E         Use -v to get more diff

tests/test_settings.py:83: AssertionError
================================================================== short test summary info ===================================================================
FAILED tests/test_settings.py::TestSettings::test_save_bad_Base64URL - AssertionError: assert OrderedDict() == {'XANADU_CLOU...TLS': 'False'}

Ah, so if you take a look at the implementation of env_file, you will see that the fixture just configures the Settings class to use an empty, temporary file as the .env file. This is why dotenv_values(env_file.name) yields an empty dictionary in your test. What you probably want to use instead is the settings fixture which applies some default settings and commits them to disk.

.github/CHANGELOG.md Outdated Show resolved Hide resolved
tests/test_settings.py Outdated Show resolved Hide resolved
xcc/settings.py Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
xcc/settings.py Outdated Show resolved Hide resolved
jwoehr and others added 4 commits June 6, 2022 15:30
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
Co-authored-by: Mikhail Andrenkov <Mandrenkov@users.noreply.github.com>
@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 6, 2022

Hmm now I get a different error in the test:


self = <test_settings.TestSettings object at 0x7f659d1e12a0>
settings = Settings(REFRESH_TOKEN='j.w.t\n', ACCESS_TOKEN=None, HOST='example.com', PORT=80, TLS=False)

    def test_save_bad_Base64URL(self, settings):
        """Tests that a ValueError will be raised
        if REFRESH_TOKEN contains a character outside the Base64URL with
        the addition that the '.' dot character is part of the token as a
        section separator.
        """
        settings.REFRESH_TOKEN = "j.w.t\n"
    
        match = r"REFRESH_TOKEN contains non-JWT character\(s\)"
        with pytest.raises(ValueError, match=match):
            settings.save()
    
        # Check that the .env file was not modified since there was a "\n" in the refresh token.
        assert dotenv_values(env_file.name) == {
            "XANADU_CLOUD_REFRESH_TOKEN": "j.w.t",
            "XANADU_CLOUD_HOST": "example.com",
            "XANADU_CLOUD_PORT": "80",
            "XANADU_CLOUD_TLS": "False",
>       }
E       AttributeError: 'function' object has no attribute 'name'

tests/test_settings.py:87: AttributeError

@Mandrenkov
Copy link
Collaborator

Hmm now I get a different error in the test:


self = <test_settings.TestSettings object at 0x7f659d1e12a0>
settings = Settings(REFRESH_TOKEN='j.w.t\n', ACCESS_TOKEN=None, HOST='example.com', PORT=80, TLS=False)

    def test_save_bad_Base64URL(self, settings):
        """Tests that a ValueError will be raised
        if REFRESH_TOKEN contains a character outside the Base64URL with
        the addition that the '.' dot character is part of the token as a
        section separator.
        """
        settings.REFRESH_TOKEN = "j.w.t\n"
    
        match = r"REFRESH_TOKEN contains non-JWT character\(s\)"
        with pytest.raises(ValueError, match=match):
            settings.save()
    
        # Check that the .env file was not modified since there was a "\n" in the refresh token.
        assert dotenv_values(env_file.name) == {
            "XANADU_CLOUD_REFRESH_TOKEN": "j.w.t",
            "XANADU_CLOUD_HOST": "example.com",
            "XANADU_CLOUD_PORT": "80",
            "XANADU_CLOUD_TLS": "False",
>       }
E       AttributeError: 'function' object has no attribute 'name'

tests/test_settings.py:87: AttributeError

Right! We're no longer using the env_file fixture and so we'll need to replace

assert dotenv_values(env_file.name) ...

with

assert dotenv_values(xcc.Settings.Config.env_file) ...

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 7, 2022

Excellent everything works now.

Copy link
Collaborator

@Mandrenkov Mandrenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you still need to run make format lint. 🤔

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 7, 2022

Ah, I did run black but apparently the linter has settings other than default.

@jwoehr
Copy link
Contributor Author

jwoehr commented Jun 7, 2022

I fixed lint's complaints.
Opened a new PR to add a pyproject.toml file so developers running black get the project-mandated format.

Copy link
Collaborator

@Mandrenkov Mandrenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @jwoehr! I think this PR is ready to merge. 🟢

@Mandrenkov Mandrenkov merged commit ae790a8 into XanaduAI:main Jun 9, 2022
@jwoehr jwoehr deleted the safe_refresh branch June 10, 2022 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants