Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add krb5.aname_to_localname() #19

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pseudometric
Copy link
Contributor

  1. check for errors returned from krb5_init_context()
  2. add glue call to krb5_aname_to_localname()

Thanks for writing this; it looks like it will be quite useful for us!

– Richard Silverman res@arcesium.com

init_context() currently just returns a null pointer if the C routine
fails, which is unlike the norm in this module and I assume is an
oversight. Typically this will fail if there's a syntax error in the
libkrb5 configuration (/etc/krb5.conf, or whatever is read). A program
I wrote segfaulted in this situation, because I then passed the null
context to other routines. With this change, we get instead:

In [2]: ctx = krb5.init_context()
---------------------------------------------------------------------------
Krb5Error                                 Traceback (most recent call last)
...
Krb5Error: Included profile file could not be read -1429577697
This is part of the MIT Kerberos "localauth" interface:

  https://web.mit.edu/kerberos/krb5-devel/doc/plugindev/localauth.html

... so I used that as the "extension" grouping name. There is also a
krb5_kuserok() in this interface, that might be added.
@pseudometric
Copy link
Contributor Author

I ran your GitHub workflow myself on this; it passes, although there are some unrelated errors on older versions of MacOS that I believe must already be issues you know about.

Copy link
Owner

@jborean93 jborean93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work and for adding the docs and tests as well. I've added some comments to the changes.

As for CI, it seems like I've got some work to do to ensure it doesn't use the wheel's from PyPI over the locally built ones. I'm unsure as to why it's not favouring them there and causing the failures.

The original macOS ones have been fixed with pythongssapi/k5test#22.

import krb5


def test_aname_to_localname(realm: k5test.K5Realm) -> None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at test_cc_cache_match which uses pytests' tmp_path and monkeypatch fixture to deal with temp dirs scoped to the test and monkeypatch.setenv to set env vars that only live for the life of the test. With these it would look something like

def test_aname_to_localname(realm: k5test.K5Realm, tmp_path: pathlib.Path, monkeypatch: pytest.MonkeyPatch) -> None:
    krb5_conf = tmp_path / "krb5.conf"
    with open(krb5_conf, mode="w") as fd:
        fd.write(...)

    monkeypatch.setenv("KRB5_CONFIG", str(krb5_conf))

    ... test code starts here

principal: principal name to translate

Returns:
str: translation
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if Google doc strings support this format. Usually I do

    Returns:
        Optional[str]: Details about what it returns and when it could
        be None.

krb5_error_code krb5_aname_to_localname(
krb5_context context,
krb5_const_principal aname,
int lnsize,
Copy link
Owner

@jborean93 jborean93 Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at Heimdal, this is declared as size_t https://github.com/heimdal/heimdal/blob/fa92fe37e7ed1cfbe546fd617d712f47c55e35b9/lib/krb5/aname_to_localname.c#L299. I'm no expert at C or how the ABIs work but I would have thought this could potentially cause some problems.

It might require some trickery that we do in _keyblock.pyx to define a generic function but use compiler directives to cast as needed to size_t on Heimdal.

Considering the code passes and the compiler doesn't complain, I think it will be fine but it's something to keep in mind.

Comment on lines +37 to +41
# The definition of KRB5_LNAME_NOTRANS in the Heimdal header file
# included in the pykrb5 source appears to be out of date; this is
# the value that actually gets returned in the test -- so I'm just
# adding it here for now.
if err in [KRB5_LNAME_NOTRANS, -1765328227]:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like -1765328227 is KRB5_NO_LOCALNAME, it's not surprising that Heimdal is returning a different value than MIT. You can even see it in https://github.com/heimdal/heimdal/blob/fa92fe37e7ed1cfbe546fd617d712f47c55e35b9/lib/krb5/aname_to_localname.c#L330 that it is returning KRB5_NO_LOCALNAME. It's been in MIT krb5 and Heimdal for over 10 years so it's probably safe to for years so it's probably safe to do if err in [KRB5_LNAME_NOTRANS, KRB5_NO_LOCALNAME]: with a brief comment as to why we check both.

Just as an FYI the Heimdal headers in the repo are only used for macOS builds as they are not available on the host. If compiling against Heimdal on an actual Linux host it will use the headers from the source you are linking against.

free(localname)
raise Krb5Error(context, err)
else:
return localname.decode("utf-8")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to keep these APIs agnostic to encodings as there isn't really one set encoding that the krb5 uses. It might be utf-8 in most cases but that isn't a guarantee. Because of this, char based APIs return bytes and rely on the caller to decode it based on their host setup.

def aname_to_localname(
context: Context,
principal: Principal
) -> str:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments about bytes, but this should also be t.Optional[bytes]: as it can return None.

if err in [KRB5_LNAME_NOTRANS, -1765328227]:
return None
elif err != 0:
free(localname)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You free it here but not in the good case, we should always free any memory we allocate or any memory allocated by the krb5 library.

See kt_default_name which has a similar API to this call. It starts with a fixed buffer and continues to increase it until there is enough space. It also ensures the buffer is freed once it has been allocated using a try/finally and uses strlen to get the size.

I would see the current code looking like

cdef krb5_error_code err = KRB5_CONFIG_NOTENUFSPACE
buffer_size = 1024
buffer_length = buffer_size
cdef char *buffer = <char *>malloc(buffer_length)
if not buffer:
    raise MemoryError()

try:
    while err == KRB5_CONFIG_NOTENUFSPACE:
        err = krb5_aname_to_localname(context.raw, principal.raw, buffer_length, buffer)

        if err == KRB5_CONFIG_NOTENUFSPACE:
            buffer_length += buffer_size

            # Use a temp var to ensure buffer is always something valid to be freed
            new_buffer = <char *>realloc(buffer, buffer_length)
            if not new_buffer:
                raise MemoryError()
            buffer = new_buffer

        elif err in [KRB5_LNAME_NOTRANS, KRB5_NO_LOCALNAME]:
            return None
            
        elif err:
            raise Krb5Error(context, err)

        else:
            name_len = strlen(buffer)
            return <bytes>buffer[:name_len]

finally:
    free(buffer)

I saw in the MIT krb5 and Heimdal tests that they use a buffer size of 1024 and not 256 so I've used that as a base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! Skimming it everything you say makes sense. I'll try to make the changes in the next few days; I won't get to it right away, I'm afraid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing:

You free it here but not in the good case, we should always free any memory we allocate or any memory allocated by the krb5 library.

I must be missing something: we can't free it in the good case, because we're returning it to the caller. I was assuming that once it's returned it's reified as a Python str object and will be managed (garbage-collected) as usual by Python. Is that not the case?

Ah, maybe it's this: an expression like localname.decode("utf-8") in my original code, your new <bytes>buffer[:name_len], actually creates a Python object from the C string and returns that -- so now it's OK to free the C buffer before returning. Is that it? Pardon my unfamiliarity with Cython.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually thought about the keep-mallocing-until-it's-big-enough strategy, but I didn't because a) I'm suspicious of things that can cause a program to just keep allocating memory endlessly until it crashes; I prefer setting some reasonable limit instead, where it makes sense, and b) it seems pretty unlikely in practice to have such long translations. It's a bit of a tension between different aspects of correctness and safety. But I'm happy to do it this way if you prefer.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must be missing something: we can't free it in the good case, because we're returning it to the caller. I was assuming that once it's returned it's reified as a Python str object and will be managed (garbage-collected) as usual by Python. Is that not the case?

In that case the use of localname.decode("utf-8") was creating a new object anyway. You still need to cleanup the buffer that was malloc'd to avoid memory leaks as there is nothing tracking the lifetime of it. So in both your original code and the one I'm proposing this buffer needs to be freed regardless as the object returned by the function is a brand new copy based on that original buffer.

I actually thought about the keep-mallocing-until-it's-big-enough strategy, but I didn't because a) I'm suspicious of things that can cause a program to just keep allocating memory endlessly until it crashes

That's a good point, we should probably have a check to see if the buffer_length > ..., maybe 8192? You can probably keep the original code if you want but I recommend increasing the buffer size to 1024 rather than 256 as that's what both Heimdal and MIT krb5 uses for it's tests.

@jborean93
Copy link
Owner

#20 should work around the macOS testing problem, just push a new commit and hopefully it will pick it up.

I'm moving this to its own PR; it's unrelated to the
krb5_aname_to_localname() addition.

This reverts commit cda1e05.
@pseudometric
Copy link
Contributor Author

I reverted the krb5.init_context() fix on this branch, and opened a separate PR for it. It's ready to go and shouldn't wait for me to get time to finish work on the krb5.aname_to_localname() addition.

@pseudometric pseudometric changed the title [pseudometric] add aname_to_localname(); fix bug in init_context() [pseudometric] add aname_to_localname() Nov 16, 2022
@pseudometric pseudometric changed the title [pseudometric] add aname_to_localname() add krb5.aname_to_localname() Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants