Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NTAuthentication.MakeSignature produces different thing on macOS, Windows and Linux #65678

Closed
filipnavara opened this issue Feb 21, 2022 · 16 comments · Fixed by #89267
Closed
Assignees
Labels
area-System.Net.Security bug tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Milestone

Comments

@filipnavara
Copy link
Member

filipnavara commented Feb 21, 2022

I've tried to make a test for NTAuthentication.MakeSignature only to find out that it produces different outputs on each OS.

On Windows it produces an output in the form (signature, message).
On Linux it produces an output in form (length prefix, signature, message). The length prefix is generated in managed code.
On macOS it produces an output in form (length prefix, message, signature).

I cannot find any justification in the specifications (GSSAPI, GSSAPI SASL mechanism, or NTLM) for prepending the length prefix. It could have been erroneously copied from the Encrypt operation which also seems to do the length prefixing on Windows. However, in the case of Encrypt it seems that the justification actually comes from the high-level NegotiateStream specification. MakeSignature is not used in context of NegotiateStream though.

The reversed order of message and signature on macOS could be a bug in the provider default (https://github.com/apple-opensource/Heimdal/blob/9fed6a5818d85a439310b43727dd299366b397c2/lib/gssapi/ntlm/crypto.c#L699) but likely there's a way to enforce the proper order through different API call.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Security untriaged New issue has not been triaged by the area owner labels Feb 21, 2022
@ghost
Copy link

ghost commented Feb 21, 2022

Tagging subscribers to this area: @dotnet/area-system-security, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

I've tried to make a test for NTAuthentication.MakeSignature only to find out that it produces different outputs on each OS.

On Windows it produces an output in the form (signature, message).
On Linux it produces an output in form (length prefix, signature, message). The length prefix is generated in managed code.
On macOS it produces an output in form (length, prefix, message, signature).

I cannot find any justification in the specifications (GSSAPI, GSSAPI SASL mechanism, or NTLM) for prepending the length prefix. The reversed order of message and signature on macOS could be a bug in the provider default (https://github.com/apple-opensource/Heimdal/blob/9fed6a5818d85a439310b43727dd299366b397c2/lib/gssapi/ntlm/crypto.c#L699) but likely there's a way to enforce the proper order through different API call.

The Encrypt method looks to suffer from the same issues.

Author: filipnavara
Assignees: -
Labels:

area-System.Security, untriaged

Milestone: -

@filipnavara
Copy link
Member Author

filipnavara commented Feb 22, 2022

Turns out that the order of GSS_Wrap is quite underspecified but the Windows call is way more explicit about the layout.

On Windows the EncryptMessage API takes three buffers with each having a designated type (SECBUFFER_TOKEN, SECBUFFER_DATA, SECBUFFER_PADDING). The SECBUFFER_TOKEN buffer has a size that was previously queried from cbSecurityTrailer.

In GSSAPI there's a similar multiple-buffer API gss_wrap_iov. Unfortunately it's not always available for every provider. However, the specification of the gss_wrap_iov API explicitly says:

To generate gss_wrap() compatible packets, use: HEADER | DATA | PADDING | TRAILER

This leads me to the following conclusion:

  • Encrypt in NegotiateStreamPal.Windows constructs the NegotiateStream compatible packets in the form LENGTH PREFIX | TRAILER | DATA | PADDING
  • gss-ntlmssp on Linux generates incorrectly ordered output in gss_wrap (TRAILER | DATA instead of DATA | TRAILER)
  • Encrypt in NegotiateStreamPal.Unix constructs the packets as LENGTH PREFIX | gss_wrap. That happens to work on Linux due to the incorrect implementation in gss-ntlmssp but it doesn't work on Heimdal/Apple platforms that implement it correctly.

There doesn't seem to be an easy workaround for this. We can start using gss_wrap_iov if available and that will help on Apple platforms but not on Heimdal. Detecting the broken implementation is possible in some circumstances by calling gss_get_mic and comparing it with the prefix/trailer of the buffer returned by gss_wrap but there's always some risk of collision. Doing a dummy call with gss_wrap with small buffer would avoid the collision risk but it's not possible because it modifies the internal state (packet sequence number). It would be possible to use the GSS_C_INQ_SSPI_SESSION_KEY inquiry option (specific to the gss-ntlmssp provider) to get the session key and implement the encryption/decryption in managed code but that's already going a bit too far in the direction of the full managed implementation (#62264).

@filipnavara
Copy link
Member Author

filipnavara commented Feb 22, 2022

So, there is actually a way to detect the broken gss_wrap implementation.

On the first wrap call for NTLM call gss_export_sec_context to export the state of the context. Then run a dummy gss_wrap call with a single byte array. Run gss_get_mic for the same single byte buffer. Determine whether the gss_wrap produced the signature as header or trailer by comparing the MIC with the buffer returned from gss_wrap. The sequence number (0) and signature version (1) ensure that at least 8 bytes must match so there cannot be false positive on a one-byte buffer. Reset the context with gss_import_sec_context to the previously saved state (and sequence number). Proceed with gss_wrap as usual and transform it appropriately by switching the header/trailer in the buffer.

Update: ...and quite obviously gss_export_sec_context is not properly implemented on macOS, sigh. So that would need to be detected too.

@filipnavara
Copy link
Member Author

filipnavara commented Feb 22, 2022

Still looking into a workaround that would work universally... There seems be an easier way to reset the sequence number with gss-ntlmssp by using gss_set_sec_context_option (seems a dead end, works only for datagram contexts). Also, Apple seems to treat gss_wrap_iov (& related) as private APIs.

Meanwhile, I have verified against Exchange server that the on-wire NTLM signed message in the GSSAPI authentication is in format (NTLM SIGNATURE | MESSAGE) or, in other words, (TRAILER | MESSAGE).

@filipnavara
Copy link
Member Author

The macOS gss_wrap implementation turned out to be complete garbage.

Here's how it constructs the buffers:

    iov[0].type = GSS_IOV_BUFFER_TYPE_DATA;
    iov[0].buffer.length = input_message_buffer->length;
    iov[0].buffer.value = output_message_buffer->value;
    memcpy(iov[0].buffer.value, input_message_buffer->value,
           input_message_buffer->length);

    iov[1].type = GSS_IOV_BUFFER_TYPE_TRAILER;
    iov[1].buffer.length = 16;
    iov[1].buffer.value = (unsigned char *)output_message_buffer->value + 16;

The second buffer should have been iov[1].buffer.value = (unsigned char *)output_message_buffer->value + input_message_buffer->length; but it's not. So it produces the message signature right in the middle of the data buffer, or even corrupts memory.

@filipnavara
Copy link
Member Author

Submitted feedback to Apple. For reference:

The gss_wrap implementation in the NTLM provider uses incorrect buffer calculation: https://github.com/apple-oss-distributions/Heimdal/blob/339a41041d55a8e27edf1590087df3e5f4971da6/lib/gssapi/ntlm/crypto.c#L699-L701

The code:

    iov[1].type = GSS_IOV_BUFFER_TYPE_TRAILER;
    iov[1].buffer.length = 16;
    iov[1].buffer.value = (unsigned char *)output_message_buffer->value + 16;

is supposed to be

    iov[1].type = GSS_IOV_BUFFER_TYPE_TRAILER;
    iov[1].buffer.length = 16;
    iov[1].buffer.value = (unsigned char *)output_message_buffer->value + input_message_buffer->length;

The incorrect implementation could cause buffer overrun for wrapping messages shorter than 16 bytes. In all cases but 16-byte messages it will produce incorrect results because the two output buffers will either overlap or not be adjacent to each other.

Furthermore, the gss_wrap API produces opposite order of the DATA and TRAILER buffers than the implementation in gss-ntlmssp on Linux. Due to lack of context in the specifications either interpretation may be considered valid. They are just not mutually compatible. The interpretation with putting the signature in the trailer seems consistent with the “trailer” naming in the Windows API but it’s not present in the NTLM specification.

However, actual on-the-wire data for GSSAPI authentication scheme when connecting to Microsoft Exchange SMTP server uses the NTLM signature | message ordering. The GSSAPI SASL authentication scheme is described by the RFC 2222 specification and later updated as RFC 4752. The relevant part is in RFC 4752, section 3.1. at the end where it describes the last message exchange to be directly manipulated through gss_unwrap and gss_wrap calls. The implication of that is that for NTLM provider the order in both gss_wrap and gss_unwrap should be with the signature at the beginning of the wrapped message, not at the end.

With the interpretation above the correct implementation of _gss_ntlm_wrap would use the following buffer setup:

    iov[0].type = GSS_IOV_BUFFER_TYPE_DATA;
    iov[0].buffer.length = input_message_buffer->length;
    iov[0].buffer.value = output_message_buffer->value + 16;
    memcpy(iov[0].buffer.value, input_message_buffer->value,
           input_message_buffer->length);

    iov[1].type = GSS_IOV_BUFFER_TYPE_TRAILER;
    iov[1].buffer.length = 16;
    iov[1].buffer.value = (unsigned char *)output_message_buffer->value;

For _gss_ntlm_unwrap it would be:

    memcpy(output_message_buffer->value, input_message_buffer->value + 16,
           output_message_buffer->length);

    iov[0].type = GSS_IOV_BUFFER_TYPE_DATA;
    iov[0].buffer = *output_message_buffer;

    iov[1].type = GSS_IOV_BUFFER_TYPE_TRAILER;
    iov[1].buffer.value = (unsigned char *)input_message_buffer->value;
    iov[1].buffer.length = 16;

Given the existing bug in _gss_ntlm_wrap it’s unlikely that any software was depending on the wrapped messages as produced by this function.

@ghost
Copy link

ghost commented Feb 22, 2022

Tagging subscribers to this area: @dotnet/ncl, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

I've tried to make a test for NTAuthentication.MakeSignature only to find out that it produces different outputs on each OS.

On Windows it produces an output in the form (signature, message).
On Linux it produces an output in form (length prefix, signature, message). The length prefix is generated in managed code.
On macOS it produces an output in form (length, prefix, message, signature).

I cannot find any justification in the specifications (GSSAPI, GSSAPI SASL mechanism, or NTLM) for prepending the length prefix. It could have been erroneously copied from the Encrypt operation which also seems to do the length prefixing on Windows. However, in the case of Encrypt it seems that the justification actually comes from the high-level NegotiateStream specification. MakeSignature is not used in context of NegotiateStream though.

The reversed order of message and signature on macOS could be a bug in the provider default (https://github.com/apple-opensource/Heimdal/blob/9fed6a5818d85a439310b43727dd299366b397c2/lib/gssapi/ntlm/crypto.c#L699) but likely there's a way to enforce the proper order through different API call.

Author: filipnavara
Assignees: -
Labels:

area-System.Net.Security, untriaged

Milestone: -

@wfurt
Copy link
Member

wfurt commented Mar 3, 2022

Is there work we should do for 7.0 @filipnavara ? We generally depend on GSSAPI and I'm not sure how much we should fiddle with it.

@filipnavara
Copy link
Member Author

We can easily fix this on Linux (draft PR is submitted but the unit tests are issue we need to discuss). On macOS I submitted feedback to Apple and haven't heard back yet.

@wfurt
Copy link
Member

wfurt commented Mar 3, 2022

On macOS I submitted feedback to Apple and haven't heard back yet.

No surprise there ;( Would this be still beneficial if we cannot fix all platforms and make it consistent? (and there are always old OS version...)

@filipnavara
Copy link
Member Author

Yep, fixing Linux is easy. Should I submit PR without the unit test part?

@karelz
Copy link
Member

karelz commented Mar 31, 2022

Triage:

  • Looks like low impact - moving to Future (though we would take PR even in 7.0 if it is contributed).
  • Note: Looks like MacOS can't be fixed - so it will just be slightly better, but not perfect even with the change

@karelz karelz added this to the 7.0.0 milestone Mar 31, 2022
@karelz karelz added bug and removed untriaged New issue has not been triaged by the area owner labels Mar 31, 2022
@am11
Copy link
Member

am11 commented May 16, 2022

@filipnavara, are we looking to eventually move all platforms to the managed implementation? I think getting rid of krb5 build+run time dependency (pal_gssapi.c and related glue) would be a huge win as it will lower the system requirements bar, and it will possibly help mitigating these differences across platforms (predictable behavior).

@filipnavara
Copy link
Member Author

filipnavara commented May 16, 2022

are we looking to eventually move all platforms to the managed implementation?

Probably not, at least not in short term. GSSAPI provides both Kerberos and NTLM authentication. While the managed NTLM implementation could easily be used on more platforms now, adding Kerberos would be a whole different goal. There is a managed implementation (https://github.com/dotnet/Kerberos.NET) but so far there was no effort or plan to integrate it or use it as a replacement for the system provided GSSAPI. I don't necessarily think it's a good idea to do so because of compatibility and size concerns.

I am looking into supporting GSSAPI for Kerberos only and handling NTLM/Negotiate in the managed code. No firm commitment to that though.

@am11
Copy link
Member

am11 commented May 16, 2022

Thanks. Yes, it wasn't meant for ".NET 7 milestone" but for a general goodness moving forward into the future. :)
Not sure if trading krb5-dev package requirement with a portable C# implementation in an existing (trimmed) assembly would negatively affect the overall size of a sandbox (I hope not significant that would raise to the level of concern).

I was thinking more about how SocketsHttpHandler has helped in the past (.NET Core 1.6 - 2.0 era) to gradually get rid of curl-dev dependency from .NET ecosystem in favor of fully managed implementation on all platforms; perhaps we can also move away from krb5-dev with similar plan in mind.

@wfurt
Copy link
Member

wfurt commented May 16, 2022

We may do some transition over time @am11 if this turns out to be working well. For example, NTLM does not work in Azure functions because the underlying OS is missing gss-ntlm package. My original big motivation was macOS for developers, but @filipnavara figured out that the problem was so we were able to proceed with without switching to managed.
Also Windows authentication team is trying to really retire NTLM and turn it off by default. So I see it mostly as legacy combat.

Kerberos is whole different animal. I would like to use fragments of Kerberos.NET for testing but I'm not sure if this is going to really happen in 7.0. If the experience is good, we may look into it for platforms that do not have krb5.

@wfurt wfurt added the tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly label May 19, 2022
@wfurt wfurt modified the milestones: 7.0.0, Future May 19, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 20, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 15, 2023
@karelz karelz modified the milestones: Future, 9.0.0 Aug 29, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Sep 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Security bug tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants