Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch DirectoryControl to use AsnWriter, AsnDecoder #101512

Merged
merged 24 commits into from
Dec 5, 2024

Conversation

edwardneal
Copy link
Contributor

Relates to #97540.

This PR replaces all references to BerConverter in LDAP directory control generation/parsing to use AsnWriter and AsnDecoder. SortRequestControl didn't use BerConverter directly - it called the OpenLDAP and WLDAP ldap_create_sort_control APIs instead. This class was the only thing in S.DS.P which referenced the ldap_create_sort_control and SortKeyInterop struct, so I deleted them both.

SortRequestControl

The change to SortRequestControl's generation mechanism might also resolve #34679, since there shouldn't be any mechanism for the heap corruption to occur.

Most of the SortRequestControl's new ASN.1 encoding is pretty uncontroversial, but there was a bit of discussion in PR #65548 around the encoding of the sort key's attribute name, and this was marshalled (as part of SortKeyInterop) with different encodings between Windows and Linux. In the RFC, this is defined (indirectly) as an LdapString; this is described as ISO10646 characters, encoded as a UTF-8 string and represented as an OCTET STRING. I'm fairly sure that UTF8Encoding.GetBytes fulfils this, and running the associated test case against a real AD domain controller passes.

Test changes

There are also test changes, but these are largely to change the special-casing of expected byte values between OpenLDAP and WLDAP - .NET now generates these values in a consistent format (the OpenLDAP format) regardless of platform. The .NET Framework tests continued to use the version of S.DS.P from the GAC in my environment, so I've special-cased by the framework version rather than by the platform.

Misc. optimizations

There were a handful of byte-by-byte array copies, which I've switched over to using span-based copies in hopes that they'll benefit slightly from vectorisation. TransformControls and GetValue have a related change: where they used to reference properties returning byte arrays (which took defensive copies) they now reference the property values directly. These should both reduce GC traffic slightly.

Replaced this with the managed AsnDecoder, removing PInvoke from a potential hot path.
Also removed the manual API calls to ldap_create_sort_control - this is now built in managed code.
This then has knock-on effects to eliminate the SortKeyInterop classes.
Most of the Control tests were hardcoded to the output of BerConverter, which uses four-byte lengths in all cases.
This behaviour is now different: the same output is returned across all platforms for .NET, and remains unchanged for .NET Framework.
This should also close issue 34679.
Reduce number of copies required in TransformControls, and enable these copies to take advantage of newer intrinsics where available.
Windows domain controllers may return a distinguished name starting with OU=, rather than ou=.
@PaulusParssinen
Copy link
Contributor

Out of curiosity, any benchmarks for perf. numbers before/after switching to AsnReader/AsnWriter?

@edwardneal
Copy link
Contributor Author

I've not got benchmarks right now, but will write some in the next few days. In advance of these, I expect there'll be a modest reduction in managed and unmanaged memory usage, and that execution time will reduce (while remaining within the margin of error for the network request itself.)

Preallocating space for AsnWriter buffers to reduce memory usage.
Correctly handling attribute names in SortControls.
@edwardneal
Copy link
Contributor Author

edwardneal commented Apr 27, 2024

Benchmarks are below. To summarize:

  • As expected, the performance improvements are working in the margins. I fully expect most of the execution time variations to be lost in the noise of network traffic.
  • 35% median reduction in memory usage. One notable improvement on this in DirectoryControl.TransformControls, which is on the hot path for processing LDAP responses and reduces memory usage by about 75%.
  • Although the percentage reductions in memory usage are good, the absolute reductions are pretty small - the median reduction was of 144 bytes.
  • 88.5% median reduction in execution time, although the absolute reductions are often small - AsqRequest is reduced from 1.869 microseconds to 185.1 nanoseconds.
  • DirectoryControl.TransformControls is another notable exception to this, reducing from 6.596us to 1.588us.
  • Most of the original code's memory allocations stuck around for Gen1 GCs. This GC pressure no longer exists.
  • I've got no data on unmanaged memory usage. This is particularly relevant for SortRequest, which moved from 400 bytes to 416 bytes managed memory usage. I'm assuming that this lack of data is the reason for the increase in memory usage - it's not actually increasing, it's just now trackable in the managed counters.
  • Code size is 15 bytes in most places. The disassembly puts this at the size of the benchmark itself - just enough to return DirectoryControl.GetValue. I think this is just noise from the JIT inlining.
Performance header
BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
Intel Core i7-8565U CPU 1.80GHz (Whiskey Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 8.0.200
  [Host]     : .NET 8.0.4 (8.0.424.16909), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.4 (8.0.424.16909), X64 RyuJIT AVX2
AsqRequestControl.GetValue: -90% execution time, -35% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.869 μs 0.0373 μs 0.0485 μs 127 B 0.1030 0.0992 432 B
PR 185.1 ns 3.25 ns 4.12 ns 15 B 0.0668 280 B
CrossDomainMoveControl.GetValue: -54% execution time, -62% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 89.81 ns 1.297 ns 1.332 ns 1,039 B 0.0516 216 B
PR 41.24 ns 0.469 ns 0.438 ns 2,577 B 0.0191 80 B
DirSyncRequestControl.GetValue: -89% execution time, -35% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.587 μs 0.0453 μs 0.1329 μs 191 B 0.1221 0.1183 520 B
PR 162.4 ns 2.20 ns 2.06 ns 15 B 0.0782 328 B
ExtendedDNControl.GetValue: -89% execution time, -30% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.059 μs 0.0213 μs 0.0522 μs 147 B 0.0877 0.0858 368 B
PR 115.9 ns 1.72 ns 1.44 ns 15 B 0.0610 256 B
PageResultRequestControl.GetValue: -90% execution time, -40% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.412 μs 0.0283 μs 0.0432 μs 160 B 0.1030 0.1011 432 B
PR 134.7 ns 1.49 ns 1.32 ns 15 B 0.0610 256 B
QuotaControl.GetValue: -92% execution time, -28% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.607 μs 0.0261 μs 0.0232 μs 127 B 0.0916 0.0877 392 B
PR 120.4 ns 1.65 ns 1.29 ns 15 B 0.0668 280 B
SearchOptionsControl.GetValue: -90% execution time, -30% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.251 μs 0.0223 μs 0.0197 μs 147 B 0.0877 0.0858 368 B
PR 114.7 ns 1.56 ns 1.39 ns 15 B 0.0610 256 B
SecurityDescriptorFlagControl.GetValue: -88% execution time, -30% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.021 μs 0.0188 μs 0.0460 μs 147 B 0.0877 0.0858 368 B
PR 115.8 ns 1.47 ns 1.37 ns 15 B 0.0610 256 B
SortRequestControl.GetValue: -79% execution time, +4% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.893 μs 0.0220 μs 0.0195 μs 15 B 0.0954 400 B
PR 390.5 ns 5.37 ns 5.03 ns 15 B 0.0992 416 B
DirectoryControl.TransformControls: -75% execution time, -76% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 6.596 μs 0.1310 μs 0.3164 μs 3,603 B 0.8392 0.0153 3.43 KB
PR 1.588 μs 0.0208 μs 0.0255 μs 12,854 B 0.1945 816 B
VerifyNameControl.GetValue: -87% execution time, -46% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.575 μs 0.0480 μs 0.1240 μs 1,526 B 0.1488 0.1469 624 B
PR 195.1 ns 1.43 ns 1.27 ns 15 B 0.0801 336 B
VlvRequestControl.GetValue: -84% execution time, -69% Gen0 memory allocation
Method Mean Error StdDev Code Size Gen0 Gen1 Allocated
Original 1.459 μs 0.0276 μs 0.0650 μs 15 B 0.2289 0.0038 968 B
PR 227.7 ns 3.21 ns 3.00 ns 15 B 0.0706 296 B

I've made a performance adjustment by specifying the initial size of AsnWriter, since this is trivial to calculate (or always static.) AsnWriter grows in 1KB increments, which is much larger than the size of a normal directory control and causes memory usage to balloon.

One inefficiency which I couldn't eliminate is that when writing strings as ASN.1 octet strings, I want to manually select the encoding to use and encode directly into the AsnWriter buffer. This isn't possible, (probably to keep AsnWriter specification-compliant) so I have to reserve/allocate a byte array, encode into that and write that out as an octet string. An example of this behaviour is in VerifyNameControl.GetValue.

Edit: the updated build has completed and the test failures are unrelated, so I'm now happy that the benchmarks are valid @PaulusParssinen

* Handles the switch to AsnDecoder
* One data correction in SortResponseControlTests - one field has a tag of [0] which BerConverter was ignoring
* AsqResponseControl now verifies that there's no trailing data within the control value's decoded BER sequence.
* All response controls now verify that there's no trailing data after their encoded value in the control value.
Previously, a zero-length octet string interpreted via the "a" format string would have resulted in a null value in Windows 8.1, and an empty string in every other case. This now returns an empty string in all cases.
@edwardneal
Copy link
Contributor Author

Following the merge of #107201, I've updated this PR with three commits; it's ready for review at leisure.

Commits 1 & 3

  • Updated the various response control tests, accounting for cases where previously "nonconformant but valid" samples are now marked as invalid, and eliminating the OS-specific behaviour in these tests. Commit 3 just removes the OS-specific behaviour from the conformant control samples - I'd forgotten that Windows 8.1 handled the "a" parameter differently.
  • Corrected a handful of conformant values in the sort response control tests - I had specified the tag for an OCTET STRING (0x04), when the RFC actually gives the structure an explicit tag of [0] (0x80.)
  • Clarified the differing behaviour of text decoding when the bytes to be decoded aren't a valid input to Encoding.GetString. .NET Framework/BerConverter throws a DecoderFallbackException, .NET 10 throws a BerConversionException.

Commit 2

  • Slight tightening of the validation when comparing the nonconformant but valid samples between response tests: added a check to ensure that there's no trailing data inside the ASN sequence when parsing an AsqResponseControl.
  • Also changed all five response control tests to make sure that there's no trailing data after the end of the RFC-compliant response control value. I think this tightens up the last piece of loose parsing of trailing data, and matches the RFCs/specs. I've tested this against an OpenLDAP server and can run the existing tests.

@ericstj ericstj requested review from a team and removed request for buyaa-n November 4, 2024 16:30
Copy link
Member

@steveharter steveharter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwardneal I verified test coverage for the new code, and that looks good with the exception of the two cases I mentioned.

Reading a long attribute name would have failed due to an invalid expected ASN.1 tag. Correct, and added a test.
Added test to validate that passing an invalid UTF8 string as the target parameter of a VlvRequestControl will now throw an EncoderFallbackException.
No longer null coalescing _directoryControlValue; replaced with a Debug.Assert that it's not null.
Copy link
Member

@steveharter steveharter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking approved; will wait for a bit for any additional feedback from @bartonjs or others before merging. I think moving this code forward is a good thing even though there some risk. It also has the potential to fix issues including #34679.

{
internal static class AsnWriterExtensions
{
public static void WriteLdapString(this AsnWriter writer, string value, Encoding stringEncoding, bool mandatory = true, Asn1Tag? tag = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a good name. There's no construct (that I see) called LdapString, and not all strings in LDAP are sent as "A Utf8String, except using tag 04 instead of 0C".

WriteUtf8OctetString, maybe?

The bool mandatory has no peer on AsnWriter methods. I recommend removing it here (making it always behave as true, and making the one "optional" caller bring that logic closer to home... so it looks like any other conditional write for an ASN OPTIONAL or DEFAULT value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one caller passes Encoding.Unicode. So either two functions, or "WriteStringAsOctetString" might be a better name for the current shape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name LdapString partially comes from RFC2251, as the backing type for AttributeDescription. Do you still want the name to change?

It was primarily used for writing the sort controls, and the other control logic piggybacks on the same method by explicitly specifying the encoding. I'll see if two methods would be clearer for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried a couple of different methods to see what the semantics look like, and agree - WriteStringAsOctetString it is. That's rolled up and done now.

[ThreadStatic]
private static AsnWriter? t_writer;

[MemberNotNull(nameof(t_writer))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how well MemberNotNull behaves with ThreadStatic. No one should be touching t_writer except this function, so why is the annotation needed/warranted at all?

Copy link
Contributor Author

@edwardneal edwardneal Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it'd help bridge the gap between a nullable local variable and a non-nullable return value. I've removed it.


[MemberNotNull(nameof(t_writer))]
internal static AsnWriter GetWriter()
=> t_writer ??= new AsnWriter(AsnEncodingRules.BER);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct behavior for every caller is to call Reset() on the writer when they get it, because they don't know if they have one that was abandoned due to an exception.

Maybe GetWriter should do that for them, instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've shifted this around.

@bartonjs
Copy link
Member

any additional feedback from @bartonjs

Had a few small things. Looked at all the commits since my last review.

This change doesn't take effect on .NET Framework, so any test expecting an exception will fail.
Removed the unnecessary nullability annotations, and moved the Reset call into GetWriter.
Also adjusted method signature to better align to the rest of the AsnWriter API surface.
@steveharter
Copy link
Member

@edwardneal do you have any further action items or planned changes? If not, I'll merge. Thanks.

@steveharter steveharter removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Dec 5, 2024
@edwardneal
Copy link
Contributor Author

Thanks - I've responded to bartonjs' code review in-line, so don't have any further code changes planned.

An earlier comment asked for a breaking change doc to be created though, and with the work settled I'll do this today/tomorrow.

@steveharter steveharter merged commit b5ea456 into dotnet:main Dec 5, 2024
80 of 83 checks passed
@edwardneal edwardneal deleted the issue-97540 branch December 6, 2024 06:37
@edwardneal
Copy link
Contributor Author

Thanks @steveharter and @bartonjs for your reviews. The breaking change doc is dotnet/docs#43885.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.DirectoryServices breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. community-contribution Indicates that the PR has been added by a community member needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[mono] Test failed on windows: System.DirectoryServices.Protocols.Tests.SortRequestControlTests
6 participants