-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shorten UTD marker file #9387
Shorten UTD marker file #9387
Conversation
Do you have an experimental insertion for this fix? |
https://devdiv.visualstudio.com/DevDiv/_git/VS/pullrequest/509283 I'm looking into the checks failures. UPDATE: Just a single test is failing for insertion which is by design (as built packages are missing in the public feed) |
Closing in order to rekick the confused git pull |
CI blocked by ADO issue. IcM [internal]: https://portal.microsofticm.com/imp/v3/incidents/details/439851463/home |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation :)
ReadOnlySpan<byte> span = MemoryMarshal.Cast<char, byte>(text.AsSpan()); | ||
foreach (byte b in span) | ||
{ | ||
hash = unchecked((hash ^ b) * fnvPrimeA64Bit); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does MSBuild support big-endian platforms? It looks like this would not be stable between big-endian and little-endian.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KalleOlaviNiemitalo - Thank you for bringing this up!
We're not declaring what can be the expectations of stableness of the hashing between different systems (we actually do not declare anything ... yet :-)) - but it'd certainly be nice to have it more universal.
@uweigand - would you be willing to chime in and share your thoughts on the following?
Basically - since the MemoryMarshal.Cast<char, byte>
changes how we fetch data from memory to processor - it'll behave differently between Endinan-ness representations. But how anout the version which manually splits each char to bytes:
for (int i = 0; i < text.Length; i++)
{
char ch = text[i];
byte b = (byte)ch;
hash ^= b;
hash *= fnvPrimeA64Bit;
b = (byte)(ch >> 8);
hash ^= b;
hash *= fnvPrimeA64Bit;
}
since all this happens in processor after the data are fetched from memory - it should fetch same results regardless of endinan-ness - is that right?
FYI @ladipro (and thanks for all the offline consultation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would work the same in either endianness.
You could also check BitConverter.IsLittleEndian and use MemoryMarshal.Cast on little-endian platforms but fall back to the slower NET35-compatible implementation on big-endian platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…although I have not measured whether the NET35-compatible implementation actually is slower. In both implementations, each operation on hash
depends on the previous one, so it doesn't look like there is much that the CPU could do in parallel. And the strings probably are pretty short anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me the code above implements three methods that are (potentially) different. There's the NET35
method in ComputeHash64
(which operates on single bytes at a time, splitting the 2-byte char in an endian-independent way), the !NET35
method in ComputeHash64
(which also operates on single bytes at a time, but splits the 2-byte char in an endian-dependent way), and then there's ComputeHash64Fast
(which operates on 2-byte chars at a time, which appears to give a different result from either of the two other methods).
In general, I agree that it would be preferable to use a method that is stable across native endianness. Unless the NET35
method is really slower (and I also suspect it actually isn't), and performance matters at this place, it might be best to just use that method unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re ComputeHash64Fast, this comment seems to be false:
/// The hashing algorithm considers only the first 8 bits of each character. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the detailed input! I'll remove the Spans version.
The Spans version is indeed not superior (it has slight to no advantage for large strings, but performs worse for a very short strings). It's nice that it doesn't need unchecked code and is more brief - but given the fact that it can perform worse and that the other version anyways need to be maintained - it's an easy choice.
I'll as well fix the description for the ComputeHash64Fast
…n.CurrentVersion.targets
…entVersion.targets (#9520) #9387 introduced improved hashing for [`[MSBuild]::StableStringHash`](https://learn.microsoft.com/en-us/visualstudio/msbuild/property-functions?view=vs-2022#msbuild-stablestringhash), that however broke internal functionality relying on the hash to be stable between versions (despite documented otherwise). ### Changes Made * Reverted all the hashing changes * Kept the UTD marker change (fixing #9346) ### Note Longer term fix: #9519
…mon.Curr…" This reverts commit 5af9301.
…ft.Common.Curr…" This reverts commit 5af9301.
* Revert "Revert "Shorten UTD marker file (#9387)" except Microsoft.Common.Curr…" This reverts commit 5af9301. * Make FNV hash compatible across endianness * Add StableStringHash intrinsic function overloads * Put StringTools functions references behind changewave * Prevent StableStringHash inlining * Move the changewave description to proper section
Fixes #9346, #7131
Context
With long project paths CopyUpToDate mareker file can get long and break build in VS as long file support is not on.
This is a proposal of short/mid term remedy by shortening the marker file path a bit. However it applies only to non-CPS cases and it applies to CopyUpToDate marker file only.
Considering those facts it's questionable if this should be part of the MSBuild repo.
Alternatively we can decide to fix it more widely by renaming the
MSBuildCopyMarkerName
toMSBuildProjectMarkerName
and use it in various other scenarios where marker/cache files are generated in a similar way - though that would have much higher breaking possibility.