Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Improve MemoryMarshal.Cast #16659

Merged
merged 1 commit into from
Mar 1, 2018
Merged

Improve MemoryMarshal.Cast #16659

merged 1 commit into from
Mar 1, 2018

Conversation

mikedn
Copy link

@mikedn mikedn commented Feb 28, 2018

Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Contributes to #15492

It would be nice to address some of the problems this code exposes in the JIT. But that will take a while so for now let's fix the source code...

@mikedn
Copy link
Author

mikedn commented Feb 28, 2018

This brings down the size of Marvin.ComputeHash from 445 to 310 bytes

Diff/before/after disassembly: https://gist.github.com/mikedn/026e692f8c5357b56f24dc2192d1abd9

Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.
@jkotas
Copy link
Member

jkotas commented Feb 28, 2018

cc @AndyAyersMS

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedn Thanks for doing this!

@AndyAyersMS
Copy link
Member

Looks like this gives us maybe another 5-10% improvement in the longer hash computations. My machine currently shows shorter computations a bit slower but I can't do a completely clean perf test right now. Will rerun these later today.

@AndyAyersMS
Copy link
Member

Test failures (example) look unrelated, so am going to retry.

@dotnet-bot retest Ubuntu x64 Checked Innerloop Build and Test
@dotnet-bot retest CentOS7.1 x64 Checked Innerloop Build and Test
@dotnet-bot retest OSX10.12 x64 Checked Innerloop Build and Test
@dotnet-bot retest Tizen armel Cross Checked Innerloop Build and Tes

@AndyAyersMS
Copy link
Member

@dotnet-bot retest Tizen armel Cross Checked Innerloop Build and Test

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Mar 1, 2018

Still measure a bit of a slowdown for some cases but when looking at diffs it doesn't make a lot of sense. So perhaps my baseline data (which came from the runs I did for #16654) is stale.

(edit: reran the baseline and now it all looks good; note new results are now the leftmost column)

image

Old data below

image

@mikedn
Copy link
Author

mikedn commented Mar 1, 2018

Cool, need to figure out how to run these tests myself :)

@AndyAyersMS
Copy link
Member

Just for the record, here's what I did -- only works if the bits you are changing live entirely in CoreCLR:

  • build release bits for diff and base (I usually use two repos). usually build checked too.
  • in base repo, selectively build one perf test from the test tree
  • from repo root, run tests\scripts\run-xunit-perf.py -testBinLoc <path to that test's dir> to setup the perf test sandbox (which will be bin\sandbox)
  • compile your new custom perf test (I copied the one from corefx).
  • copy the test binary into bin\sandbox
  • cd to the sandbox and corerun perfharness.dll myperftest.exe
  • switch to diff repo and repeat above to get diff data (or swap diff parts into base sandbox)
  • hoover the two csv that perfharness creates into excel and pivot to get comparative view (need to add a custom item to get ratio, it's not a built in option)

@AndyAyersMS AndyAyersMS merged commit 5e82446 into dotnet:master Mar 1, 2018
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Mar 1, 2018
Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
stephentoub pushed a commit to dotnet/corefx that referenced this pull request Mar 1, 2018
Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Mar 1, 2018
Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
stephentoub pushed a commit to dotnet/corert that referenced this pull request Mar 1, 2018
Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Mar 1, 2018
Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@adamsitnik
Copy link
Member

Hi @mikedn

I am currently profiling XmlSerializer deserialization and I can see that your change has improved the perf for my particular scenario (https://github.com/dotnet/corefx/issues/27636) by 2,8%

I have marked the Cast method with black circles, on the left is after, on the right before your change . Sorry for unnatural order ;)

image

Nice improvement!

kbaladurin pushed a commit to kbaladurin/corert that referenced this pull request Mar 15, 2018
Avoid unnecessary checked and signed arithmetic. Handle special cases such as cast between same size types and from byte sized types, the JIT is unable to optimize these cases currently.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@mikedn mikedn deleted the memcast branch March 25, 2018 10:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants