-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose an eeIsJitIntrinsic method for quickly determining if a method handle represents an intrinsic method #51124
Conversation
72642d8
to
daa19d0
Compare
private bool isJitIntrinsic(CORINFO_METHOD_STRUCT_* ftn) | ||
{ | ||
MethodDesc method = HandleToObject(ftn); | ||
return method.IsIntrinsic; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only particular thing of note is this (as per https://github.com/dotnet/runtime/compare/main...tannergooding:isJitIntrinsic?expand=1#diff-132a77bcd3f74cf0e0b04fbccda246c97c91e40562d78cb01fff61cf69403573L872-L873) isn't fine grained enough to determine just CORINFO_FLG_INTRINSIC
or CORINFO_FLG_JIT_INTRINSIC
Did you measure crossgen2 throughput with this? You could probably also use crossgen1, it might be simpler. |
Working on doing that now, I was just planning on running the same benchmark from #51006 |
@AndyAyersMS, @jkotas, on my own box, I see the below. Crossgen2Base:
Resolve tokens (+6.03% from Base):
Jan's change (+2.24% from Base, -3.58% from Resolve tokens):
This change (+1.94% from Base, -3.86% from Resolve tokens, -0.29% from Jan's change):
Notably, its not clear why this change, which is doing a lot less work than |
The "Time on Thread" has improved significantly with this change. Is it a sign that this change works well? |
@DrewScoggins might be able to give more insight into what each value is for. There is a good bit of noise here as well, and multiple independent runs can result in some pretty variable results. For example, in the change I just pushed which drops the
|
Console output indicates 1 warm up and 5 iterations, for all 4 (now 5) shared results. |
Given that we're making measurements where we don't have a good model for the underlying distribution, median may be a more useful thing to report than mean. Historically for TP measurements the jit team would use crossgen1 running under PIN (or some other instruction counting technique), and there the run to run repeatability was quite good. So you might try doing that instead. |
I can run some more benchmarks later this afternoon. I can collect some instruction counts using AMD uProf ( I'll need to figure the command line the benchmark is passing to crossgen as part of that. |
Here is the code that we use to generate the command line. Looks like we call crossgen2 with the path to the file we are crossgening, give an output file name and use the -O flag as well as -r for the different ref files. |
For the Native AOT compiler (which is pretty much crossgen2), turning off Tiered JIT stabilized throughput by a lot. We actually also had a compilation throughput improvement with that so it's just off by default there. |
I'm seeing an average of:
It isn't clear to me why this is the case initially, since this should be doing less work now. |
03c0999
to
d467326
Compare
Rebased to resolve merge conflict |
What does the retired instructions number mean? |
The number of instructions that were executed and correct (meaning this does not include speculatively executed instructions). In general, it isn't clear why calling |
We usually just look at the overall numbers for the whole process, not try and attribute them to particular methods. |
Its lost in the noise of the overall process. The average for all 4 variations, measured across 5 runs each, is about 92000 retired instructions with not enough variation between them to see anything discernable in terms of hardware metrics that ETL or uProf collect. |
This small variation in instructions retired is seen across all 4 variants? (base, regression, fix1, fix2...)? If so then something is off in the measurement process. I can try looking with PIN later and post what I see. |
Here's the full process pin icount data for crossgen SPC. Seems to track your crossgen2 data above fairly well. Looks like we're maybe 1.5% slower than we were before all this. I think we can live with that.
|
Thanks! I'll need to figure out why uProf wasn't reporting the right numbers on my end |
This should resolve #51006