-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split Prism::Loader#load_node in one lambda per node type #1710
Conversation
@eregon I half wonder how naming methods for each type with a send would work? Not just because it may be more expensive for us to run blocks but because it would make debugging the gem give a much nicer backtrace when there is a problem. |
@enebo That would be a megamorphic send (megamorphic in lookup since the method name is one of many, impossible to inline cache that lookup), which is a lot more expensive than a block call with a non-constant block "call target" (which has no lookup). |
@eregon yeah I am aware of that but block dispatch is quite a bit more expensive than method dispatch so I think that single hash lookup will still be cheaper. I can understand why you may not like it. |
Feel free to experiment, I'd like to merge this first in any case. |
@eregon yeah definitely land it as it makes a big difference for both our impls. |
I'm seeing about a 10% slow-down on CRuby for this. I'd rather not impose that penalty unless we have to. Could you check the platform that you're on when you're templating out the file? That would allow both implementations to still be quick. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking as request changes
That's more than I saw in my measurements in the description.
Unfortunately that cannot work since the same I can however emit both code paths and do I'm not sure it's worth it given @enebo One thing that is worth a try is to have an array of |
* Otherwise load_node is too big to compile and is forced to run in interpreter: oracle/truffleruby#3293 (comment) * For the benchmark at oracle/truffleruby#3293 (comment) TruffleRuby Native 23.1.0: Before: 10.574041 After: 5.592436 JRuby 9.4.3.0: Before: 7.037780 After: 3.995317 JRuby 9.4.3.0 -Xcompile.invokedynamic=true: Before: 7.047832 After: 2.269294
Rebased and now the exact same code as before is used on CRuby. |
I did not measure on CRuby since the FFI backend is not default there.
EDIT:
I measured on CRuby 3.2 too and it's a little bit slower but almost the same:
Without YJIT: Before: 5.284203 After: 5.357027
With YJIT: Before: 3.333703 After: 3.516235
I think that's fine, given that on CRuby it's using the C extension for YARP.parse and that's faster (2s, 1.75s with YJIT).
cc @enebo