-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline some IR methods and constructors #5030
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were originally all made non-inline to try to speed up the build. Not sure how much effect it has one way or the other.
@fruffy Looks like mac runner image is again broken. But why do we need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of this change? For me it makes sense to have all the implementations in one place. Otherwise you end up jumping between files when reading up on a class.
@fruffy Looks like mac runner image is again broken. But why do we need pkg-config there?
Looks like a new runner is deployed, there is already a fix here: #5028
@@ -530,7 +530,7 @@ int IrClass::generateConstructor(const ctor_args_t &arglist, const IrMethod *use | |||
} | |||
|
|||
if (kind == NodeKind::Abstract) ctor->access = IrElement::Protected; | |||
ctor->inImpl = true; | |||
ctor->inImpl = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have major effect on the size of the ir-generated.h header. Is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, currently all simple the constructors are outlined. This case covers those that are not in defined explicitly and just initialize the fields. As a result lots of function calls are emitted as compiler is unable to inline those obviously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some deeply nested inheritance trees. Wouldn't this lead to a blowup if every constructor can be inlined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are also adding ~4k LoC to the header in the worst case (with Tofino)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some deeply nested inheritance trees. Wouldn't this lead to a blowup if every constructor can be inlined?
Not quite. Remember that inlining decision is done by compiler. Not every function declared inline will be inlined. Also, explicitly defined constructors w/o inline
will still be outlined, there is a way to control that. Implicit ones are the ones that initialize the fields just forwarding the arguments.
The main purpose of this is to ensure that simple methods are properly inlined. E.g. does not make any sense to outline one-liners like inline bool isBuiltin() const { return name == ParserState::accept || name == ParserState::reject; } We just pay extra function call for no reason. Defining things inline ensures compiler is able to optimize them further on. |
Is there a measurable effect on performance? We should be mindful not to trade of too much usability/readability and binary size for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we merge it would be nice to see some numbers on performance gain versus compile time increase. If compile time increases significantly and there is no measurable performance difference I do not think we should merge this.
If, on the other hand, compile time doesn't measurably change we can merge.
In case the compile-time impact is high, how does this change compare to enabling LTO in the build? What I remember from Tofino days, we had rather significant speedup by enabling LTO, presumably exactly because it allows inlining more functions. The advantage is that LTO can be enabled only for release builds so normal development build speeds are not affected. On a slight tangent, there was an idea of speeding up compilation by using PCH (precompiled header in GCC/clang) for |
I believe LTO contributes a lot to devirtualization and further optimization opportunities uncovered due to this.
This is good idea, yes, that is worth trying. Modules are likely unfeasible due to circular dependencies here and there. But PCH / PTH might give a decent speedup. |
So, the performance results are interesting. Here are the build time for ir-generated.cpp: main:
ir-inline:
These numbers are pretty stable, so for me it's ~38 second on average on The compile time for "ordinary" C++ files looks similar to me, I'm showing individual files here, but averages from multiple runs are pretty much the same:
ir-inline:
And here is
ir-inline:
The overall compile time for the whole p4c is pretty noisy. I tried to replicate it several times, but overall the differences are below the noise level. The performance of the p4c itself improved slightly. Which is even surprising for pretty flat profile dominated by memory allocation, copies and constant clones :) Sadly, I cannot run larger apps with GC off, but GC itself introduced lots of variation to catch ~1-2% difference.
@fruffy Please let me know if there are any other checks that you'd want me to run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange results indeed. Thanks for the detailed analysis!
It looks like build time is not adversely affected but it is hard to measure.
to outline them as they are mostly one-liners and could be further simplified. Signed-off-by: Anton Korobeynikov <anton@korobeynikov.info>
It does not really make sense to emit them in
ir-generated.cpp
as they are mostly one-liners and could be further simplified.