Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Full build takes more time since 2024-09 (type inference) #3387

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

stephan-herrmann
Copy link
Contributor

optimize 3:

  • never visit the same super type more than once

Fixes #3327

(eclipse-jdt#3384)

optimize 3:
+ never visit the same super type more than once

Fixes eclipse-jdt#3327
@stephan-herrmann
Copy link
Contributor Author

stephan-herrmann commented Dec 3, 2024

For a moment I was tricked by the discussion about O(n^2) vs O(n) without saying what is n :)

The requirement put forward by JLS (and thus not subject to our choosing) is similar to O(n^2), by saying that all pairs of supertypes of types s and t must be inspected. The implementation actually cuts this down to two linear-ish searches. See that t never changes in all recursive invocations. This part of the equation is always handled using findSuperTypeOriginatingFrom() which performs a limited search in that hierarchy (limited in that it stops when it find a "matching" type).

What still looks like some combinatorial explosion is in fact a difference between the number of paths vs nodes in the inheritance hierarchy. Viz, the original implementation traversed all paths from t up to j.l.Object whereas visiting each node in that graph is fully sufficient. This is less an issue of the depth of a hierarchy but more of its "volume", it must be deep and wide with lots of joining edges in the graph for effects to show.

Limiting that traversal is what the cache achieves. In contrast to #3333 (comment) no cache of pairs is needed, since s is constant during a traversal :)

Additionally, using prototype() I ensure that differently annotated variants of the same type are still recognized as same.

I'm still a bit uneasy about using TypeBinding in a hashed collection, but looking at the comment in ReferenceBinding.hashCode() this could still be safe as long as no UnresolvedReferenceBinding are in the lot (because those intentionally break uniqueness of hashCode()).

@stephan-herrmann
Copy link
Contributor Author

FWIW, using the reproducer my poor man's measurements yield:

min max
real 0m4,050s 0m4,656s
user 0m8,669s 0m10,040s
sys 0m0,355s 0m0,423s

which is not such a significant gain over results from #3384:

min max
real 0m4,210s 0m6,360s
user 0m8,628s 0m9,131s
sys 0m0,316s 0m0,393s

@stephan-herrmann stephan-herrmann marked this pull request as draft December 3, 2024 18:21
@stephan-herrmann
Copy link
Contributor Author

@fedejeanne I marked this PR as draft, because I'd like to get feedback on results from #3384 in real life, before stacking more speculative optimization on top of the previous one. I mean we've seen the differences in counts, but not so much in compile time.

@stephan-herrmann
Copy link
Contributor Author

which is not such a significant gain over results from #3384:

The reason is: while we still visit a significant number of types, we don't perform any actual work. The list of pairs already has length zero thanks to #3384 :)

@fedejeanne
Copy link
Contributor

@stephan-herrmann understood, I was about to check the performance gain anyway when I read your comments :-)

I am setting up my IDE right now and I will measure performance with #3384 right away.

@jukzi
Copy link
Contributor

jukzi commented Dec 4, 2024

I'm still a bit uneasy about using TypeBinding in a hashed collection, but looking at the comment in ReferenceBinding.hashCode() this could still be safe as long as no UnresolvedReferenceBinding are in the lot (because those intentionally break uniqueness of hashCode()).

  1. This PR uses hashCode(), equals() only during the search. The cache is not kept on heap. The search itself should not (but may? - that would be bad) modify the hashcode during the search.
  2. The hashcode implementation looks horribly bad: equals() and hashCode() should always be in sync. I wonder where this hashcode is already used and if it would not be better fixed.
  3. As workaround this PR could use ReferenceBindingSetWrapper that already tried to workaround that problem.
  4. The Implementation of org.eclipse.jdt.internal.compiler.lookup.ReferenceBinding.identityHashCode() is bad, as it relies on the fact that super does not override hashCode. ReferenceBindingSetWrapper should use System.identityHashCode instead.

@fedejeanne
Copy link
Contributor

FTR, even in its current state and with the bad implementations of hashCode() / equals(), I rebased this PR on master (so #3384 is also applied) and run a full build: it took ~12min 👏

I just happened to have VisualVM still open so I am adding also a screenshot of the Monitor view. FYI the build took time between ~9:23 to ~9:35 (sorry that I can not be more precise, I wasn't looking at the watch).

image

This is the sample (sampled every 20ms): v20241203-1907 and 3387-Full build (clean workspace).zip

And one can see that the hotspots changed:

image

allSuperPairsWithCommonGenericType and allSuperPairsWithCommonGenericTypeRecursive are way down in the hotspots view:
image

image

@jukzi
Copy link
Contributor

jukzi commented Dec 4, 2024

Good, that's what i expected.

BTW: you should not trust the Heap/Memory view while you are sampling. Sampling does too much memory overhead.
You may also want to investigate in new Issues why you have hotspots in indexOfChildren() and findSuperTypeOriginatingFrom() [the later may be similar problem as this]. Never ever should you have any CPU hotspot during compile expect File IO, as compiling is not really a computational intensive problem but should be mostly lookups.

@fedejeanne
Copy link
Contributor

BTW: you should not trust the Heap/Memory view while you are sampling. Sampling does too much memory overhead.

I wasn't aware of that, thank you for the tip!

You may also want to investigate in new Issues why you have hotspots in indexOfChildren() and findSuperTypeOriginatingFrom() [the later may be similar problem as this]. Never ever should you have any CPU hotspot during compile expect File IO, as compiling is not really a computational intensive problem but should be mostly lookups.

Hm, I didn't know that either. Do you have any hints about what should I look for when debugging these methods? I honestly assumed that these hotspots were "normal" since they were already there in the sampling I did with the good old fast 2024-06 in the original issue (#3327 (comment))

385661685-2b876849-c0d4-4546-82b4-86625cacaf11

Also, this PR brings down the number of invocations on those 2 methods almost to those of 2024-06 so I don't know how much more there is to improve in there. I can certainly take a look at them after merging this PR though (since this PR improves both of them, I mean). I am always happy to help track and improve performance bottlenecks :-)

@jukzi
Copy link
Contributor

jukzi commented Dec 4, 2024

Do you have any hints about what should I look for when debugging these methods?

Same procedure: first step is to gather data why it is slow. Typically a breakpoint where it takes longest time (i.e. most elements in the loop) or statistic of the arguments used. second: create a reproducer. At best you could even write a junit test that triggers the ill behavior. At that point others can easily jump in to find a solution.
The potential gain is a reduction of up to another 1.5 minutes (total time cpu):
image

@stephan-herrmann
Copy link
Contributor Author

I'm still a bit uneasy about using TypeBinding in a hashed collection, but looking at the comment in ReferenceBinding.hashCode() this could still be safe as long as no UnresolvedReferenceBinding are in the lot (because those intentionally break uniqueness of hashCode()).

  1. This PR uses hashCode(), equals() only during the search. The cache is not kept on heap. The search itself should not (but may? - that would be bad) modify the hashcode during the search.

no modification in sight

  1. The hashcode implementation looks horribly bad: equals() and hashCode() should always be in sync. I wonder where this hashcode is already used and if it would not be better fixed.

Please let's be extremely careful about "fixing". The current implementation is necessary for semantic correctness and has served this purpose without any reported issues for 20 years.

In fact, scanning the compiler code, there are several locations using TypeBinding as keys in a HashMap. Again with no reported anomalies.

Given those precedents, I'll stop worrying at this particular location. If desired, the general topic can be picked up in a follow-up ticket.

@stephan-herrmann
Copy link
Contributor Author

You may also want to investigate in new Issues why you have hotspots in indexOfChildren() and findSuperTypeOriginatingFrom() [the later may be similar problem as this].

findSuperTypeOriginatingFrom() is part of the algorithm being optimized here, so no surprise if you ask me.

Never ever should you have any CPU hotspot during compile expect File IO, as compiling is not really a computational intensive problem but should be mostly lookups.

I'm not sure I'm following the reasoning here. Perhaps there's an unintended "not" involved? Or should it read except instead of expect? As you see I'm clueless :)

@stephan-herrmann
Copy link
Contributor Author

FTR, even in its current state and with the bad implementations of hashCode() / equals(), I rebased this PR on master (so #3384 is also applied) and run a full build: it took ~12min 👏

So, in the real world case there seems to be something that is not captured in the reproducer. With this gain I agree that there is value in the 3rd level optimization and will merge it, so it will get thorough field testing before release.

@stephan-herrmann stephan-herrmann marked this pull request as ready for review December 4, 2024 23:42
@stephan-herrmann stephan-herrmann added this to the 4.35 M1 milestone Dec 4, 2024
@stephan-herrmann stephan-herrmann merged commit 7ebcecc into eclipse-jdt:master Dec 4, 2024
10 checks passed
@stephan-herrmann stephan-herrmann deleted the issue3327_2 branch December 4, 2024 23:44
@fedejeanne
Copy link
Contributor

FTR, even in its current state and with the bad implementations of hashCode() / equals(), I rebased this PR on master (so #3384 is also applied) and run a full build: it took ~12min 👏

So, in the real world case there seems to be something that is not captured in the reproducer. With this gain I agree that there is value in the 3rd level optimization and will merge it, so it will get thorough field testing before release.

Great, thank you!

I'm looking forward to tomorrow's I-BUILD and I will use it ASAP.

@@ -1277,11 +1283,11 @@ protected List<Pair<TypeBinding>> allSuperPairsWithCommonGenericType(TypeBinding
if (tSuper != null && s.isParameterizedType() && tSuper.isParameterizedType()) { // optimization #1 again
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming s.isParameterizedType is a very cheap operation and findSuperTypeOriginatingFrom is not:
The lookup of tSuper can be guarded by s.isParameterizedType():

TypeBinding tSuper = s.isParameterizedType() ? t.findSuperTypeOriginatingFrom(s) : null;
if (tSuper != null && tSuper.isParameterizedType()) {
  result.add(new Pair<>(s, tSuper));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try this tomorrow and report back. Thank you for the hint!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredible... I had to sample twice because I couldn't believe my eyes 😄

Thank you, @szarnekow ! --> #3411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Performance] Full build takes more time since 2024-09 (type inference)
4 participants