-
Notifications
You must be signed in to change notification settings - Fork 133
Mitigate some of #495 via MRO member selection and analysis set optimization #517
Conversation
if (Comparer == other.Comparer) { | ||
if (lCount != rCount) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This optimization may not work. Code below gets hash codes and collapses them into hash sets, which means that two buckets arrays with different amount of items will be considered equal. I can't say if that is correct or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the comparers are the same, should there ever be a situation where two sets have different contents that are the same? To me that seems related to the issue of keys sometimes changing after being added (that Debug.Fail
), except that this specific check is costly to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It maybe possible as a result of some AnalysisSet
merges or if there is more than one default Bucket
instance with 0
hash set.
Btw, have you tested if there is any significant benefit of this optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not sure how merging matters for this. Wouldn't Buckets.Count
be updated when that occurs? Or does a union not actually do any unioning and instead concatenates?
I don't have any numbers, but it seemed like it shifted some CPU time out to other places when I was working on fastai. Namely, I started with the count checks, then saw a lot of CPU time used on accessing Count
(since it's put behind a lock), then moved the access above.
Most of the CPU time in my testing comes from set comparison and tuples getting too large, which I wasn't able to cleanly limit. The MRO lookup change curbs things in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or does a union not actually do any unioning and instead concatenates?
I've seen sets with empty buckets under debugger, but I have no idea how to reproduce it and I don't know if that is a buggy result or an expected one. This code is quite muddy. @MikhailArkhipov , what is your opinion?
Namely, I started with the count checks, then saw a lot of CPU time used on accessing Count (since it's put behind a lock), then moved the access above.
Just for the reference. Thin locks are really fast, but I don't know if we have thin or fat lock here. GetHashCode
is overridden, but it isn't the only thing that can promote thin lock to fat lock.
The MRO lookup change curbs things in some cases.
Yes, MRO fix is clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The locking I'm referring to is here: https://github.com/Microsoft/python-language-server/blob/cb1c3dfbad1c7a521857a5dae8eea7c94225a096/src/Analysis/Engine/Impl/AnalysisHashSet.cs#L95-L106
Now that I look more closely, most other functions grab Bucket
at the start. The original function here did (by grabbing lBuckets
and maybe grabbing rBuckets
). Perhaps I should have moved those upward and operated on them instead of introducing two extra possible locks.
I'm happy to remove some of the more suspect length checks if you think this may be an issue (but some of them should still be valid).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, the thin vs fat locking is past lock
and more to do with object layout, so me showing where the locking is doesn't matter. Oops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different numbers of items (types) in buckets and yet equality seem odd to me. I am not overly concerned with keeping correct number of items in buckets since collecting types from various type of calls is generally not what we want to be doing. Ex, def f(a): return a
called with int
now has int
return. Called then with str
it is suddenly show as returning int, str
. But that's not true, it is not returning combo or a union. And then user adds code calling with bool and it now it returns triplet... Which has nothing to do with the nature of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New analysis has special types for call arguments and generics so the said function is considered returning type of its argument so no need to collect possible outcomes. f(1).
will show completion for int
members and f('a').
will yield str
members without need to hold unions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM. I'd build insiders and let people try it out.
…set optimization (microsoft#517) This isn't a complete fix, but does seem to improve microsoft#495 in some cases. Adds back the early MRO return removed in microsoft#277, so now large class hierarchies won't over-propagate types (where some of the trouble with fastai happens do to the Transform class). I also optimized AnalysisHashSet a little to do length checks when possible. There are still times when things get caught up comparing unions for a while, but it seems to be nondeterministic.
* Fix #470 (#478) * Fix various common exceptions, log on a failed directory load (#498) * rethrow exception after telemetry instead of disposing * log an error instead of crashing when hitting some exceptions loading files, fix nullref when shutting down without an idle tracker * fix short circuiting of question mark checks to prevent null from being returned * print name of exception instead of relying on ToString * don't use MaybeEnumerate * upgrade message to Show * Mitigate some of #495 via MRO member selection and analysis set optimization (#517) This isn't a complete fix, but does seem to improve #495 in some cases. Adds back the early MRO return removed in #277, so now large class hierarchies won't over-propagate types (where some of the trouble with fastai happens do to the Transform class). I also optimized AnalysisHashSet a little to do length checks when possible. There are still times when things get caught up comparing unions for a while, but it seems to be nondeterministic. * Two path resolution bug fixes - Fix #509: PLS doesn't flag error in a relative import if it is found in another root directory (#519) - Fix #510: PLS doesn't resolve relative paths correctly * Catch exceptions when importing from search paths (#525) * catch exceptions when importing from search paths * retry instead of not found
This isn't a complete fix, but does seem to improve #495 in some cases. Adds back the early MRO return removed in #277, so now large class hierarchies won't over-propagate types (where some of the trouble with fastai happens do to the Transform class). I also optimized AnalysisHashSet a little to do length checks when possible.
There are still times when things get caught up comparing unions for a while, but it seems to be nondeterministic.