Skip to content
This repository has been archived by the owner on Apr 14, 2022. It is now read-only.

Mitigate some of #495 via MRO member selection and analysis set optimization #517

Merged
merged 4 commits into from
Jan 2, 2019

Conversation

jakebailey
Copy link
Member

This isn't a complete fix, but does seem to improve #495 in some cases. Adds back the early MRO return removed in #277, so now large class hierarchies won't over-propagate types (where some of the trouble with fastai happens do to the Transform class). I also optimized AnalysisHashSet a little to do length checks when possible.

There are still times when things get caught up comparing unions for a while, but it seems to be nondeterministic.

if (Comparer == other.Comparer) {
if (lCount != rCount) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization may not work. Code below gets hash codes and collapses them into hash sets, which means that two buckets arrays with different amount of items will be considered equal. I can't say if that is correct or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the comparers are the same, should there ever be a situation where two sets have different contents that are the same? To me that seems related to the issue of keys sometimes changing after being added (that Debug.Fail), except that this specific check is costly to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It maybe possible as a result of some AnalysisSet merges or if there is more than one default Bucket instance with 0 hash set.

Btw, have you tested if there is any significant benefit of this optimization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not sure how merging matters for this. Wouldn't Buckets.Count be updated when that occurs? Or does a union not actually do any unioning and instead concatenates?

I don't have any numbers, but it seemed like it shifted some CPU time out to other places when I was working on fastai. Namely, I started with the count checks, then saw a lot of CPU time used on accessing Count (since it's put behind a lock), then moved the access above.

Most of the CPU time in my testing comes from set comparison and tuples getting too large, which I wasn't able to cleanly limit. The MRO lookup change curbs things in some cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or does a union not actually do any unioning and instead concatenates?

I've seen sets with empty buckets under debugger, but I have no idea how to reproduce it and I don't know if that is a buggy result or an expected one. This code is quite muddy. @MikhailArkhipov , what is your opinion?

Namely, I started with the count checks, then saw a lot of CPU time used on accessing Count (since it's put behind a lock), then moved the access above.

Just for the reference. Thin locks are really fast, but I don't know if we have thin or fat lock here. GetHashCode is overridden, but it isn't the only thing that can promote thin lock to fat lock.

The MRO lookup change curbs things in some cases.

Yes, MRO fix is clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The locking I'm referring to is here: https://github.com/Microsoft/python-language-server/blob/cb1c3dfbad1c7a521857a5dae8eea7c94225a096/src/Analysis/Engine/Impl/AnalysisHashSet.cs#L95-L106

Now that I look more closely, most other functions grab Bucket at the start. The original function here did (by grabbing lBuckets and maybe grabbing rBuckets). Perhaps I should have moved those upward and operated on them instead of introducing two extra possible locks.

I'm happy to remove some of the more suspect length checks if you think this may be an issue (but some of them should still be valid).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, the thin vs fat locking is past lock and more to do with object layout, so me showing where the locking is doesn't matter. Oops.

Copy link

@MikhailArkhipov MikhailArkhipov Jan 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different numbers of items (types) in buckets and yet equality seem odd to me. I am not overly concerned with keeping correct number of items in buckets since collecting types from various type of calls is generally not what we want to be doing. Ex, def f(a): return a called with int now has int return. Called then with str it is suddenly show as returning int, str. But that's not true, it is not returning combo or a union. And then user adds code calling with bool and it now it returns triplet... Which has nothing to do with the nature of the code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New analysis has special types for call arguments and generics so the said function is considered returning type of its argument so no need to collect possible outcomes. f(1). will show completion for int members and f('a'). will yield str members without need to hold unions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM. I'd build insiders and let people try it out.

@jakebailey jakebailey merged commit 0893287 into microsoft:release Jan 2, 2019
@jakebailey jakebailey added this to the Dec 2018.3 milestone Jan 18, 2019
AlexanderSher pushed a commit to AlexanderSher/python-language-server that referenced this pull request Feb 4, 2019
…set optimization (microsoft#517)

This isn't a complete fix, but does seem to improve microsoft#495 in some cases. Adds back the early MRO return removed in microsoft#277, so now large class hierarchies won't over-propagate types (where some of the trouble with fastai happens do to the Transform class). I also optimized AnalysisHashSet a little to do length checks when possible.

There are still times when things get caught up comparing unions for a while, but it seems to be nondeterministic.
AlexanderSher added a commit that referenced this pull request Feb 4, 2019
* Fix #470 (#478)

* Fix various common exceptions, log on a failed directory load (#498)

* rethrow exception after telemetry instead of disposing

* log an error instead of crashing when hitting some exceptions loading files, fix nullref when shutting down without an idle tracker

* fix short circuiting of question mark checks to prevent null from being returned

* print name of exception instead of relying on ToString

* don't use MaybeEnumerate

* upgrade message to Show

* Mitigate some of #495 via MRO member selection and analysis set optimization (#517)

This isn't a complete fix, but does seem to improve #495 in some cases. Adds back the early MRO return removed in #277, so now large class hierarchies won't over-propagate types (where some of the trouble with fastai happens do to the Transform class). I also optimized AnalysisHashSet a little to do length checks when possible.

There are still times when things get caught up comparing unions for a while, but it seems to be nondeterministic.

* Two path resolution bug fixes

- Fix #509: PLS doesn't flag error in a relative import if it is found in another root directory (#519)
- Fix #510: PLS doesn't resolve relative paths correctly

* Catch exceptions when importing from search paths (#525)

* catch exceptions when importing from search paths

* retry instead of not found
@jakebailey jakebailey deleted the fix-cpu-bug branch February 14, 2019 20:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants