Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Astroid calls to ancestors are uncached and slow for templates and generics in ClassDef.ancestors #1115

Open
robin-wayve opened this issue Aug 2, 2021 · 12 comments

Comments

@robin-wayve
Copy link

robin-wayve commented Aug 2, 2021

tl;dr I have been trying to figure out why pylint is slow for us and profiling points to astroid. I'd appreciate some pointers on how I could get a clearer signal on what is going on.

Versions

astroid 2.6.5
pylint 2.9.6
python 3.7.9

Description

We run pylint against individual packages/modules in our codebase and the performance is highly variable. I've been trying to figure out why some runs are slow with yappi, similar to how it was done in #610.

One such module of ours has about 1000 lines of code and has become infeasibly slow on the versions listed above (I gave up after waiting 1 hour with the profiler on). I managed to get it to complete a profile in ~30 minutes by limiting the run to a single 200 line file. It does complete in under 5 minutes on our current versions: pylint 2.7.4 with astroid 2.5.2 -- I haven't captured a profile on these versions yet, but I could if that would be helpful.

I'm not able to share the code since it is proprietary, and the profile may contain sensitive information. I may attempt to scrub the profile and share the dot graph if that would be of interest. Dot graph shared below.

Profling result

The hottest node from the callgrind was this by quite some margin:

Name.infer astroid/node_classes.py:325 57.62% (16.12%) 82185683×

I believe that refers to this code: https://github.com/PyCQA/astroid/blob/v2.6.5/astroid/node_classes.py#L325-L379

This node has quite a lot of edges making it hard to see exactly what is going on but I hope this an interesting enough starting point for us to figure out what the cause of the bottleneck could be?

@robin-wayve
Copy link
Author

robin-wayve commented Aug 2, 2021

There's quite a few other details I'm not sure whether to include:

  • to get a useful profile, this was done with --jobs=1: it does run faster with --jobs=0 but it's still quite slow. I more or less copied this nickdrozd/pylint@dad6955 (thanks @nickdrozd!)
  • I believe our rcfile is relatively boring apart from disabling quite a lot of pylint's checkers
  • Other options we're using
--persistent=n --unsafe-load-any-extension=y

@Pierre-Sassoulas
Copy link
Member

Thank you for opening the issue. It makes sense that the inference would be what is costing a lot on astroid. We'd need to dig deeper and analyse what inferences exactly are taking more time than they should and probably in what pylint checker.

@Pierre-Sassoulas
Copy link
Member

Coule you try to single out deep recursion call, if possible ?

@robin-wayve
Copy link
Author

I tidied up the graph, hope this will help:
pylint296_scrubbed

@Pierre-Sassoulas
Copy link
Member

Looking at your graph it looks like ClassDef.ancestors is taking quite a lot of time. Is there a lot of inheritance in your code ? Maybe this is something what we needs to cache.

@robin-wayve
Copy link
Author

robin-wayve commented Aug 2, 2021

Looking at your graph it looks like ClassDef.ancestors is taking quite a lot of time. Is there a lot of inheritance in your code ? Maybe this is something what we needs to cache.

Yes, and the inherited classes are also using Templates / Generics which I think is another known slow thing?

@Pierre-Sassoulas Pierre-Sassoulas changed the title slow pylint performance with a hot spot in astroid Astroid calls to ancestors are uncached and slow for templates and generics in ClassDef.ancestors Aug 2, 2021
@Pierre-Sassoulas
Copy link
Member

I changed the name of the issue even if this might not be the only problem with the slowness of astroid on your code. This is the one issue that jump to the eye and we already had trouble with this before. Thank you for providing a graphs this is really helpful.

@nickdrozd
Copy link
Contributor

That red box for Name.infer is an opportunity for improvement.

@robin-wayve
Copy link
Author

robin-wayve commented Aug 6, 2021

I was also going to ask: is there a way I could figure out the minimum set of pylint rules to disable in this context to keep my runs fast in the meantime?

@Pierre-Sassoulas
Copy link
Member

There is no easy way right now. We can't profile each checker independently (and they are often intertwined) There is an issue opened in pylint to provide the tools for that: pylint-dev/pylint#4067.

@dwo
Copy link

dwo commented Aug 7, 2021

@Pierre-Sassoulas I started a rough sketch of caching ClassDef.ancestors in #1120 which still fails two tests, but is this roughly the right idea?

@Pierre-Sassoulas
Copy link
Member

Yeah, I think it's smoothly the right idea 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants