Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running pylint twice on the same file while changing the file in between runs doesn't work #5888

Closed
agrendath opened this issue Mar 10, 2022 · 11 comments · Fixed by #6024
Closed

Comments

@agrendath
Copy link

Question

So in our code we wanted to run pylint programmatically on the same file "test.py" multiple times while changing the content of the file in between runs. We figured a new reporter and new Run object each call should be enough but it just keeps outputting the same report from the content of the file on the first run.

Below is some code we wrote to test the issue further, using two different kinds of reporters that use 2 different kinds of output streams just to be absolutely 100% sure there is no way the first reporter is being used in the second call.

from pylint import lint
from pylint.reporters.text import TextReporter, ColorizedTextReporter
from io import StringIO

class WritableObject(object):
    "dummy output stream for pylint"

    def __init__(self):
        self.content = ""

    def write(self, st):
        "dummy write"
        self.content += st

    def read(self):
        "dummy read"
        return self.content


pylint_output = WritableObject()
newreporter = TextReporter(pylint_output)
lint.Run(["test.py"], reporter=newreporter, exit=False)
print(pylint_output.read())

test = input("continue? (after changing test.py): ")

pylint_output2 = StringIO()
newreporter2 = ColorizedTextReporter(pylint_output2)
lint.Run(["test.py"], reporter=newreporter2, exit=False)
pylint_output2.seek(0)
print(pylint_output2.read())

The file test.py in our tests contained simply

i = 5
print(a)

and we just changed a to b in between the Run calls (so when the test program is waiting for input).

You can see the output in the screenshot attachment, the issue is only the report for the original content of test.py is being given twice even though we changed & saved test.py in between calls. Our workaround fix for now is to just keep creating new test1.py, test2.py, etc but that is obviously not a good solution.
Screenshot from 2022-03-10 10-48-45

Documentation for future user

Either in the "Running pylint section" or FAQ probably.

Additional context

No response

@agrendath agrendath added the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label Mar 10, 2022
@DanielNoord
Copy link
Collaborator

I did a little digging and this is because of the following lines:
https://github.com/PyCQA/pylint/blob/1ebdc8c59f31962db0244a79eaab9b8d90a87baf/pylint/lint/pylinter.py#L1264-L1268

Because we re-use the constant MANAGER in the same process we use the cached module for test2.

That said, I'm not sure if this is something we should change there, as the caching of modules is probably quite useful for performance reasons. What is your use-case?

@DanielNoord DanielNoord added Discussion 🤔 and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Mar 10, 2022
@agrendath
Copy link
Author

I did a little digging and this is because of the following lines:

https://github.com/PyCQA/pylint/blob/1ebdc8c59f31962db0244a79eaab9b8d90a87baf/pylint/lint/pylinter.py#L1264-L1268

Because we re-use the constant MANAGER in the same process we use the cached module for test2.

That said, I'm not sure if this is something we should change there, as the caching of modules is probably quite useful for performance reasons. What is your use-case?

We're using pylint inside pyodide, so in a WebAssembly environment. It was hard to get it to work because of some dependencies but we managed and got it to work using lint.Run. The user enters code in the browser, we save it to a file called test.py and run pylint on it. For now we have to keep creating test1.py, test2.py etc for each submission which is a terrible solution of course. Any other way we've tried to use pylint programmatically has resulted in issues because it tries to create subprocesses, calls sys.exit or outputs to stdout which is hard to access.

I'll be honest I don't fully understand pylint's source code but maybe it would be possible to add an optional parameter to disable caching or something?

@DanielNoord
Copy link
Collaborator

I'll be honest I don't fully understand pylint's source code but maybe it would be possible to add an optional parameter to disable caching or something?

That might be something we could explore, although I'm hesitant to change how the caching in astroid works. Even in your case, the caching due to using one Manager does speed up the process.

I have no experience with WebAssembly, but I guess you're unable to run python run_pylint_over_file.py twice?

@agrendath
Copy link
Author

agrendath commented Mar 10, 2022

We're running the javascript code that calls pylint using pyodide.runPython. Unfortunately when you start pyodide it has to load and install all necessary packages before running the code. This would take far too long so unfortunately we cannot restart the process.

Maybe it would be possible to add a function to clear the cache? Or maybe that's already possible in which case I'm not quite sure of how you would go about doing that.

@DanielNoord
Copy link
Collaborator

You could try and see if adding:

import astroid
from astroid.manager import AstroidManager

astroid.MANAGER = AstroidManager()

between the two pylint calls works?

MANAGER gets imported from:
https://github.com/PyCQA/astroid/blob/f2d0f4213f6f527a0d2309164ed43778f1e06184/astroid/astroid_manager.py

@agrendath
Copy link
Author

Just tried it, doesn't fix the issue unfortunately.

This is the exact code I tried, again changing a -> b in test.py in between runs.
Screenshot from 2022-03-10 13-48-59

@DanielNoord
Copy link
Collaborator

Hm, I'm going to think about this some more. Perhaps others have better ideas..

@jacobtylerwalls
Copy link
Member

Maybe it would be possible to add a function to clear the cache? Or maybe that's already possible in which case I'm not quite sure of how you would go about doing that.

This seems to work:

from pylint import lint
lint.pylinter.MANAGER.astroid_cache = {}

@agrendath
Copy link
Author

Maybe it would be possible to add a function to clear the cache? Or maybe that's already possible in which case I'm not quite sure of how you would go about doing that.

This seems to work:

from pylint import lint
lint.pylinter.MANAGER.astroid_cache = {}

That does indeed solve the issue, thank you so much for your help. Maybe it's still worth putting in the docs in the FAQ or something? Regardless, thanks for the help.

@Pierre-Sassoulas
Copy link
Member

Related to pylint-dev/astroid#1242

@jacobtylerwalls
Copy link
Member

jacobtylerwalls commented Mar 29, 2022

BTW, this is probably the better call:

from pylint.lint import pylinter
pylinter.MANAGER.clear_cache()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants