Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add different configuration for different sub directories #618

Open
pylint-bot opened this issue Aug 10, 2015 · 45 comments · May be fixed by #9395
Open

Add different configuration for different sub directories #618

pylint-bot opened this issue Aug 10, 2015 · 45 comments · May be fixed by #9395
Labels
Configuration Related to configuration Enhancement ✨ Improvement to a component Hacktoberfest Help wanted 🙏 Outside help would be appreciated, good for new contributors High priority Issue with more than 10 reactions Per directory config New per directory config feature

Comments

@pylint-bot
Copy link

Originally reported by: Fergal Hainey (BitBucket: FerHai)


I'm using pytest and redefined-outer-name is a false negative in these cases as it's part of pytest's design. I'd like to run pylint over the whole project to flag things automatically before code reviews. I'd like to catch cases of redefined-outer-name outside the tests folder.

I could add comments to the modules but 1. that's a lot of comments and 2. changes will require changing as many comments.

What could be nice is looking up through basedirs of the file being linted for pylintrc files and taking config in closer pylintrcs as higher priority. Or maybe something in the project wide pylintrc to denote some config applies to a directory.

Would this be considered a good idea? I'm open to making a patch if it's considered worthwhile.


@pylint-bot
Copy link
Author

Original comment by Florian Bruhin (BitBucket: The-Compiler, GitHub: @The-Compiler?):


Personally, I'd gladly use this feature when #352 is fixed as well 😄

As I don't use __init__.py files in my tests, I currently need a custom script to run pylint over it, and set some custom options there:

files = []
for dirpath, _dirnames, filenames in os.walk('tests'):
    for fn in filenames:
        if os.path.splitext(fn)[1] == '.py':
            files.append(os.path.join(dirpath, fn))
disabled = [
    'attribute-defined-outside-init',
    'redefined-outer-name',
    'unused-argument',
    'missing-docstring',
    # #511/
    'undefined-variable',
]
no_docstring_rgx = ['^__.*__$', '^setup$']
args = (['--disable={}'.format(','.join(disabled)),
         '--no-docstring-rgx=({})'.format('|'.join(no_docstring_rgx))] +
        sys.argv[1:] + files)
subprocess.call(['pylint'] + args)

@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Having specific configuration files per subdirectories would be nice. I'll probably use it myself,
for instance, for ignoring mock related failures in tests.

The problem is that it probably won't be trivial to do it, since the discovery of
the configuration is done before getting the files for analysis and it's baked
into the main pylint driver class through an additional dependency
(logilab.common): https://bitbucket.org/logilab/pylint/src/736b19e661d3f794a30797fd877bb49569736d57/pylint/lint.py?at=default#lint.py-457
Afterwards, each checker uses this configuration object.

Anyway, if you could spent some time on a patch and make this work, that would be really awesome.

@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


By the way, Florian, someone worked on issue #352 at the sprints, but a PR was never made. :(

@pylint-bot
Copy link
Author

Original comment by Fergal Hainey (BitBucket: FerHai):


Sounds like adding directories to the config and making the checkers aware of it would be easier than changing how config is loaded. Unless this is a feature the logilab.common dependency would enjoy too?

I'm not yet familiar with pylint and dependencies internals, but will take a look.

@pylint-bot
Copy link
Author

Original comment by Florian Bruhin (BitBucket: The-Compiler, GitHub: @The-Compiler?):


I started some work on ripping out logilab.common from pylint but got stuck on the configuration part. I still think it'd be sensible to rip out that overengineered... thing completely and replace it from scratch. Then this kind of thing could be a lot easier.

@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Having this feature in logilab.common will not help pylint, since we want to move away from it.

@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


By the way, in the issue #28, Torsten said something about using forests of configuration. That sounds like what we're planning to do here, might be worth pinging him for more details.

@pylint-bot pylint-bot added Enhancement ✨ Improvement to a component Configuration Related to configuration labels Dec 9, 2015
@eddie-dunn
Copy link

I would like a solution where the project wide pylintrc supports ignoring some Pylint-warnings for certain folders and/or Python modules. Is anybody working on this issue? I could take a crack at it otherwise, but I have no idea where to start.

@PCManticore
Copy link
Contributor

No, no one is working currently on this issue, feel free to take a stab at it. But this issue is about introducing support for configurations per subdirectories, which might not necessarily mean support for disabling messages for particular Python modules (it would though mean that you could disable messages in a subdirectory config). Unfortunately, I have no idea right now how can the messages be disabled for particular modules, I have to think of it (probably we can't use disable for that, so we'd need to add a new option, something on the lines of scoped-disabled=a.py:no-member,b.py:unbalanced-tuple-unpacking or something of sorts).

@eddie-dunn
Copy link

You are right that this issue is not necessarily (but possibly) related to what I want to achieve. I created a separate issue for folder-specific disables here: #871

@AWhetter
Copy link
Contributor

I'm in need of this feature so I'm looking to implement it but I need to nail down a few details first. Mainly where should I look for configuration files and in what order? My current thinking is to look for and load the files in the opposite order that they are looked for at the moment. So load /etc/pylintrc first if it exists, then the PYLINTRC environment variable/home, then the current directory/the parent directory/it's parent/etc. So in this example we'd load /etc/pylintrc, then any settings that are in ~/.pylintrc would override those in /etc/pylintrc.

In terms of the command line I'm thinking we allow passing in --rcfile multiple times and it'll read the files in order. So settings in later rc files would override settings in the earlier ones.

In terms of implementation in code, my current plan is to allow a list of config file locations (or a single path) wherever we currently use config_file (eg config.OptionsManagerMixIn and config.find_pylintrc). I know that the config module is up for refactoring but I'm not sure if I'm the right person to do that. I could be convinced though as long as I was given enough guidance on what the end goal would be 😉

@PCManticore
Copy link
Contributor

PCManticore commented Apr 30, 2016

Hi @AWhetter

There are two major issues with implementing this feature, as far as I can tell:

  1. deciding how the config files are loaded, where they leave and the exact semantic (overriding, completely new files)
  2. how each configuration object gets into a checker or another and how the checker decides what configuration to use.

In the first case, I think that passing multiple --rcfile won't help that much, since each one of them would probably be tailored for one folder or another, while this feature could be left out to the part that discovers configuration files on disk.
Regarding the overriding / replacing semantics: I think that each subsequent file should be a complete pylintrc file altogether. The alternative would be for config files from lower levels to modify the state of the configuration files from upper level, but this means it won't be obvious what is changed, now the user would have to know the same logic that pylint does and apply it in order to figure out what is happening. I think it's cleaner to just use completely new configuration files, that have modified one thing or two instead of "patching" things up. Let me know if you find a convincing case about the alternative, if you care about this behaviour.
The lookup of the config files would probably need to be separated a bit after having the files for analysis. This would mean a workflow as this:

  • if one of the location we need to analyze is a folder
    • try to get the pylintrc at the folder's level, if any
    • descend into the folder and for each inner location, store a corresponding pylintrc if found or the global one
      In the end, it will result a tree of config files, as in {'location': 'cfg', 'location.inner': 'cfg1', 'location.inner.inner...': 'cfgX'}

Now this gets into the second part of the issue, which is deciding how to pass each inner object into a checker. So what we know so far is that a checker has a underlying config object and it's not necessarily the duty of the checker to know which config object it should use. What, I think, would be useful to have is this:
* when entering into a new checker, we need a new dispatch method, in the same vein as open/close, for picking up the config file. open/close can't be reused, because they are called in different parts of lint scaffolding, so it has to be something new, called before processing the token and the AST checkers (somewhere in lint.py/check_astroid_module). It should be called with the node we're intending to process and in that method, we should decide, by the .root().file of the node, which config file to use.
* the new config object, picked at the last step, is set and the checkers can use it without caring how it got there.

As far as I can think of it, this seems to be it. There are some rough edges in this proposal, but hopefully we could flesh them out. Let me know what is your opinion about it.

@AWhetter
Copy link
Contributor

I agree that multiple --rcfile flags isn't a great option so I'm happy to drop that.

I don't like the idea of always needing complete rcfiles. My main motivation for implementing this issue was that say you have a team that lints multiple projects, then the team could have a global config file for linting against the team's standard. But then there may be projects that already have their own conventions and the team would want to be able to override certain aspects of the global configuration for that project.
If pylint is in the situation where each project (and potentially subdirectories in those projects) has it's own pylintrc file then a change in the team's standards (say a reduction in line length from 80 to 79 characters) would mean changing many rcfiles across many projects.
With per-project configuration files that override the global configuration, a change in the team's linting standards would mean a change in only the global rcfile.
I do agree however that changing state of the global configuration on every directory level would get confusing. So I'd propose only looking for the first pylintrc file in a directory path instead of on every level. So each parent directory in the path would only be searched for an rcfile if it's child does not contain an rcfile. That way a global overrides apply only to the current directory and subdirectories below.

As for lookups the tree idea would work for complete pylintrc files but seems less desirable for override configs. We'd have to parse the global config every time we come across an override file, and then do the same again once we leave the overridden directory.
I need to actually look at whether or not this is plausible in the pylint code, but I wonder whether the config manager itself could handle passing overrides. It could store a stack of configuration options and push/pop from it as directories are entered and left.
For example say a directory x overrides max-line-length to 79 characters and single-line-if-stmt to yes. When we enter the directory we'd do push max-line-length=100, single-line-if-stmt=yes, and the stack would store the current values of these configuration variables. Upon leaving the directory these values would be popped back off of the stack and restored.
If a directory is entered with no overrides then a null entry is added to the stack. When popped the null entry does nothing.

@PCManticore
Copy link
Contributor

So you're saying that we could support the following protocol for a configuration:

  1. global configuration somewhere
  2. per project configuration that can patch the settings of the global configuration.
  3. per subdirectory configuration that can patch the settings of the project configuration (which is basically a version of 2.)

So, if I'll call pylint project to the following structure, this will happen:

    /project/.pylintrc applies to
              /a.py
              /b.py

            /subfolder/.pylintrc applies to
                      /c.py
                      /d/
                        /x.py
                        another_subfolder/.pylintrc applies to nothing
                                               /e.py

Inner subdirectories, other than this layout, could potentially have a pylintrc of their own that is included from the project's .pylintrc through an include directive, but that is a separate feature that we can address.

I could live with this functionality, as long as we limiting to the closest subdirectories for the root of the project, otherwise, the understanding of what's happening could get fuzzy and I don't want to make this more complicate than it is. Also, I'm thinking that currently this might introduce some unwanted behaviour for users that are having a .pylintrc in their project and, for some reason, they also have a global .pylintrc, which means the per-project .pylintrc of theirs will suddenly be a patch of the global one. We should enforce this behaviour through a flag, that users can disable if they want to have the older feedback of not patching anything, just overriding the configuration.

Regarding the implementation, I think it would be probably cleaner to have multiple config objects, corresponding to each directory's .pylintrc. One issue with your stack-based approach is that the configuration is processed before calling the checkers, which means that the processing will get entangled in the checking process, which would make things more fuzzy as well (plus you don't know exactly which directory you're processing until you're not passing a new Module node to the checkers, ideally, the configurations should be processed so far).

Let me know if there is anything you'd like to discuss, otherwise we can already start working on this and by it, I mean you. ;-)

@PCManticore PCManticore added the Blocker 🙅 Blocks the next release label May 4, 2016
@PCManticore PCManticore added this to the 1.7.0 milestone May 9, 2016
@PCManticore PCManticore removed the Blocker 🙅 Blocks the next release label May 9, 2016
@AWhetter
Copy link
Contributor

In your example how will we know what is a project level override and what is a directory level override?

I'm actually drifting more towards looking for pylintrc overrides on every level. Yes it could get complicated to figure out which rules in which pylintrc are applied to a file, but I think that this is a decision that can be left up to the user. It gives the user more flexibilty, but we should make it clear in documentation that this comes at the cost of complexity.

@PCManticore
Copy link
Contributor

Okay, I don't mind having it like that in this case. Also, we should take a look to see what other tools are doing, for instance Rubocop for Ruby and ESLint for javascript. At least in the former's case, I know they have some similar as the plan we're trying to flesh out here, would be interesting to see how they approached this problem.

@AWhetter
Copy link
Contributor

So Rubocop has quite an elegant solution. It will find the first configuration file in the directory chain, but within the config file you can choose to include a configuration from elsewhere via either a relative path, absolute path, or even a remote file. It doesn't seem to have a concept of a global config other than the built in default options. See https://github.com/bbatsov/rubocop#inheriting-from-another-configuration-file-in-the-project.

ESLint searches for configuration files on every level, similar to what I was suggesting here. See https://github.com/bbatsov/rubocop#inheriting-from-another-configuration-file-in-the-project. I now prefer Rubocop's solution though.

I think we can achieve the functionality we are looking for with an "include" option in pylint files (or a similarly named option). So we would continue to search for the nearest pylintrc, and a project subdirectory pylintrc file will have to explicitly include a project level pylintrc file to indicate that it wants to overwrite the settings in that file. This should be much easier for a user to use rather than using the pylintrc in every parent directory because the user is more in control of what overrides what. We can then use the "tree of config files" idea that you suggested here (#618 (comment)) to build config objects for each subdirectory and apply each one appropriately.

I don't think that it's a good idea for us to implement remote pylintrc files like Rubocop has because we'd be automatically downloading a python file and evaluating it. So we should stick with absolute and relative paths.

The only downside with the includes solution is that, in the case that I described where you might want a team pylintrc file to apply to a range of projects, each project pylintrc file would need to include the path to the team pylintrc file. If the path to the team file changes then many pylintrc files need to be updated. This is maybe a bit of an unusual scenario, but I think it could be fixed with something like a --global-pylintrc file option that describes the global pylintrc, and pylint will still search for near pylintrc files to override that with still. I think this something to consider in another iteration, and the "includes" solution can be implemented on its own.

@PCManticore
Copy link
Contributor

I think this approach sounds very good, so I'm +1 for doing it as is.

@kernc
Copy link

kernc commented May 26, 2016

Regarding the overriding / replacing semantics: I think that each subsequent file should be a complete pylintrc file altogether. ... Let me know if you find a convincing case about the alternative, if you care about this behaviour.

We have a project-wide lint config we like. But for tests in the tests dir, we would like to disable line-too-long and missing-docstring.


Ftr, introducing an include=/path/to/other/pylintrc directive is in line with #175.

@AWhetter
Copy link
Contributor

AWhetter commented Jun 10, 2016

I started the implementation of this but I felt like I was really shoehorning the code into what already there. So instead I need to do some refactoring in the lead up resolving the actual issue:

  • Move the parallel execution code into Runner (See Move the parallel execution code into Runner #936. This is that it will be easier to pull config code of PyLinter).
  • Pull command line parsing code either into the Runner or preferably into something that creates the runner (See Pull command line parsing into Runner #972).
  • Pull config code out of PyLinter so that it can take a config object instead. (Creating a PyLinter needs to be cheap as it's going to need to be done per file at first. There may need to be some work done to figure out how extensions will register their options and defaults onto the config. It may be that a PyLinter looks at a checker when loading it and manipulates it's config object accordingly. Or it could be that the config object loads the checkers and does the manipulation on itself.)
  • Pull config code out of checkers. (Creating a checker should also be cheap. It'll be done every time we make a pylinter.).
  • Make the Runner find and distribute config objects to pylinters. (This should be done via some config store object that maps paths to config objects via an internal cache. The runner should cache linters in a similar fashion so that we aren't constantly creating linters for directories that use a previously loaded config. Config files should still be complete pylintrc files at this point.). See Move expand_files outside of PyLinter #973 for part of the solution for this task.
  • Add "Include" directive to config files.

It's worth noting that we can skip steps 2, 3, and 4, but we'll see a slight performance hit because of it. So I'll do that and then post the other steps as issues afterwards.

@PCManticore
Copy link
Contributor

Sounds good overall. One small comment regarding Or it could be that the config object loads the checkers and does the manipulation on itself. This doesn't sound a good idea, since the configuration shouldn't have this capability of loading the checkers, for having complete separation of concerns. Apart of this comment, I like this plan. Please create new issues when implementing them, so it would be easier to track than a checkbox.

@dineshtrivedi
Copy link

Hello guys, I just want to check if anyone has found an alternative way to do this at the moment?

Thank you

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented Nov 19, 2021

This issue is tracked in project as 'per directory configuration', there's a bunch of issues that are step towards this goal and that everyone can contribute too.

@dineshtrivedi
Copy link

Thank you for the response @Pierre-Sassoulas
I will check it out and see if I am able to contribute

@DanielNoord
Copy link
Collaborator

DanielNoord commented Jun 4, 2022

I'm working on a new approach in:
#6789

Instead of doing all the refactoring that the discussion here has led to, I think there is a much simpler approach possible which I have outlined in that PR.

  1. Create a basic Namespace
  2. Iterate over all files_or_modules and see if we can create additional Namespaces for subdirectories
  3. Look up the correct Namespace before checking a file

With #6789 we already check which namespace we need before checking a file.

We only need a good system to iterate over all directories and create Namespaces for them if we encounter a new configuration file. That should be relatively easy now that the parsing of configuration is in one straightforward method. I'd welcome any PRs that tries to implement step 2 😄

@simon-liebehenschel
Copy link

simon-liebehenschel commented Jul 31, 2022

I did not find if this already discussed, but if you will take a look at how Mypy allows to configure the same, you will see an easy (for the end user) solution:

[mypy-my_directory.*,my_another_module.*,something_else.*]
rule_name_that_I_want_to_set_for_the_mentioned_dirs = false
another_rule_only_per_these_dirs = true
  • No multiple configs. No config overlapping, duplicate rules, etc.
  • Easy settings management.
  • All in one file.

Can not be it done for pylint in the same way?

Demo pylint config:

# Global rules
disable = something, something_else


# Specify rules per specific directories
[pylint-tests,deploy.tests]
disable = 
    # Do not complain about Pytest fixtures
    redefined-outer-name,

@christopherpickering
Copy link
Contributor

@AIGeneratedUsername I did it through per-file-ignores in a custom plugin. #3767 (comment)

@nickurak
Copy link

nickurak commented May 8, 2024

A complication I just thought of: if the configuration files can load plugins, what does it mean to have a plugin enabled for one subdirectory and not another? If plugins can't be unloaded at runtime (or perhaps disabled temporarily) it would be hard for that concept to work with different plugin configurations in different contexts with a single invocation of pylint.

Edited to add: That could be valuable though. I've been using the pylint-pytest plugin for example, and it would be nice to be super-confident that it's not going to have any impact outside of my test subdirectory.

@DanielNoord
Copy link
Collaborator

Yes, this is definitely a concern that we should take into consideration.

@imomaliev
Copy link
Contributor

@nickurak I think at least for the initial version of this feature, we could ignore per folder plugins functionality.

@DanielNoord
Copy link
Collaborator

I think @nickurak makes a good point. We can't really ignore this as subdirectories could trigger the loading of plugins which would affect the rest of the run. We would need to explicitly disable plugins if we want to avoid this.

@imomaliev
Copy link
Contributor

We would need to explicitly disable plugins if we want to avoid this.

Yeah, this is what I meant. Also, I am not sure how dangerous executing per folder plugins could be, I feel like this could be a possible attack vector.

@nkalpakis21
Copy link

i see this has been open for awhile. is there trouble getting a dev to fix this issue? what is blocking this for so long

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Related to configuration Enhancement ✨ Improvement to a component Hacktoberfest Help wanted 🙏 Outside help would be appreciated, good for new contributors High priority Issue with more than 10 reactions Per directory config New per directory config feature
Projects
None yet