Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign import handling #8584

Closed
JukkaL opened this issue Mar 25, 2020 · 10 comments
Closed

Redesign import handling #8584

JukkaL opened this issue Mar 25, 2020 · 10 comments
Labels
meta Issues tracking a broad area of work needs discussion priority-0-high

Comments

@JukkaL
Copy link
Collaborator

JukkaL commented Mar 25, 2020

The way mypy deals with imports is quite confusing. I'm proposing that we reconsider how various import-related features work. There are existing issues that cover some of these, but I want to discuss the changes in a central location here, since many of these are related.

I'd love to hear what everybody thinks about this proposal! My idea is that all (or most) of the changes below would be implemented as part of a big overhaul, though likely not in a single mypy release. I think that the current situation is sufficiently problematic that small incremental changes aren't enough.

1. Enable namespace packages by default

Mypy would support namespace packages by default. Currently --namespace-packages is needed, which is confusing for new users, in particular.

2. Generalize specifying namespace packages on the command line

Allow passing namespace packages using all the supported ways of specifying which files to type check on the command line. (See also #5759.)

3. Require package roots to be explicitly specified

To support namespace packages properly, the way mypy currently infers module name from a source file path by looking for __init__.py files can't be used, since namespace packages don't contain __init__.py files. Instead, all package roots would need to be explicitly defined using the mypy module search path. Some package roots will be enabled by default, such as the current directory and typeshed stub directories.

We may want to tweak how the module search path is specified, since many additional users might need to use it.

4. Use modular typeshed, and don't ship (most) third party package stubs with mypy

This is unrelated to namespace packages, but will be essential to allow typeshed to grow significantly. As third-party stubs would now be installed as packages, this would also make PEP-561 and non-PEP-561 packages behave closer to each other (for both you'll have to install some package first). Currently PEP 561 packages require a very different workflow, which is confusing.

This assumes that we'll actually switch to a modular typeshed. I'm hoping that we'll make progress in the next few months.

5. Allow selectively processing arbitrary installed packages

Make it possible to opt-in to type checking individual installed packages, even if they don't support PEP 561. This was proposed in #8545. This would be make it easier to get basic support for many third party packages that don't have stubs and don't support PEP 561.

This isn't enabled by default since some packages won't work properly or at all (C extensions, for example). One option would be disallow this option by default for popular packages with known issues.

6. Rethink import related error messages

Typical import related error messages should give explicit instructions about how to solve the issue. Ignoring missing imports should rarely be needed, and not recommended in the documentation. Examples of what might be suggested:

  • Install stub package <package> (this will be particularly important once we move to modular typeshed)
  • Install package <package> (in case it's known to support PEP 561)
  • Give the command line options that allow using an installed package (in case stubs don't exist, but the package is importable)
  • Update the module search path (if it doesn't look like a third-party library module)

7. Add flag to use the old module name inference behavior

Provide a backward compatibility option to use the old behavior of inferring module names, as migrating from the old behavior can be non-trivial for large projects. Still, the key of this proposal is that the new, backward incompatible behavior will be enabled by default, and should be used by most users.

@warsaw
Copy link
Member

warsaw commented Mar 30, 2020

This may have gotten buried in the original thread, but see this comment about my PEP 420 change not working for nested namespace packages, which needs to be fixed.

@ashb
Copy link

ashb commented Sep 9, 2020

This proposal gets a 👍 from me as a dev working on on Apache Airflow

@jnsnow
Copy link

jnsnow commented Oct 21, 2020

PEP420 support by default would be quite nice: it's unintuitive that something that works perfectly fine in pylint and setuptools and at runtime doesn't seem to work in mypy.

It's also quite confusing how, if you were in a git tree root, these two commands are almost, but not quite, the same:

> mypy namespace/
> mypy -p namespace

Checking "as files" vs "as a package" has implications for the fully qualified module names that mypy uses to write configuration against. Both seem to work, but they diverge at a point. It would be nice, as in #5759, to unify this check to eliminate divergent behaviour that is in all likelihood being incurred only accidentally.

I have some questions about what requiring a package root to be specified looks like, but as long as it assumed your CWD as a root, that would probably be a good default. I am not sure what should happen if your CWD is inside a subpackage, though. Anecdotally, one of the wonkiest aspects of pylint is how it tries to "guess" your module structure in order to enable it to be run from CWD inside of a subpackage -- but it does do this guessing incorrectly at times, making the feature unreliable. I have generally settled from only ever running python tools from my package/namespace root and hoping for the best.

One thing I have struggled with as a newcomer to Python in general (I started writing seriously under Python 3.6) is the pipeline between "single python script with no entry point" to "python package with setup.py and one or more modules" all the way to "python package with multiple hierarchical subpackages". It isn't actually always clear in which contexts tools like mypy are expected to run and work properly. Which directory should I be in, how should my imports be written, which invocation of mypy should I use? Unfortunately, there's no consensus; setuptools, pylint, mypy, and other tools all have different behaviors.

At some point, I wonder if it's worth attempting to ratify a "good tool behavior" PEP that explains how various third party utilities should attempt to understand and process file structures, in a bid to try and standardize disparate behavior between popular tools. Maybe something like that would only work after a lot of the Pyproject/setuptools/toml tumult is finally over and done with.

Cheers from the QEMU upstream,
--js

@timstaley
Copy link

It's also quite confusing how, if you were in a git tree root, these two commands are almost, but not quite, the same:

> mypy namespace/
> mypy -p namespace

Just a +1 to add that this has bitten me repeatedly, depending on precise combination of .pyi, __init__.py and the two invocations above can all give different scanning behaviour potentially resulting in (mistakenly, silently) unchecked files.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Oct 29, 2020

@timstaley
Bad experiences with mypy's import handling are very much confounded by the presence of two bugs that were recently fixed on master (currently unreleased):

  1. When you pass files/directories, mypy silently used to not recurse further if an __init__.py was missing in a subdirectory of a package, fixed in find_sources: find build sources recursively #9614
  2. When you passed in packages and --namespace-packages, mypy wouldn't really work beyond a level in the absence of __init__.py, fixed in modulefinder: make -p handle namespace packages correctly #9616

Together, both these fixes should hopefully resolve most cases of mistakenly, silently, unchecked files.

@jnsnow
We're trying to figure out what exactly specifying explicit package roots looks like in #9632. Note that if passing in files/directories was exactly the same as passing in packages/modules, we could just drop one of those input formats. But eg, a lot of people want mypy . to just work. mypy does have some improvable behaviour when e.g. passed a file inside a subpackage.

@jnsnow
Copy link

jnsnow commented Oct 30, 2020

We're trying to figure out what exactly specifying explicit package roots looks like in #9632. Note that if passing in files/directories was exactly the same as passing in packages/modules, we could just drop one of those input formats. But eg, a lot of people want mypy . to just work. mypy does have some improvable behaviour when e.g. passed a file inside a subpackage.

I get you, it's hard to know what the right behavior is. Finding a way to unify the treatment of loose files and packages might be a great thing, though.

I had been typing mypy namespace/ instead, which has the weird side effect of working and checking all of my files, but secretly was chopping off the namespace for how it modeled the tree internally. This made writing mypy.ini quite hard, because I hadn't realized mypy was not capturing the folder name as part of the fully qualified import path. If it had assumed the folder was a valid import root, it might have just worked.

With regards to the "silently didn't realize it was doing the wrong thing" problem, would it be possible to ask for a short report mode where it prints line by line what it checked?

As I recall the "0 errors in 13 files" message is also fairly modern (and fantastic), but maybe it could be augmented:

mypy found 0 errors in 4 files

foo.bar.__init__.py ...... 0 errors
foo.bar.baz.py ........... 0 errors
tests.simpletest.py ...... 0 errors
setup.py ................. 0 errors

It's a bit extreme, but such output allows you to run a diff-style test in CI to prove that your mypy CI is doing exactly what you thought it was, no more, no less. Showing the fully qualified paths might have clued me in to the fact that mypy was perceiving my layout structure wrong, too.

I'll subscribe to traffic on your PR, thank you for taking the usability of mypy configuration seriously! mypy has been a lifesaver for me, so thanks!

hauntsaninja pushed a commit to hauntsaninja/mypy that referenced this issue Oct 31, 2020
With the changes I've been making to mypy's import handling, I think
this would be a useful thing to add preemptively.

Note that I would have found this very useful at points, and I think
others would too, eg python#7672 and python#8584

The existing logging ignores source_modules and source_text and won't
help with determining what mypy things the module name for a given file
is, which is useful for namespace package issues as in the complaint in python#8584.
@hauntsaninja
Copy link
Collaborator

@jnsnow mypy -v can already do this a little. This is something I'd find useful too, so if accepted #9672 will add more information about what files mypy plans on checking to mypy -v. It's not quite as pretty as what you're suggesting (feel free to open a feature request), but it should serve well for debugging purposes.

hauntsaninja added a commit that referenced this issue Oct 31, 2020
With the changes I've been making to mypy's import handling, I think
this would be a useful thing to add preemptively.

Note that I would have found this very useful at points, and I think
others would too, eg #7672 and #8584

The existing logging ignores source_modules and source_text and won't
help with determining what mypy things the module name for a given file
is, which is useful for namespace package issues as in the complaint in #8584.

Co-authored-by: hauntsaninja <>
bonzini pushed a commit to qemu/qemu that referenced this issue May 30, 2021
0.730 appears to be about the oldest version that works with the
features we want, including nice human readable output (to make sure
iotest 297 passes), and type-parameterized Popen generics.

0.770, however, supports adding 'strict' to the config file, so require
at least 0.770.

Now that we are checking a namespace package, we need to tell mypy to
allow PEP420 namespaces, so modify the mypy config as part of the move.

mypy can now be run from the python root by typing 'mypy -p qemu'.

A note on mypy invocation: Running it as "mypy qemu/" changes the import
path detection mechanisms in mypy slightly, and it will fail. See
python/mypy#8584 for a decent entry point with
more breadcrumbs on the various behaviors that contribute to this subtle
difference.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
Message-id: 20210527211715.394144-23-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
bonzini pushed a commit to qemu/qemu that referenced this issue Jun 2, 2021
0.730 appears to be about the oldest version that works with the
features we want, including nice human readable output (to make sure
iotest 297 passes), and type-parameterized Popen generics.

0.770, however, supports adding 'strict' to the config file, so require
at least 0.770.

Now that we are checking a namespace package, we need to tell mypy to
allow PEP420 namespaces, so modify the mypy config as part of the move.

mypy can now be run from the python root by typing 'mypy -p qemu'.

A note on mypy invocation: Running it as "mypy qemu/" changes the import
path detection mechanisms in mypy slightly, and it will fail. See
python/mypy#8584 for a decent entry point with
more breadcrumbs on the various behaviors that contribute to this subtle
difference.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
Message-id: 20210527211715.394144-23-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 26, 2021

Here's a list of potential things remaining. Note that of Jukka's original list, 2, 3, 4 should be completed. 6, 7 are mostly completed or mostly true. I've added an 8, 9, 10.

.1. Enable namespace packages by default (#9636)
.5. Allow selectively processing arbitrary installed packages (#8545)
.8. Separate out --ignore-missing-imports functionality, e.g. --ignore-untyped-imports or --ignore-missing-type-hints (#9789 / #10660 / #10456 etc), or do something like #10044.
.9. Improve stub suggestions (#10619). If ^ introduces new flags, could be a good time to introduce backward incompatible behaviour here if needed.
.10. Improve --exclude. Should also allow it to be specified multiple times (#10310). The name was probably a little too general, since users often expect it to also do the equivalent of follow_imports=skip (e.g. #10820) — maybe it should do that?

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Aug 10, 2023

Update: now, 1 and 8 should be completed, 10 is half completed, leaving 5, 9 and half of 10.

If we want an 11, we could do #10600 (comment)

@hauntsaninja
Copy link
Collaborator

Closing since basically everything in the original issue is done! I included the items I mentioned as undone in #16472 (and of course they have their own issues, as linked above)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Issues tracking a broad area of work needs discussion priority-0-high
Projects
None yet
Development

No branches or pull requests

8 participants