Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add namespace support for develop #789

Closed
wants to merge 2 commits into from
Closed

Conversation

theacodes
Copy link
Member

@theacodes theacodes commented Sep 15, 2016

See pypa/packaging-problems#12 for detailed context, but essentially:

python setup.py develop does not work with namespace packages

This adds support for namespace packages by adding functionality to the develop command to create {distribution}-develop-nspkg.pth files in site-packages instead of .egg-links.

This work is mostly based on install_egg_info and this gist.

There are some outstanding questions I have about this, and of course I need to write tests before merging. I just want to sanity check before I go further.

Questions:

  1. install_egg_info and develop share a lot of common code around generating the nspkg.pth files. Should I refactor that into a common module? If so, what should the module be? (There doesn't seem to be any 'private' modules in setuptools).
  2. Right now this code path is only activated with you specify the flag and when there are namespace packages around. However, it's very possible that I can generate pth files that work the same as the .egg-info files today for non-namespace packages. Is that desired?

/cc @ncoghlan


if self.dry_run:
# always generate the lines, even in dry run
list(lines)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since lines is already a list comprehension, what does this achieve?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing. This was just left over from where I copied this code from install_egg_info. In that function it uses map to generate the lines instead of a list comprehension.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Fixed)

@jaraco
Copy link
Member

jaraco commented Sep 16, 2016

How does this feature play with setup.py develop --uninstall?

@theacodes
Copy link
Member Author

How does this feature play with setup.py develop --uninstall

It'll work as it does today, it'll just remove the -develop.nspkg.pth file instead of the .egg-link (I haven't coded this part yet because of outstanding questions on the approach).

@jaraco
Copy link
Member

jaraco commented Sep 18, 2016

Should I refactor that into a common module? If so, what should the module be? (There doesn't seem to be any 'private' modules in setuptools).

Yes please. Feel free to create a new private module for this purpose.

I can generate pth files that work the same as the .egg-info files today.

That does seem less complex and thus preferable.

My main reservation with this implementation is it adds additional dependence on -nspkg.pth files, where these files are presumably unnecessary on Python 3.3+ (pypa/pip#1924) and possibly could be obviated on older Pythons (#406). I'd be tempted to restrict this functionality to environments where PEP-420 namespace support isn't available (i.e. python 2.6-3.2 where importlib2.hook.installed is not True).

I know that complicates the implementation (and deployment) but it has the added benefit of not masking the problem where a better solution (presumably) exists. What do you think?

@theacodes
Copy link
Member Author

Yes please. Feel free to create a new private module for this purpose.

Will do.

I know that complicates the implementation (and deployment) but it has the added benefit of not masking the problem where a better solution (presumably) exists. What do you think?

What better solution for editable install exists in those versions? As far as I can tell, you can omit -nspkg.pth files in those interpreters for installed packages, but this is explicitly for editable packages which need to somehow map a package to a folder outside of the normal site-packages tree.

@jaraco
Copy link
Member

jaraco commented Sep 19, 2016

What better solution for editable install exists in those versions?

Well, in Python, when you start an interpreter, it adds '.' to sys.path (either via '' or os.path.abspath('') in sys.path). Therefore, any packages (namespace or otherwise) being edited in . will be importable (and preferred). If there are other packages that one wants to be editing, simply add those to PYTHONPATH:

/dev/project_a$ PYTHONPATH=/dev/project_b python
>>> import stuff_from_a
>>> import stuff_from_b

Neither project_a nor project_b need to be installed (and if they are, the editable versions in /dev will be imported first). That's of course the simplest use case, but I imagine you have a more complicated use case in mind. Can you expand on how the above wouldn't work?

@theacodes
Copy link
Member Author

Hmm, the PYTHONPATH way would be easier, I suppose. However, afaik PYTHONPATH isn't treated as a sitedir and I don't think that namespace packages will work properly. There's also the case where package_dir is specified in setup.py - mapping a package to a different direction e.g. src/google -> google.

Thoughts?

@jaraco
Copy link
Member

jaraco commented Sep 19, 2016

With PEP 420, any directory in PYTHONPATH can have a namespace package:

$ mkdir -p /tmp/foo/pkg
$ touch /tmp/foo/pkg/a.py 
$ mkdir -p /tmp/bar/pkg
$ touch /tmp/bar/pkg/b.py
$ PYTHONPATH=/tmp/bar:/tmp/foo python
Python 3.6.0b1 (v3.6.0b1:5b0ca4ed5e2f, Sep 12 2016, 09:24:46) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.a
>>> import pkg.b
>>> pkg.__path__
_NamespacePath(['/tmp/bar/pkg', '/tmp/foo/pkg'])

As for 'src', why not simply PYTHONPATH=src?

Actually, now that I think about it, I'm not opposed to the use of .pth files, but the use of .pth files to mess with the import mechanism, breaking PEP 420 packages. Here's a .pth file linking one of the two namespace packages:

$ cat > /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pkg.a.pth
/tmp/foo
$ PYTHONPATH=/tmp/bar python
Python 3.6.0b1 (v3.6.0b1:5b0ca4ed5e2f, Sep 12 2016, 09:24:46) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.a
>>> import pkg.b
>>> pkg.__path__
_NamespacePath(['/tmp/bar/pkg', '/tmp/foo/pkg'])

setup.py develop could do something similar, creating .pth files (even for non-namespace packages) with only the link to the target for Python 3.3+.

For Python < 3.3, it might still be necessary to fold in the extra behavior.

@theacodes
Copy link
Member Author

Actually, now that I think about it, I'm not opposed to the use of .pth files, but the use of .pth files to mess with the import mechanism, breaking PEP 420 packages.

Cool, so I think we might have a way forward here. Please let me know if any of this is incorrect:

develop will always create a {package}-develop.pth file for every declared top-level package in setup.py.

For Python 2 and < 3.3 (or more correctly when importlib2.hook.installed == False) the contents of the .pth file depend on the presence of namespace packages:

  • For non-namespace packages the .pth file will just contain the full filesystem path to the package directory.
  • For namespace packages the .pth file will resemble pip's -nspkg.pth files with import machinery to inject the full filesystem path to the package directory into the namespace package's __path__.

For > 3.3 (or more correctly when importlib2.hook.installed == True) the contents of the .pth file will always be the full filesystem path to the package directory.

@ncoghlan
Copy link
Member

What @jonparrott suggests above is roughly what I had in mind when I suggested using *.pth files in pypa/packaging-problems#12, I just completely forgot about the complicating factor of also needing to keep pre-PEP-420 namespace packages working.

@theacodes
Copy link
Member Author

I'm going to be out of office for the rest of this week, but will pick this back up next week. Thanks for helping me work out exactly what needs to be done here @jaraco and @ncoghlan.

@jaraco
Copy link
Member

jaraco commented Oct 1, 2016

@jonparrott: Interestingly, I have an example where I expected to run into this issue but I'm not. In this build on Travis, under tox, I'm running tests on the jaraco.mongodb package that depends on other jaraco.* packages. Tox creates a virtualenv for the target Python version and installs the package under test using the usedevelop feature. I've found it has to do this to run the doctests without running up against tox-dev/tox#373.

What's interesting is that the tests pass - the package under develop seems to be able to import the installed packages from the same namespace... and the tests pass on many Python versions including 2.7.

Right now, I can't explain why it is that I'm not hitting this issue, but perhaps this finding reveals a technique that's already viable. Perhaps I'll find to look into this more later, but I'd welcome your review of this as well to see if you can determine what I'm not seeing.

@theacodes
Copy link
Member Author

@jaraco that is... interesting. Hmm.

@jaraco
Copy link
Member

jaraco commented Oct 9, 2016

And although I use the same technique in another project, I find its tests are failing apparently due to namespace package issues, and here only under Python 3. I wonder if there's some sensitivity to the order in which the package under develop is imported.

@ncoghlan
Copy link
Member

A point potentially worth noting: *.pth files are processed by the site module, so python's -S option will always break them.

The other common source of conflicts is that the current directory is added to the front of sys.path when using the -m switch, but the directory containing the script itself is added when using direct execution.

@jaraco
Copy link
Member

jaraco commented Oct 19, 2016

I think there's a more insidious problem here - with pip-installed packages (or any other installer that omits __init__.py modules for namespace packages), a project that shares a namespace with its dependencies will run into conflicts under Python 3.3+ because the package under development will contain a __init__.py file in the namespace package, but the dependencies will not.

For example, consider something like zope.component, which depends on zope.interface. Since the introduction of #805, if one were to install zope.component using setup.py develop, zope would be loaded by the NamespaceLoader for zope.interface (assuming it was pip installed), but by SourceFileLoader for zope.component (because there's no opportunity for zope/__init__.py to be omitted). For the project under development to be installed in a way that's consistent with its dependencies, the zope/__init__.py would have to be removed from the filesystem.

Perhaps the project could be reconfigured not to include that file at all, but that would eliminate the ability to use find_packages (at least until #97 can be solved). I'm also unsure if this causes conflicts with dependencies installed as eggs.

@jaraco
Copy link
Member

jaraco commented Oct 19, 2016

I should also mention that this PR is designed to address pypa/pip#3 and #250.

@theacodes
Copy link
Member Author

Man, this is complex. :|

I haven't had time to really dig into this as I've been busy re-writing an auth library, but I do want to come back to this eventually. However, If someone wants to take the reigns in the meantime, please feel free to work from what I've done and cc me on the new PR.

@ncoghlan
Copy link
Member

@jaraco While I agree that situation isn't ideal, I believe it should still work as long as the __init__.py does the necessary dance to set its own __path__ correctly using pkg_resources or pkgutil.

Folks using something like tox to do full install-and-test in a venv would be able to test the post-install configuration that way, even if setup.py develop & pip install -e gave them something slightly different.

@jaraco
Copy link
Member

jaraco commented Oct 23, 2016

I noticed there was a lot of code in common with install_egg_info and develop, so I endeavored in the namespace-module branch to create a new module and mix-in to provide the namespace functionality in one place.

@jaraco jaraco mentioned this pull request Oct 28, 2016
@jaraco
Copy link
Member

jaraco commented Nov 6, 2016

I believe it should still work as long as the init.py does the necessary dance to set its own path correctly using pkg_resources or pkgutil.

In my testing, I was unable to get the system to work because the nspkg.pth files disable their behavior when a init.py exists. Although now it occurs to me that perhaps we could disable that behavior (the ie in the nspkg.pth line).

I'm going to continue this discussion in #250.

@theacodes
Copy link
Member Author

@jaraco sg, thanks for staying on top of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants