Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bdist_wheel makes absolute data_files relative to site-packages #92

Closed
agronholm opened this issue Dec 7, 2013 · 81 comments
Closed

bdist_wheel makes absolute data_files relative to site-packages #92

agronholm opened this issue Dec 7, 2013 · 81 comments

Comments

@agronholm
Copy link
Contributor

Originally reported by: Marcus Smith (Bitbucket: qwcode, GitHub: qwcode)


bdist_wheel doesn't handle absolute paths in the "data_files" keyword like standard setuptools installs do, which honor it as absolute (which seems to match the examples in the distutils docs)

when using absolute paths, the data ends up in the packaged wheel at the top level, and get's installed relative to site-packages (along with the project packages)

so, bdist_wheel is re-interpreting distutil's "data_files" differently. maybe better for wheel to fail to build projects with absolute data_files, than to just reinterpret it.


@agronholm
Copy link
Contributor Author

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


I.e. it's either that the wheel spec has to grow to cover absolute data_files (I don't see how it could handle them now; putting them into {distribution}-{version}.data doesn't help because that's relative to sys.prefix), or bdist_wheel just needs to fail to build in that case.

@agronholm
Copy link
Contributor Author

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, relative "data_files" paths are handled as expected and end up in the "*.data" dir in the packaged wheel.

@agronholm
Copy link
Contributor Author

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't think we should allow absolute paths.

@agronholm
Copy link
Contributor Author

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


Absolute paths need to be allowed but it may be acceptable to restrict to absolute paths within the sdist.

There's a place in setuptools where certain kinds of paths cause errors and I run into it from time to time. I don't remember the details atm, only that it would be much easier to use if it did allow absolute paths.

@agronholm
Copy link
Contributor Author

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


Why does it have to be allowed? If bdist_wheel and sdist were consistent, that would be one thing, but they're not and can't be at the current time, so it seems wrong for wheels to build absolute paths and then place them into site-packages

@agronholm
Copy link
Contributor Author

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


I could be thinking about setuptools' /other/ bug ;-)

@agronholm
Copy link
Contributor Author

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't see any reason why absolute paths have to be allowed. I think they are a bad design in general, everything should be rooted in sys.prefix. It's not a very good thing for a Wheel to be able to override /etc/hosts for instance.

@agronholm
Copy link
Contributor Author

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, there's a metadata issue open for whether wheel would grow the ability to handle platform-specific paths (including absolute I guess) https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

for me, this issue isn't about that discussion.

it's about the oddity of placing absolute paths into site-packages

since wheel has no ability to properly place absolute files currently, it shouldn't build projects that declare them

@agronholm
Copy link
Contributor Author

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


packagename-1.0.data/data/ is currently a way to place absolute files. This is an accidental feature but I don't have any particular beef with it.

They are absolute relative to the root of the virtualenv :-) Or if no virtualenv is in use, probably /

@agronholm
Copy link
Contributor Author

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


That's not what absolute means, that's a relative path. An absolute file is one that will install to /this/exact/path/even/in/a/virtualenv

@agronholm
Copy link
Contributor Author

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


so take this setup.py which defines an absolute data files at "/opt/data_file": https://gist.github.com/qwcode/9144129
(and assuming there is a "data_file" relative to it)

build an sdist and wheel and then install each, and see where "data_file" goes.

  • for the sdist: /opt/data_file
  • for the wheel: ../site-packages/opt/data_file

on the other hand, relative data files get packaged into *.data/data and get installed relative to sys.prefix

@agronholm
Copy link
Contributor Author

Original comment by Michael Hoglan (Bitbucket: mhoglan, GitHub: mhoglan):


Graphite does a similar thing, not specifically their data files, but the lib files are specified in an absolute location (/opt/graphite/webapp) in the setup.cfg, and it results in the files being under site-packages/opt/graphite/... when you build a wheel and install it in a virtualenv.

When building from source, I would specify --install-options to change those locations to be relative to the virtualenv, but that does not seem possible to pass those options into pip wheel.

Removing the prefix / lib configurations in the setup.cfg cause the wheel and source installs to behave the same (ends up in site-packages); Altering the wheel and getting rid of the /opt/graphite/webapp at the top level achieves the same thing (since it would have assumed prefix of . at install);

btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that. I see this as more of having to work with projects that are not defined cleanly. And probably allowing there to be consistency between a src install and a wheel install.

@agronholm
Copy link
Contributor Author

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that.

Agreed!

I see this as more of having to work with projects that are not defined cleanly.

Well, actually the current problem is to work with package installers and virtualenvs that are defined cleanly!

Problem is that you may be able to put a data file somewhere using setup(data_files=xx) -- but can you determine where it went from your application instance!?

That's the main problem I'm facing with setuptools right now... when using setuptools, all paths for the data_files kwarg are relative to sys.prefix, but when installing in a virtualenv, they're not..

@agronholm
Copy link
Contributor Author

Original comment by Keerthan Jaic (Bitbucket: jck2, GitHub: jck2):


Is there a uniform way to find (relative) package data which works irrespective of whether the package is installed globally or in a venv?

@agronholm
Copy link
Contributor Author

Original comment by Joo Tsao (Bitbucket: nuwa, GitHub: nuwa):


need support setup(data_files=/opt/xxx)

@agronholm
Copy link
Contributor Author

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


For reference, this is bug is essentially the same as #120
And since pip 7.0.0 all packages are now wheeled before install, meaning that this bug and #120 are getting prime exposure in several packages.
See pypa/pip#2874

@agronholm
Copy link
Contributor Author

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@jck2 the simplest way for me is to only use package data effectively stored in a package directory side by side with the python code that needs them and never use data files.
Once you have this, dirname and __file__ will let you navigate to these data file locations relative to your python code location. Since the data is always in the same place relative to the calling code, the fact you are installed globally in a venv or else does not matter anymore.

As a simple example of this approach:

@agronholm
Copy link
Contributor Author

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


My work-around for pip 7.0 (because pip automatically creates wheels from sdists) is to include this in setup.py:

if 'bdist_wheel' in sys.argv:
    raise RuntimeError("This setup.py does not support wheels")

Pip will automatically skip the .whl packaging and run the normal sdist installation.

Why on earth this decision to make an unfinished packaging system deploy things that weren't intended for it by default is beyond my belief :( People who've made sdist installations, released them, and tested them, can create their .whl files themselves... this new bdist_wheel call prolonges the installation process and creates new unexpected behavior.

@agronholm
Copy link
Contributor Author

Original comment by Benjamin Reedlunn (Bitbucket: breedlun, GitHub: breedlun):


I just released my first python package, and it is affected by this issue. I would like to avoid absolute paths, as suggested, but I do not know the proper way. Can someone give me a hand?

Here is a link to my stack overflow question that goes into more detail.

@agronholm
Copy link
Contributor Author

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


This is a real problem for me as well.

My setup.py script works as expected with regard to data_files that use an absolute path and honors them when I do 'python setup.py install' however when I do 'python setup.py bdist_wheel' and then pip install my wheel the data_files that I specified with an absolute path and were correctly installed using a straight setup.py install ARE NOT installed correctly from the wheel and wind up relative to site-packages. I.e. site-packages/usr/lib/blah/blah

If I want to install a file outside of site-packages (say to an arbitrary place on the filesystem) I should be able to do that. The behaviour is inconsistent. I'd really like to see this fixed because right now I can't use wheels and that's exactly what I want to use.

@agronholm
Copy link
Contributor Author

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


@joe_code - I can recommend finding a workaround, not using setup.py's setup(). Ultimately, that's what we did, and to be honest and despite my previous harsh rhetoric in this thread, it's nice to get rid of data_files and have a Python project that works inside virtual environments again and can be distributed with Wheel :)

@agronholm
Copy link
Contributor Author

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


Hey Benjamin, thanks for your reply. Could you elaborate a little bit on your solution please?

@agronholm
Copy link
Contributor Author

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


In our case, we could factor out most of the files in /usr/share and turn them into "package data". The remaining files are now handled by OS installers (for instance debian packages, pkg for Mac, setup.exe for stuff Windows etc).

In case you don't want to create OS installers, you can have a "run first" approach for your application for which you do if not os.path.exists, possibly adding a file with your project's version in. The disadvantage is uninstallation.

@agronholm
Copy link
Contributor Author

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


I think it's a real problem that this decision has effectively broken a use case that many packages have relied in--in some cases for bad reasons, but in other cases for good reasons.

Although I personally feel like the reasoning behind the breakage has some merit, breaking things without offering some kind of guidance on how best to handle outside-Python resource files has created yet another sore point against Python packaging that has been raised by some of colleagues, and it's a valid complaint.

I think the argument "well we shouldn't just allow installing files to arbitrary system locations" is well meaning but ultimately spurious. It's true that, depending on what install_data gets set to, the paths which can be installed to is somewhat limited making it hard, say, to overwrite /etc/hosts. Yet pip will also happily overwrite executables in /usr/bin, for example, which I think is awful and it shouldn't. So really you're making a security-related argument that falls apart because there's actually no promise of security when installing a wheel system-wide (outside a virtualenv). Meanwhile it's possible to hand-craft wheels with files in the .data directory that can be installed almost anywhere within /usr at the very least.

I think a better approach would be to not make arbitrary decisions for software developers who know what they're doing, and where necessary protect users (and developers who don't know what they're doing) by not allowing pip to overwrite files that already exist on their system (especially for "data files").

@agronholm
Copy link
Contributor Author

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


Allowing absolute paths breaks the isolation of virtual environments.

@agronholm
Copy link
Contributor Author

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


So treat absolute paths as relative to the root of a virtualenv when installing in a virtualenv, and don't break their semantics on system installs.

@agronholm
Copy link
Contributor Author

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@embray but is pip aware of being in a virtualenv at all?

@agronholm
Copy link
Contributor Author

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


Actually it is: https://github.com/pypa/pip/blob/d86d1713647f791979b9267ffc5773479d0ef469/pip/locations.py#L39

@agronholm
Copy link
Contributor Author

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


Yes, it has to be--especially to be able to deal with the nuances between virtualenvs with and without "global site-packages".

@agronholm
Copy link
Contributor Author

Original comment by Benno Fünfstück (Bitbucket: bennofs, GitHub: bennofs):


Is there a way right now to install some file into site-packages that works with both setuptools and bdist_wheel? For example, if I want to install a native library that is later loaded by my application. Or should I not use site-packages for that?

@dholth
Copy link
Member

dholth commented Sep 13, 2018 via email

@njsmith
Copy link
Member

njsmith commented Sep 13, 2018

If you have data you want to access at runtime, then we already have a standard and well-supported solution (it even works for packages installed in zips!): https://docs.python.org/3.7/library/importlib.html#module-importlib.resources

@dholth
Copy link
Member

dholth commented Sep 13, 2018 via email

@pfmoore
Copy link
Member

pfmoore commented Sep 13, 2018

I've suggested that if a wheel contained a package-1.0.data/docs/
directory, that the installer could place those files into e.g. $virtualenv/share/docs/$packagename-
$packageversion by default. Imagine that plus a few more categories.

Indeed. If someone wanted to flesh out that proposal, put it into the form of a PEP/standard and get it approved and then implemented in the various tools, then that would probably cover a lot of the use cases I've seen mentioned in the past. Of course, no-one has yet volunteered to champion the suggestion. It really needs someone with an actual stake in the issue to step up, or it's going to forever sit behind other priorities.

@dholth
Copy link
Member

dholth commented Sep 13, 2018 via email

@jdemeyer
Copy link

jdemeyer commented Sep 14, 2018

So IIUC, the data directory in wheels has never worked in a useful way,

Wrong! The data directory (and data_files in setup.py) is useful in several ways. For example, it can be used to install Jupyter files such as Jupyter kernel specs or Jupyter notebook extensions (example). And I see nothing wrong with installing man pages or documentation using data.

@jdemeyer
Copy link

What about relative location, but not inside of site-package?

That's exactly the use case that data_files solves.

@jdemeyer
Copy link

jdemeyer commented Sep 14, 2018

it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

You are confusing two different use cases for package_data and data_files.

package_data is useful for data files used by the package itself (or possibly other Python tools looking there).

data_files on the other hard is useful for data files used by other software (which may not even be written in Python).

@njsmith
Copy link
Member

njsmith commented Sep 15, 2018

what's this other software, that has nothing to do with Python, but it understands about Python environment layouts, including the data directory that even most Python software doesn't understand, but that doesn't know how to find package_data?

@jdemeyer
Copy link

understands about Python environment layouts

"environments" are not specific to Python at all. Most open source software packages have a concept of installation prefix, analogous to sys.prefix. Conda for example installs everything (Python packages but also other packages) in a common prefix.

@jdemeyer
Copy link

what's this other software

Jupyter packages are a good example. While many Jupyter kernels are written using Python, that is not a requirement: it is possible to implement the Jupyter protocol without Python. So they decided to use data_files for that, which makes it work the same way for Python packages and non-Python packages.

@jdemeyer
Copy link

And the man pages example is also a good one (even though I personally don't know any Python package which installs a man page).

@agronholm
Copy link
Contributor Author

The consensus (?) seems to be that this needs a new standard and that wheel itself is currently not doing anything wrong. If someone wants this to be reopened, be specific about what changes are required for the wheel project. Otherwise a new issue could be opened when a new standard emerges that requires implementation here.

openstack-mirroring pushed a commit to openstack/kolla that referenced this issue Jul 20, 2020
This change modifies the ironic base container
to copy rootwarp filters from the virtual
env rather than the source code directory. This
is need because some required filters have
been moved to ironic-lib and are not present in
the /ironic dir. The rootwrap filters are not
automitaclly installed in /etc/... due to kolla
use of virtual envs and pypa/wheel#92

Closes-Bug: #1886663
Change-Id: Idb0a675d92bab8b9a0cf5209f0a06e996e96033c
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Jul 20, 2020
* Update kolla from branch 'master'
  - Merge "copy rootwarp files form venv in ironic base"
  - copy rootwarp files form venv in ironic base
    
    This change modifies the ironic base container
    to copy rootwarp filters from the virtual
    env rather than the source code directory. This
    is need because some required filters have
    been moved to ironic-lib and are not present in
    the /ironic dir. The rootwrap filters are not
    automitaclly installed in /etc/... due to kolla
    use of virtual envs and pypa/wheel#92
    
    Closes-Bug: #1886663
    Change-Id: Idb0a675d92bab8b9a0cf5209f0a06e996e96033c
openstack-mirroring pushed a commit to openstack/kolla that referenced this issue Jul 21, 2020
This change modifies the ironic base container
to copy rootwarp filters from the virtual
env rather than the source code directory. This
is need because some required filters have
been moved to ironic-lib and are not present in
the /ironic dir. The rootwrap filters are not
automitaclly installed in /etc/... due to kolla
use of virtual envs and pypa/wheel#92

Closes-Bug: #1886663
Change-Id: Idb0a675d92bab8b9a0cf5209f0a06e996e96033c
(cherry picked from commit b6c7110)
bushbecky added a commit to bushbecky/fusesoc that referenced this issue May 9, 2023
This adds the script created by Stefan in PR #57 as a starting point for
further improvements on this really useful tool.

Automatically installing this file with pip is something between
impossible and annoying for multiple reasons.

- There's no way to select between Windows and Linux, with and without
  bash, etc.
- The file could be installed into /etc/bash_completion.d by setuptools,
  but not by wheels. (Not a big problem right now, since fusesoc doesn't
  contain binary components.)
  Reference: pypa/wheel#92
- Installing the file next to the package is tricky, however could be
  done using data_files & friends in setup.py. However that is rather
  tricky and provides no clear benefit over getting the file as needed
  with curl.

So I'd go with just placing this file in the source tree and documenting
its use with
$ sudo curl https://raw.githubusercontent.com/olofk/fusesoc/master/extras/bash-completion -o /etc/bash_completion.d
or (without root privileges)
$ curl https://raw.githubusercontent.com/olofk/fusesoc/master/extras/bash-completion >> ~/.bash_completion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests