Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] easy-install.pth when updated during installation is overwritten without the recently added changes #3126

Closed
gst opened this issue Feb 20, 2022 · 25 comments · Fixed by #3128
Labels
enhancement Needs Discussion Issues where the implementation still needs to be discussed. Needs Repro Issues that need a reproducible example.

Comments

@gst
Copy link

gst commented Feb 20, 2022

setuptools version

setuptools==60.9.3

Python version

CPython 3.8

OS

Ubuntu 20.04

Additional environment information

Used in combination/on top of a CMake project using sub-projects via cmake FetchContent module : https://cmake.org/cmake/help/git-stage/module/FetchContent.html#fetchcontent

Description

Basically : my setup.py is calling cmake to build a cmake project with pybind11 extensions.
the main project includes others cmake (sub-)projects using FetchContent. I control everything in the picture.
I want to be able to install in editable mode the main project and I want all the sub-projects to be also installed along the way as dependencies/automatically (given they all contains pybind11 extensions), also in editable mode/the same way the main/top one. I so tried to put a pip install -e or (python setup.py develop) of the sub-project(s) as a cmake dedicated custom target(s) on the main project. it's well correctly called .. BUT : actually it's a pip install or setup.py develop which is called during/inside another one..

and the net effect is all is done correctly BUT the easy-install.pth file gets overwritten by the main pip install (after the inner one finish).. basically the directory added to it by the inner installation is lost after the outer one writes its resulting one..

does that ring a bell to someone ?

Expected behavior

I would have expected that the outer/main pip install/setup.py develop read again the content of the easy-install.pth file (so to get a fresh content) before writing it.. and merging what's need to as necessary/correctly.

How to Reproduce

Call pip install / setup.py develop inside another one.

Output

can produce on demand.

@gst gst added bug Needs Triage Issues that need to be evaluated for severity and status. labels Feb 20, 2022
@gst
Copy link
Author

gst commented Feb 20, 2022

could make a fix that's doing what I was eventually expecting :

CUTTED

@gst
Copy link
Author

gst commented Feb 20, 2022

should I make/propose an MR ?

@abravalheri
Copy link
Contributor

abravalheri commented Feb 22, 2022

Hi @gst, thank you very much for bringing this discussion up.

As I pointed out in the PR, I probably lack the knowledge to assess this issue, so I will just ask for general information that would help the other maintainers.

I have some questions (sorry if they are already in your original text and I missed them):

  • Did this used to work in old versions of setuptools?
  • If that is the case, do you know in which version it stopped working?
  • Could you share with us just a general idea on how your is project organised? (e.g. how the folders are nested, where the setup.py/pyproject.toml/setup.cfg files are located? It does not have to be your real project just some stub that reproduces what is happening in your project.

@gst
Copy link
Author

gst commented Feb 22, 2022

Hi @abravalheri ,

Did this used to work in old versions of setuptools?

I don't know / that is a good question I guess. maybe, maybe not.

Could you share with us just a general idea on how your is project organised?

yes. even though it does not change the picture ; it's basically this:

main-project/
    include/  # one place for all headers files
    src/   # one place for all sources files. included python ones (pybind11 wrappring c++ code and pure python one).
    setup.py
    pyproject.toml
    CMakeLists.txt

Now my setup.py is thus calling my CMakeLists.txt (via/using .. well cmake calls).
In that main/top level CMakeLists.txt I include other cmake projects (having basically the same structure, which does not matter). Using cmake FetchContent helper/tool module.

But with/in my CMakeLists.txt I would like to also call pip install -e/python setup.py develop under the hood (via the cmake add custom target directive) the fetched sub-projects (they are downloaded/fetched inside the build folder itself).

Everything works well but as I said the code didn't reload/refresh the file before writing to it. That is my MR fix.

@gst
Copy link
Author

gst commented Feb 24, 2022

@pjeby @webknjaz @agronholm @jaraco @RonnyPfannschmidt

please have a look at this issue and the related proposed fix/MR.

Thank you.

@agronholm
Copy link
Contributor

Why did you ping so many people? I for one am not working on setuptools.

@gst
Copy link
Author

gst commented Feb 25, 2022

Why did you ping so many people? I for one am not working on setuptools.

I only ping the ones I can find with git blame related to this repository and this specific change/file.

20:01 $ git blame setuptools/command/easy_install.py | grep -c  "Alex Grönholm"
204

Regards.

@jaraco
Copy link
Member

jaraco commented Feb 26, 2022

Do I understand correctly that the issue is that multiple PthDistributions objects are being created at the same time and overwriting the changes of the other? Can you explain a bit more about what conditions cause two PthDistribution objects to be interleaved?

@gst
Copy link
Author

gst commented Feb 26, 2022

@jaraco you understand correctly.

I tried to explain it in the following post : #3126 (comment)

Can you explain a bit more about what conditions cause two PthDistribution objects to be interleaved?

it is a pip install -e which happens/is called during the build phase/command of another/a parent pip install -e.

I cannot explain it differently given it is exactly what is happening ;)

@RonnyPfannschmidt
Copy link
Contributor

I'd say it's absolutely OK to break that way when people run hacked up nested editable installs instead of running the installs in order

@gst
Copy link
Author

gst commented Feb 26, 2022

@RonnyPfannschmidt

it's not really hacked. it's just/only a cmake build .. which includes/builds other sub-projects (as part of its own build). Ok this might be considered advanced usage I guess ;)

from my point of view : it doesnt work previously :)

because the save() method should check the pth file has not been modified since it was loaded and in the case it has, I think merging new content is what's the user is expecting ?

As I said: everything is working perfectly fine.. but the entry(ies) added to the file during the inner installation(s) of the sub-project(s) leads to them being just lost/overwritten totally by the outer installation.

The proposed MR fix that.

@jaraco
Copy link
Member

jaraco commented Feb 26, 2022

I tried to explain it in the following post : #3126 (comment)

I meant to ask for more detail regarding how Setuptools loses control of the file mid-operation. You've described the use of cmake, but that's not an option that Setuptools invokes, so I'd like more detail about how one would replicate the behavior at the project level. I understand you've traced the symptom and you're addressing the symptom at the level of the PthDistribution behavior, and that's also what the unit test does, but I'd like to understand how to replicate the behavior more generally, and preferably without an external tool like cmake. Can you create a pair of projects with a hook similar to what your cmake hook is doing that causes interleaved installs?

I'm leaning toward agreeing with Ronny that this project may not want to support this use-case, and instead declare that an install operation should be allowed to run to completion prior to running another, and that if you want to install multiple, dependent projects, there may be a better way to do it that doesn't stumble on these implementation details in Setuptools.

@gst
Copy link
Author

gst commented Feb 26, 2022

I meant to ask for more detail regarding how Setuptools loses control of the file mid-operation.

well : I guess the exact thing happening is this :

  1. I do pip install -e . of the main/top-level project. that setup.py is building python pybind11 extension. but the related project, includes others cmake sub-projects (that I maintain), theses sub-projects also contain pybind11 extension(and have their own setup.py each).
  2. the top level setup.py is thus calling cmake on the main/top level project.
  3. in the main CMakeLists.txt I so include others sub-projects. when it's included : I check if they contain a setup.py. if yes I add a cmake custom target : which execute the command : ${PYTHON_EXECUTABLE} -m pip install -v -e . (of the sub-project). where PYTHON_EXECUTABLE is the main/top level one used.
  4. Thus : during the build phase of the first/top level pip install -e ., occurs a build of another (sub-)project (which execute another pip install -e (on it)). and there it is.

at that point I can only guess that the PthDistribution class has been instantiated by the first/top level pip install BEFORE the second/inner level one completed.

I hope this makes the picture clearer.

@gst
Copy link
Author

gst commented Feb 26, 2022

personally I don't see why it could not be an accepted use case though.. even if new and/or unexpected.. it's happening :)

@jaraco jaraco added enhancement Needs Repro Issues that need a reproducible example. and removed bug Needs Triage Issues that need to be evaluated for severity and status. labels Feb 26, 2022
@jaraco
Copy link
Member

jaraco commented Feb 26, 2022

I don't see why it could not be an accepted use case though.

That's fair and deserves an explanation. I'm disinclined to add complexity (new boolean flags that touch the code in multiple places) to support an idiosyncratic use-case that's needed by only one person, especially when it's supporting a usage that I would discourage and especially if the intended outcome can be achieved by a different approach.

I appreciate the fact that the proposed PR is limited to only the PthDistributions class. Unfortunately, the presence of that class touches many other parts of the code, so it's not even obvious to me where these changes have an effect in practice.

  1. the top level setup.py is thus calling cmake on the main/top level project.

Can you elaborate on how this happens?

@gst
Copy link
Author

gst commented Feb 26, 2022

the top level setup.py is thus calling cmake on the main/top level project.

Can you elaborate on how this happens?

sure. I followed same principle than here : https://github.com/pybind/cmake_example/blob/master/setup.py

@gst
Copy link
Author

gst commented Feb 27, 2022

@jaraco you added "Needs repro" tag.. but I don't know what I could give more than the test and all the description/explanation I already gave here..

what kind of repro ?

@jaraco
Copy link
Member

jaraco commented Feb 27, 2022

sure. I followed same principle than here : https://github.com/pybind/cmake_example/blob/master/setup.py

Thanks for that link. From there, I can see that the setup.py is implementing a custom build_ext, so now it's more clear that the reason the operations are interleaved is because the technique is invoking cmake from within a setuptools build_ext operation, and cmake is invoking additional setuptools operations.

I'm tempted to say that the right thing to do here is to document that build_ext should build the project but shouldn't get involved in operations outside building the project at hand.

what kind of repro ?

Ideally, we'd have a reproducer that I could run to test the reported use case. What we have instead is a description of how one might be able to reproduce it if they wish to spend the time devising dependent projects and learning cmake and orchestrating the customized backends.

But I don't want to put too much emphasis on a reproducer. Now that I understand the problem, I'm still disinclined to support the use-case, considering that Setuptools would like to be out of the business of supporting and maintaining dependency management concerns, including issues like this that involve interleaved Setuptools installs to the same environment.

I'd much rather that someone who has this issue would find another way to avoid this interleaved behavior by instead orchestrating the build/install steps of each dependent project outside of Setuptools. I understand that it may be convenient to recursively install dependent projects inside of installing the depending project, but that approach just isn't supported and I'm reluctant to add more complexity to the implementation, especially complexity from an integrated (entangled) implementation as proposed.

Have you explored a different process that would orchestrate the installation of a project and its dependencies outside of a project's install? Are there reasons you couldn't instead build and install each project separately?

@jaraco jaraco added the Needs Discussion Issues where the implementation still needs to be discussed. label Feb 27, 2022
@gst
Copy link
Author

gst commented Feb 27, 2022

I understand that it may be convenient to recursively install dependent projects inside of installing the depending project, but that approach just isn't supported

That's what my proposed MR is proposing.. to support this approach / use-case :)

but that approach just isn't supported and I'm reluctant to add more complexity to the implementation, especially complexity from an integrated (entangled) implementation as proposed.

Can you please clarify what complexity you talk about ? I only refresh the file content before saving it basically.. I'm not sure how complex that is...

@jaraco
Copy link
Member

jaraco commented Feb 28, 2022

The PthDistribution class and its usage are already overly-complex. Even though PthDistribution is constructed just once, it's passed values from EasyInstall.all_site_dirs, which is derived from various sources. PthDistribution extends Environment, but it's not clear how much of Environment is relevant to the usage here. And as much as I try, the existing save/load functionality is difficult to simplify. I have plans to remove the dependence on pkg_resources, and I worry that adding more complexity to this implementation is only going to make it harder.

@gst
Copy link
Author

gst commented Feb 28, 2022

The PthDistribution class and its usage are already overly-complex.

ok but I was more asking what complexity I added.. ?

For the fact that PthDistributions extends/subclass Environment .. yes and ? I don't see how it's relevant to what I changed.. which actually only concerns the PthDistributions save() behavior basically.

And as much as I try, the existing save/load functionality is difficult to simplify

well the load is done in the __init__() method and the save in the save() one.. I don't see how that's not simple or complicated too here.

I have plans to remove the dependence on pkg_resources, and I worry that adding more complexity to this implementation is only going to make it harder.

again I don't see where it's that extra/more complex to basically refresh a file before overwritting it in order to play nicely/cooperatively with .. well with another instance of setuptools itself.. :/

@gst
Copy link
Author

gst commented Feb 28, 2022

Ideally, we'd have a reproducer that I could run to test the reported use case. What we have instead is a description of how one might be able to reproduce it if they wish to spend the time devising dependent projects and learning cmake and orchestrating the customized backends.

I could build a full reproducer and publish it. I cannot release/publish the projects on which I'm working 'cause it's for my employer obviously ;)

But that would take me quite some time. And it would involve/include cmake (I don't need to introduce pybind11 in the picture hopefully I think).

Basically that would be :

top_project:
    CMakeLists.txt
    pyproject.toml
    setup.py
    src/
    include/

child_project:
    CMakeLists.txt
    pyproject.toml
    setup.py
    src/
    include/

where the top_project/CMakeLists.txt thus includes the child project (actually via git clone in the build directory). AND also call pip install -e on the setup.py (of the child) when it's requested to basically (when building the python extension in editable mode).

if I would setup/build that repro would that help to make this requested change accepted eventually ?

@gst
Copy link
Author

gst commented Mar 1, 2022

otherwise I was thinking :

maybe we can make this "refresh & merge" in the save() method optional then.. and default it to disabled and so only opt-in for it explicitly ?

but from my pov it seems the right thing to do by default (tm).

@gst
Copy link
Author

gst commented May 18, 2023

ping

@jaraco
Copy link
Member

jaraco commented May 19, 2023

Releasing as v67.8.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Needs Discussion Issues where the implementation still needs to be discussed. Needs Repro Issues that need a reproducible example.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants