Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages shadowed by other modules are no longer importable under bzlmod #1174

Open
aschleck opened this issue Apr 19, 2023 · 9 comments
Open
Labels
type: bzlmod bzlmod work

Comments

@aschleck
Copy link

🐞 bug report

Affected Rule

The issue is caused by the rule: `py_binary`/`py_library` under bzlmod

Is this a regression?

Yes, this works under the WORKSPACE model. It is only bzlmod that is broken.

Description

When two Bazel workspaces (with one depending on the other) both define packages with the same name (common, core, helpers, private, util, ...) the outer one will shadow the inner one. Without bzlmod this has always been solvable by using absolute imports with the workspace name as the prefix. However, under bzlmod, these absolute imports no longer work: when a module is used as an outer repository the prefix is _main but when a module is used as a dependency repository it is <module name>. And even if that was fine, there are further problems caused by local_path_override because it suffixes ~override onto the folders in the runfiles directory.

From my perspective the best solution would be to make the generated py_binary wrapper script read the generated _repo_mapping file and set up the path such that absolute imports work the same way they've always worked. One way to make that happen would be to have the wrapper script create a temporary directory, create symlinks with the correct names pointing to the real folders, and then add the temporary directory to the PYTHONPATH. I could imagine there is probably some cleaner way I can't think of.

🔬 Minimal Reproduction

I put an example here: https://github.com/aschleck/bzlmod-shadow-python-problem (with another writeup in README.md.)

This repository illustrates three situations:

  • shadowed_with_workspace shows a workspace depending on another workspace where both define a package named core. In this case, the outer workspace imports dep absolutely but dep's import of itself fails due to being shadowed.
  • with_workspace illustrates the obvious fix: dep can always import itself with its workspace name as a prefix (like import dep.core.<whatever>. Another solution would be to use relative imports, but those get unwieldy and annoying in large repos.
  • with_bzlmod fails due to rules_python not respecting the _repo_mapping file for imports. The absolute import of dep from outer fails because the folder in the runfiles is actually dep~override since local_path_override was used. Even if it was named dep, the code in dep itself could no longer assume that import dep refers to itself (because when its rules are run directly it is _main, and so it would need to do import _main.core not import dep.core.)

🔥 Exception or Error

The error is what you'd expect: the import fails.


~/bzlmod-shadow-python-problem/with_bzlmod/outer $ bazel run //core
INFO: Analyzed target //core:core (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //core:core up-to-date:
  bazel-bin/core/core
INFO: Elapsed time: 0.212s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/core/core
imported outer.core.outer
Traceback (most recent call last):
  File "/home/april/.cache/bazel/_bazel_april/1d28658223745047b82756f6e8e88a66/execroot/_main/bazel-out/k8-fastbuild/bin/core/core.runfiles/_main/core/outer.py", line 3, in 
    import dep.core.interface
ModuleNotFoundError: No module named 'dep'

🌍 Your Environment

Operating System:

  
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.2 LTS"
  

Output of bazel version:

  
Bazelisk version: v1.14.0
Build label: 6.1.1
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Mar 15 15:44:56 2023 (1678895096)
Build timestamp: 1678895096
Build timestamp as int: 1678895096
  

Rules_python version:

  
0.20.0
  

Anything else relevant?

Thank you for maintaining these rules! It's been amazing to see how well this all works on the whole.

@chrislovecnm
Copy link
Collaborator

create symlinks with the correct names pointing to the real folders

I have not tried, but I have had many problems with symlinks and Windows. So we may not want to go down this path.

@fmeum
Copy link
Contributor

fmeum commented Apr 27, 2023

Directory symlinks aka junctions have been available on Windows for a long time and are relied upon by Bazel quite heavily. They could work here.

A simple "solution" that is recommended for Bazel-aware Python packages is to move everything into a top-level directory with the desired name. Since repo roots are on the Python path, this should allow existing absolute imports to continue to work.

@chrislovecnm chrislovecnm added the type: bzlmod bzlmod work label May 17, 2023
@Wyverald
Copy link
Member

@rickeylev in case you're not aware of this. We discussed this problem around a year ago I think, and I think your recommendation was to set up a PyPI-esque import dir merging all needed repos (and forgoing repo names altogether). Could that be adapted to help in @aschleck's case? Is this process documented?

cc @meteorcloudy

@rickeylev
Copy link
Contributor

Not having the repo name in the import name is ideal, yes.

That said, requiring that to switch to bzlmod is a rather big hurdle.

Having the repo mapping available during analysis would open a few more options.

Hmmm. There might be another way out, too. Not 100% sure on this.

Basically, a user creates a directory with the desired import name. In that dir, they create an init.py file. They set the imports attribute of the target to .. (they can nest an additional directory to avoid additional pollution). Within the init, there's a few options. It could add to __path__, which will make imports of submodules search those added paths. Another option is to manually load and exec the original higher level file and replace itself in sys.modules. This has the advantage of avoiding a temp dir with symlinks. It is also something a library can do, so doesn't require a binary to do anything special.

Symlinks would also work. They'll require additional changes to the startup script, plus cleanup logic, though a user's main could also be written to do the equiv before running the real main module code.

@rickeylev rickeylev added this to the Bzlmod GA milestone Aug 1, 2023
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Jan 28, 2024
@aschleck
Copy link
Author

I no longer work at the company where this matters but it still is a thing

@github-actions github-actions bot removed the Can Close? Will close in 30 days if there is no new activity label Jan 29, 2024
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Jul 28, 2024
@aschleck
Copy link
Author

Still a thing as far as I know

@github-actions github-actions bot removed the Can Close? Will close in 30 days if there is no new activity label Jul 29, 2024
@rickeylev
Copy link
Contributor

Another idea: add something to the end of sys.path as a last-chance way to handle these. By adding to the end, we can minimize interfering with other import dirs. Could generate a dir with shim files (using __path__), or an import hook.

Also, in case it wasn't clear: the main thing preventing progress on this issue is available time. There's several ideas, but someone has to try something to see how well it works or doesn't work.

@aignas aignas removed this from the Bzlmod GA milestone Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bzlmod bzlmod work
Projects
None yet
Development

No branches or pull requests

6 participants