Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for conda install of selenium-manager #12536

Merged

Conversation

stevetracvc
Copy link
Contributor

Description

conda doesn't seem to properly package selenium-manager, so it needs to be install as a separate package (via conda). But this puts it in the environment's bin folder.

This commit checks the path for the selenium-manager executable if it isn't installed in the package's webdriver/common// folder.

Also, I tried running the existing tests but all of the remote tests failed, and two of the browser tests failed for trunk (only one failed for my branch).

Motivation and Context

fixes #11234 and #12084

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@CLAassistant
Copy link

CLAassistant commented Aug 11, 2023

CLA assistant check
All committers have signed the CLA.

@titusfortner
Copy link
Member

I think looking everywhere on PATH for a selenium-manager binary is sub-optimal.
I don't know how conda is managing this. We're looking at caching the binary in this directory and using what is there if it matches the minor version.

f"{Path.home()}/.cache/selenium/manager/0.4.{minor_version}/"

Would it be possible to move the binary to that directory from where conda is putting it?

@stevetracvc
Copy link
Contributor Author

I don't really know anything about how conda handles stuff in the background. And I guess there's an issue if conda activate <env> isn't actually used (since that's what adds stuff to the path). But at least this is out there for other people if they need a temporary fix. I could even add to the exception a message about trying to install selenium-manager.

But I think forcing people to manually move something is just as bad, or possibly even worse, than searching the path. How would you even alert people that they need to do this? I think a better route would be to remove the newer packages from conda instead, since this is such a mess. It took me hours to finally find something useful about why my code was no longer working (in a new conda environment, which I guess grabbed a newer version of selenium than my other environments). If conda only has the older versions, then this won't be an issue.

Also, why do you dislike searching the path? Does it really matter whether selenium-manager was installed via conda or pip or manually?

@titusfortner
Copy link
Member

Agree about the manual moving not being a valid fix, I guess I was thinking there might be a way for the package to automate that, but I'm sure that's not how it works. As for PATH, thought is if someone downloads an old version and puts it somewhere on PATH it could cause a problem. If we did that, I'd want to be checking the version just to be sure, which seems a pain ☹️
Is there a way to know that conda is being used? Maybe we can look in the specific directory it puts it in?

@codecov-commenter
Copy link

codecov-commenter commented Aug 11, 2023

Codecov Report

Patch coverage: 80.00% and project coverage change: +0.03% 🎉

Comparison is base (a4beba5) 57.26% compared to head (54485db) 57.29%.
Report is 15 commits behind head on trunk.

❗ Current head 54485db differs from pull request most recent head 83c2eb7. Consider uploading reports for the commit 83c2eb7 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##            trunk   #12536      +/-   ##
==========================================
+ Coverage   57.26%   57.29%   +0.03%     
==========================================
  Files          86       86              
  Lines        5335     5339       +4     
  Branches      198      198              
==========================================
+ Hits         3055     3059       +4     
  Misses       2082     2082              
  Partials      198      198              
Files Changed Coverage Δ
py/selenium/webdriver/ie/options.py 47.50% <0.00%> (ø)
py/selenium/webdriver/common/selenium_manager.py 60.00% <100.00%> (+2.25%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stevetracvc
Copy link
Contributor Author

stevetracvc commented Aug 11, 2023

os.environ['CONDA_PREFIX'] should provide the environment folder, so then

os.path.join(os.environ['CONDA_PREFIX'], 'bin', file)

SHOULD work, for now, but it seems like a bit of a hack.

I can change it to that if you like it better.

As for having conda change how it's installed, I don't know who makes those scripts or how to get them changed.

@titusfortner
Copy link
Member

@symonk any preferences for this?

@titusfortner
Copy link
Member

Yes, I want us to log that we've detected a Conda installation and look in a specific place.
I think trying to debug a problem where selenium manager is anywhere on PATH is going to be a pain. If we need to do something more, we can easily change later.

@titusfortner
Copy link
Member

@stevetracvc can you update this PR for that? I'd appreciate it. Want to try to get out a release with bug fixes for all the SM stuff "soon."

@stevetracvc
Copy link
Contributor Author

sorry, I tried to rebase but it didn't work for some reason so I tried from github and that merged upstream instead

conda doesn't seem to properly package selenium-manager, so it needs
to be install as a separate package (via conda). But this puts it in
the environment's bin folder.

This commit checks the path for the selenium-manager executable if
it isn't installed in the package's webdriver/common/<platform>/
folder.

fixes SeleniumHQ#11234 and SeleniumHQ#12084
@stevetracvc stevetracvc force-pushed the fix/selenium-manager_conda_install branch from 28c7687 to 7de1fba Compare August 14, 2023 21:08
@stevetracvc
Copy link
Contributor Author

Sorry, I'm not a developer so I don't rebase much. I think I fixed it 😆

@titusfortner
Copy link
Member

Don't worry about the rebasing, Github can manage it; if there's a conflict you need to fix, we can let you know. :)

@isaulv
Copy link
Contributor

isaulv commented Aug 15, 2023

This is a useful thing to support Conda users. Many conda users use Selenium to scrape data from the web or automate other tasks.

@titusfortner titusfortner merged commit d4285d1 into SeleniumHQ:trunk Aug 15, 2023
@titusfortner
Copy link
Member

@stevetracvc thank you for this!

@stevetracvc
Copy link
Contributor Author

No prob @titusfortner

Though one thing I thought of this morning, 'selenium-manager' package should either be listed as a requirement on conda, or there needs to be additional info on https://anaconda.org/conda-forge/selenium stating that selenium-manager needs to be installed separately.

I'm submitting an issue over at https://github.com/conda-forge/selenium-feedstock/ as I think that's where they do the builds

@bollwyvl
Copy link

Hey folks! 👋 git blame would reveal I'm the nerd that made the selenium-manager package for conda-forge and put it in bin on behalf of @conda-forge/selenium, the maintainers of the "feedstocks" that build both packages for conda-forge.

While a former Anaconda, Inc. employee, I have no particular sway on what happens on anaconda.com (the defaults channel) or the anaconda.org/anaconda channel, but as there is a semi-Fedora/RedHat relationship between the two (e.g. they pay for anaconda.org/conda-forge's hefty CDN bills) often package names will eventually line up between the two channels, though their compatibility is not ensured in any meaningful way, as conda-forge is "self-hosted" in the sense that it only builds packages with packages from conda-forge. Any concerns about Anaconda, Inc., the Anaconda Distribution, etc. should be directed to their support channels.

Back to conda-forge: the motivation for selenium-manager as a separate package: we typically avoid redistributing pre-built binaries, as they often don't:

  • work for all of the platforms supported by conda-forge (e.g. aarch64, ppc64le)
  • include all of the licenses for the packages statically compiled into a binary
    • which is entirely unavoidable with rust and go

With the above said, we also try to minimize the number of builds/tests of, specifically, python packages, and the amount of "special" things done at installation time:

  • selenium-manager-feedstock is a non-python package, so we "only" have to build once for each of the os x arch tuples
    • "only" in the sense that it's not os x arch x python
    • the resulting binary has to go somewhere
      • the standard $PATH (and %PATH%) locations seemed like a reasonable place for a standalone binary
  • selenium-feedstock has historically been a noarch: python and gets built exactly once from the PyPI sdist (when possible)
    • unlike a .whl, it is packaged without .pyc files and is installed to the expected location based on the installed python e.g. $PREFIX/lib/python-3.11

The expected location in $PREFIX/lib/python-3.{py-major}/site-packages/.../selenium-manager(.exe) is therefore problematic for us, so we patch to look in $PATH, as of 4.11.2_*_1, but we haven't pulled any older packages, as workarounds are presently still possible.

We're certainly open to other approaches that meet the above pattern, and can mark the _1 build as broken, which we don't like doing, when possible.

Some concrete suggestions:

  • setuptools' data_files
    • deliver selenium-manager to a "well-known" location, e.g. $PREFIX/opt/selenium-manager/{os}/selenium-manager(.exe)
    • we would still patch that out in favor of a dependency on selenium-manager, but then there would be no discrepancy between a .whl-installed package vs a .conda-installed package.
  • do it The Rust Way, with e.g. maturin or setuptools-rust
    • of course, this would make us (and probably you) sad due to the above-mentioned cartesian explosion, but at least would be consistent
  • respect a "well-known" environment variable, e.g. SE_MANAGER_EXE
    • while we don't like to do it, we can push environment variables onto a stack during the activation of a conda environment, and clean up on deactivation
      • this approach gets used for e.g. openjdk, etc. to ensure a java-based package finds the right JAVA_* environment variables

@stevetracvc
Copy link
Contributor Author

@bollwyvl thanks for coming over here. I probably should have reached out to conda-forge earlier, but I honestly didn't know about conda feedstock until about an hour ago.

In regards to your using the PATH, we had a discussion and they're concerned about out-of-sync versions between selenium and selenium-manager. Since conda-forge selenium-manager installs to the environment's bin folder, is there a reason you prefer searching the path instead of using os.environ["CONDA_PREFIX"]?

@titusfortner
Copy link
Member

Thanks @bollwyvl

Caveat: I'm not a python dev and don't really understand conda.

note that ring won't compile presently

There aren't drivers for those architectures, so selenium itself won't work without manual effort by the user to compile their own things. Can you make inclusion conditional on supported architectures?

include all of the licenses

Not sure how this works. Does it help if the license is identical to the Selenium package itself?

The expected location in ... is therefore problematic for us

Would it be easier or harder for conda to use $HOME/.cache/selenium/manager/<version>/selenium-manager?
We're planning to look there first in a future version:
#11359

Further, the location of $HOME/.cache will be adjustable by env SE_CACHE_PATH

we haven't pulled any older packages

fwiw, we still consider this a beta feature and do not guarantee backwards compatible API, so on our end we're not worried about how older versions are supported. /shrug

I'm hoping that what we have with this PR is a good short term solution and that we can coordinate a better long term solution when we implement the caching.

@bollwyvl
Copy link

understand conda.

It is more akin to a Linux distribution package manager: basically, a (linux) machine needs a compatible glibc (gotta stop somewhere) and conda provides everything else in isolated (little "i"), reproducible (pretty little "r") environments which can make use of common files via hard links into a package cache. This approach works pretty well on the above-mentioned architectures for linux, osx, and windows, which makes it attractive for organizations that span a number of environments, and is used fairly widely in certain communities.

Can you make inclusion conditional on supported architectures?

Yep, we can ship a variant (but only two total builds) that does not require selenium-manager which would always be "trumped" by one that does, if it can be installed.

include all of the licenses

Not sure how this works.

After a cargo install downloads half the internet, we use cargo-bundle-licenses to get the full text of all of the licenses of the packages statically compiled into the binary. These, along with the first-party license of the binary-being-packaged, are distributed along with the binary in a predictable (and auditable) location. While the terms of every license does not demand this, doing it for all of them saves a huge amount of manual labor of checking which ones that do.

use $HOME

When at all possible, conda-forge packages will never change anything outside of their installed $PREFIX and the package cache. Some very small exceptions are made for e.g. desktop shortcuts, which cannot be done any other way.

So if cache dirs became the only way to do this, we'd probably patch it to $PREFIX/var/cache/selenium..., and attempt to fulfill (and test) the cache contract with an install-time dependency, rather than a runtime download.

At any rate, if looking at doing this, I highly recommend using something like platformdirs which implements the XDG spec, rather than home-rolling some new scheme.

implement the caching.

In my ideal world, the browser drivers, would, like geckodriver, maintain sufficient compatibility windows that allowed a package manager to handle this dance, up to an entire, offline-installable-and-runnable environment including browsers. Indeed, I've done workshops using precisely this approach, where I got a room full of folks on laptops of the big three operating systems (and many subflavors and versions) with various levels of locked-downed-ness and software-updatedness up to interactively running their first selenium tests in under 20 minutes after a ~500mb download/USB stick swap.

Practically, of course, this is not the case, especially for the spawn of Chromium. But generally, I really don't like runtime downloading, and will go out of my way to patch around it when it's remotely (har) possible.

@titusfortner
Copy link
Member

After a cargo install downloads half the internet

🤣

recommend using something like platformdirs

Our solution needs to be implemented in 5 different languages, and we're trying to keep things as simple as possible across the implementations.

rather than home-rolling some new scheme

We're using this scheme because it is the XDG specification for user-specific data:

There is a single base directory relative to which user-specific non-essential (cached) data should be written. This directory is defined by the environment variable $XDG_CACHE_HOME.
....
If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.

We can't assume that a Selenium user has elevated permissions on the platform.

I really don't like runtime downloading

There's a reason we resisted it for a long time. But our "competitors" don't require extra management when starting to use the project, and Intro to Selenium workshops were a nightmare to get started. Somehow we have to provide an equivalent out of the box experience and this is the best approach we've come up with.

especially for the spawn of Chromium

I'm assuming you've seen the new Chrome for Testing binaries that make managing the versions much more straightforward?

@bollwyvl
Copy link

5 different languages

Oh indeed: on conda-forge we ship.... a lot more than five language runtimes (and their build/test chains) and the sundry packages (and their build/test chains) people would want to use. This is why we try to rely on near-ubiquitous things like $PATH (or other well-known paths and environment variables) when stuck with cross-language-runtime binary discovery.

as simple as possible across the implementations.

it is the XDG specification

We can't assume that a Selenium user has elevated permissions on the platform.

Right, conda doesn't either, and in fact discourages folks from using a privileged account for any conda-related commands. But honoring the various env vars means it can be handled by a non-privileged install, and using an existing implementation, in a given language target, keeps the complexity managed, at the expense of a novel dependency.

Intro to Selenium workshops were a nightmare to get started

Right, see above for being able to build reproducible, non-privileged installers.

Chrome for Testing

Yep, broke a bunch of pipelines, and requires yet another language runtime to be used in the documented way, and downloads a bunch of random binaries at runtime. This is large reason I personally have only taken the time to help maintain geckodriver and the selenium* stack for conda-forge.

@titusfortner
Copy link
Member

I feel like we're speaking past each other on some of these things. Are we agreed that we can figure out a more sustainable solution for selenium manager in conda if we cache the binary?

You seemed to have suggested that $HOME/.cache is a "home rolled" solution instead of what I understand the spec to recommend as the place to put things. Or is there a problem with the subdirectory scheme, or are there environment variables we need to be respecting? I can't tell what your actual recommendation is for us here.

You recommended platformdirs, but that's a python library, and I don't understand how we would use it to write the code for Ruby / .NET / Java.

see above for being able to build reproducible, non-privileged installers

Sure, except that requires users to adapt to your solution rather than adapting a solution to the users. We're open to exploring alternatives here, but I think our requirements/limitations are different, and I can't tell what exactly we might want to change that works better for both of us.

@titusfortner
Copy link
Member

respect a "well-known" environment variable, e.g. SE_MANAGER_EXE

@bollwyvl to follow up, we ended up deciding that your third suggestion was the right way for us to go. Selenium 4.13+ will first look for a binary at SE_MANAGER_PATH. Thank you.

@cosing
Copy link

cosing commented Oct 19, 2023

I have some doubts about the 'bin' here, and I'm not sure what I changed, but when I downloaded it, selenium-manager was in another path.

(math) C:\Users\cosing>where selenium-manager
D:\program\conda\envs\math\Scripts\selenium-manager.exe"

Simply changing 'bin' to 'Scripts' makes it work fine on my term.

mweinelt added a commit to mweinelt/selenium that referenced this pull request Oct 30, 2023
Accessing CONDA_PREFIX in os.environ fails with a KeyError when it is
not set. The condition should instead check for the existence of the
key, before accessing it in the following codepath

Fixes: d4285d1 ("fix for conda install of selenium-manager (SeleniumHQ#12536)")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[🐛 Bug]: Selenium manager not found via Anaconda distribution
8 participants