Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Python support to adhere to Python versions #3671

Closed
fviernau opened this issue Feb 23, 2021 · 18 comments
Closed

Improve Python support to adhere to Python versions #3671

fviernau opened this issue Feb 23, 2021 · 18 comments
Labels
analyzer About the analyzer tool enhancement Issues that are considered to be enhancements

Comments

@fviernau
Copy link
Member

For Python projects it is hard (impossible) to determine to which Python version they apply.
ORT has some mechanism which tries (1) Python 2.x and (2) Python 3. In particular, for the latter Python 3.6 is used.

When installing dependencies via pip install -r requirements.txt from pypi.org, pip considers only the dependencies which are compatible with the used Python version, 3.6.

For example in a Python 3.7 project this can lead to the following two issues:

  1. A dependency cannot be resolved by ORT, while it is actually resolvable using the appropriate Python version. This
    happens if none of the versions allowed by the constraints is compatible with Python 3.7.
  2. Even worse, the wrong dependency version is resolved. This happens e.g. if the youngest 3.6 compatible version allowed by
    the constraints does not equal the youngest 3.7 compatible version. In the worst case it can happen that the project uses the
    latest version of a dependency while ORT resolves a really old (Python 3.6 compatible) version of that dependency.
@sschuberth sschuberth added analyzer About the analyzer tool enhancement Issues that are considered to be enhancements labels Feb 23, 2021
@sschuberth
Copy link
Member

At least for setup.py-based projects, there might be hints which Python version to use. For example, the classifiers may contain something like

classifiers=[
    "Programming Language :: Python",
    "Programming Language :: Python :: 3.6",
    "Programming Language :: Python :: 3.7",
    "Programming Language :: Python :: 3.8",
],

This at least tells you the minimum version to use. Similarly, a line like

python_requires=">=3.6",

might be contained. However, both types of entries are a bit inconvenient to parse.

But even if the Python version to use would be known, another problem then is that you actually require that version to be installed. At least for our Dockerfile it's impractical to install all sorts of Python versions beforehand, just in case they might be needed. Maybe bootstrapping the right Python version on the fly would be an option, but doing that right for all supported platform is a lot of effort.

A completely different idea that I have in mind is to generally switch over to using GraalVM for ORT, and leverage its Python support to run pip programmatically, and somehow "pretending" to be the right Python version.

@sschuberth
Copy link
Member

Ping @pombredanne, being the Python guy here, do you have any good idea how to solve this problem?

@pombredanne
Copy link
Contributor

@sschuberth a way is to rely on python_requires=">=3.6" in modern setup.cg or setup.py. Classifiers are weak information otherwise.

If you want to go the dynamic pip route in anycase pip would resolve based on it current interpreter version and not some other version of Python. Note that pip now uses this bundled resolver https://github.com/sarugaku/resolvelib which you could also invoke directly and be in control of which os/python version/python/arch combo you are emulating.

With that said there is no absolute right answer as each of these combos may resolve to different deps versions and different packages, unless you work from a lockfile, as each dep may be tagged like here

So to recap: each combo of

  • Python interpreter (CPython, Pypy)
  • Python version (3.6 to 3.9 at least today)
  • OS (Windows, Linux, Mac, etc)
  • Architecture (x68, x86_64, ARM, etc)
    ... may yield a different set of:
  • built/bundled dependencies (included in a built "wheel")
  • external dependencies (resolved and fetched from PyPI)
    ... which furthermore can change over time, unless pinned/locked down.

That said this is nothing really specific to Python AFAIK, just made ore visible here.

However, both types of entries are a bit inconvenient to parse.

ScanCode knows how to parse these alright AFAIK.

@sschuberth
Copy link
Member

Note that pip now uses this bundled resolver https://github.com/sarugaku/resolvelib

Does pip really use that library as-is? I thought I read somewhere that it's rather the other way around, with resolvelib being a stand-alone reimplementation of pip's (new) resolver mechanism.

However, both types of entries are a bit inconvenient to parse.

ScanCode knows how to parse these alright AFAIK.

Sure, because ScanCode is written in Python itself 😋

@pombredanne
Copy link
Contributor

Does pip really use that library as-is?

See https://github.com/pypa/pip/tree/0ffff034f376feef189cf32cfba56ddd3a472c70/src/pip/_vendor/resolvelib

The resolvelib resolver library is "vendored" as-is in pip from what I can see (and I reckon I did study the code ;) ). FWIW, the vendoring is because pip cannot have external dependencies itself and needs things bundled to be able to bootstrap standalone.

Sure, because ScanCode is written in Python itself

Actually that's not really an advantage when we do static analysis. There is of course a clear and obvious advantage to use Python overall ;) but in the case of Python manifests proper, setup.cfg is an .ini format, and except for setup.py that needs some code lexing and finicking, (as does Ruby and as would Groovy and Kotlin need it) most everything else is either RFC822-, Toml- or JSON-formatted, so reasonably easy to get to.

@sschuberth
Copy link
Member

sschuberth commented Feb 25, 2021

Actually that's not really an advantage when we do static analysis.

Except that you can more easily use the same Python libraries / functions as pip itself when parsing setup.py.

There is of course a clear and obvious advantage to use Python overall ;)

You came to the wrong place for that statement 😄 When we meet in person again, I'll explain you why I basically like Python's syntax, but its ecosystem (and esp. dependency management) just sucks. I mean, there obviously are not even means to just query the transitive dependency tree without actually downloading all the binaries (that you don't need nor are interested in) and pretending to do a build (which is not what you want to do). But let's continue that discussion elsewhere 😉

@deeplook
Copy link

Any progress on this?

@deeplook
Copy link

deeplook commented May 10, 2021

Why is this "inconvenient to parse"?

classifiers=[
    "Programming Language :: Python",
    "Programming Language :: Python :: 3.6",
    "Programming Language :: Python :: 3.7",
    "Programming Language :: Python :: 3.8",
]

There is always the nice ast package you can use without executing the parsed code...

sschuberth added a commit that referenced this issue Jun 10, 2021
Also see issues #3671 and #3873.

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
sschuberth added a commit that referenced this issue Jun 10, 2021
Also see issues #3671 and #3873.

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
sschuberth added a commit that referenced this issue Jun 10, 2021
Also see issues #3671 and #3873.

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
@nicorikken
Copy link
Member

I ran into this issue with the analyzer which was running python 3.6, but the project was developed for a newer python version. It had requirements like numpy~=1.20.3 which at least requires python 3.7:

07:54:22.146 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running 'virtualenv --version' in '/'...
07:54:22.202 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.analyzer.managers.Pip - Resolving PIP dependencies for '/project/requirements.txt'...
07:54:22.203 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.analyzer.managers.Pip - Creating a virtualenv for the 'project' project directory...
07:54:22.206 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running 'python3 /tmp/python_compatibility9712707363750178.py -d /project' in '/'...
07:54:29.849 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.analyzer.managers.Pip - Trying to install dependencies using Python 3...
07:54:29.852 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running 'virtualenv /tmp/ort-project-virtualenv122357601095579656 -p /usr/bin/python3' in '/project'...
07:54:39.758 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running '/tmp/ort-project-virtualenv122357601095579656/bin/pip --trusted-host pypi.org --trusted-host pypi.python.org install pip==18.0' in '/project'...
07:54:41.992 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running '/tmp/ort-project-virtualenv122357601095579656/bin/pip --trusted-host pypi.org --trusted-host pypi.python.org install pipdeptree==0.13.2' in '/project'...
07:54:42.769 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running '/tmp/ort-project-virtualenv122357601095579656/bin/pip --trusted-host pypi.org --trusted-host pypi.python.org install --no-warn-conflicts --prefer-binary -r requirements.txt' in '/project'...
07:54:45.662 [DefaultDispatcher-worker-3] ERROR org.ossreviewtoolkit.utils.ProcessCapture - Running '/tmp/ort-project-virtualenv122357601095579656/bin/pip --trusted-host pypi.org --trusted-host pypi.python.org install --no-warn-conflicts --prefer-binary -r requirements.txt' in '/project' failed with exit code 1:
  Could not find a version that satisfies the requirement numpy~=1.20.3 (from -r requirements.txt (line 13)) (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0rc1, 1.13.0rc2, 1.13.0, 1.13.1, 1.13.3, 1.14.0rc1, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0rc1, 1.15.0rc2, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0rc1, 1.16.0rc2, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0rc1, 1.17.0rc2, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0rc1, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0rc1, 1.19.0rc2, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5)
No matching distribution found for numpy~=1.20.3 (from -r requirements.txt (line 13))
You are using pip version 18.0, however version 21.1.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

07:54:45.663 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.analyzer.managers.Pip - Falling back to trying to install dependencies using Python 2...
07:54:45.665 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.ProcessCapture - Running 'virtualenv /tmp/ort-project-virtualenv1856848556700345242 -p /usr/bin/python2' in '/project'...

Considering that the analyzer is creating a virtual environment, it would be nice if it could select a Python version.

For my case I'll try running the analyzer with a newer Python version and running the Scanner with Python 3.6 as is required by Scancode toolkit. This will make it difficult to have an all-in-one Docker image though.

@pombredanne
Copy link
Contributor

For my case I'll try running the analyzer with a newer Python version and running the Scanner with Python 3.6 as is required by Scancode toolkit. This will make it difficult to have an all-in-one Docker image though.

The current version of ScanCode TK supports all Python versions from 3.6 to 3.9 .... so I am not sure what is the issue. We release the app bundle with support for Python 3.6 only for now... may be that's what is used here?

@nicorikken
Copy link
Member

To help the discussion, I looked into the available Python versions in Ubuntu, as that is currently used in the Dockerfile:
image

For example the information on the default Python3 package in Focal can be found on the package page

Considering that not all Python3.6+ versions are available in Bionic, perhaps we can create multiple Docker image variants for the analyzer with different tags?

But considering the discussion above, it isn't necessary to have a complete Python installation, as long as the packages can be resolved for a specified Python version. That would make it far more flexible.

@nicorikken
Copy link
Member

@pombredanne my bad, I assumed this was still also an issue with ScanCode TK as there was another issue on that. Sorry for speaking bad about it then. I'll try switching to Focal to see if the entire setup works. If ScanCode is not holding us back, we could create multiple Docker image versions for specific Python versions for now.

@pombredanne
Copy link
Contributor

@nicorikken re:

Sorry for speaking bad about it then

You did not! :P

we could create multiple Docker image versions for specific Python versions for now.

I chatted with @sschuberth about this a few weeks ago and there was an alternative... but I forgot which one!

@nicorikken
Copy link
Member

I started reworking the Docker image to Ubuntu 20.04, but now I see @heliocastro has already made this effort. And has also worked on multiple Python versions: #3613 #3902 I'll look into that work to see if I can remove my blockade. But on this issue, if we can avoid relying on system-wide Python versions for resolving Python dependencies, that would be great.

@edulix
Copy link

edulix commented Jul 16, 2021

Great work on ORT, it's an awesome project! I'm struggling with this issue. Wouldn't it be easier to be able to configure the project python version explictly, for example with an ENV variable as @heliocastro proposes in #3902 or with some other specific scanner yaml configuration?

@heliocastro
Copy link
Contributor

I just updated the new docker with final solution using pyenv, so please jump on #3902 to take a look.
With pyenv now we can install all versions you want, i just set one 3.8.11 as default.
On the commit readme explains how to install extra versions and then can use pyenv as usual

@pombredanne
Copy link
Contributor

pombredanne commented Aug 2, 2022

Some updates that are likely relevant here: https://github.com/nexB/python-inspector is now out and has been designed specifically to be integrated in ort. See also aboutcode-org#1 that we are refining there first before submitting to ort proper here.

https://github.com/nexB/python-inspector features are that you can point it to requirements and pass it a target Python version and a target os/architecture and it will resolve the dependencies tree as pip does it, but without installing any of the packages. The target Python version and os can be different than the runtime Python and os. It internally uses the same resolution library as pip (resolvelib) and a pip requirements parser that we extracted from pip and is now in use in several other tools including CycloneDX.

TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 12, 2022
…ss-review-toolkit#3671

This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 16, 2022
…ss-review-toolkit#3671

This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 18, 2022
…ss-review-toolkit#3671

This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 18, 2022
…ss-review-toolkit#3671

This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 22, 2022
…ss-review-toolkit#3671

This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 22, 2022
…ss-review-toolkit#3671

This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
mnonnenmacher pushed a commit that referenced this issue Aug 23, 2022
This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: #4637
Reference: #3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 23, 2022
This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 23, 2022
This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
TG1999 added a commit to aboutcode-org/ort that referenced this issue Aug 23, 2022
This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: oss-review-toolkit#4637
Reference: oss-review-toolkit#3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
mnonnenmacher pushed a commit that referenced this issue Aug 24, 2022
This PR replaces pipdeptree with python-inspector to resolve
Python packages dependencies found in requirement files.
python-inspector can resolve dependencies for any target
Python version and OS (and not only the one running the tool).
In this integration in ORT, it replaces pipdeptree pretty much
in place as python-inspector implements a similar output data
structure by design to ease the integration.

Reference: https://github.com/nexB/python-inspector
Reference: #4637
Reference: #3671
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@sschuberth
Copy link
Member

@mnonnenmacher @fviernau can we already close this as 1208225 made the Python version configurable?

tsteenbe added a commit that referenced this issue Dec 12, 2022
Ticket linked with limitations[1] has been closed as Python
version is now configurable.

[1]: #3671

Signed-off-by: Thomas Steenbergen <thomas_steenbergen@epam.com>
tsteenbe added a commit that referenced this issue Dec 12, 2022
Ticket linked with limitations[1] has been closed as Python
version is now configurable.

[1]: #3671

Signed-off-by: Thomas Steenbergen <thomas_steenbergen@epam.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer About the analyzer tool enhancement Issues that are considered to be enhancements
Projects
None yet
Development

No branches or pull requests

8 participants