Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to parse data-requires-python with a * char #445

Closed
j0ack opened this issue Dec 22, 2022 · 18 comments · Fixed by #447
Closed

Fails to parse data-requires-python with a * char #445

j0ack opened this issue Dec 22, 2022 · 18 comments · Fixed by #447
Assignees
Labels
bug Something isn't working component:dep-sources Dependency sources

Comments

@j0ack
Copy link

j0ack commented Dec 22, 2022

Bug description

pip-audit raises an InvalidSpecifier error when trying to parse a data-requires-python key containing a * from pypi.
Seems related to this old issue : #138

Reproduction steps

I found the nltk package wich contains this link in its pypi page raising the mentioned error:

<a 
  href="https://files.pythonhosted.org/packages/98/06/de681159e6750d0a215c2126e784504e177896afddf5a68cba42ebe42355/nltk-3.6-py3-none-any.whl#sha256=718de6908f538db19a77f96b9e6f5f586b0892d7de5eea32e71f2a2535ed8657" 
  data-requires-python="&gt;=3.5.*" 
>
  nltk-3.6-py3-none-any.whl
</a><br />

To reproduce, we only need to launch a pip-audit with this dependency.

echo "nltk" > requirements.txt
pip-audit -v --requirement requirements.txt                       

Screenshots and logs

DEBUG:pip_audit._cli:parsed arguments: Namespace(cache_dir=None, desc=<VulnerabilityDescriptionChoice.Auto: 'auto'>, dry_run=False, extra_index_urls=[], fix=False, format=<OutputFormatChoice.Columns: 'columns'>, ignore_vulns=[], index_url='https://pypi.org/simple/', local=False, no_deps=False, output=PosixPath('stdout'), paths=[], progress_spinner=<ProgressSpinnerChoice.On: 'on'>, project_path=None, require_hashes=False, requirements=[<_io.TextIOWrapper name='test_require.txt' mode='r' encoding='UTF-8'>], skip_editable=False, strict=False, timeout=15, verbose=True, vulnerability_service=<VulnerabilityServiceChoice.Pypi: 'pypi'>)
DEBUG:cachecontrol.controller:Looking up "https://pypi.org/simple/nltk/" in the cache
DEBUG:cachecontrol.controller:Current age based on date: 1186
DEBUG:cachecontrol.controller:Freshness lifetime from max-age: 600
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pypi.org:443
DEBUG:urllib3.connectionpool:https://pypi.org:443 "GET /simple/nltk/ HTTP/1.1" 304 0
Traceback (most recent call last):
  File "/usr/local/bin/pip-audit", line 8, in <module>
    sys.exit(audit())
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_cli.py", line 448, in audit
    for (spec, vulns) in auditor.audit(source):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_audit.py", line 67, in audit
    for dep, vulns in self._service.query_all(specs):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_service/interface.py", line 155, in query_all
    for spec in specs:
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/requirement.py", line 116, in collect
    for _, dep in self._collect_cached_deps(filename, reqs):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/requirement.py", line 328, in _collect_cached_deps
    for req, resolved_deps in self._resolver.resolve_all(iter(req_values)):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/interface.py", line 88, in resolve_all
    yield (req, self.resolve(req))
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/resolvelib.py", line 77, in resolve
    result = self.resolver.resolve([req])
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 521, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 372, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 168, in _add_to_criteria
    candidates=build_iter_view(matches),
  File "/usr/local/lib/python3.8/site-packages/resolvelib/structs.py", line 169, in build_iter_view
    matches = list(matches)
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 362, in find_matches
    [
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 362, in <listcomp>
    [
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 203, in get_project_from_indexes
    yield from get_project_from_index(index_url, session, project, extras, timeout, state)
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 249, in get_project_from_index
    spec = SpecifierSet(py_req)
  File "/usr/local/lib/python3.8/site-packages/packaging/specifiers.py", line 700, in __init__
    parsed.add(Specifier(specifier))
  File "/usr/local/lib/python3.8/site-packages/packaging/specifiers.py", line 234, in __init__
    raise InvalidSpecifier(f"Invalid specifier: '{spec}'")
packaging.specifiers.InvalidSpecifier: Invalid specifier: '>=3.5.*'

Platform information

  • Linux
  • pip-audit 2.4.9
  • Python 3.8.2
  • pip 22.3.1
@j0ack j0ack added the bug-candidate Might be a bug. label Dec 22, 2022
@woodruffw
Copy link
Member

Thanks for the report!

This is an interesting one: the error here is coming from packaging, which says that the specifier itself isn't well formed. I've never seen an asterisk in a specifier in that position, so I'll have to check what the docs say.

@woodruffw
Copy link
Member

(At the very least, however, this shouldn't cause a hard failure -- we should produce a warning and possibly just ignore the specifier instead.)

@woodruffw woodruffw added bug Something isn't working component:dep-sources Dependency sources and removed bug-candidate Might be a bug. labels Dec 22, 2022
@woodruffw
Copy link
Member

Yep, definitely localized to packaging:

>>> from packaging.specifiers import Specifier
>>> Specifier(">=3.5.*")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/william/devel/pip-audit/env/lib/python3.7/site-packages/packaging/specifiers.py", line 234, in __init__
    raise InvalidSpecifier(f"Invalid specifier: '{spec}'")
packaging.specifiers.InvalidSpecifier: Invalid specifier: '>=3.5.*'
>>> import packaging
>>> packaging.__version__
'22.0'

@woodruffw
Copy link
Member

PEP 440 doesn't say it directly, but the language implies that the .* is only valid on exact comparison operators (e.g. == and !=), nor ordered comparisons (e.g. >=).

In particular, a comparison like >=3.5.* is redundant: it has the exact same meaning as >=3.5.

I'll raise an upstream issue with packaging to get some clarity on the expected behavior here. But in the meantime, we can also "fix" this on our end by handling the exception that occurs here.

@woodruffw
Copy link
Member

Looks like this is indeed an invalid specifier, and the most recent version of packaging began correctly rejecting it: pypa/packaging#645

@woodruffw woodruffw self-assigned this Dec 22, 2022
@di
Copy link
Member

di commented Dec 22, 2022

pip is still able to install this though, so we should figure out what it's doing when it encounters this specifier and do the same:

$ python -m pip install nltk==3.6
Collecting nltk==3.6
  Using cached nltk-3.6-py3-none-any.whl (1.5 MB)
Collecting regex
  Using cached regex-2022.10.31-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (770 kB)
Collecting tqdm
  Using cached tqdm-4.64.1-py2.py3-none-any.whl (78 kB)
Collecting joblib
  Using cached joblib-1.2.0-py3-none-any.whl (297 kB)
Collecting click
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Installing collected packages: tqdm, regex, joblib, click, nltk
Successfully installed click-8.1.3 joblib-1.2.0 nltk-3.6 regex-2022.10.31 tqdm-4.64.1

$ pip --version
pip 22.3.1 from /home/di/env/lib/python3.10/site-packages/pip (python 3.10)

@woodruffw
Copy link
Member

Based on pypa/packaging#645, I think it's interpreting it as a LegacyVersion, or possibly just silently ignoring it. But I'll confirm that 🙂

@woodruffw
Copy link
Member

Yep, here's what they do:

https://github.com/pypa/pip/blob/c4566c6c828fa7b41f5656d30c6375a494d73ded/src/pip/_internal/index/package_finder.py#L51-L95

TL;DR they ignore the requires-python specifier if they can't make sense of it, meaning that they include the candidate in the candidate selection list.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 22, 2022

pip is still on packaging 21.3; which is why it still accepts that. I expect we'll start rejecting/skipping those in 3-6 months from now (or at least, being really noisy in warning about them).

@woodruffw
Copy link
Member

Makes sense, thanks for confirming! It looks like the current behavior is to ignore invalid specifiers (even if it isn't currently triggered for these ones yet), so we'll mimic that for now 🙂

woodruffw added a commit that referenced this issue Dec 22, 2022
We follow `pip`'s lead here and ignore these entirely, treating
them as if they don't exist (while also warning the user).

Fixes #445.

See: pypa/packaging#645

Signed-off-by: William Woodruff <william@trailofbits.com>
woodruffw added a commit that referenced this issue Dec 22, 2022
* pip_audit, test: handle invalid requires-python specifiers

We follow `pip`'s lead here and ignore these entirely, treating
them as if they don't exist (while also warning the user).

Fixes #445.

See: pypa/packaging#645

Signed-off-by: William Woodruff <william@trailofbits.com>

* CHANGELOG: record changes

Signed-off-by: William Woodruff <william@trailofbits.com>

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member

Thanks again for reporting @j0ack! Let us know if this is currently blocking your use of pip-audit; if so, we can do a patch release to get it out.

@j0ack
Copy link
Author

j0ack commented Dec 22, 2022

You're very welcome @woodruffw, I did not think the issue would be resolved only few hours after posting it. I think it will work with the code you merged in #447 Thank you

@krishnasism
Copy link

krishnasism commented Dec 28, 2022

@woodruffw is there an expected date as to when the fix for this might be released?

@tetsuo-cpp
Copy link
Contributor

tetsuo-cpp commented Dec 28, 2022

@woodruffw is there an expected date as to when the fix for this might be released?

@krishnasism If this is blocking your use of pip-audit, I can look into cutting a patch release.

@krishnasism
Copy link

It is definitely blocking my usage.
That would be really helpful, thanks!

@woodruffw
Copy link
Member

@krishnasism we've cut 2.4.11 with the fix; it should be available on PyPI momentarily. Thanks for letting us know!

@krishnasism
Copy link

@woodruffw @tetsuo-cpp thanks a lot. It works like a charm now! 😄

@woodruffw
Copy link
Member

Excellent, glad to hear it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component:dep-sources Dependency sources
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants