Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DEPRECATION] Moving away from html5lib to html.parser #10825

Closed
pradyunsg opened this issue Jan 24, 2022 · 131 comments · Fixed by #11259
Closed

[DEPRECATION] Moving away from html5lib to html.parser #10825

pradyunsg opened this issue Jan 24, 2022 · 131 comments · Fixed by #11259
Labels
project: vendored dependency Related to a vendored dependency type: deprecation Related to deprecation / removal.

Comments

@pradyunsg
Copy link
Member

pradyunsg commented Jan 24, 2022

Starting with pip 22.0, the HTML parsing is done using html.parser instead of html5lib by default. Along with this, there's an additional check to ensure that a valid HTML 5 doctype declaration is present in the document.

If you're here from a warning/error from pip's output:

  • Please reach out to the provider of the package index you're using and ask them to change the index pages to be valid HTML 5 documents (declaring doctype, having the correct structure etc).
  • You may pass --use-deprecated=html5lib until pip 22.2 (i.e. start of Q3 2022), when this flag will be dropped. This will suppress the warning for now, however you will no longer be able to pass this flag once pip 22.2 is released (and will need to fix the index pages to suppress the warning).

This behaviour change is motivated by two major factors:

  • html5lib is the reason that pip pulls in dropping various other libraries, as part of its own dependency graph. Dropping html5lib and its dependencies from pip, enables reducing the maintainance workload on pip's maintainers and helps reduce the size of pip's distributions.
  • The Python standard library's html.parser is more than sufficient for parsing the pages that pip needs to parse (see https://pypi.org/simple/pip/ for example).

Barring major surprises, the flag to use html5lib will be removed in 22.1. There were surprises.

  • The initial implementation of the html.parser-based parsing enforced that the page contains a doctype, throwing an error if it did not. Turns out, many third-party package indexes did not include a <!doctype html> in their index pages.
  • With pip 22.0.1, certain bugs in the fallback logic were fixed, for pages that did not include the doctype.
  • With pip 22.0.2, a fallback to the legacy html5lib logic was introduced, for pages that don't start with <!doctype html> (case-insensitive) with a warning presented to the user.
  • With pip 22.0.3, the fallback to the legacy html5lib logic has been removed and the strict error in the html.parser logic has been relaxed to be a warning.
  • With pip 22.0.4, the warning has been removed. Users will no longer get a warning on an invalid or missing doctype. However, this should still be fixed since a future version of pip may start rejecting such pages (after a deprecation period of ~3-6 months).
@DiddiLeija DiddiLeija added type: deprecation Related to deprecation / removal. project: vendored dependency Related to a vendored dependency labels Jan 25, 2022
@webknjaz

This comment has been minimized.

@pradyunsg

This comment has been minimized.

@astrojuanlu
Copy link
Contributor

Got an error while trying to use https://www.piwheels.org/simple/.

Before:

$ pip install -U tox -i https://www.piwheels.org/simple/
Looking in indexes: https://www.piwheels.org/simple/
Collecting tox
  Downloading https://www.piwheels.org/simple/tox/tox-3.24.5-py2.py3-none-any.whl (85 kB)
     |████████████████████████████████| 85 kB 981 kB/s             
^CERROR: Operation cancelled by user

After:

$ pip install -U tox -i https://www.piwheels.org/simple/
Looking in indexes: https://www.piwheels.org/simple/
ERROR: Exception:
Traceback (most recent call last):
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 165, in exc_logging_wrapper
    status = run_func(*args)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 340, in run
    reqs, check_supported_wheels=not options.target_dir
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 95, in resolve
    collected.requirements, max_rounds=try_to_avoid_resolution_too_deep
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
    if not criterion.candidates:
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
    return bool(self._sequence)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
    return any(self)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
    return (c for c in iterator if id(c) not in self._incompatible_ids)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 44, in _iter_built
    for version, func in infos:
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 297, in iter_index_candidate_infos
    hashes=hashes,
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/package_finder.py", line 868, in find_best_candidate
    candidates = self.find_all_candidates(project_name)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/package_finder.py", line 809, in find_all_candidates
    page_candidates = list(page_candidates_it)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/sources.py", line 134, in page_candidates
    yield from self._candidates_from_page(self._link)
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/package_finder.py", line 773, in process_project_url
    page_links = list(parse_links(html_page, self._use_deprecated_html5lib))
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/collector.py", line 310, in wrapper_wrapper
    return list(fn(page, use_deprecated_html5lib))
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/collector.py", line 350, in parse_links
    parser.feed(page.content.decode(encoding))
  File "/usr/lib/python3.7/html/parser.py", line 111, in feed
    self.goahead(0)
  File "/usr/lib/python3.7/html/parser.py", line 179, in goahead
    k = self.parse_html_declaration(i)
  File "/usr/lib/python3.7/html/parser.py", line 270, in parse_html_declaration
    self.handle_decl(rawdata[i+2:gtpos])
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/collector.py", line 405, in handle_decl
    self._raise_error()
  File "/home/juanlu/Projects/Other/tutor-53-bot/.venv/lib/python3.7/site-packages/pip/_internal/index/collector.py", line 427, in _raise_error
    "HTML doctype missing or incorrect. Expected <!DOCTYPE html>.\n\n"
ValueError: HTML doctype missing or incorrect. Expected <!DOCTYPE html>.

If you believe this error to be incorrect, try passing the command line option --use-deprecated=html5lib and please leave a comment on the pip issue at https://github.com/pypa/pip/issues/10825.

This is how the index page looks like:

$ curl -r 0-100 https://www.piwheels.org/simple/
<!doctype html>
<html>
<head>
<meta name="api-version" value="2" />
<title>piwheels - Simple index

@pradyunsg
Copy link
Member Author

pradyunsg commented Jan 30, 2022

AHAHAAH. From https://www.w3resource.com/html5/doctype.php:

'DOCTYPE' keyword is not case sensitive. So, <!doctype html> or <!DOCTYPE html>, both will do.

I've filed #10844 for this. /cc @bennuttall so that he's aware of the bug on our end which affects piwheels users.

@astrojuanlu Can you confirm that the workaround noted above, passing --allow-deprecated=html5lib, works?

@astrojuanlu
Copy link
Contributor

Yep, I confirm it fixes the issue @pradyunsg!

@matthew-s-walker
Copy link

We're hitting this with pip caches served from Artifactory, it doesn't include DOCTYPE in the response at all. The allow-deprecated flag does work though.

@pradyunsg
Copy link
Member Author

@matthew-s-walker Could you go ahead and reach out to the Artifactory (JFrog?) folks about that? The fix for that needs to happen on their end.

@suman4ds
Copy link

suman4ds commented Jan 30, 2022

Got the error while upgrading pip from 19.2.3 to the latest version(22.0) on docker. I tried --use-deprecated=html5lib, did not work.

Getting an error for this:
pip3 install --no-cache --upgrade pip setuptools wheel

Specifying 21.3.1 solved my issue:
pip3 install --no-cache --upgrade pip==21.3.1 setuptools wheel

[06:28:07 PM]  Installing collected packages: pip, setuptools, wheel
[06:28:07 PM]    Found existing installation: pip 19.2.3
[06:28:07 PM]      Uninstalling pip-19.2.3:
[06:28:07 PM]        Successfully uninstalled pip-19.2.3
[06:28:09 PM]    Found existing installation: setuptools 41.2.0
[06:28:09 PM]      Uninstalling setuptools-41.2.0:
[06:28:09 PM]        Successfully uninstalled setuptools-41.2.0
[06:28:09 PM]  Successfully installed pip-22.0 setuptools-60.5.0 wheel-0.37.1
[06:28:09 PM]  Looking in indexes: https://mbartft:****@artifactory.corp.adobe.com/artifactory/api/pypi/pypi-asr-python-release-local/simple, https://pypi.python.org/simple
[06:28:09 PM]  ERROR: Exception:
[06:28:09 PM]  Traceback (most recent call last):
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 165, in exc_logging_wrapper
[06:28:09 PM]      status = run_func(*args)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
[06:28:09 PM]      return func(self, options, args)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/commands/install.py", line 339, in run
[06:28:09 PM]      requirement_set = resolver.resolve(
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 94, in resolve
[06:28:09 PM]      result = self._result = resolver.resolve(
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
[06:28:09 PM]      state = resolution.resolve(requirements, max_rounds=max_rounds)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
[06:28:09 PM]      self._add_to_criteria(self.state.criteria, r, parent=None)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
[06:28:09 PM]      if not criterion.candidates:
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
[06:28:09 PM]      return bool(self._sequence)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
[06:28:09 PM]      return any(self)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
[06:28:09 PM]      return (c for c in iterator if id(c) not in self._incompatible_ids)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 44, in _iter_built
[06:28:09 PM]      for version, func in infos:
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 294, in iter_index_candidate_infos
[06:28:09 PM]      result = self._finder.find_best_candidate(
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 868, in find_best_candidate
[06:28:09 PM]      candidates = self.find_all_candidates(project_name)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 809, in find_all_candidates
[06:28:09 PM]      page_candidates = list(page_candidates_it)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/sources.py", line 134, in page_candidates
[06:28:09 PM]      yield from self._candidates_from_page(self._link)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 773, in process_project_url
[06:28:09 PM]      page_links = list(parse_links(html_page, self._use_deprecated_html5lib))
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 310, in wrapper_wrapper
[06:28:09 PM]      return list(fn(page, use_deprecated_html5lib))
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 350, in parse_links
[06:28:09 PM]      parser.feed(page.content.decode(encoding))
[06:28:09 PM]    File "/usr/lib/python3.8/html/parser.py", line 111, in feed
[06:28:09 PM]      self.goahead(0)
[06:28:09 PM]    File "/usr/lib/python3.8/html/parser.py", line 171, in goahead
[06:28:09 PM]      k = self.parse_starttag(i)
[06:28:09 PM]    File "/usr/lib/python3.8/html/parser.py", line 345, in parse_starttag
[06:28:09 PM]      self.handle_starttag(tag, attrs)
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 410, in handle_starttag
[06:28:09 PM]      self._raise_error()
[06:28:09 PM]    File "/usr/src/venv/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 426, in _raise_error
[06:28:09 PM]      raise ValueError(
[06:28:09 PM]  ValueError: HTML doctype missing or incorrect. Expected <!DOCTYPE html>.
[06:28:09 PM]  
[06:28:09 PM]  If you believe this error to be incorrect, try passing the command line option --use-deprecated=html5lib and please leave a comment on the pip issue at https://github.com/pypa/pip/issues/10825.

@matthew-s-walker
Copy link

@pradyunsg I've raised a ticket in their support portal :)

@Arksine
Copy link

Arksine commented Jan 30, 2022

FWIW, passing --use-deprecated=html5lib does not work for my project. When attempting to install a pinned version of tornado from a requirements file I get the following response:

ERROR: Could not find a version that satisfies the requirement tornado==6.1.0 (from versions: none)
ERROR: No matching distribution found for tornado==6.1.0

@nadav-yo
Copy link

Hi, Nadav from JFrog.
We're on it. https://www.jfrog.com/jira/browse/RTFACT-26750

@VarIr
Copy link

VarIr commented Jan 30, 2022

Installing PyTorch CPU-only packages (described here) seems to be affected as well:

pip3 install torch==1.10.1+cpu  -f https://download.pytorch.org/whl/cpu/torch_stable.html

The --use-deprecated=html5lib work-around results in the same error that @Arksine reported:

ERROR: Could not find a version that satisfies the requirement torch==1.10.1+cpu (from versions: none)
ERROR: No matching distribution found for torch==1.10.1+cpu

@pradyunsg
Copy link
Member Author

pradyunsg commented Jan 30, 2022

If you're getting ERROR: No matching distribution found -- you are hitting a separate issue related to the package and its compatibility with your system, which is unrelated to this. You're likely going to need to go to the relevant project's documentation/issue tracker to find guidance. :)

This was a bug. See #10846.

@wiggin15
Copy link

This is affecting entire companies. Is it possible to pull pip version 22.0 until the issues in JFrog and pip are resolved?

@Necropaw
Copy link

Can't even downgrade pip successfully:

pip3 install --no-cache --upgrade  --use-deprecated=html5lib  pip==21.3.1
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
ERROR: Could not find a version that satisfies the requirement pip==21.3.1 (from versions: none)
ERROR: No matching distribution found for pip==21.3.1

@notatallshaw
Copy link
Member

@pradyunsg I see you are already making the check case insensitive in #10844 , but could this check be removed entirely? Is it an issue in the index returns an HTML fragment rather than a complete HTML document?

I'm going to chase JFrog via my corporate channels but given the life cycle of them updating and corporate roll out if this check is left in place I'd appreciate if --use-deprecated=html5lib is kept for at least the next year.

@gjermund66
Copy link

gjermund66 commented Jan 30, 2022

Azure DevOps, running Linux/Python 3.8.12:

Successfully installed pip-22.0 setuptools-60.5.0 wheel-0.37.1
...

Collecting mysql-connector-python==8.0.28
  Using cached mysql_connector_python-8.0.28-cp38-cp38-manylinux1_x86_64.whl (37.6 MB)
ERROR: Exception:
Traceback (most recent call last):
...

  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 426, in _raise_error
    raise ValueError(
ValueError: HTML doctype missing or incorrect. Expected <!DOCTYPE html>.
...

Collecting numexpr==2.8.1
  Using cached numexpr-2.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (381 kB)
ERROR: Exception:
Traceback (most recent call last):
...

  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 426, in _raise_error
    raise ValueError(
ValueError: HTML doctype missing or incorrect. Expected <!DOCTYPE html>.


```

@notatallshaw
Copy link
Member

notatallshaw commented Jan 30, 2022

Hi all I have created a separate issue with this ERROR: No matching distribution found: #10845 . Please add info there if you have more details (please don't comment if you have no further info, maybe up vote it or something).

If you're getting ERROR: No matching distribution found -- you are hitting a separate issue related to the package and its compatibility with your system, which is unrelated to this. You're likely going to need to go to the relevant project's documentation/issue tracker to find guidance. :)

@pradyunsg I am able to reproduce on JFrog using Pip 22.0 with --use-deprecated=html5lib trying to install requests and it works fine on Pip 21.3, it's definitely not a package/platform compatibility issue.

@nadav-yo
Copy link

@Necropaw You can downgrade using
pip3 install --no-cache --upgrade pip==21.3.1 -i https://pypi.org/simple/

@notatallshaw
Copy link
Member

notatallshaw commented Jan 30, 2022

@Necropaw You can downgrade using pip3 install --no-cache --upgrade pip==21.3.1 -i https://pypi.org/simple/

FYI for those trying this inside a corporate network, access to https://pypi.org/simple/ may be blocked. You likely are better doing a fresh install of Python: #10825 (comment) (@pradyunsg has a less destructive solution).

And as someone who maintains a Python installer in a large company please run your projects in a virtual env (or conda env) so you can destroy and recreate them without any hassle in the future.

@pradyunsg
Copy link
Member Author

You likely are better doing a fresh install of Python.

Uhh... No? Use python -m pip uninstall pip && python -m ensurepip to get a version of pip that was bundled with your Python version.

@pradyunsg
Copy link
Member Author

I am able to reproduce on JFrog using Pip 22.0 with --use-deprecated=html5lib trying to install requests and it works fine on Pip 21.3, it's definitely not a package/platform compatibility issue.

Ah, curious!

https://download.pytorch.org/whl/cpu/torch_stable.html

@VarIr Could you file an issue against pytorch to flag this on their end?

@n1vgabay
Copy link

Hi guys,

We also have the same issue in our GitHub Actions CI

image

image

Can you please assist me solve this issue with the commands above of pip install ?

@pradyunsg
Copy link
Member Author

pradyunsg commented Jan 30, 2022

Is it an issue in the index returns an HTML fragment rather than a complete HTML document?

Well... yes. The relevant standards clearly state that these pages need to be valid HTML5 documents. From PEP 503:

This URL must respond with a valid HTML5 page with a single anchor element per file for the project.

So far, pip has been really relaxed in accepting invalid documents like (similar to how browsers parse things). As discussed in #10291, switching to being stricter about what pip accepts is necessary to ensure that alternative clients for Python package interaction don't need to implement all the same HTML relaxations as browsers do (and pip does via html5lib).

Is it possible to pull pip version 22.0 until the issues in JFrog and pip are resolved?

It's certainly possible, but I don't think this is widespread enough to justify that. If you can prevent pip 22.0 from being used internally, by blocking pip 22.0 on your Artifactory instance, please feel free to do so. Worst case, we'll cut a 22.0.1 sometime next week that drops some of these validation checks.


Can you please assist me solve this issue with the commands above of pip install ?

I don't think we have the capacity to provide 1:1 support here. :)

Consider reaching out to GitHub, if you're using GitHub Packages. If not, it's unclear to me what alternative index you're using, and I believe that is implicated in the failure.

Also, posting screenshots of error messages is a bad idea in general. It makes it difficult for people to read the errors (since it'll ignore their browser's font size configuration, color preferences etc) and also makes it impossible to copy-paste from the output (at least, without running some sort of OCR, which no one's going to do).


I am able to reproduce on JFrog using Pip 22.0 with --use-deprecated=html5lib trying to install requests and it works fine on Pip 21.3, it's definitely not a package/platform compatibility issue.

Cool, let's chat about this in #10845. /cc @DonMyrmi


@gjermund66 Can you please share the full output? It's unclear to me what index is being used since you've effectively trimmed out all the useful parts of the output. Consider reaching out to Azure's support channels, so that they're aware and make the requisite changes.

@srittau
Copy link

srittau commented Feb 23, 2022

Since NGinx doesn't include a doctype in its auto index pages, we now auto-generate a super simple index when uploading a new package. We use these scripts, maybe they are useful to some:

# release.sh

set -e -o pipefail

SFTP_HOST=foo@example.com
SFTP_PATH=/srv/foo

# Build and upload the wheel
# ...

sftp -qb - "$SFTP_HOST" >package-list <<EOD
@cd "$SFTP_PATH"
@ls -1
EOD
grep -v index.html package-list | python3 create-package-index.py >index.html
scp index.html "$SFTP_HOST:$SFTP_PATH"
rm package-list index.html
#!/usr/bin/env python3
# create-package-index.py

import sys
from html import escape

PREAMBLE = """<!DOCTYPE html>
<html>
<head><title>Package Index</title></head>
<body>
<h1>Package Index</h1>
<ul>
"""

POSTAMBLE = "</ul></body></html>"

print(PREAMBLE)
for line in sys.stdin:
    filename = escape(line.strip())
    print(f'<li><a href="{filename}">{filename}</a></li>')
print(POSTAMBLE)

@pombredanne

This comment was marked as duplicate.

@pfmoore

This comment was marked as duplicate.

@A-Fares

This comment was marked as duplicate.

@jrobbins-LiveData
Copy link

@pradyunsg AWS got back to me and they have updated CodeArtifact to include the required declaration.

@lambacck
Copy link

lambacck commented Mar 1, 2022

Given the merging of #10903 can the issue description at the top of this page be modified to more clearly call out the removal of the doctype checking please?

@pradyunsg
Copy link
Member Author

No, because that's not in a release yet. I'll update that as and when there's a user facing update to the status quo.

@pradyunsg
Copy link
Member Author

pradyunsg commented Mar 7, 2022

Alright, 22.0.4 removes the doctype warning; in line with what we've said earlier.

I consider it an oversight in pip's code, that it did not have strict=True in our html5lib.parse call, but that's water under the bridge now. The "reject invalid HTML pages" is something we might do in the future, albeit with a 3-6 month deprecation period -- platform vendors and users should still update their code to serve valid HTML 5 pages.

inmantaci pushed a commit to inmanta/inmanta-core that referenced this issue Jul 21, 2022
Bumps [pip](https://github.com/pypa/pip) from 22.1.2 to 22.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>22.2 (2022-07-21)</h1>
<h2>Deprecations and Removals</h2>
<ul>
<li>Remove the <code>html5lib</code> deprecated feature flag. (<code>[#10825](pypa/pip#10825) &lt;https://github.com/pypa/pip/issues/10825&gt;</code>_)</li>
<li>Remove <code>--use-deprecated=backtrack-on-build-failures</code>. (<code>[#11241](pypa/pip#11241) &lt;https://github.com/pypa/pip/issues/11241&gt;</code>_)</li>
</ul>
<h2>Features</h2>
<ul>
<li>
<p>Add support to use <code>truststore &lt;https://pypi.org/project/truststore/&gt;</code>_ as an
alternative SSL certificate verification backend. The backend can be enabled on Python
3.10 and later by installing <code>truststore</code> into the environment, and adding the
<code>--use-feature=truststore</code> flag to various pip commands.</p>
<p><code>truststore</code> differs from the current default verification backend (provided by
<code>certifi</code>) in it uses the operating system’s trust store, which can be better
controlled and augmented to better support non-standard certificates. Depending on
feedback, pip may switch to this as the default certificate verification backend in
the future. (<code>[#11082](pypa/pip#11082) &lt;https://github.com/pypa/pip/issues/11082&gt;</code>_)</p>
</li>
<li>
<p>Add <code>--dry-run</code> option to <code>pip install</code>, to let it print what it would install but
not actually change anything in the target environment. (<code>[#11096](pypa/pip#11096) &lt;https://github.com/pypa/pip/issues/11096&gt;</code>_)</p>
</li>
<li>
<p>Record in wheel cache entries the URL of the original artifact that was downloaded
to build the cached wheels. The record is named <code>origin.json</code> and uses the PEP 610
Direct URL format. (<code>[#11137](pypa/pip#11137) &lt;https://github.com/pypa/pip/issues/11137&gt;</code>_)</p>
</li>
<li>
<p>Support <code>PEP 691 &lt;https://peps.python.org/pep-0691/&gt;</code><em>. (<code>[#11158](pypa/pip#11158) &lt;https://github.com/pypa/pip/issues/11158&gt;</code></em>)</p>
</li>
<li>
<p>pip's deprecation warnings now subclass the built-in <code>DeprecationWarning</code>, and
can be suppressed by running the Python interpreter with
<code>-W ignore::DeprecationWarning</code>. (<code>[#11225](pypa/pip#11225) &lt;https://github.com/pypa/pip/issues/11225&gt;</code>_)</p>
</li>
<li>
<p>Add <code>pip inspect</code> command to obtain the list of installed distributions and other
information about the Python environment, in JSON format. (<code>[#11245](pypa/pip#11245) &lt;https://github.com/pypa/pip/issues/11245&gt;</code>_)</p>
</li>
<li>
<p>Significantly speed up isolated environment creation, by using the same
sources for pip instead of creating a standalone installation for each
environment. (<code>[#11257](pypa/pip#11257) &lt;https://github.com/pypa/pip/issues/11257&gt;</code>_)</p>
</li>
<li>
<p>Add an experimental <code>--report</code> option to the install command to generate a JSON report
of what was installed. In combination with <code>--dry-run</code> and <code>--ignore-installed</code> it
can be used to resolve the requirements. (<code>[#53](pypa/pip#53) &lt;https://github.com/pypa/pip/issues/53&gt;</code>_)</p>
</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fix <code>pip install --pre</code> for packages with pre-release build dependencies defined
both in <code>pyproject.toml</code>'s <code>build-system.requires</code> and <code>setup.py</code>'s
<code>setup_requires</code>. (<code>[#10222](pypa/pip#10222) &lt;https://github.com/pypa/pip/issues/10222&gt;</code>_)</li>
<li>When pip rewrites the shebang line in a script during wheel installation,
update the hash and size in the corresponding <code>RECORD</code> file entry. (<code>[#10744](pypa/pip#10744) &lt;https://github.com/pypa/pip/issues/10744&gt;</code>_)</li>
<li>Do not consider a <code>.dist-info</code> directory found inside a wheel-like zip file
as metadata for an installed distribution. A package in a wheel is (by</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/8e7e76e60f4e115ea1201bee2f176377a718fce1"><code>8e7e76e</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/b6f6a94e36f10a4535ea5bbdc6b351f62003eede"><code>b6f6a94</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/790725aca3f60c745e33827a6079d9600da373d8"><code>790725a</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11274">#11274</a> from sbidoul/install-report-note-sbi</li>
<li><a href="https://github.com/pypa/pip/commit/d4b9e187aa7cc5ab14b2339f6171f7f2ea6504e9"><code>d4b9e18</code></a> Add clarifications to the installation report documentation</li>
<li><a href="https://github.com/pypa/pip/commit/b1a01ef762a78af1194958a1c874015eaf81fd04"><code>b1a01ef</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11265">#11265</a> from finnagin/main</li>
<li><a href="https://github.com/pypa/pip/commit/48bcb0a4ccd30a9d00e58fe58827772e307a7e39"><code>48bcb0a</code></a> reformat to pass pre-commit check</li>
<li><a href="https://github.com/pypa/pip/commit/a7c1fe3bff5655393018c53b448b669b3525515b"><code>a7c1fe3</code></a> Remove utc fixture from tests</li>
<li><a href="https://github.com/pypa/pip/commit/0c574f72905185d62bcca741c813df9bae1d9282"><code>0c574f7</code></a> Remove time import</li>
<li><a href="https://github.com/pypa/pip/commit/246fef19149eea893f1cf3efd53f9b17c94c952f"><code>246fef1</code></a> Remove utc fixture</li>
<li><a href="https://github.com/pypa/pip/commit/c9cb7f4629bdd8c61b792feff6dacb1d2e848d57"><code>c9cb7f4</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11270">#11270</a> from uranusjr/upgrade-pre-commit-hooks</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/22.1.2...22.2">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=22.1.2&new-version=22.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>
inmantaci pushed a commit to inmanta/inmanta-core that referenced this issue Jul 21, 2022
Bumps [pip](https://github.com/pypa/pip) from 22.1.2 to 22.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>22.2 (2022-07-21)</h1>
<h2>Deprecations and Removals</h2>
<ul>
<li>Remove the <code>html5lib</code> deprecated feature flag. (<code>[#10825](pypa/pip#10825) &lt;https://github.com/pypa/pip/issues/10825&gt;</code>_)</li>
<li>Remove <code>--use-deprecated=backtrack-on-build-failures</code>. (<code>[#11241](pypa/pip#11241) &lt;https://github.com/pypa/pip/issues/11241&gt;</code>_)</li>
</ul>
<h2>Features</h2>
<ul>
<li>
<p>Add support to use <code>truststore &lt;https://pypi.org/project/truststore/&gt;</code>_ as an
alternative SSL certificate verification backend. The backend can be enabled on Python
3.10 and later by installing <code>truststore</code> into the environment, and adding the
<code>--use-feature=truststore</code> flag to various pip commands.</p>
<p><code>truststore</code> differs from the current default verification backend (provided by
<code>certifi</code>) in it uses the operating system’s trust store, which can be better
controlled and augmented to better support non-standard certificates. Depending on
feedback, pip may switch to this as the default certificate verification backend in
the future. (<code>[#11082](pypa/pip#11082) &lt;https://github.com/pypa/pip/issues/11082&gt;</code>_)</p>
</li>
<li>
<p>Add <code>--dry-run</code> option to <code>pip install</code>, to let it print what it would install but
not actually change anything in the target environment. (<code>[#11096](pypa/pip#11096) &lt;https://github.com/pypa/pip/issues/11096&gt;</code>_)</p>
</li>
<li>
<p>Record in wheel cache entries the URL of the original artifact that was downloaded
to build the cached wheels. The record is named <code>origin.json</code> and uses the PEP 610
Direct URL format. (<code>[#11137](pypa/pip#11137) &lt;https://github.com/pypa/pip/issues/11137&gt;</code>_)</p>
</li>
<li>
<p>Support <code>PEP 691 &lt;https://peps.python.org/pep-0691/&gt;</code><em>. (<code>[#11158](pypa/pip#11158) &lt;https://github.com/pypa/pip/issues/11158&gt;</code></em>)</p>
</li>
<li>
<p>pip's deprecation warnings now subclass the built-in <code>DeprecationWarning</code>, and
can be suppressed by running the Python interpreter with
<code>-W ignore::DeprecationWarning</code>. (<code>[#11225](pypa/pip#11225) &lt;https://github.com/pypa/pip/issues/11225&gt;</code>_)</p>
</li>
<li>
<p>Add <code>pip inspect</code> command to obtain the list of installed distributions and other
information about the Python environment, in JSON format. (<code>[#11245](pypa/pip#11245) &lt;https://github.com/pypa/pip/issues/11245&gt;</code>_)</p>
</li>
<li>
<p>Significantly speed up isolated environment creation, by using the same
sources for pip instead of creating a standalone installation for each
environment. (<code>[#11257](pypa/pip#11257) &lt;https://github.com/pypa/pip/issues/11257&gt;</code>_)</p>
</li>
<li>
<p>Add an experimental <code>--report</code> option to the install command to generate a JSON report
of what was installed. In combination with <code>--dry-run</code> and <code>--ignore-installed</code> it
can be used to resolve the requirements. (<code>[#53](pypa/pip#53) &lt;https://github.com/pypa/pip/issues/53&gt;</code>_)</p>
</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fix <code>pip install --pre</code> for packages with pre-release build dependencies defined
both in <code>pyproject.toml</code>'s <code>build-system.requires</code> and <code>setup.py</code>'s
<code>setup_requires</code>. (<code>[#10222](pypa/pip#10222) &lt;https://github.com/pypa/pip/issues/10222&gt;</code>_)</li>
<li>When pip rewrites the shebang line in a script during wheel installation,
update the hash and size in the corresponding <code>RECORD</code> file entry. (<code>[#10744](pypa/pip#10744) &lt;https://github.com/pypa/pip/issues/10744&gt;</code>_)</li>
<li>Do not consider a <code>.dist-info</code> directory found inside a wheel-like zip file
as metadata for an installed distribution. A package in a wheel is (by</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/8e7e76e60f4e115ea1201bee2f176377a718fce1"><code>8e7e76e</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/b6f6a94e36f10a4535ea5bbdc6b351f62003eede"><code>b6f6a94</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/790725aca3f60c745e33827a6079d9600da373d8"><code>790725a</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11274">#11274</a> from sbidoul/install-report-note-sbi</li>
<li><a href="https://github.com/pypa/pip/commit/d4b9e187aa7cc5ab14b2339f6171f7f2ea6504e9"><code>d4b9e18</code></a> Add clarifications to the installation report documentation</li>
<li><a href="https://github.com/pypa/pip/commit/b1a01ef762a78af1194958a1c874015eaf81fd04"><code>b1a01ef</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11265">#11265</a> from finnagin/main</li>
<li><a href="https://github.com/pypa/pip/commit/48bcb0a4ccd30a9d00e58fe58827772e307a7e39"><code>48bcb0a</code></a> reformat to pass pre-commit check</li>
<li><a href="https://github.com/pypa/pip/commit/a7c1fe3bff5655393018c53b448b669b3525515b"><code>a7c1fe3</code></a> Remove utc fixture from tests</li>
<li><a href="https://github.com/pypa/pip/commit/0c574f72905185d62bcca741c813df9bae1d9282"><code>0c574f7</code></a> Remove time import</li>
<li><a href="https://github.com/pypa/pip/commit/246fef19149eea893f1cf3efd53f9b17c94c952f"><code>246fef1</code></a> Remove utc fixture</li>
<li><a href="https://github.com/pypa/pip/commit/c9cb7f4629bdd8c61b792feff6dacb1d2e848d57"><code>c9cb7f4</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11270">#11270</a> from uranusjr/upgrade-pre-commit-hooks</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/22.1.2...22.2">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=22.1.2&new-version=22.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
project: vendored dependency Related to a vendored dependency type: deprecation Related to deprecation / removal.
Projects
None yet
Development

Successfully merging a pull request may close this issue.