You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Running make linkcheck on a document that contains an external link to a website may report the link is broken when a web browser may successfully open the link. Specifically, if the website closes its connection when receiving the HTTP HEAD request method, then linkcheck.py will receive a ConnectionError exception, which bypasses the logic that would otherwise have it make an HTTP GET request.
$ sphinx-quickstart # accept all the default options
$ echo'\n\nThis is `a link to the US Patent Website <https://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=7840660&OS=7840660&RS=7840660>`_.\n'>> index.rst
$ make html linkcheck
Observe linkcheck reporting a broken link:
Running Sphinx v4.0.2
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [linkcheck]: targets for 1 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
preparing documents... done
writing output... [100%] index
( index: line 22) broken https://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=7840660&OS=7840660&RS=7840660 - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
build finished with problems.
make: *** [linkcheck] Error 1
Open _build/html/index.html in your browser
Click "a link to the US Patent Website"
Observe the link opening and rendering normally
Expected behavior
If a link is valid, and the website is returning valid content for an HTTP GET request, make linkcheck should not report the link as broken.
Internally, in sphinx/builders/linkcheck.py, if a call to requests.head() raises a requests.exceptions.ConnectionError exception, it should attempt a requests.get() just like it does with HTTPError and TooManyredirects.
Describe the bug
Running
make linkcheck
on a document that contains an external link to a website may report the link is broken when a web browser may successfully open the link. Specifically, if the website closes its connection when receiving theHTTP HEAD
request method, thenlinkcheck.py
will receive aConnectionError
exception, which bypasses the logic that would otherwise have it make anHTTP GET
request.A specific example of a website exhibiting this behaviour is the US Patent and Trademark Office
To Reproduce
Steps to reproduce the behaviour:
_build/html/index.html
in your browserExpected behavior
If a link is valid, and the website is returning valid content for an
HTTP GET
request,make linkcheck
should not report the link asbroken
.Internally, in
sphinx/builders/linkcheck.py
, if a call torequests.head()
raises arequests.exceptions.ConnectionError
exception, it should attempt arequests.get()
just like it does withHTTPError
andTooManyredirects
.Your project
sphinx-bug-linkcheck-reports-broken-link.zip
Environment info
macOS 11.3.1
(but this does not appear to be OS-dependent)3.8.10
and3.9.5
v3.5.4
andv4.0.2
Additional context
Using
curl
, we can see that this particular website closes connections when receivingHTTP HEAD
:The text was updated successfully, but these errors were encountered: