Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linkcheck in CI is broken on PyPI URLs with anchors #1744

Closed
webknjaz opened this issue Dec 11, 2024 · 1 comment · Fixed by #1767
Closed

linkcheck in CI is broken on PyPI URLs with anchors #1744

webknjaz opened this issue Dec 11, 2024 · 1 comment · Fixed by #1767
Labels
good first issue help wanted type: bug A confirmed bug or unintended behavior type: task Something that needs to be done that is not a bug or feature

Comments

@webknjaz
Copy link
Member

          > The linkcheck reports on `https://pypi.org/project/pip/23.3.1/#files` (Non-existing anchor), but that's surprising: yesterday's cron job reported is as success and the link works (opens PyPI on the files tab). Also, it's not related to any changes made for this PR.

You're right. Over the past few days I noticed a quick Fastly loading screen showing up on PyPI, which then redirects to where I was going originally.
So I probed it with cURL just now and verified that this is what's happening, and HTML DOM no longer contains that in the HTTP first response (this is probably cookie-based):

$ curl -v 'https://pypi.org/project/pip/23.3.1/#files'
* Host pypi.org:443 was resolved.
* IPv6: 2a04:4e42:600::223, 2a04:4e42:200::223, 2a04:4e42:400::223, 2a04:4e42::223
* IPv4: 151.101.0.223, 151.101.192.223, 151.101.64.223, 151.101.128.223
*   Trying [2a04:4e42:600::223]:443...
* Immediate connect fail for 2a04:4e42:600::223: Network is unreachable
*   Trying [2a04:4e42:200::223]:443...
* Immediate connect fail for 2a04:4e42:200::223: Network is unreachable
*   Trying [2a04:4e42:400::223]:443...
* Immediate connect fail for 2a04:4e42:400::223: Network is unreachable
*   Trying [2a04:4e42::223]:443...
* Immediate connect fail for 2a04:4e42::223: Network is unreachable
*   Trying 151.101.0.223:443...
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=pypi.org
*  start date: Apr 23 04:22:05 2024 GMT
*  expire date: May 25 04:22:04 2025 GMT
*  subjectAltName: host "pypi.org" matched cert's "pypi.org"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=GlobalSign Atlas R3 DV TLS CA 2024 Q2
*  SSL certificate verify ok.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connected to pypi.org (151.101.0.223) port 443
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://pypi.org/project/pip/23.3.1/#files
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: pypi.org]
* [HTTP/2] [1] [:path: /project/pip/23.3.1/]
* [HTTP/2] [1] [user-agent: curl/8.10.1]
* [HTTP/2] [1] [accept: */*]
> GET /project/pip/23.3.1/ HTTP/2
> Host: pypi.org
> User-Agent: curl/8.10.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 200 
< set-cookie: _fs_ch_st_FSBmUei20MqUiJb9=ARwOUcLntEKxNCnL5W0on4gbZZuJgKNFAuJTwV5kqlwObPx5zOjadDLJ8iZ2jXY2v-kRpx0J1npexkvu_R75uguNU_5S13wmbTRuQr1zm4AghacYsZb2dTQG9sPxmJahlzJLe16uBKWgCnaeE4pXhqsMs77NogoTpKoqJhS6nkwgjtK2hJA3s4d8d4JnXTvMtJRqm3vtuDFWp5s6OqiT-u3N-QTbB58=; Max-Age=10; HttpOnly; Path=/
< content-type: text/html; charset=utf-8
< cache-control: no-store
< accept-ranges: bytes
< date: Wed, 11 Dec 2024 23:00:42 GMT
< x-served-by: cache-iad-kcgs7200169-IAD, cache-iad-kjyo7100141-IAD, cache-fra-eddf8230065-FRA
< x-cache: MISS, MISS
< x-cache-hits: 0, 0
< x-timer: S1733958043.869433,VS0,VE107
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-frame-options: deny
< x-xss-protection: 1; mode=block
< x-content-type-options: nosniff
< x-permitted-cross-domain-policies: none
< permissions-policy: publickey-credentials-create=(self),publickey-credentials-get=(self),accelerometer=(),ambient-light-sensor=(),autoplay=(),battery=(),camera=(),display-capture=(),document-domain=(),encrypted-media=(),execution-while-not-rendered=(),execution-while-out-of-viewport=(),fullscreen=(),gamepad=(),geolocation=(),gyroscope=(),hid=(),identity-credentials-get=(),idle-detection=(),local-fonts=(),magnetometer=(),microphone=(),midi=(),otp-credentials=(),payment=(),picture-in-picture=(),screen-wake-lock=(),serial=(),speaker-selection=(),storage-access=(),usb=(),web-share=(),xr-spatial-tracking=()
< 
<!DOCTYPE html>
<html>
  <head>
    <meta
      http-equiv="Content-Security-Policy"
      content="default-src 'self'; img-src 'self' data:; media-src 'self' data:; object-src 'none'; style-src 'self' 'sha256-o4vzfmmUENEg4chMjjRP9EuW9ucGnGIGVdbl8d0SHQQ='; script-src 'self' 'sha256-a9bHdQGvRzDwDVzx8m+Rzw+0FHZad8L0zjtBwkxOIz4=';"
    />
    <link
      href="/_fs-ch-1T1wmsGaOgGaSxcX/assets/inter-var.woff2"
      rel="preload"
      as="font"
      type="font/woff2"
      crossorigin
    />
    <link href="/_fs-ch-1T1wmsGaOgGaSxcX/assets/styles.css" rel="stylesheet" />
    <meta
      name="viewport"
      content="width=device-width, initial-scale=1, maximum-scale=1"
    />
    <style>
      #loading-error {
        font-size: 16px;
        font-family: 'Inter', sans-serif;
        margin-top: 10px;
        margin-left: 10px;
        display: none;
      }
    </style>
  </head>
  <body>
    <noscript>
      <div class="noscript-container">
        <div class="noscript-content">
          <img
            src="/_fs-ch-1T1wmsGaOgGaSxcX/assets/errorIcon.svg"
            alt="Error Icon"
            class="error-icon"
          />
          <span class="noscript-span"
            >JavaScript is disabled in your browser.</span
          >
          Please enable JavaScript to proceed.
        </div>
      </div>
    </noscript>
    <div id="loading-error">
      A required part of this site couldn’t load. This may be due to a browser
      extension, network issues, or browser settings. Please check your
      connection, disable any ad blockers, or try using a different browser.
    </div>
    <script>
      function loadScript(src) {
        return new Promise((resolve, reject) => {
          const script = document.createElement('script');
          script.onload = resolve;
          script.onerror = (event) => {
            console.error('Script load error event:', event);
            document.getElementById('loading-error').style.display = 'block';
            reject(
              new Error(
                `Failed to load script: ${src}, Please contact the service administrator.`
              )
            );
          };
          script.src = src;
          document.body.appendChild(script);
        });
      }

      loadScript('/_fs-ch-1T1wmsGaOgGaSxcX/errors.js')
        .then(() => {
          const script = document.createElement('script');
          script.src = '/_fs-ch-1T1wmsGaOgGaSxcX/script.js?reload=true';
          script.onerror = (event) => {
            console.error('Script load error event:', event);
            const errorMsg = new Error(
              `Failed to load script: ${script.src}. Please contact the service administrator.`
            );
            console.error(errorMsg);
            handleScriptError();
          };
          document.body.appendChild(script);
        })
        .catch((error) => {
          console.error(error);
        });
    </script>
  </body>
</html>
* Connection #0 to host pypi.org left intact

Nevertheless, this would be blocking PR merges, and so we have to address it by possibly adding the URL to nitpick_ignore or adjusting the anchor checks somehow. It's best to do this in a separate PR.

Originally posted by @webknjaz in #1662 (comment)

@webknjaz
Copy link
Member Author

The reason was posted @ Warehouse:

Please see this thread that explains what changed and why. https://discuss.python.org/t/fastly-interfering-with-pypi-search/73597/6

Originally posted by @miketheman in #17285


So the action item here is to configure sphinx to drop the URL fragment checks for the pypi.org URLs.

@webknjaz webknjaz moved this to 🧐 @webknjaz's review queue 📋 in 📅 Procrastinating in public Dec 18, 2024
@webknjaz webknjaz moved this from 🧐 @webknjaz's review queue 📋 to 🦉 Inclusion ⚖️ in 📅 Procrastinating in public Dec 18, 2024
ncoghlan added a commit to ncoghlan/packaging.python.org that referenced this issue Dec 23, 2024
@github-project-automation github-project-automation bot moved this from 🦉 Inclusion ⚖️ to 🌈 Done 🦄 in 📅 Procrastinating in public Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue help wanted type: bug A confirmed bug or unintended behavior type: task Something that needs to be done that is not a bug or feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant