Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug Circle CI failures #562

Closed
wants to merge 3 commits into from
Closed

Debug Circle CI failures #562

wants to merge 3 commits into from

Conversation

joerick
Copy link
Contributor

@joerick joerick commented Jan 25, 2021

It might be related to the use of Python 3.6on Circle... or something else. I'm adding more logging.

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

Something mega weird is going on at CircleCI. Check this out.

static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
6787224326c386c83b8577caa2c6a208
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
0b54c89e6862d3b93bb471b847006bb7
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
ad477dc2d090bc137f5a8d1741f1d127
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
05335b79f29ee06eaf4c1f766b7e42c9
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed

^ yes, same URL every time, but if I keep doing it, sometimes I get an error, other times, a corrupt file. The correct hash is 8b19748473609241e60aa3618bbaf3ed.

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

Curl happens to be using HTTP/2, which I'm guessing urlopen isn't, but the same happens with http/1.1:

static:~ distiller$ curl -s --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed
static:~ distiller$ curl -s --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
0894f56c96b518cb00cbb1251000df08
static:~ distiller$ curl -s --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' | md5
8b19748473609241e60aa3618bbaf3ed

@henryiii
Copy link
Contributor

Can we add and check the hash? I was going to suggest that anyway, as it would catch this sooner instead of trying to install a corrupted file, which is a little worrisome.

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

I grabbed one of the corrupt ones. Bytes-wise, they seem to start the same, but at some point diverge...

image

image

@henryiii
Copy link
Contributor

henryiii commented Jan 25, 2021

Did this start happening on Saturday? Seems like maybe it was around a bit more than that? They had a database maintenance job then. https://status.circleci.com

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

Good request:

static:~ distiller$ curl -vv --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' -o file.dat && md5 file.dat
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 151.101.184.223...
* TCP_NODELAY set
* Connected to www.python.org (151.101.184.223) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [108 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3279 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: businessCategory=Private Organization; 1.3.6.1.4.1.311.60.2.1.3=US; 1.3.6.1.4.1.311.60.2.1.2=Delaware; serialNumber=3359300; C=US; ST=Oregon; L=Beaverton; O=Python Software Foundation; CN=www.python.org
*  start date: Sep 29 00:00:00 2020 GMT
*  expire date: Oct 31 00:00:00 2021 GMT
*  subjectAltName: host "www.python.org" matched cert's "www.python.org"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
*  SSL certificate verify ok.
> GET /ftp/python/3.9.1/python-3.9.1-macos11.0.pkg HTTP/1.1
> Host: www.python.org
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Connection: keep-alive
< Content-Length: 37451735
< Server: nginx
< Content-Type: application/octet-stream
< Last-Modified: Mon, 07 Dec 2020 18:03:11 GMT
< ETag: "5fce6e5f-23b77d7"
< X-Clacks-Overhead: GNU Terry Pratchett
< Via: 1.1 varnish, 1.1 varnish
< Accept-Ranges: bytes
< Age: 394508
< Date: Mon, 25 Jan 2021 19:50:34 GMT
< X-Served-By: cache-lga21966-LGA, cache-mdw17331-MDW
< X-Cache: HIT, HIT
< X-Cache-Hits: 0, 0
< X-Timer: S1611604235.796561,VS0,VE1
< Strict-Transport-Security: max-age=63072000; includeSubDomains
< 
{ [1371 bytes data]
100 35.7M  100 35.7M    0     0  20.8M      0  0:00:01  0:00:01 --:--:-- 20.8M
* Connection #0 to host www.python.org left intact
MD5 (file.dat) = 8b19748473609241e60aa3618bbaf3ed

Corrupt request:

static:~ distiller$ curl -vv --http1.1 'https://www.python.org/ftp/python/3.9.1/python-3.9.1-macos11.0.pkg' -o file.dat && md5 file.dat
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 199.232.96.223...
* TCP_NODELAY set
* Connected to www.python.org (199.232.96.223) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [108 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3279 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: businessCategory=Private Organization; 1.3.6.1.4.1.311.60.2.1.3=US; 1.3.6.1.4.1.311.60.2.1.2=Delaware; serialNumber=3359300; C=US; ST=Oregon; L=Beaverton; O=Python Software Foundation; CN=www.python.org
*  start date: Sep 29 00:00:00 2020 GMT
*  expire date: Oct 31 00:00:00 2021 GMT
*  subjectAltName: host "www.python.org" matched cert's "www.python.org"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
*  SSL certificate verify ok.
> GET /ftp/python/3.9.1/python-3.9.1-macos11.0.pkg HTTP/1.1
> Host: www.python.org
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Connection: keep-alive
< Content-Length: 37451735
< Server: nginx
< Content-Type: application/octet-stream
< Last-Modified: Mon, 07 Dec 2020 18:03:11 GMT
< ETag: "5fce6e5f-23b77d7"
< X-Clacks-Overhead: GNU Terry Pratchett
< Via: 1.1 varnish, 1.1 varnish
< Accept-Ranges: bytes
< Age: 394511
< Date: Mon, 25 Jan 2021 19:50:37 GMT
< X-Served-By: cache-lga21951-LGA, cache-lck10927-LCK
< X-Cache: HIT, HIT
< X-Cache-Hits: 0, 0
< X-Timer: S1611604238.653765,VS0,VE0
< Strict-Transport-Security: max-age=63072000; includeSubDomains
< 
{ [1371 bytes data]
100 35.7M  100 35.7M    0     0  22.4M      0  0:00:01  0:00:01 --:--:-- 22.4M
* Connection #0 to host www.python.org left intact
MD5 (file.dat) = 81be867a150def9911b27bdb76245df0

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

Thing is, these are HTTPS connections. So I don't see how Circle could be MITM'ing it. So I'm thinking it's a Fastly issue. Dodgy CDN node perhaps... the problems only seem to be happening when curl connects to 199.232.96.223.

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

Can we add and check the hash? I was going to suggest that anyway, as it would catch this sooner instead of trying to install a corrupted file, which is a little worrisome.

I suppose we could, yes. But there is a built-in assumption of cibuildwheel that we trust HTTPS... after all, it's downloaded from pip over HTTPS in the first place.

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

I've filed a support request with Fastly... we'll see...

@henryiii
Copy link
Contributor

@joerick
Copy link
Contributor Author

joerick commented Jan 25, 2021

You can always add hashes to your requirements, like manylinux does: https://github.com/pypa/manylinux/blob/master/docker/build_scripts/requirements.txt

I know, I wanted to add them to our constraints in #256, but it wasn't possible at the time. See #256 (comment) for the gory details.

@henryiii
Copy link
Contributor

The "everything pinned" would be an issue that might eventually be solvable, but the nice thing with hashes for the Python versions would be that it would have shown the "correct" error (invalid Python.pkg) immediately for something like this, reducing debugging. It should be pretty easy to add to our collection update script, I'd think, now that we have that.

@joerick
Copy link
Contributor Author

joerick commented Jan 29, 2021

Fastly think this might be fixed... let's see.

@joerick joerick closed this Jan 29, 2021
@joerick joerick reopened this Jan 29, 2021
@henryiii
Copy link
Contributor

Nope.

@joerick
Copy link
Contributor Author

joerick commented Jan 29, 2021

I think the close/reopen trick doesn't work on circle CI, that run is from 4 days ago

@henryiii
Copy link
Contributor

Wow, that usually works; I think you are right. How about a git commit --amend followed by a force push? No need to change anything I think, it will update the date of the commit.

@joerick
Copy link
Contributor Author

joerick commented Jan 29, 2021

I've added a line to print the md5, which might be useful if it doesn't work!

@henryiii
Copy link
Contributor

😂

I can look at adding a md5 check to the Python download, that would help the error be better in the future. Though I think we should get #484 in first, I've been holding off waiting for that to land. ;)

@joerick
Copy link
Contributor Author

joerick commented Jan 30, 2021

Passed, let's try again...

@joerick
Copy link
Contributor Author

joerick commented Jan 30, 2021

That's 3 passes in a row, I think it's good :)

@joerick joerick closed this Jan 30, 2021
@joerick joerick deleted the debug-circleci branch February 5, 2021 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants