-
Notifications
You must be signed in to change notification settings - Fork 705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sanity check for TensorFlow-2.6.0-foss-2021a.eb fails (download problem when running TensorFlow-2.x_mnist-test.py test case) #14058
Comments
You need the ca-certificates package installed |
You mean system-wise? If so, that is out of my hands. |
ca-certificates is a static rpm providing the trusted CA's, it can not crash, rather not provide the right certificates. |
The reason for the crash above is that if can't find the CA cert when doing a secure download. The only reason for that I can think of is that the RPM is not installed. |
The RPM is there and working on all nodes. Just confirmed with the sysdamin. |
@avapirev Let's try and narrow this down a bit... Can you provide some more information about the system on which you're seeing this? Which OS, etc.? Does If that works, please try running this Python code, after loading the from urllib.request import urlopen
res = urlopen('https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz')
print(res.status) Output should be |
This works:
|
+1, I have the same issue. My system info:
This remark from @boegel
Triggered something in me: at first, we didn't have OpenSSL's development headers installed in our system package OpenSSL. Thus, EasyBuild would build it's own OpenSSL. I think this was the OpenSSL our Python was built against. Then, I ran into the issue that TensorFlow threw this error during the build:
The build logs showed that OpenSSL was in the Maybe it's because now, my Python was built against EasyBuild's OpenSSL, but now that we have a system one, that messes things up? @avapirev any chance that you went through something similar? If so, that would be a good indication that this might be the cause. I'd love to try and built a completely new stack now that we have the OS OpenSSL headers in place, but since this is on our new system, that's a bit of a challenge (filesystems are not really stable yet). If I do at some point manage to rebuild the full stack, I'll let you know if it helped. |
First, loading
Then, I reinstalled
So, I don't think this issue is TensorFlow related at all, it's more related to the EasyBuild installation of OpenSSL-1.1. I haven't tried reinstalling TF yet, but my bet is that this is the solution:
|
Small update: I can confirm that the three steps I described above fixed the issue for me. |
@avapirev So, it's not a system-wide error, but an issue specific to the |
FYI: This works fine for me with no need to recompile OpenSSL
|
@avapirev Does just the import work, or also downloading a file over HTTPS via |
Maybe I forgot to reply. No, the _ssl import does not work: python
|
Also ran into this and can confirm @casparvl steps are working. We are still on CentOS7.9 and I had to install |
I've just bumped against this, and second the question from @seb45tian |
I'm seeing this same issue now in the CentOS 7.9 container I'm starting to use for regression tests (since all our systems are now RHEL8)... So I'll try to take a look what going on here. |
Problem reproduced in CentOS 7.9 container:
Issue is not the
I only have OpenSSL 1.0.2 installed as OS package, OpenSSL 1.1 is not there (currently) in the container:
The fix in easybuilders/easybuild-easyblocks#2575 looks like it could be related (TensorFlow + OpenSSL), so I'll try that first... |
OK, spent quite a bit of time debugging this today, and it seems like I've got it figured out... It boils down to a bug in then from-source OpenSSL installation that is provided by Just symlinking The
Symlinking First, without this change:
(stopping the Then, adding the missing symlink:
recheck:
and checking again with Python:
So, we should:
cc @lexming edit:
|
The problem with OpenSSL is fixed with the updated easyblock in https://github.com/easybuilders/easybuild-easyblocks, which is included in EasyBuild v4.5.5. So, regardless of whether |
The error is past the sanity check during the install phase:
EDIT:
Simply pasting the above mnist.npz link in a browser downloads the file.
The text was updated successfully, but these errors were encountered: