Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read-only S3 driver (ros3) doesn't seem to work #125

Closed
djhoese opened this issue Aug 10, 2020 · 12 comments · Fixed by #134
Closed

Read-only S3 driver (ros3) doesn't seem to work #125

djhoese opened this issue Aug 10, 2020 · 12 comments · Fixed by #134

Comments

@djhoese
Copy link
Contributor

djhoese commented Aug 10, 2020

I created #122 and #124 so I could play with the S3 reading ability of HDF5. I tried using a public S3 bucket to inspect a NetCDF4 file with no luck. I then started looking at one of the files here: https://www.hdfgroup.org/solutions/enterprise-support/cloud-amazon-s3-storage-hdf5-connector/ but got the same errors. Here's a command from the example video that should work:

h5dump -pBH --filedriver=ros3 "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/snp500.h5"  

But just produces:

h5dump error: unable to open file "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/snp500.h5"

I know this is probably an upstream issue but wanted to make this so if anyone else tries this feature they know it isn't working. Also, is there a way to install the built package from a PR so I could try enabling some debug flags to further debug this issue?


Environment (conda list):
$ conda list
# packages in environment at /home/davidh/miniconda3/envs/hdf5_test:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
c-ares                    1.16.1               h516909a_0    conda-forge
ca-certificates           2020.6.20            hecda079_0    conda-forge
hdf5                      1.12.0          nompi_h54c07f9_101    conda-forge
krb5                      1.17.1               hfafb76e_2    conda-forge
libcurl                   7.71.1               hcdd3856_4    conda-forge
libedit                   3.1.20191231         h46ee950_1    conda-forge
libev                     4.33                 h516909a_0    conda-forge
libgcc-ng                 9.3.0               h24d8f2e_14    conda-forge
libgfortran-ng            7.5.0               hdf63c60_14    conda-forge
libgomp                   9.3.0               h24d8f2e_14    conda-forge
libnghttp2                1.41.0               hab1572f_1    conda-forge
libssh2                   1.9.0                hab1572f_5    conda-forge
libstdcxx-ng              9.3.0               hdf63c60_14    conda-forge
ncurses                   6.2                  he1b5a44_1    conda-forge
openssl                   1.1.1g               h516909a_1    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
zlib                      1.2.11            h516909a_1007    conda-forge

Details about conda and system ( conda info ):
$ conda info

     active environment : hdf5_test
    active env location : /home/davidh/miniconda3/envs/hdf5_test
            shell level : 2
       user config file : /home/davidh/.condarc
 populated config files : /home/davidh/.condarc
          conda version : 4.8.3
    conda-build version : not installed
         python version : 3.7.3.final.0
       virtual packages : __cuda=10.2
                          __glibc=2.31
       base environment : /home/davidh/miniconda3  (writable)
           channel URLs : https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/davidh/miniconda3/pkgs
                          /home/davidh/.conda/pkgs
       envs directories : /home/davidh/miniconda3/envs
                          /home/davidh/.conda/envs
               platform : linux-64
             user-agent : conda/4.8.3 requests/2.23.0 CPython/3.7.3 Linux/5.4.0-7634-generic pop/20.04 glibc/2.31
                UID:GID : 53807:1000
             netrc file : None
           offline mode : False
@djhoese
Copy link
Contributor Author

djhoese commented Sep 18, 2020

I've been emailing with the HDF5 group and after talking with some of their developers the support person said:

He said that there are several debugging-print defines in the source that
you could enable and rebuild, to help troubleshoot what's going on. One of
the features has curl spit back its error codes and messages.

In src/H5FDs3comms.c:
#define S3COMMS_DEBUG 1 (was 0)
#define S3COMMS_CURL_VERBOSITY 2 (was 0) 

Does anyone know the easiest way for me to enable these (a new PR?) for this repository and test them locally without affecting the master branch?

@djhoese
Copy link
Contributor Author

djhoese commented Oct 19, 2020

I was able to follow the conda-forge instructions for building a package locally and give the HDF5 group the information they needed. We still can't figure it out, but they've now made an issue and we're assuming it is the super new version of curl that conda-forge is using:

This is now a bug report on their JIRA instance: https://jira.hdfgroup.org/browse/HDFFV-11156 (use hdf5group.org free login)

@jakirkham
Copy link
Member

cc @conda-forge/ros-core

@yarikoptic
Copy link

FWIW, indeed something "specific" to conda (have made a build on debian as well, custom build, wishlist bugreport against debian -- that one works). Here are strace -tt -e trace=network traces on debian and conda builds

Debian
107921 14:49:52.909027 socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
107921 14:49:52.909649 socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
107921 14:49:52.910571 socketpair(AF_UNIX, SOCK_STREAM, 0, [5, 6]) = 0
107922 14:49:52.911698 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7
107922 14:49:52.911788 connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
107922 14:49:52.911881 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7
107922 14:49:52.911906 connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
107922 14:49:52.913284 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 7
107922 14:49:52.913331 setsockopt(7, SOL_IP, IP_RECVERR, [1], 4) = 0
107922 14:49:52.913392 connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0
107922 14:49:52.913455 sendmmsg(7, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="c\216\1\0\0\1\0\0\0\0\0\0\2s3\tamazonaws\3com\0\0\1"..., iov_len=34}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=34}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base=":\214\1\0\0\1\0\0\0\0\0\0\2s3\tamazonaws\3com\0\0\34"..., iov_len=34}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=34}], 2, MSG_NOSIGNAL) = 2
107922 14:49:52.915755 recvfrom(7, ":\214\201\200\0\1\0\0\0\0\0\0\2s3\tamazonaws\3com\0\0\34"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 34
107922 14:49:52.944396 recvfrom(7, "c\216\201\200\0\1\0\1\0\0\0\0\2s3\tamazonaws\3com\0\0\1"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 50
107922 14:49:52.944567 sendto(6, "\1", 1, MSG_NOSIGNAL, NULL, 0) = 1
107922 14:49:52.944946 +++ exited with 0 +++
107921 14:49:52.945062 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
107921 14:49:52.945188 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
107921 14:49:52.945417 connect(5, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("52.217.89.246")}, 16) = -1 EINPROGRESS (Operation now in progress)
107921 14:49:52.979683 getsockopt(5, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
107921 14:49:52.979904 getpeername(5, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("52.217.89.246")}, [128->16]) = 0
107921 14:49:52.979952 getsockname(5, {sa_family=AF_INET, sin_port=htons(44804), sin_addr=inet_addr("192.168.1.34")}, [128->16]) = 0
107921 14:49:52.980083 sendto(5, "HEAD /hdfgroup/data/hdf5demo/snp"..., 88, MSG_NOSIGNAL, NULL, 0) = 88
107921 14:49:53.062339 recvfrom(5, "HTTP/1.1 200 OK\r\nx-amz-id-2: P0L"..., 16384, 0, NULL, NULL) = 456
107921 14:49:53.064492 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 105, MSG_NOSIGNAL, NULL, 0) = 105
107921 14:49:53.124857 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 505
107921 14:49:53.125894 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 106, MSG_NOSIGNAL, NULL, 0) = 106
107921 14:49:53.178225 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 515
107921 14:49:53.178920 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
107921 14:49:53.235142 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 580
107921 14:49:53.236532 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 108, MSG_NOSIGNAL, NULL, 0) = 108
107921 14:49:53.294508 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 1014
107921 14:49:53.295519 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 110, MSG_NOSIGNAL, NULL, 0) = 110
107921 14:49:53.343443 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 504
107921 14:49:53.344163 recvfrom(5, "\21\0\20\0\0\0\0\0\210\0\0\0\0\0\0\0\250\2\0\0\0\0\0\0\f\0H\1\0\0\0\0"..., 360, 0, NULL, NULL) = 360
107921 14:49:53.345203 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 110, MSG_NOSIGNAL, NULL, 0) = 110
107921 14:49:53.394022 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 504
107921 14:49:53.394269 recvfrom(5, "HEAP\0\0\0\0X\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0\310\2\0\0\0\0\0\0"..., 512, 0, NULL, NULL) = 512
107921 14:49:53.394513 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 109, MSG_NOSIGNAL, NULL, 0) = 109
107921 14:49:53.439866 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 1047
107921 14:49:53.441137 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 111, MSG_NOSIGNAL, NULL, 0) = 111
107921 14:49:53.507471 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 505
107921 14:49:53.507936 recvfrom(5, "SNOD\1\0\1\0\10\0\0\0\0\0\0\0\210\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 328, 0, NULL, NULL) = 328
107921 14:49:53.508421 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 111, MSG_NOSIGNAL, NULL, 0) = 111
107921 14:49:53.560910 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 505
107921 14:49:53.561567 recvfrom(5, "\1\0\5\0\1\0\0\0000\2\0\0\0\0\0\0\1\0\30\0\0\0\0\0\1\1\1\0\0\0\0\0"..., 512, 0, NULL, NULL) = 512
107921 14:49:53.562562 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 111, MSG_NOSIGNAL, NULL, 0) = 111
107921 14:49:53.609535 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 504
107921 14:49:53.610212 recvfrom(5, "\5\0\10\0\1\0\0\0\2\3\0\1\0\0\0\0\10\0\30\0\0\0\0\0\3\2\2\20\10\0\0\0"..., 64, 0, NULL, NULL) = 64
107921 14:49:53.612553 sendto(5, "GET /hdfgroup/data/hdf5demo/snp5"..., 111, MSG_NOSIGNAL, NULL, 0) = 111
107921 14:49:53.721729 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 506
107921 14:49:53.722377 recvfrom(5, "TREE\1\0006\0\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377 \v \0\0\0\0\0"..., 2096, 0, NULL, NULL) = 2096
107921 14:49:53.730662 +++ exited with 0 +++
conda
107180 14:47:25.570310 socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
107180 14:47:25.570479 socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
107180 14:47:25.570703 socketpair(AF_UNIX, SOCK_STREAM, 0, [5, 6]) = 0
107181 14:47:25.571085 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7
107181 14:47:25.571117 connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
107181 14:47:25.571197 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7
107181 14:47:25.571220 connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
107181 14:47:25.572627 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 7
107181 14:47:25.572695 setsockopt(7, SOL_IP, IP_RECVERR, [1], 4) = 0
107181 14:47:25.572727 connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0
107181 14:47:25.572785 sendmmsg(7, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="8\375\1\0\0\1\0\0\0\0\0\0\2s3\tamazonaws\3com\0\0\1"..., iov_len=34}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=34}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\333\377\1\0\0\1\0\0\0\0\0\0\2s3\tamazonaws\3com\0\0\34"..., iov_len=34}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=34}], 2, MSG_NOSIGNAL) = 2
107181 14:47:25.575079 recvfrom(7, "\333\377\201\200\0\1\0\0\0\0\0\0\2s3\tamazonaws\3com\0\0\34"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 34
107181 14:47:25.609065 recvfrom(7, "8\375\201\200\0\1\0\1\0\0\0\0\2s3\tamazonaws\3com\0\0\1"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [28->16]) = 50
107181 14:47:25.609559 sendto(6, "\1", 1, MSG_NOSIGNAL, NULL, 0) = 1
107181 14:47:25.609898 +++ exited with 0 +++
107180 14:47:25.610010 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
107180 14:47:25.610065 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
107180 14:47:25.610136 connect(5, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("52.216.12.14")}, 16) = -1 EINPROGRESS (Operation now in progress)
107180 14:47:25.642832 getsockopt(5, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
107180 14:47:25.643160 getpeername(5, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("52.216.12.14")}, [128->16]) = 0
107180 14:47:25.643234 getsockname(5, {sa_family=AF_INET, sin_port=htons(50490), sin_addr=inet_addr("192.168.1.34")}, [128->16]) = 0
107180 14:47:25.643334 sendto(5, "HEAD /hdfgroup/data/hdf5demo/snp"..., 88, MSG_NOSIGNAL, NULL, 0) = 88
107180 14:47:25.702486 recvfrom(5, "HTTP/1.1 200 OK\r\nx-amz-id-2: +JV"..., 16384, 0, NULL, NULL) = 456
107180 14:47:25.706180 sendto(5, "HEAD /hdfgroup/data/hdf5demo/snp"..., 106, MSG_NOSIGNAL, NULL, 0) = 106
107180 14:47:25.749093 recvfrom(5, "HTTP/1.1 206 Partial Content\r\nx-"..., 16384, 0, NULL, NULL) = 497
107180 14:47:48.749377 recvfrom(5, "", 8, 0, NULL, NULL) = 0
107180 14:47:48.757490 +++ exited with 1 +++
so it seems to get something from s3 but then next query response is different and then it waits (for more?) and never receives it. (I guess time stamp is for "returned")

@djhoese
Copy link
Contributor Author

djhoese commented Nov 2, 2020

@yarikoptic Any idea what versions of curl you used in those two builds (debian and conda)?

Edit: curl not conda

@yarikoptic
Copy link

debian: 7.72.0-1
$> grep curl debian-1.12.strace
105475 14:36:30.048612 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libcurl.so.4", O_RDONLY|O_CLOEXEC) = 3
1 35561.....................................:Mon 02 Nov 2020 05:55:32 PM EST:.
lena:/tmp
$> dlocate /lib/x86_64-linux-gnu/libcurl.so.4
git-annex-standalone: /usr/lib/git-annex.linux/lib/x86_64-linux-gnu/libcurl.so.4
libcurl4:amd64: /usr/lib/x86_64-linux-gnu/libcurl.so.4.6.0
libcurl4:amd64: /usr/lib/x86_64-linux-gnu/libcurl.so.4
1 35562.....................................:Mon 02 Nov 2020 05:55:42 PM EST:.
lena:/tmp
$> apt-cache policy libcurl4
libcurl4:
  Installed: 7.72.0-1
  Candidate: 7.72.0-1
  Version table:
 *** 7.72.0-1 900
        900 http://deb.debian.org/debian bullseye/main amd64 Packages
        600 http://http.debian.net/debian sid/main amd64 Packages
        100 /var/lib/dpkg/status
     7.64.0-4+deb10u1 100
        100 http://deb.debian.org/debian buster/main amd64 Packages
        100 http://security.debian.org/debian-security buster/updates/main amd64 Packages
conda: 7.71.1
$> grep curl conda-1.12.strace
105462 14:36:26.175841 openat(AT_FDCWD, "/home/yoh/anaconda-5.2.0-2.7/envs/hdf5-ros3/bin/../lib/./libcurl.so.4", O_RDONLY|O_CLOEXEC) = 3

$> conda list | grep curl
libcurl                   7.71.1               hcdd3856_8    conda-forge

@dopplershift
Copy link
Member

@djhoese Not sure if this is relevant or not, but I had problems with the same kind of thing with netcdf-c and curl >7.69--Unidata/netcdf-c#1798 was a one-line fix in that case.

@djhoese
Copy link
Contributor Author

djhoese commented Nov 17, 2020

Thanks. I'll try to look at this later.

@djhoese
Copy link
Contributor Author

djhoese commented Nov 17, 2020

Not sure I understand the NOBODY option, but I see this is already in HDF5:

https://github.com/HDFGroup/hdf5/blob/a50d211755cb272b2e468144e7d892a4c90813c4/src/H5FDs3comms.c#L903-L904

    if (CURLE_OK != curl_easy_setopt(curlh, CURLOPT_NOBODY, 1L))
        HGOTO_ERROR(H5E_ARGS, H5E_BADVALUE, FAIL, "error while setting CURL option (CURLOPT_NOBODY).");

It is later set to NULL to unset it so this might not be the issue.

@yarikoptic
Copy link

yarikoptic commented Nov 17, 2020

FWIW, if I LD_PRELOAD libcurl from debian, conda build of hdf5 tools works:
$> LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcurl.so.4.6.0 h5dump -pBH --filedriver=ros3 "http://s3.amazonaws.com/hdfgroup/data/hdf5demo/snp500.h5" | head
HDF5 "http://s3.amazonaws.com/hdfgroup/data/hdf5demo/snp500.h5" {
SUPER_BLOCK {
   SUPERBLOCK_VERSION 0
   FREELIST_VERSION 0
   SYMBOLTABLE_VERSION 0
   OBJECTHEADER_VERSION 0
   OFFSET_SIZE 8
   LENGTH_SIZE 8
   BTREE_RANK 16
   BTREE_LEAF 4
..

so it means that likely there is nothing to be solved at hdf5 library level but rather in curl or some other library it relies upon.

Here is the list of non-conda libraries which are picked up by such PRELOAD:
$> LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libcurl.so.4.6.0 ldd `which h5dump` | grep -v conda
	linux-vdso.so.1 (0x00007ffd0dc8c000)
	/usr/lib/x86_64-linux-gnu/libcurl.so.4.6.0 (0x00007fde9fa99000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fde9f63e000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fde9f479000)
	libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007fde9f430000)
	librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007fde9f40f000)
	libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007fde9f3b8000)
	libldap_r-2.4.so.2 => /lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007fde9efb5000)
	liblber-2.4.so.2 => /lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007fde9efa4000)
	libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007fde9ef96000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fde9ef71000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fde9ef69000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fde9ee25000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fde9fb6c000)
	libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007fde9eca3000)
	libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007fde9ead9000)
	libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007fde9ea90000)
	libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007fde9ea50000)
	libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007fde9e9cb000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fde9e8a2000)
	libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007fde9e886000)
	libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007fde9e863000)
	libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007fde9e72f000)
	libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007fde9e719000)
	libffi.so.7 => /lib/x86_64-linux-gnu/libffi.so.7 (0x00007fde9e70b000)

@dopplershift
Copy link
Member

@djhoese That actually is suspect then--the fix was to replace CURLOPT_NOBODY with CURLOPT_HTTPGET.

Essentially (based on this and this) what's going on (in the netcdf bug and I think here too) is that setting NOBODY to true tells curl to not download the body. Unfortunately, setting NOBODY to false does NOT tell it to go back to downloading the body--you need to set CURLOPT_HTTPGET to true to do this. So a quick fix would be to find where it sets it to NULL and replace with:

curl_easy_setopt(state->curl, CURLOPT_HTTPGET, 1L)

@djhoese
Copy link
Contributor Author

djhoese commented Nov 17, 2020

Holy crap I think it worked! I'll make a PR later tonight for you to review hopefully. I'll have to send something to the HDF5 group about this too. For the record:

$ cat recipe/patches/fix_nobody.patch
--- hdf5-1.12.0.orig/src/H5FDs3comms.c  2020-07-18 17:38:20.152820841 -0500
+++ hdf5-1.12.0/src/H5FDs3comms.c       2020-11-17 14:50:10.752751431 -0600
@@ -1082,9 +1082,7 @@
      **********************/

     if ( CURLE_OK !=
-        curl_easy_setopt(curlh,
-                         CURLOPT_NOBODY,
-                         NULL) )
+        curl_easy_setopt(curlh, CURLOPT_HTTPGET, 1L) )
     {
         HGOTO_ERROR(H5E_ARGS, H5E_BADVALUE, FAIL,
                     "error while setting CURL option (CURLOPT_NOBODY). "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants