You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context
On Dec 2, 2021, multiple fetch-and-ingest runs for GISAID failed. The failure pattern was we would download for a while and the transfer would get closed before it's completed. Subsequent attempts to fetch would hit a 503 error. We manually triggered fetch-and-ingest two more times and saw the same failure pattern.
Possible solution
The scheduled run today had no issues, so this may have just been unfortunate timing of our runs being interrupted by GISAID's reboots. We can revisit the following solutions in anticipation of similar future issues:
Manual downloads from the same API endpoint were able to complete successfully when done without streaming decompression. We can update fetch-from-gisaid to stop decompression during streaming to lower the open connection time. However, decompressing in a separate step this would increase the total time to run fetch-and-ingest.
Switch to an endpoint with xz, which has better compression ratio and decompression time than bzip2. Regardless of errors, this would be a huge improvement for us and dramatically decrease fetch-and-ingest runtime.
The text was updated successfully, but these errors were encountered:
I don't know if it's any faster, but why now.
The results are correct in my local testing.
Locally, it does use multiple threads, but not too many. We might be bound by download speed rather then decompression though.
Related: #242
Context
On Dec 2, 2021, multiple
fetch-and-ingest
runs for GISAID failed. The failure pattern was we would download for a while and the transfer would get closed before it's completed. Subsequent attempts to fetch would hit a 503 error. We manually triggeredfetch-and-ingest
two more times and saw the same failure pattern.Possible solution
The scheduled run today had no issues, so this may have just been unfortunate timing of our runs being interrupted by GISAID's reboots. We can revisit the following solutions in anticipation of similar future issues:
fetch-from-gisaid
to stop decompression during streaming to lower the open connection time. However, decompressing in a separate step this would increase the total time to runfetch-and-ingest
.xz
, which has better compression ratio and decompression time thanbzip2
. Regardless of errors, this would be a huge improvement for us and dramatically decreasefetch-and-ingest
runtime.The text was updated successfully, but these errors were encountered: