Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore stucks when it cannot pull file from azure #34

Open
aensidhe opened this issue Jun 1, 2020 · 5 comments
Open

Restore stucks when it cannot pull file from azure #34

aensidhe opened this issue Jun 1, 2020 · 5 comments

Comments

@aensidhe
Copy link

aensidhe commented Jun 1, 2020

I use ~9GB sample to test. It fails on different files, but symptoms are the same: no retries and being stuck forever. How can I help with this issue?

root@asgard:~# /home/aensidhe/go/bin/zfsbackup-go receive \
     --encryptTo test-cloud@aensidhe.ru \
     --signFrom test-cloud@aensidhe.ru \
     --publicKeyRingPath /home/aensidhe/public.pgp \
     --secretKeyRingPath /home/aensidhe/private.pgp \
     --auto \
     -d mimir/test \
     azure://test \
     mimir \
     --logLevel debug \
     --maxBackoffTime 5m \
     --maxRetryTime 10m
2020/06/01 06:13:13 Setting number of cores to: 2
2020/06/01 06:13:13 Loaded private key ring /home/aensidhe/private.pgp
2020/06/01 06:13:13 Loaded public key ring /home/aensidhe/public.pgp
2020/06/01 06:13:13 Setting working directory to /root/.zfsbackup
2020/06/01 06:13:13 PGP Debug Info:
Loaded Private Keys:
        EEA5B2C07D008AFD
        map[Anatoly Popov <test-cloud@aensidhe.ru>:0xc000304680]

Loaded Public Keys:
        EEA5B2C07D008AFD
        map[Anatoly Popov <test-cloud@aensidhe.ru>:0xc000304780]
2020/06/01 06:13:13 Limiting the number of active files to 5
2020/06/01 06:13:13 Initializing Backend azure://test
2020/06/01 06:13:14 Trying to determine latest snapshot for volume mimir/test.
2020/06/01 06:13:14 Restoring to snapshot test2.
2020/06/01 06:13:14 Calculating how to restore to test2.
2020/06/01 06:13:14 Getting ZFS Snapshots with command "zfs list -H -d 1 -p -t snapshot,bookmark -r -o name,creation,type -S creation mimir/test"
2020/06/01 06:13:14 Adding backup job for test2 to the restore list.
2020/06/01 06:13:14 Need to restore 1 snapshots.
2020/06/01 06:13:14 Restoring snapshot test2 (1/1)
2020/06/01 06:13:14 Initializing Backend azure://test
2020/06/01 06:13:14 Enabling the full path (-d) flag on the receive.
2020/06/01 06:13:14 Starting zfs receive command: zfs receive -d mimir
2020/06/01 06:13:14 Downloading volume mimir/test|test2.zstream.gz.pgp.vol1.
2020/06/01 06:13:14 Downloading volume mimir/test|test2.zstream.gz.pgp.vol2.
2020/06/01 06:13:14 Downloading volume mimir/test|test2.zstream.gz.pgp.vol3.
2020/06/01 06:13:14 Downloading volume mimir/test|test2.zstream.gz.pgp.vol4.
2020/06/01 06:13:14 Downloading volume mimir/test|test2.zstream.gz.pgp.vol5.
2020/06/01 06:13:52 Downloaded mimir/test|test2.zstream.gz.pgp.vol5.
2020/06/01 06:13:56 Downloaded mimir/test|test2.zstream.gz.pgp.vol1.
2020/06/01 06:13:56 Processing mimir/test|test2.zstream.gz.pgp.vol1.
2020/06/01 06:14:00 Downloaded mimir/test|test2.zstream.gz.pgp.vol2.
2020/06/01 06:14:02 Processed mimir/test|test2.zstream.gz.pgp.vol1.
2020/06/01 06:14:02 Processing mimir/test|test2.zstream.gz.pgp.vol2.
2020/06/01 06:14:02 Downloading volume mimir/test|test2.zstream.gz.pgp.vol6.
2020/06/01 06:14:02 Downloaded mimir/test|test2.zstream.gz.pgp.vol4.
2020/06/01 06:14:02 Downloaded mimir/test|test2.zstream.gz.pgp.vol3.
2020/06/01 06:14:07 Processed mimir/test|test2.zstream.gz.pgp.vol2.
2020/06/01 06:14:07 Processing mimir/test|test2.zstream.gz.pgp.vol3.
2020/06/01 06:14:07 Downloading volume mimir/test|test2.zstream.gz.pgp.vol7.
2020/06/01 06:14:12 Processed mimir/test|test2.zstream.gz.pgp.vol3.
2020/06/01 06:14:12 Processing mimir/test|test2.zstream.gz.pgp.vol4.
2020/06/01 06:14:12 Downloading volume mimir/test|test2.zstream.gz.pgp.vol8.
2020/06/01 06:14:17 Processed mimir/test|test2.zstream.gz.pgp.vol4.
2020/06/01 06:14:17 Processing mimir/test|test2.zstream.gz.pgp.vol5.
2020/06/01 06:14:17 Downloading volume mimir/test|test2.zstream.gz.pgp.vol9.
2020/06/01 06:14:23 Processed mimir/test|test2.zstream.gz.pgp.vol5.
2020/06/01 06:14:23 Downloading volume mimir/test|test2.zstream.gz.pgp.vol10.
2020/06/01 06:14:41 Downloaded mimir/test|test2.zstream.gz.pgp.vol7.
2020/06/01 06:14:44 Downloaded mimir/test|test2.zstream.gz.pgp.vol6.
2020/06/01 06:14:44 Processing mimir/test|test2.zstream.gz.pgp.vol6.
2020/06/01 06:14:49 Processed mimir/test|test2.zstream.gz.pgp.vol6.
2020/06/01 06:14:49 Processing mimir/test|test2.zstream.gz.pgp.vol7.
2020/06/01 06:14:49 Downloading volume mimir/test|test2.zstream.gz.pgp.vol11.
2020/06/01 06:14:58 Downloaded mimir/test|test2.zstream.gz.pgp.vol10.
2020/06/01 06:15:00 Downloaded mimir/test|test2.zstream.gz.pgp.vol9.
2020/06/01 06:15:00 Downloaded mimir/test|test2.zstream.gz.pgp.vol8.
2020/06/01 06:15:24 Processed mimir/test|test2.zstream.gz.pgp.vol7.
2020/06/01 06:15:24 Processing mimir/test|test2.zstream.gz.pgp.vol8.
2020/06/01 06:15:24 Downloading volume mimir/test|test2.zstream.gz.pgp.vol12.
2020/06/01 06:15:27 Downloaded mimir/test|test2.zstream.gz.pgp.vol11.
2020/06/01 06:15:29 Processed mimir/test|test2.zstream.gz.pgp.vol8.
2020/06/01 06:15:29 Processing mimir/test|test2.zstream.gz.pgp.vol9.
2020/06/01 06:15:29 Downloading volume mimir/test|test2.zstream.gz.pgp.vol13.
2020/06/01 06:15:34 Processed mimir/test|test2.zstream.gz.pgp.vol9.
2020/06/01 06:15:34 Processing mimir/test|test2.zstream.gz.pgp.vol10.
2020/06/01 06:15:34 Downloading volume mimir/test|test2.zstream.gz.pgp.vol14.
2020/06/01 06:15:39 Processed mimir/test|test2.zstream.gz.pgp.vol10.
2020/06/01 06:15:39 Processing mimir/test|test2.zstream.gz.pgp.vol11.
2020/06/01 06:15:39 Downloading volume mimir/test|test2.zstream.gz.pgp.vol15.
2020/06/01 06:15:45 Processed mimir/test|test2.zstream.gz.pgp.vol11.
2020/06/01 06:15:45 Downloading volume mimir/test|test2.zstream.gz.pgp.vol16.
2020/06/01 06:16:12 Downloaded mimir/test|test2.zstream.gz.pgp.vol14.
2020/06/01 06:16:13 Downloaded mimir/test|test2.zstream.gz.pgp.vol13.
2020/06/01 06:16:16 Downloaded mimir/test|test2.zstream.gz.pgp.vol12.
2020/06/01 06:16:16 Processing mimir/test|test2.zstream.gz.pgp.vol12.
2020/06/01 06:16:17 Downloaded mimir/test|test2.zstream.gz.pgp.vol15.
2020/06/01 06:16:19 Downloaded mimir/test|test2.zstream.gz.pgp.vol16.
2020/06/01 06:16:21 Processed mimir/test|test2.zstream.gz.pgp.vol12.
2020/06/01 06:16:21 Processing mimir/test|test2.zstream.gz.pgp.vol13.
2020/06/01 06:16:21 Downloading volume mimir/test|test2.zstream.gz.pgp.vol17.
2020/06/01 06:16:26 Processed mimir/test|test2.zstream.gz.pgp.vol13.
2020/06/01 06:16:26 Processing mimir/test|test2.zstream.gz.pgp.vol14.
2020/06/01 06:16:26 Downloading volume mimir/test|test2.zstream.gz.pgp.vol18.
2020/06/01 06:16:48 Downloaded mimir/test|test2.zstream.gz.pgp.vol17.
2020/06/01 06:16:51 Processed mimir/test|test2.zstream.gz.pgp.vol14.
2020/06/01 06:16:51 Processing mimir/test|test2.zstream.gz.pgp.vol15.
2020/06/01 06:16:51 Downloading volume mimir/test|test2.zstream.gz.pgp.vol19.
2020/06/01 06:16:56 Processed mimir/test|test2.zstream.gz.pgp.vol15.
2020/06/01 06:16:56 Processing mimir/test|test2.zstream.gz.pgp.vol16.
2020/06/01 06:16:56 Downloading volume mimir/test|test2.zstream.gz.pgp.vol20.
2020/06/01 06:17:02 Processed mimir/test|test2.zstream.gz.pgp.vol16.
2020/06/01 06:17:02 Processing mimir/test|test2.zstream.gz.pgp.vol17.
2020/06/01 06:17:02 Downloading volume mimir/test|test2.zstream.gz.pgp.vol21.
2020/06/01 06:17:09 Processed mimir/test|test2.zstream.gz.pgp.vol17.
2020/06/01 06:17:09 Downloading volume mimir/test|test2.zstream.gz.pgp.vol22.
2020/06/01 06:17:15 Downloaded mimir/test|test2.zstream.gz.pgp.vol18.
2020/06/01 06:17:15 Processing mimir/test|test2.zstream.gz.pgp.vol18.
2020/06/01 06:17:20 Processed mimir/test|test2.zstream.gz.pgp.vol18.
2020/06/01 06:17:20 Downloading volume mimir/test|test2.zstream.gz.pgp.vol23.
2020/06/01 06:17:20 Downloaded mimir/test|test2.zstream.gz.pgp.vol19.
2020/06/01 06:17:20 Processing mimir/test|test2.zstream.gz.pgp.vol19.
2020/06/01 06:17:25 Processed mimir/test|test2.zstream.gz.pgp.vol19.
2020/06/01 06:17:25 Downloading volume mimir/test|test2.zstream.gz.pgp.vol24.
2020/06/01 06:17:53 Downloaded mimir/test|test2.zstream.gz.pgp.vol20.
2020/06/01 06:17:53 Processing mimir/test|test2.zstream.gz.pgp.vol20.
2020/06/01 06:17:54 Downloaded mimir/test|test2.zstream.gz.pgp.vol21.
2020/06/01 06:17:56 Downloaded mimir/test|test2.zstream.gz.pgp.vol23.
2020/06/01 06:18:06 Downloaded mimir/test|test2.zstream.gz.pgp.vol22.
2020/06/01 06:18:14 Downloaded mimir/test|test2.zstream.gz.pgp.vol24.
2020/06/01 06:18:18 Processed mimir/test|test2.zstream.gz.pgp.vol20.
2020/06/01 06:18:18 Processing mimir/test|test2.zstream.gz.pgp.vol21.
2020/06/01 06:18:18 Downloading volume mimir/test|test2.zstream.gz.pgp.vol25.
2020/06/01 06:18:23 Processed mimir/test|test2.zstream.gz.pgp.vol21.
2020/06/01 06:18:23 Processing mimir/test|test2.zstream.gz.pgp.vol22.
2020/06/01 06:18:23 Downloading volume mimir/test|test2.zstream.gz.pgp.vol26.
2020/06/01 06:18:28 Processed mimir/test|test2.zstream.gz.pgp.vol22.
2020/06/01 06:18:28 Processing mimir/test|test2.zstream.gz.pgp.vol23.
2020/06/01 06:18:28 Downloading volume mimir/test|test2.zstream.gz.pgp.vol27.
2020/06/01 06:18:34 Processed mimir/test|test2.zstream.gz.pgp.vol23.
2020/06/01 06:18:34 Processing mimir/test|test2.zstream.gz.pgp.vol24.
2020/06/01 06:18:34 Downloading volume mimir/test|test2.zstream.gz.pgp.vol28.
2020/06/01 06:18:39 Processed mimir/test|test2.zstream.gz.pgp.vol24.
2020/06/01 06:18:39 Downloading volume mimir/test|test2.zstream.gz.pgp.vol29.
2020/06/01 06:18:49 Downloaded mimir/test|test2.zstream.gz.pgp.vol25.
2020/06/01 06:18:49 Processing mimir/test|test2.zstream.gz.pgp.vol25.
2020/06/01 06:19:02 Downloaded mimir/test|test2.zstream.gz.pgp.vol27.
2020/06/01 06:19:15 Downloaded mimir/test|test2.zstream.gz.pgp.vol28.
2020/06/01 06:19:22 Downloaded mimir/test|test2.zstream.gz.pgp.vol29.
2020/06/01 06:19:23 Could not download file mimir/test|test2.zstream.gz.pgp.vol26 to the local cache dir due to error - context deadline exceeded.
2020/06/01 06:19:28 Processed mimir/test|test2.zstream.gz.pgp.vol25.
2020/06/01 06:19:28 Downloading volume mimir/test|test2.zstream.gz.pgp.vol30.
2020/06/01 06:20:04 Downloaded mimir/test|test2.zstream.gz.pgp.vol30.
@someone1
Copy link
Owner

someone1 commented Jun 1, 2020

Thanks - I've seen some issue like this before but it was hard to track. I think I need to get more serious about testing the retry logic as I believe I may have subverted it when introducing contexts in the back end code.

Please allow me some time to look into this and get back to you.

Just to confirm, was this with the latest 0.4-rc1 release?

@aensidhe
Copy link
Author

aensidhe commented Jun 24, 2020

Sorry, missed your reply. I used master branch on date of opening issue.

@aensidhe
Copy link
Author

So, now it happens with custom s3 backend. I'm using 0.4-rc1 release.

root@asgard:~# /home/aensidhe/zfsbackup-go receive     --encryptTo test-cloud@aensidhe.ru     --signFrom test-cloud@aensidhe.ru     --publicKeyRingPath /home/aensidhe/public.pgp     --secretKeyRingPath /home/aensidhe/private.pgp     --auto --logLevel info     mimir/private     s3://bucket-20200710-0136     mimir/restore-test
2020/07/15 20:30:32 Setting number of cores to: 2
2020/07/15 20:30:32 Loaded private key ring /home/aensidhe/private.pgp
2020/07/15 20:30:32 Loaded public key ring /home/aensidhe/public.pgp
2020/07/15 20:30:32 Setting working directory to /root/.zfsbackup
2020/07/15 20:30:32 Limiting the number of active files to 5
2020/07/15 20:30:33 Trying to determine latest snapshot for volume mimir/private.
2020/07/15 20:30:33 Restoring to snapshot autosnap_2020-07-09_22:45:01_frequently.
2020/07/15 20:30:33 Calculating how to restore to autosnap_2020-07-09_22:45:01_frequently.
2020/07/15 20:30:33 Adding backup job for autosnap_2020-07-09_22:45:01_frequently to the restore list.
2020/07/15 20:30:33 Need to restore 1 snapshots.
2020/07/15 20:30:33 Restoring snapshot autosnap_2020-07-09_22:45:01_frequently (1/1)
2020/07/15 20:35:26 Starting zfs receive command: zfs receive mimir/restore-test
2020/07/16 06:15:18 Could not download file mimir/private|autosnap_2020-07-09_22:45:01_frequently.zstream.gz.pgp.vol2895 to the local cache dir due to error - unexpected EOF.

What should I do? Restart the process?

@someone1
Copy link
Owner

someone1 commented Aug 3, 2020

Hmm - that seems like a zfs recv error - can you please use the debug log level? Does your local have enough disk space to support the restore?

The debug logs may tell us more.

@aensidhe
Copy link
Author

aensidhe commented Aug 7, 2020

Ok, I will try this weekend. After several attempts, it was successful, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants