-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to restore recent snapshot with strange error #1160
Comments
Unarchiving by the tar.gz archve and counting the immutables every second, I see this strange pattern:
eg. file "disappear" |
Thanks for the report ! Sadly this is not the first issue about snapshots unpacking, see issue: #1140. This is probably related to the fact that we pack the snapshot without stopping the node. |
That's unfortunate :( We should probably stop the node before snapshotting, or perhaps use some journaled FS? |
Stopping the node would be ideal but would totally change the relation between it and the aggregator. Instead of being an addition running alongside, the aggregator would control the node (what about environment variable or parameters that may be needed for that ?). We did not thought about a journaled FS, so I've no idea on how this would works and the related costs but this is an interesting idea. We mainly thought about mitigation, what we have in mind is copying the files that changes (last immutable trio, ledger and volatiles) before making the snapshot (see #1140) but imo this may just move the problem since the copy would still happen while the node is running. |
Yes, the aggregator is special so there's no reason to restrain ourselves in what it can do. I do think it makes total sense for it to control the node, even to fork one as part of its startup process. BTW, this could be a testbed for offering a package providing mithril+cardano-node ;) Renaming (eg. |
This is weird because I had no problem while doing the same operation on my computer with $ mithril-client snapshot download fdd609c5affa627c9b19dfd32c5a370a9e6ba0f930ec50281b5f470fe3c955de
./mithril-client snapshot download fdd609c5affa627c9b19dfd32c5a370a9e6ba0f930ec50281b5f470fe3c955de
1/7 - Checking local disk info…
2/7 - Fetching the certificate's information…
3/7 - Verifying the certificate chain…
4/7 - Downloading the snapshot…
5/7 - Unpacking the snapshot…
6/7 - Computing the snapshot digest…
7/7 - Verifying the snapshot signature…
Snapshot 'fdd609c5affa627c9b19dfd32c5a370a9e6ba0f930ec50281b5f470fe3c955de' has been unpacked and successfully checked against Mithril multi-signature contained in the certificate.
Files in the directory './db' can be used to run a Cardano node.
If you are using Cardano Docker image, you can restore a Cardano Node with:
docker run -v cardano-node-ipc:/ipc -v cardano-node-data:/data --mount type=bind,source="./db",target=/data/db/ -e NETWORK=mainnet inputoutput/cardano-node:8.1.2 Maybe the aggregator is not responsible for that behavior, however we have strengthened the archive verification: we make sure that the archive can be fully unpacked before publishing it (as in #1179). Can you provide more information about the environment where you encountered this problem? |
I could try again. I had some other issues last week before the workshop but I suspect this was more a file corruption due to interrupted network download than an issue in the snapshot |
It looks like the warning stating that the disk space available is not displayed when the |
OK, let's close this then, it's certainly a spurious problem, let's keep our eyes open should it reproduce. |
Context & versions
Trying to restore a recent snapshot:
Got the following error at unpacking stage:
Steps to reproduce
The text was updated successfully, but these errors were encountered: