You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The client is not able to unpack some archive on the mainnet (previous are restored without problem) and returns an error when running command ./mithril-client snapshot download with the following digests:
5/7 - Unpacking the snapshot…
Error: "An error occured: Could not unpack './snapshot-a7a82ca3734fcfc8f233af20e6a1b78eddc09909ced8f70d293a4a26f832b812.tar.gz' in directory './db'. Error: « failed to iterate over archive »."
Expanded error:
Error: "An error occured: Could not unpack './snapshot-74a70bc7c6346e2dad6d1ee16d402e843dfadbf2f45edf558cd1006a33ada7a1.tar.gz' in directory './db'. Error: « Custom { kind: Other, error: TarError { desc: \"failed to iterate over archive\", io: Custom { kind: Other, error: \"numeric field did not have utf-8 text: ,{��\\n)�3 when getting cksum for \\t�h@�\\u{15}��1�R\\u{6}L��XP\\u{13}/�Ĉ\\u{11}oF�\\nn\\u{f}�����k\\u{16}�`���^��z8��z�\\t�JȚ�%V�-m\\u{3}_�E)�p��\\u{14}�-\\u{4}�\\u{7f}>�<[��2�\\\\���\\u{6}V0�]\\u{b}\" } } } »."
To do
Identify why the produced archive is corrupted
Add a verification step in the aggregator after creation of the archive, and before upload
Assess the time needed to operate this verification on a mainnet archive and impact on snapshot production: ~20/25 min extra computation vs ~2h30min already need to compute the archive)
Analysis
The archive created is corrupted, and it appears that the checksum error that we witness is due to some files evolving during the creation of the archive
A verification step can consist of computing the entries list of the archive: this will fail if the archive is corrupted
This will avoid users discover the archive is invalid after downloading it
We have made tests, and on the target aggregator VM, verifying an archive is an operation that takes up to 20/25 min, given the ~2h30min needed to produce the archive
In order to avoid creating unwanted delays for snapshotting new immutable files, we have reduced the number of retries to produce an archive (from 3 to 2)
In the long run, we can limit the creation of corrupted archives by modifying slightly our snapshotting algorithm: volatile, ledger state and latest immutable files should be captured in order to provide a valid snapshot (copy in a temp folder, and delete when snapshot is completed)
Issue
The client is not able to unpack some archive on the
mainnet
(previous are restored without problem) and returns an error when running command./mithril-client snapshot download
with the following digests:a7a82ca3734fcfc8f233af20e6a1b78eddc09909ced8f70d293a4a26f832b812
74a70bc7c6346e2dad6d1ee16d402e843dfadbf2f45edf558cd1006a33ada7a1
5/7 - Unpacking the snapshot… Error: "An error occured: Could not unpack './snapshot-a7a82ca3734fcfc8f233af20e6a1b78eddc09909ced8f70d293a4a26f832b812.tar.gz' in directory './db'. Error: « failed to iterate over archive »."
Expanded error:
To do
mainnet
archive and impact on snapshot production:~20/25 min
extra computation vs~2h30min
already need to compute the archive)Analysis
20/25
min, given the~2h30min
needed to produce the archiveLater
The text was updated successfully, but these errors were encountered: