Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration error from 0.7.1 to 0.8.0 #3760

Closed
stridentbean opened this issue Nov 25, 2019 · 3 comments
Closed

Migration error from 0.7.1 to 0.8.0 #3760

stridentbean opened this issue Nov 25, 2019 · 3 comments

Comments

@stridentbean
Copy link

stridentbean commented Nov 25, 2019

Background

Casa Nodes updated from 0.7.1 to 0.8.0 and a half dozen didn't make it. We have been receiving reports that lnd doesn't start back up. After reviewing two sets of logs, they both look like the error message listed below. It leads me to believe there was a wallet corruption during the migration process.

Unfortunately, I just jumped onto a customer node and found that the SCB file was missing. Somehow, during the migration, the backup file must have been deleted. I suppose this user could have been migrating from pre 0.6.0 (I didn't ask), but it seems unlikely.

Is there any way to extract an SCB file from the wallet.db file? I have access to wallet.db.

Your environment

  • version of lnd
    0.8.0
  • which operating system (uname -a on *Nix)
    Linux v1-casa-node-develop 4.14.70-v7+ Fixing some README typos #2 SMP Wed Sep 19 07:49:26 UTC 2018 armv7l GNU/Linux
  • version of btcd, bitcoind, or other backend
    bitcoin ~0.18.0
  • any other relevant environment details
    Running everything in docker containers on a Casa Node.

Steps to reproduce

Lnd is constantly restarting with similar errors listed below.

I think this relates to #3550

2019-11-11T02:01:34Z lnd UNKNOWN[15727] panic: freepages: failed to get all reachable pages (page 1501: multiple references)
2019-11-11T02:01:34Z lnd UNKNOWN[15727] 
2019-11-11T02:01:34Z lnd UNKNOWN[15727] goroutine 43 [running]:
2019-11-11T02:01:34Z lnd UNKNOWN[15727] github.com/coreos/bbolt.(*DB).freepages.func2(0x32444c0)
2019-11-11T02:01:34Z lnd UNKNOWN[15727] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.3/db.go:1003 +0xcc
2019-11-11T02:01:34Z lnd UNKNOWN[15727] created by github.com/coreos/bbolt.(*DB).freepages
2019-11-11T02:01:34Z lnd UNKNOWN[15727] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.3/db.go:1001 +0x108
2019-11-11T02:01:37Z lnd UNKNOWN[15727] panic: freepages: failed to get all reachable pages (page 1501: multiple references)
2019-11-11T02:01:37Z lnd UNKNOWN[15727] 
2019-11-11T02:01:37Z lnd UNKNOWN[15727] goroutine 43 [running]:
2019-11-11T02:01:37Z lnd UNKNOWN[15727] github.com/coreos/bbolt.(*DB).freepages.func2(0x32444c0)
2019-11-11T02:01:37Z lnd UNKNOWN[15727] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.3/db.go:1003 +0xcc
2019-11-11T02:01:37Z lnd UNKNOWN[15727] created by github.com/coreos/bbolt.(*DB).freepages
2019-11-11T02:01:37Z lnd UNKNOWN[15727] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.3/db.go:1001 +0x108

Expected behaviour

LND should start up properly.

Actual behaviour

Lnd is crashing at boot up.

@Roasbeef
Copy link
Member

How is the node shutdown before y'alls updated process? Based on that fragment, it looks like the freelist was corrupted. Within 0.8, we stopped syncing the free list to disk by default which was done in order to take advantage of performance improvements, but also since it seemed that may of the disk corruption instances we saw were related to the free list. The latest version will no longer write the free-list to disk, but will instead reconstruct it from scratch.

If you boot with sync-freelist does the node start?

If the SCB file wasn't present, then that may indicate partial data loss.

@Roasbeef
Copy link
Member

Closing due to inactivity.

@stridentbean
Copy link
Author

Finally got back around to this,
I was able to upgrade lnd to 0.8.2 and added the flag sync-freelist. Unfortunately, I still get the error below.

panic: freepages: failed to get all reachable pages (page 1501: multiple references)

goroutine 42 [running]:
github.com/coreos/bbolt.(*DB).freepages.func2(0xb432800)
	/go/pkg/mod/github.com/coreos/bbolt@v1.3.3/db.go:1003 +0xdb
created by github.com/coreos/bbolt.(*DB).freepages
	/go/pkg/mod/github.com/coreos/bbolt@v1.3.3/db.go:1001 +0x157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants