Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: page 11014 already freed #4217

Closed
jsullivan3 opened this issue Jun 11, 2018 · 1 comment
Closed

panic: page 11014 already freed #4217

jsullivan3 opened this issue Jun 11, 2018 · 1 comment
Labels
needs-investigation The issue described is detailed and complex.

Comments

@jsullivan3
Copy link

Overview of the Issue

Consul cluster with three servers and eleven clients. All nodes are running 1.1.0 as downloaded from the Consul downloads page.
Raft appears have been reaping a snapshot when boltdb encountered a panic, seemingly the same as described in boltdb/bolt#731 and therefore etcd-io/bbolt#72.

Reproduction Steps

Unsure - the cluster had recently reaped a client that had been removed, otherwise it was idle from a user interaction perspective.

Consul info for both Client and Server

Client info No client crashed during this incident.
Server info The following information was collected after the consul data directory was emptied and the server was restarted.
agent:
        check_monitors = 1
        check_ttls = 0
        checks = 1
        services = 1
build:  
        prerelease =
        revision = 5174058f
        version = 1.1.0
consul: 
        bootstrap = false
        known_datacenters = 1
        leader = false
        leader_addr = 192.168.1.28:8300
        server = true
raft:
        applied_index = 8399147
        commit_index = 8399148
        fsm_pending = 128
        last_contact = 502.552µs
        last_log_index = 8399148
        last_log_term = 18032
        last_snapshot_index = 8393388
        last_snapshot_term = 18032
        latest_configuration = [{Suffrage:Voter ID:11ff40ab-ef3d-2b11-9ec2-927fae1aef67 Address:192.168.1.28:8300} {Suffrage:Voter ID:8f19f316-e477-5885-c549-bef153a15150 Address:192.168.1.2:8300} {Suffrage:Voter ID:e08704d9-6631-89bc-7883-f9d3ef79f997 Address:192.168.1.16:8300}]
        latest_configuration_index = 8356317
        num_peers = 0
        protocol_version = 3
        protocol_version_max = 3  
        protocol_version_min = 0  
        snapshot_version_max = 1  
        snapshot_version_min = 0  
        state = Follower
        term = 18032
runtime:
        arch = arm
        cpu_count = 4
        goroutines = 87
        max_procs = 4
        os = linux
        version = go1.10.1
serf_lan:
        coordinate_resets = 0
        encrypted = false
        event_queue = 0
        event_time = 48
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 2031
        members = 14
        query_queue = 0
        query_time = 1
serf_wan:
        coordinate_resets = 0
        encrypted = false
        event_queue = 0
        event_time = 1
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 490
        members = 3
        query_queue = 0
        query_time = 1

Operating system and Environment details

Fedora 28, Raspberry Pi 2 (armv7hl), 1GB RAM

Log Fragments

Jun 11 13:43:14 naspberry consul[704]:     2018/06/11 13:43:14 [INFO] consul.fsm: snapshot created in 4.063643ms
Jun 11 13:43:14 naspberry consul[704]:     2018/06/11 13:43:14 [INFO] raft: Starting snapshot up to 8398461
Jun 11 13:43:14 naspberry consul[704]:     2018/06/11 13:43:14 [INFO] snapshot: Creating new snapshot at /var/lib/consul
/data/raft/snapshots/18032-8398461-1528738994213.tmp
Jun 11 13:43:14 naspberry consul[704]:     2018/06/11 13:43:14 [INFO] snapshot: reaping snapshot /var/lib/consul/data/ra
ft/snapshots/18029-8365684-1528587809443
Jun 11 13:43:14 naspberry consul[704]:     2018/06/11 13:43:14 [INFO] raft: Compacting logs from 7306929202844180645 to
8388221
Jun 11 13:43:14 naspberry consul[704]:     2018/06/11 13:43:14 [INFO] raft: Snapshot to 8398461 complete
Jun 11 15:25:50 naspberry consul[704]:     2018/06/11 15:25:50 [INFO] serf: EventMemberReap: osmc-basement
Jun 11 16:16:00 naspberry consul[704]:     2018/06/11 16:16:00 [INFO] consul.fsm: snapshot created in 237.708µs
Jun 11 16:16:00 naspberry consul[704]:     2018/06/11 16:16:00 [INFO] raft: Starting snapshot up to 8400438
Jun 11 16:16:00 naspberry consul[704]:     2018/06/11 16:16:00 [INFO] snapshot: Creating new snapshot at /var/lib/consul
/data/raft/snapshots/18032-8400438-1528748160206.tmp
Jun 11 16:16:00 naspberry consul[704]:     2018/06/11 16:16:00 [INFO] snapshot: reaping snapshot /var/lib/consul/data/ra
ft/snapshots/18032-8382072-1528662548315
Jun 11 16:16:00 naspberry consul[704]:     2018/06/11 16:16:00 [INFO] raft: Compacting logs from 8378128 to 8390198
Jun 11 16:16:03 naspberry consul[704]: panic: page 11014 already freed
Jun 11 16:16:03 naspberry consul[704]: goroutine 12 [running]:
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*freelist).free(0x1366
72e0, 0x16717, 0x0, 0xb58a5000)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/fre
elist.go:121 +0x2f4
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*node).spill(0x136be180, 0x13814040, 0xf0e9a8)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/node.go:363 +0x1e0
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*node).spill(0x136be040, 0x13814020, 0xf0e9a8)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/node.go:350 +0x90
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*node).spill(0x13862700, 0x139d1e80, 0x1385ba14)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/node.go:350 +0x90
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Bucket).spill(0x139d1e60, 0x139d1e00, 0x1385bb84)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/bucket.go:570 +0x37c
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Bucket).spill(0x133b488c, 0xbebfd400, 0x1ad8249)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/bucket.go:537 +0x2d4
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Tx).Commit(0x133b4880, 0xb593b2f2, 0x8)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/tx.go:163 +0xb8
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.(*BoltStore).DeleteRange(0x136f36d0, 0x7fd710, 0x0, 0x800636, 0x0, 0x0, 0x0)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:183 +0x264
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*LogCache).DeleteRange(0x1339e810, 0x7fd710, 0x0, 0x800636, 0x0, 0x2, 0xbebfd400)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/log_cache.go:78 +0xb8
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*Raft).compactLogs(0x135461e0, 0x802e36, 0x0, 0x0, 0x0)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/snapshot.go:235 +0x2b0
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*Raft).takeSnapshot(0x135461e0, 0x0, 0x0, 0x0, 0x0)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/snapshot.go:201 +0x8cc
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*Raft).runSnapshots(0x135461e0)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/snapshot.go:79 +0x30c
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*Raft).(github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.runSnapshots)-fm()
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/api.go:507 +0x1c
Jun 11 16:16:03 naspberry consul[704]: github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*raftState).goFunc.func1(0x135461e0, 0x134773b8)
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/state.go:146 +0x48
Jun 11 16:16:03 naspberry consul[704]: created by github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(*raftState).goFunc
Jun 11 16:16:03 naspberry consul[704]:         /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/state.go:144 +0x50
Jun 11 16:16:03 naspberry systemd[1]: consul.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jun 11 16:16:03 naspberry systemd[1]: consul.service: Failed with result 'exit-code'.
@pearkes pearkes added the needs-investigation The issue described is detailed and complex. label Jul 26, 2018
@pearkes
Copy link
Contributor

pearkes commented Oct 26, 2018

Thanks for reporting. I believe this is a duplicate of #3771.

@pearkes pearkes closed this as completed Oct 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-investigation The issue described is detailed and complex.
Projects
None yet
Development

No branches or pull requests

2 participants