-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split brain resolver intermittently fails to recreate shard #3455
Comments
Thanks! We'll look into it. |
I think the bug here is just in Cluster.Sharding itself, so I'll see if we can recreate this locally. |
Version: Akka.Cluster.Sharding 1.3.8-beta66 We are using persistence with BigTable (https://github.com/hafslundnett/hn-akka-persistence-bigtable) I think we got a similar error:
We are running with 8 nodes (akka-seed-0 to akka-seed-7), where akka-seed-0 and akka-seed-1 are seed nodes. It seems the error occurred after akka-seed-0 and akka-seed-4 were restarted about the same time (time 04:00:07 and 04:00:10). It seems like the nodes that went down is not able to register to coordinator again. akka-seed-0.txt |
looks like the same issue as #3204 |
closed as part of Akka.NET v1.3.12. |
Version: Akka.Cluster.Sharding 1.3.6-beta62
I've got a three node cluster with a very simple sharded entity. I'm testing the static-quorum split brain resolver strategy and have hit a bug.
The quorum size is set to 2 so when I bring down one node I expect the shard to migrate to one of the UP nodes. I'm also using SQL Server persistence. 'Auto down' is not enabled.
Sometimes this works, however around 50% of the time I get the following exception
[ERROR][17/05/2018 13:37:53][Thread 0020][[akka://akka-cluster-server/system/sharding/my-actorCoordinator/singleton/coordinator#1726064772]] Exception in ReceiveRecover when replaying event type [Akka.Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [9] for persistenceId [/system/sharding/my-actorCoordinator/singleton/coordinator] Cause: System.ArgumentException: Shard 1 is already allocated Parameter name: e at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e) at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message) at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message) at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.<Recovering>b__0(Receive receive, Object message)
The text was updated successfully, but these errors were encountered: