Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClusterStressSpec and Cluster Failure Detector Cleanup #4940

Merged

Conversation

Aaronontheweb
Copy link
Member

close #4849

Fixes Akka.Cluster failure detector removal for new nodes that appear in the gossip regardless of what their current membership status is - useful for scenarios where:

  1. Node A heartbeats Node B;
  2. Network split occurs - Node B gets downed, Node A stays up and maintains heartbeat record for Node A;
  3. Node A restarts - rejoins unstable cluster and quickly gets promoted to WeaklyUp;
  4. Network split heals - Node A has massive PhiAccrual value for Node B and immediately marks it as UNREACHABLE due to old data that occurred in previous split.

This resolves that issue by making it so when Node A appears as a new member in Node B's gossip (due to its different UID) the failure detector records get reset anyway, even if Node A appears as WeaklyUp initially and not Joining.

@Aaronontheweb
Copy link
Member Author

This is a clean rebase of #4899

Copy link
Member Author

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-reviewed this PR

@@ -175,7 +175,7 @@ protected override void AfterTermination()

//TODO: ExpectedTestDuration?

void MuteLog(ActorSystem sys = null)
public virtual void MuteLog(ActorSystem sys = null)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed this for changing MuteLog behavior in individual tests

@@ -348,6 +348,7 @@ Target "MultiNodeTests" (fun _ ->

Target "MultiNodeTestsNetCore" (fun _ ->
if not skipBuild.Value then
setEnvironVar "akka.cluster.assert" "on" // needed to enable assert invariants for Akka.Cluster
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Causes MNTR runs to blow up if the Gossip class does something it shouldn't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be picked up by Akka, or do we need to change it to AKKA_CLUSTER_ASSERT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't realize it was picked up as is inside the code, but maybe we should follow our own environment variable naming standard by making this AKKA_CLUSTER_ASSERT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should - I'll update on that on a subsequent PR. I need to tweak some of the timing settings for StressSpec since the build server is way underpowered compared to my development machine. I'll address that with this too.

{
bool GetAssertInvariants()
{
var isOn = Environment.GetEnvironmentVariable("akka.cluster.assert")?.ToLowerInvariant();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert invariants only runs when the akka.cluster.assert environment variable is set to on

@@ -1964,8 +1964,11 @@ public ReceiveGossipType ReceiveGossip(GossipEnvelope envelope)
// for all new joining nodes we remove them from the failure detector
foreach (var node in _latestGossip.Members)
{
if (node.Status == MemberStatus.Joining && !localGossip.Members.Contains(node))
if (!localGossip.Members.Contains(node))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix for #4849

@@ -649,11 +651,12 @@ public ImmutableHashSet<UniqueAddress> Receivers(UniqueAddress sender)
{
if (iter.MoveNext() == false || n == 0)
{
iter.Dispose(); // dispose enumerator
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never disposed the IEnumerator here before, which was a bug.

/// </summary>
internal class Reachability //TODO: ISerializable?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real code changes here - just reformatted the class.

@@ -502,7 +502,7 @@ public bool Equals(UniqueAddress other)
}

/// <inheritdoc cref="object.Equals(object)"/>
public override bool Equals(object obj) => obj is UniqueAddress && Equals((UniqueAddress)obj);
public override bool Equals(object obj) => obj is UniqueAddress address && Equals(address);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just ReSharper'd

@@ -508,7 +508,6 @@ public int InitialParticipants
/// </summary>
public void RunOn(Action thunk, params RoleName[] nodes)
{
if (nodes.Length == 0) throw new ArgumentException("No node given to run on.");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this to allow a RunOn statement to execute for zero nodes if needed, which we do in some cases inside the StressSpec, since that spec is dynamic and runs for a configurable number of nodes at each stage.

@@ -30,11 +31,11 @@ public DefaultFailureDetectorRegistry(Func<FailureDetector> factory)

private readonly Func<FailureDetector> _factory;

private AtomicReference<Dictionary<T, FailureDetector>> _resourceToFailureDetector = new AtomicReference<Dictionary<T, FailureDetector>>(new Dictionary<T, FailureDetector>());
private AtomicReference<ImmutableDictionary<T, FailureDetector>> _resourceToFailureDetector = new AtomicReference<ImmutableDictionary<T, FailureDetector>>(ImmutableDictionary<T, FailureDetector>.Empty);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a potential mutability issue here.

@@ -154,7 +154,7 @@ public State(HeartbeatHistory history, long? timeStamp)

private AtomicReference<State> _state;

private State state
internal State state
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made internal for testing purposes only.

Copy link
Contributor

@Arkatufus Arkatufus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@Aaronontheweb
Copy link
Member Author

Looks like this blew up on the build server because the TestKit logger timed out in StressSpec:

2021-04-16T14:21:03.9719991Z [Node2:node-2][4/16/2021 2:21:03 PM][FAIL-EXCEPTION] Type: Akka.Configuration.ConfigurationException
2021-04-16T14:21:03.9732620Z [Node2:node-2][4/16/2021 2:21:03 PM][FAIL-EXCEPTION] Type: System.AggregateException
2021-04-16T14:21:03.9742849Z [Node2:node-2][4/16/2021 2:21:03 PM][FAIL-EXCEPTION] Type: Akka.Actor.AskTimeoutException
2021-04-16T14:21:03.9762734Z --> [Node2:node-2][4/16/2021 2:21:03 PM][FAIL-EXCEPTION] Message: Logger [Akka.TestKit.TestEventListener, Akka.TestKit] specifie
2021-04-16T14:21:03.9806036Z Unknown message: d in config cannot be loaded: System.AggregateException: One or more errors occurred. ---> Akka.Actor.AskTimeoutException: Timeout after 00:00:05 seconds
2021-04-16T14:21:03.9853083Z    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
2021-04-16T14:21:03.9870018Z    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
2021-04-16T14:21:04.1378753Z    at Akka.Actor.Futures.<Ask>d__8.MoveNext()
2021-04-16T14:21:04.1435493Z --- End of stack trace from previous location where exception was thrown ---
2021-04-16T14:21:04.1628432Z    at System.Runtime.ExceptionServices.ExceptionDispatchInfo
2021-04-16T14:21:04.1644076Z Unknown message: .Throw()
2021-04-16T14:21:04.1654609Z    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
2021-04-16T14:21:04.1666075Z    at Akka.Actor.Futures.<Ask>d__6`1.MoveNext()
2021-04-16T14:21:04.1677599Z    --- End of inner exception stack trace ---
2021-04-16T14:21:04.1691606Z    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
2021-04-16T14:21:04.1823157Z    at Akka.Event.LoggingBus.AddLogger(ActorSystemImpl system, Type loggerType, LogLevel logLevel, String loggingBusName, TimeSpan timeout)
2021-04-16T14:21:04.1843422Z    at Akka.Event.LoggingBus.StartDefaultLoggers(ActorSystemImpl system)
2021-04-16T14:21:04.1893458Z --->
2021-04-16T14:21:04.2188796Z Unknown message:  (Inner Exception #0) Akka.Actor.AskTimeoutException: Timeout after 00:00:05 seconds
2021-04-16T14:21:04.3073642Z    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
2021-04-16T14:21:06.6969980Z    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
2021-04-16T14:21:06.6971119Z    at Akka.Actor.Futures.<Ask>d__8.MoveNext()
2021-04-16T14:21:06.6971809Z --- End of stack trace from previous location where exception was thrown ---
2021-04-16T14:21:06.6972498Z    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
2021-04-16T14:21:06.6973097Z    at System.Runtime.CompilerServices.TaskAwaiter.HandleNon
2021-04-16T14:21:06.6973836Z Unknown message: SuccessAndDebuggerNotification(Task task)
2021-04-16T14:21:06.6974851Z    at Akka.Actor.Futures.<Ask>d__6`1.MoveNext()<---

I'll see what I can do to harden that for this spec but it may not be much...

@Aaronontheweb Aaronontheweb enabled auto-merge (squash) April 16, 2021 17:50
@Aaronontheweb Aaronontheweb merged commit 3ac3ee8 into akkadotnet:dev Apr 16, 2021
@Aaronontheweb Aaronontheweb deleted the fix/4849-rebase-clusterStressSpec branch April 16, 2021 22:20
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this pull request Apr 17, 2021
Per some of the suggestions on akkadotnet#4940 PR review
Aaronontheweb added a commit that referenced this pull request Apr 17, 2021
Per some of the suggestions on #4940 PR review
This was referenced Apr 28, 2021
Aaronontheweb added a commit that referenced this pull request Apr 28, 2021
* Added v1.4.19 placeholder

* close #4860 - use local deploy for TcpManager child actors. (#4862)

* close #4860 - use local deploy for TcpManager child actors.

* Use local deploy for TcpIncomingConnection.

* Use local deploy for Udp actors.

Co-authored-by: Erik Folstad <erikmafo@gmail.com>
Co-authored-by: Aaron Stannard <aaron@petabridge.com>

* Merge pull request #4875 from akkadotnet/dependabot/nuget/Hyperion-0.9.17

Bump Hyperion from 0.9.16 to 0.9.17

* Bump Newtonsoft.Json from 12.0.3 to 13.0.1 (#4866)

Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 12.0.3 to 13.0.1.
- [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases)
- [Commits](JamesNK/Newtonsoft.Json@12.0.3...13.0.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Fix ClusterMetricsExtensionSpec racy spec

* Clean up Akka.Stream file stream (#4874)

* Make sure that FileSubscriber shuts down cleanly when it dies

* Make sure that file all sink spec release the file handle if it fails

* Supress ActorSelectionMessage with DeadLetterSuppression (migrated from akka/akka#28341) (#4889)

* for example the Cluster InitJoin message is marked with DeadLetterSuppression
  but was anyway logged because sent with actorSelection
* for other WrappedMessage than ActorSelectionMessage we shouldn't unwrap and publish
  the inner in SuppressedDeadLetter because that might loose some information
* therefore those are silenced in the DeadLetterListener instead

Better deadLetter logging of wrapped messages (migrated from akka/akka#28253)

Logging of UnhandledMessage (migrated from akka/akka#28414)
* make use of the existing logging of dead letter
  also for UnhandledMessage

Add Dropped to Akka.Actor (migrated partially from akka/akka#27160)
Log Dropped from DeadLetterListener

* add CultureInfo for Turkish OS (#4880)

* add CultureInfo for Turkish OS

added English CultureInfo to fix ToUpper function causes error on Turkish OS. "warning"->"WARNİNG"

* fix LogLevel TR char error

Co-authored-by: Aaron Stannard <aaron@petabridge.com>

* Harden FileSink unit tests by using AwaitAssert to wait for file operations to complete (#4891)

* Harden FileSink unit tests by using AwaitAssert to wait for file operations to complete

* Use AwaitResult to improve readability

Co-authored-by: Aaron Stannard <aaron@petabridge.com>

* Handle CoordinatedShutdown exiting-completed when not joined (#4893)

* assertion failed: Nodes not part of cluster have marked the Gossip as seen
* trying to mark the Gossip as seen before it has joined, which may happen
  if CoordinatedShutdown is running before the node has joined

migrated from akka/akka#26835

* Persistence fixes (#4892)

* snapshot RecoveryTick ignored, part of akka/akka#20753

* lastSequenceNr should reflect the snapshot sequence and not start with 0 when journal is empty. Migrated from akka/akka#27496

* Enforce valid seqnr for deletes, migrated from akka/akka#25488

* api approval

* Added DoNotInherit annotation (#4896)

* Bump Microsoft.NET.Test.Sdk from 16.9.1 to 16.9.4 (#4894)

* Add Setup class for NewtonSoftJsonSerializer (#4890)

* Add Setup class for NewtonSoftJsonSerializer

* Use Setup as a settings modifier instead of a settings factory

* Update spec

* Update API Approval list

* Unit test can inject null ActorSystem into the serializer causing the Setup system to throw a NRE

* Add documentation

* fixed up copyright headers (#4898)

* Bump Google.Protobuf from 3.15.6 to 3.15.7 (#4900)

Bumps [Google.Protobuf](https://github.com/protocolbuffers/protobuf) from 3.15.6 to 3.15.7.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/master/generate_changelog.py)
- [Commits](protocolbuffers/protobuf@v3.15.6...v3.15.7)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Added PhiAccrualFailureDetector warning logging for slow heartbeats (#4897)

Ported from akka/akka#17389 and akka/akka#24701

* replace reflection magic in MNTR with reading of `MultiNodeConfig` properties (#4902)

* close #4901 - replace reflection magic in MNTR with reading of MultiNodeConfig properties

* fixed outdated DiscoverySpec

* fixed SBR logging error that blew up StandardOutLogger (#4909)

This format error would cause the StandardOutLogger to throw a `FormatException` internally

* added timestamp to node failures in MNTR (#4911)

* cleaned up the SpecPass / SpecFail messages (#4912)

* reduce allocations inside PhiAccrualFailureDetector (#4913)

made `HeartbeatHistory` into a `readonly struct` and cleaned up some other old LINQ calls inside the data structure

* Bump Microsoft.Data.SQLite from 5.0.4 to 5.0.5 (#4914)

* [MNTR] Add include and exclude test filter feature (#4916)

* Add -Dmultinode.include and -Dmultinode.exclude filter feature

* Add documentation

* Fix typos and makes sentences more readable

* Make the sample command line wrap instead of running off the screen

* Change include and exclude filtering by method name instead (requested)

* cleaned up RemoteWatcher (#4917)

* Fixed System.ArgumentNullException in Interspase operation on empty stream finish. (#4918)

* Rewrite the AkkaDiFixture so that it does not need to start a HostBuilder (#4920)

* Fix case where PersistenceMessageSerializer.FromBinary got a null for its type parameter (#4923)

* Bump Google.Protobuf from 3.15.7 to 3.15.8 (#4927)

Bumps [Google.Protobuf](https://github.com/protocolbuffers/protobuf) from 3.15.7 to 3.15.8.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/master/generate_changelog.py)
- [Commits](protocolbuffers/protobuf@v3.15.7...v3.15.8)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* close #4096 - documented how to terminate remembered entities (#4928)

Updated the Akka.Cluster.Sharding documentation to explain how to terminate remembered-entities.

* Add CLI switches to show help and version number (#4925)

* cleaned up protobuf CLI and definitions (#4930)

* cleaned up protobuf CLI and definitions

- Remove all `optional` fields (not allowed in Protobuf3, as all fields are optional by default unless specifically defined as `required`)
- Removed `--experimental_allow_proto3_optional` call from `protoc` compiler as it's no longer supported / needed
- Doesn't have any impact on existing wire formats, especially for `ClusterMessages.proto` which is where I removed all of the `optional` commands

* fixed compilation error caused by change in generated `AppVersion` output

* Fix MNTK specs for DData: DurablePruningSpec (#4933)

* Powershell splits CLI arguments on "." before passing them into applications (#4924)

* porting Cluster heartbeat timings, hardened Akka.Cluster serialization (#4934)

* porting Cluster heartbeat timings, hardened Akka.Cluster serialization

port akka/akka#27281
port akka/akka#25183
port akka/akka#24625

* increased ClusterLogSpec join timespan

Increased the `TimeSpan` here to 10 seconds in order to prevent this spec from failing racily, since even an Akka.Cluster self-join can take more than the default 3 seconds due to some of the timings involved in node startup et al.

* Bump Hyperion from 0.9.17 to 0.10.0 (#4935)

Bumps [Hyperion](https://github.com/akkadotnet/Hyperion) from 0.9.17 to 0.10.0.
- [Release notes](https://github.com/akkadotnet/Hyperion/releases)
- [Changelog](https://github.com/akkadotnet/Hyperion/blob/dev/RELEASE_NOTES.md)
- [Commits](akkadotnet/Hyperion@0.9.17...0.10.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* Add spec for handling delegates in DI (#4922)

* Add spec for handling delegates in DI

* Make sure spec exits cleanly by terminating the actor system.

* Add spec where singleton delegate is called from another actor

* Fix racy test

Co-authored-by: Aaron Stannard <aaron@petabridge.com>

* Bump FsCheck from 2.15.1 to 2.15.2 (#4939)

* ClusterStressSpec and Cluster Failure Detector Cleanup (#4940)

* implementation of Akka.Cluster.Tests.MultiNode.StressSpec

* made MuteLog overrideable in Akka.Cluster.Testkit

* if Roles is empty, then don't run the thunk on any nodes

Changed this to make it consistent with the JVM

* made it possible to actually enable Cluster.AssertInvariants via environment variable

* added assert invariants to build script

cleaned up gossip class to assert more invariants

* ReSharper'd Reachability.cs

* cleaned up immutability and CAS issues inside DefaultFailureDetectorRegistry

added bugfix from akka/akka#23595

* Bump FsCheck.Xunit from 2.15.1 to 2.15.2 (#4938)

Bumps [FsCheck.Xunit](https://github.com/fsharp/FsCheck) from 2.15.1 to 2.15.2.
- [Release notes](https://github.com/fsharp/FsCheck/releases)
- [Changelog](https://github.com/fscheck/FsCheck/blob/master/FsCheck%20Release%20Notes.md)
- [Commits](fscheck/FsCheck@2.15.1...2.15.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* cleanup `AKKA_CLUSTER_ASSERT` environment variable (#4942)

Per some of the suggestions on #4940 PR review

* harden Akka.DependencyInjection.Tests (#4945)

Added some `AwaitAssert` calls to check for disposed dependencies - these calls can be racy due to background actor thread calling `Dipose` after foreground test thread checks the `Disposed` property.

* HeartbeatNodeRing performance (#4943)

* added benchmark for HeartbeatNodeRing performance

* switched to local function

No perf change

* approve Akka.Benchmarks friend assembly for Akka.Cluster

* remove HeartbeatNodeRing.NodeRing() allocation and make field immutable

* made it so Akka.Util.Internal.ArrayExtensions.From no longer allocates (much)

* added some descriptive comments on HeartbeatNodeRing.Receivers

* Replaced `Lazy<T>` with `Option<T>` and a similar lazy initialization check

 Improved throughput by ~10% on larger collections and further reduced memory allocation.

* changed return types to `IImmutableSet`

Did this in order to reduce allocations from constantly converting back and forth from `ImmutableSortedSet<T>` and `ImmutableHashSet<T>` - that way we can just use whatever the underlying collection type is.

* added ReachabilityBenchmarks

* modified PingPong / RemotePingPong benchmarks to display threadcount (#4947)

Using this to gauge the impact certain dispatcher changes have on the total number of active threads per-process

* Configure duration for applying `MemberStatus.WeaklyUp`  to joining nodes (#4946)

* Configure duration for applying `MemberStatus.WeaklyUp`  to joining nodes

port of akka/akka#29665

* fixed validation check for TimeSpan duration passed in via HOCON

* harden ClusterLogSpecs

* restored Akka.Cluster model-based FsCheck specs (#4949)

* added `VectorClock` benchmark (#4950)

* added VectorClock benchmark

* fixed broken benchmark comparisons

* Performance optimize `VectorClock` (#4952)

* Performance optimize `VectorClock`

* don't cache MD5, but dispose of it

* guarantee disposal of iterators during VectorClock.Compare

* switch to local function for `VectorClock.CompareNext`

* fixed a comparison bug in how versions where compared

* minor cleanup

* replace `KeyValuePair<TKey,TValue>` with `ValueTuple<TKey,TValue>`

Reduced allocations by 90%, decreased execution time from 100ms to ~40ms

* harden RestartFirstSeedNodeSpec (#4954)

* harden RestartFirstSeedNodeSpec

* validate that we have complete seed node list prior to test

* Turned `HeatbeatNodeRing` into `struct` (#4944)

* added benchmark for HeartbeatNodeRing performance

* switched to local function

No perf change

* approve Akka.Benchmarks friend assembly for Akka.Cluster

* remove HeartbeatNodeRing.NodeRing() allocation and make field immutable

* made it so Akka.Util.Internal.ArrayExtensions.From no longer allocates (much)

* added some descriptive comments on HeartbeatNodeRing.Receivers

* Replaced `Lazy<T>` with `Option<T>` and a similar lazy initialization check

 Improved throughput by ~10% on larger collections and further reduced memory allocation.

* changed return types to `IImmutableSet`

Did this in order to reduce allocations from constantly converting back and forth from `ImmutableSortedSet<T>` and `ImmutableHashSet<T>` - that way we can just use whatever the underlying collection type is.

* converted `HeartbeatNodeRing` into a `struct`

improved performance some, but I don't want to lump it in with other changes just in case

* Add generalized crossplatform support for Hyperion serializer. (#4878)

* Add the groundwork for generalized crossplatform support.

* Update Hyperion to 0.10.0

* Convert adapter class to lambda object

* Add HyperionSerializerSetup setup class

* Add unit test spec

* Improve specs, add comments

* Add documentation

* Add copyright header.

* Change readonly fields to readonly properties.

* cleaned up `ReceiveActor` documentation (#4958)

* removed confusing and conflicting examples in the `ReceiveActor` documentation
* Embedded reference to "how actors restart" YouTube video in supervision docs

* updated website footer to read 2021 (#4959)

* added indicator for `ClusterResultsAggregator` in `StressSpec` logs (#4960)

Did this to make it easier to search for output logs produced during each phase of the `StressSpec`

* Bump Hyperion from 0.10.0 to 0.10.1 (#4957)

* Bump Hyperion from 0.10.0 to 0.10.1

Bumps [Hyperion](https://github.com/akkadotnet/Hyperion) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/akkadotnet/Hyperion/releases)
- [Changelog](https://github.com/akkadotnet/Hyperion/blob/dev/RELEASE_NOTES.md)
- [Commits](akkadotnet/Hyperion@0.10.0...0.10.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* Fix dependabot Hyperion issue (#4961)

* Update Akka.Remote.Tests.csproj to use common.props

* Update HyperionSerializer to reflect recent hyperion changes

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Gregorius Soedharmo <arkatufus@yahoo.com>

* Perf optimize `ActorSelection` (#4962)

* added memory metrics to `ActorSelection` benchmarks

* added ActorSelection benchmark

* ramped up the iteration counts

* validate that double wildcard can't be used outside of leaf node

* improve allocations on create

* minor cleanup

* create emptyRef only when needed via local function

* made `Iterator` into `struct`

* approved public API changes

* `Reachability` performance optimziation (#4955)

* reduced iteration count to speed up benchmarks

* optimize some System.Collections.Immutable invocations to allocate less

* cleanup dictionary construction

* fixed multiple enumeration bug in `Reachability`

* Fix `SpawnActor` benchmark (#4966)

* cleaned up SpawnActorBenchmarks

* cleaned up SpawnActor benchmarks

* fixed N-1 error inside `Mailbox` (#4964)

This error has no impact on extremely busy actors, but for actors who have to process small bursts of messages this can make the difference between getting everything done in one dispatch vs. doing it in two.

* Clean up bad outbound ACKs in Akka.Remote (#4963)

port of akka/akka#20093

Might be responsible for some quarantines in Akka.Cluster / Akka.Remote when nodes are restarting on identical addresses.

* UnfoldResourceSource closing twice on failure (#4969)

* Added test cases where close would be called twice

* Bugfix UnfoldResource closed resource twice on failure

* Add retry pattern with delay calculation support (#4895)

Co-authored-by: Aaron Stannard <aaron@petabridge.com>

* simplified the environment variable name for StressSpec (#4972)

* Refactored `Gossip` into `MembershipState` (#4968)

* refactor Gossip class into `MembershipState`

port of akka/akka#23291

* completed `MembershipState` port

* fixed some downed observers calls

* forgot to copy gossip upon `Welcome` from Leader

* forgot to copy `MembershipState` while calling `UpdateLatestGossip`

* refactored all DOWN-ing logic to live inside `Gossip` class

* added some additional methods onto `MembershipState`

* fixed ValidNodeForGossip bug

* fixed equality check for Reachability

 should be quality by reference, not by value

Co-authored-by: Gregorius Soedharmo <arkatufus@yahoo.com>

* Fix serialization verification problem with Akka.IO messages (#4974)

* Fix serialization verification problem with Akka.IO messages

* Wrap naked SocketAsyncEventArgs in a struct that inherits INoSerializationVerificationNeeded

* Make the wrapper struct readonly

* Expand exception message with their actor types

* Update API Approver list

* ORDictionary with POCO value missing items, add MultiNode spec (#4910)

* Update DData reference HOCON config to follow JVM

* Clean up ReplicatorSettings, add sensible default values.

* Add DurableData spec that uses ORDictionary with POCO values

* Add a special case for Replicator to suppress messages during load

* Slight change in public API, behaviour is identical.

* Change Replicator class to UntypedActor

* Code cleanup - Change fields to properties

* Update API Approver list

* clean up seed node process (#4975)

* fixed racy ActorModelSpec (#4976)

fixed `A_dispatcher_must_handle_queuing_from_multiple_threads` - we were using the wrong message type the entire time, and the previous instance caused `Thread.Sleep` to be called repeatedly.

* Update PluginSpec so that it can accept ActorSystem and ActorSystemSetup in its constructor (#4978)

* Removed inaccurate warning from Cluster Singleton docs (#4980)

Cluster singletons won't create duplicates in a cluster split scenario, and it's much safer to run them _with_ a split brain resolver on than without. This documentation was just out of date.

* Introduce `ChannelExecutor` (#4882)

* added `ChannelExecutor` dispatcher - uses `FixedConcurrencyTaskScheduler` internally - have `ActorSystem.Create` call `ThreadPool.SetMinThreads(0,0)` to improve performance across the board.

* fixed documentation errors

* Upgrade to GitHub-native Dependabot (#4984)

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

* added v1.4.19 release notes (#4985)

* added v1.4.19 release notes

Co-authored-by: Erik Følstad <32196030+ef-computas@users.noreply.github.com>
Co-authored-by: Erik Folstad <erikmafo@gmail.com>
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Igor <igor@fedchenko.pro>
Co-authored-by: Gregorius Soedharmo <arkatufus@yahoo.com>
Co-authored-by: zbynek001 <zbynek001@gmail.com>
Co-authored-by: Cagatay YILDIZOGLU <yildizoglu@gmail.com>
Co-authored-by: Ismael Hamed <1279846+ismaelhamed@users.noreply.github.com>
Co-authored-by: Anton V. Ilyin <iliyn.anton.v@gmail.com>
Co-authored-by: Arjen Smits <Deathraven163@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Akka.Cluster: quarantining / reachability changes appear to be extremely sensitive
2 participants