Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Akka.Cluster.Sharding: automatically handle ShardingEnvelope and StartEntity without forcing user to define handlers #6717

Closed
Aaronontheweb opened this issue May 3, 2023 · 6 comments · Fixed by #6863

Comments

@Aaronontheweb
Copy link
Member

Is your feature request related to a problem? Please describe.

We have two built-in message types inside Akka.Cluster.Sharding that are designed to be routed automatically to entity actors:

ShardEnvelope - used to wrap normal messages with EntityId data so they can be routed correctly to the appropriate entity actor.

/// <summary>
/// <para>Default envelope type that may be used with Cluster Sharding.</para>
/// <para>
/// The alternative way of routing messages through sharding is to not use envelopes,
/// and have the message types themselves carry identifiers.
/// </para>
/// </summary>
public sealed class ShardingEnvelope: IWrappedMessage
{
public string EntityId { get; }
public object Message { get; }
public ShardingEnvelope(string entityId, object message)
{
EntityId = entityId;
Message = message;
}
}

This message is going to be used heavily in our reliable delivery system going forward

ShardRegion.StartEntity - used to power remember-entities=on and currently we tell users they have to support this in their message extractors.

public sealed class StartEntity : IClusterShardingSerializable, IEquatable<StartEntity>
{
/// <summary>
/// An identifier of an entity to be started. Unique in scope of a given shard.
/// </summary>
public readonly EntityId EntityId;
/// <summary>
/// Creates a new instance of a <see cref="StartEntity"/> class, used for requesting
/// to start an entity with provided <paramref name="entityId"/>.
/// </summary>
/// <param name="entityId">An identifier of an entity to be started on a given shard.</param>
public StartEntity(EntityId entityId)
{
EntityId = entityId;
}
#region Equals
public override bool Equals(object obj)
{
return Equals(obj as StartEntity);
}
public bool Equals(StartEntity other)
{
if (ReferenceEquals(other, null)) return false;
if (ReferenceEquals(other, this)) return true;
return EntityId.Equals(other.EntityId);
}
public override int GetHashCode()
{
return EntityId.GetHashCode();
}
public override string ToString() => $"StartEntity({EntityId})";
#endregion
}

Describe the solution you'd like

The ShardRegion should just handle both of these messages automatically, without requiring users to manually handle them inside their own message extractors. We know what these messages are for and we have all the data we need to route them, so why not?

Additional context

It is possible that there can be a performance impact to doing this, as we now have to filter the messages twice instead of once - but erroring on the side of having better DX here seems like a no-brainer.

@Aaronontheweb
Copy link
Member Author

So this is a bit of a problem:

/// <summary>
/// Interface of the function used by the <see cref="ShardRegion"/> to
/// extract the shard id from an incoming message.
/// Only messages that passed the <see cref="ExtractEntityId"/> will be used
/// as input to this function.
/// </summary>
public delegate ShardId ExtractShardId(Msg message);
/// <summary>
/// Interface of the partial function used by the <see cref="ShardRegion"/> to
/// extract the entity id and the message to send to the entity from an
/// incoming message. The implementation is application specific.
/// If the partial function does not match the message will be
/// `unhandled`, i.e. posted as `Unhandled` messages on the event stream.
/// Note that the extracted message does not have to be the same as the incoming
/// message to support wrapping in message envelope that is unwrapped before
/// sending to the entity actor.
/// </summary>
public delegate Option<(EntityId, Msg)> ExtractEntityId(Msg message);

In order for this system to work properly, and we have to solve the same issue we ran into when implementing the GetEntityLocation query:

#6101 (comment)

One thing that is awkward about this query is that the ShardExtractor has to support the ShardRegion.StartEntity type in order for this to work - the reason being is that even though we already know what the entity id is, the extractors that resolve the ShardId are looking for an input message to generate the ShardId, not the EntityId. Therefore I have to smuggle the EnttityId back through a ShardRegion.StartEntity message. That makes this a bit hacky but it's the best compromise I could make.

Really, the ExtractShardId and ExtractEntityId should have been designed to be monadic:

public delegate Option<(EntityId, Msg)> ExtractEntityId(Msg message);

 public delegate ShardId ExtractShardId(EntityId entityId);

The benefits of doing it that way:

  1. Can generically route messages and queries that aren't explicitly defined by the user's custom method;
  2. More performant - can operate off of the EntityId directly without needing to extract it from the message for a second time (reduces per-message routing costs); and
  3. Better separation of concerns TBH - extracting the EntityId from the message is the user's job. Extracting the ShardId from the EntityId is mostly handled by Akka.NET infrastructure anyway (although it can and should still be customizable.)

But since they're not, that leave us in a pickle. We can either:

  1. Add yet another set of overloads to the sharding creation methods to use "properly designed" methods that separate concerns more cleanly - obsolete the older methods;
  2. Break the APIs in v1.6 and replace them;
  3. Do a combination of both; or
  4. Do nothing.

We should make a decision, but my guess is that the size and scope of this decision puts things somewhat out of reach for now.

@Aaronontheweb Aaronontheweb modified the milestones: 1.5.5, 1.5.6, 1.5.7 May 4, 2023
@ismaelhamed
Copy link
Member

ShardRegion.StartEntity - used to power remember-entities=on and currently we tell users they have to support this in their message extractors.

Wow! Since when do we need to handle StartEntity in the EntityId() method? AFAIK, that's only a concern when extracting the ShardId.

Really, the ExtractShardId and ExtractEntityId should have been designed to be monadic:

IMO, any improvements to the internals of the project that deviate from how the JVM works, it'd be wiser to bring them up with the original Team/Community so that they can also benefit from them.

@Aaronontheweb
Copy link
Member Author

@ismaelhamed it's always been that way - we do handle it automatically here for a specific type of query:

var entityId = getEntityLocation.EntityId;
var shardId = _extractShardId(new StartEntity(getEntityLocation.EntityId));
if (string.IsNullOrEmpty(shardId))
{
// unsupported entityId - could only happen in highly customized extractors
sender.Tell(new EntityLocation(getEntityLocation.EntityId, shardId, Address.AllSystems,
Option<IActorRef>.None));
return;
}

Technically it's true - the message only needs to be handled by the ShardExtractor, but the routing system doesn't do anything special for it - typically it passes the input into the ExtractEntityId method and then passes the extracted EntityId back into the ShardId. My proposal in this case is to automatically handle these messages but not needing to pass the message content itself in as input to the ShardId method anymore.

IMO, any improvements to the internals of the project that deviate from how the JVM works, it'd be wiser to bring them up with the original Team/Community so that they can also benefit from them.

So looking at how Akka 2.6.19 does things:

https://github.com/akka/akka/blob/3843e0f34d8a5c4e2e6a9b9112839fd483c45907/akka-cluster-sharding-typed/src/main/scala/akka/cluster/sharding/typed/internal/ClusterShardingImpl.scala#L52-L62

@InternalApi private[akka] class ExtractorAdapter[E, M](delegate: ShardingMessageExtractor[E, M])
    extends ShardingMessageExtractor[Any, M] {
  override def entityId(message: Any): String = {
    message match {
      case ShardingEnvelope(entityId, _) => entityId null
      case ClassicStartEntity(entityId)  => entityId
      case msg                           => delegate.entityId(msg.asInstanceOf[E])
    }
  }

  override def shardId(entityId: String): String = delegate.shardId(entityId)

They also do what I'm proposing here - and they redesigned their extractors to work as I described above: only the entityId is needed as input.

@ismaelhamed
Copy link
Member

The ExtractorAdapter seems like a good compromise (no need to touch how things currently work in Akka Classic, for better or worse). And we've already brought Akka Typed structures in the past (i.e., the ShardingEnvelope wasn't a thing until we ported the ShardedDaemonProcess)

@Aaronontheweb
Copy link
Member Author

In order for that adapter structure to work, we have to deprecate the old ShardExtractor API and make it work off of an EntityId only - which I'm all for.

@Aaronontheweb
Copy link
Member Author

On #6863 I've been able to implement the basic structures needed to support this without any breaking changes - and it looks like there is automatic handling for StartEntity in two different places (both the base HashCodeMessageExtractor and inside the ShardRegion itself.) Going to clean that up next.

@Aaronontheweb Aaronontheweb modified the milestones: 1.5.12, 1.5.13 Aug 2, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.13, 1.5.14 Sep 20, 2023
@Aaronontheweb Aaronontheweb modified the milestones: 1.5.14, 1.5.15 Nov 29, 2023
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Jan 4, 2024
Aaronontheweb added a commit that referenced this issue Jan 4, 2024
…artEntity` and `ShardEnvelope` handling (#6863)

* added new API to help address #6717

* integrating into `ShardRegion` and `Shard`

* finished all API changes

* API approvals

* updated `ShardRegion` to use new APIs

* optimized `Shard`

* updated API approvals

* aggressively inline hashcode message extractor calls

* fixed `ShardedDaemon` message extractor

* added `ExtractorAdapter` to automatically handle messages

* fixed spec

* perf

* Revert "perf"

This reverts commit 7c0a7f4.

* fixed bugs with `StartEntity`

* made `MessageExtractor` and `ShardIdExtractor` delegates obsolete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants