A question we get asked about frequently with Akka.NET is “how can I guarantee that my messages will eventually be received and processed by another actor?” Most of the time it’s developers who work on finance applications who have this question, because each missed message means that someone’s accounting is off - but developers in lots of other domains have this issue too.

Akka.NET’s Default Messaging Guarantees

By default Akka.NET does not guarantee delivery of messages - instead, Akka.NET uses “at most once” message delivery.

The reasons why this design decision was made are outlined well here, but I’ll give you the summarized version: guaranteed message delivery is slow, expensive, and adds a large number of moving parts to any application using Akka.NET.

And most of the time, you don’t need it.

All in-memory message passing within one application, for instance, is going to be guaranteed unless you hit an OutOfMemoryException or any of the other standard CLR failures - those failure cases are no different than what would happen if you were invoking methods instead of passing messages.

But let’s talk about the cases where you do need guaranteed message delivery: when you’re passing messages over the network.

Guaranteed Message Delivery over the Network with Akka.Persistence

When it comes to mission-critical messages that contain important data, you can’t afford to be at the mercy of imperfect networks. So you have to take active steps to ensure delivery of those messages - enter the AtLeastOnceDeliveryActor base class in Akka.NET’s Akka.Persistence model.

In case you missed it, read our introduction to event sourcing with Akka.Persistence here.

As we mention in the Akka.NET documentation, delivery of a single message is guaranteed only if we can definitively answer “yes” to all of the following questions:

  1. The message is sent out on the network?
  2. The message is received by the other host?
  3. The message is put into the target actor’s mailbox?
  4. The message is starting to be processed by the target actor?
  5. The message is processed successfully by the target actor?

The only realistic way to implement this is to use an acknowledgement-based protocol, where the actor sending a message receives some form of acknowledgment from the recipient once the message has been successfully processed. This is in essence how TCP and other reliable network protocols work.

The AtLeastOnceDeliveryActor base class exposes the components needed for us to satisfy all five guarantees - here’s, in essence, how it works.

Akka.NET AtLeastOnceDeliveryActor success case

The first thing to bear in mind is that all AtLeastOnceDeliveryActor implementations are also PersistentActors, so they have a distinct PersistenceId and write the messages they are guaranteeing for delivery to a persistent journal.

For instance, let’s take the following AtLeastOnceDeliveryReceiveActor:

using System;
using Akka.Actor;
using Akka.Persistence;

public class MyAtLeastOnceDeliveryActor : AtLeastOnceDeliveryReceiveActor
{
    // Going to use our name for persistence purposes
    public override string PersistenceId => Context.Self.Path.Name;
    private int _counter = 0;

    private ICancelable _recurringMessageSend;
    private readonly IActorRef _targetActor;

    private class DoSend { }

    const string Characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

    public MyAtLeastOnceDeliveryActor(IActorRef targetActor)
    {
        _targetActor = targetActor;

        // recover the most recent at least once delivery state
        Recover<SnapshotOffer>(
		offer => offer.Snapshot is Akka.Persistence.AtLeastOnceDeliverySnapshot, 
		offer =>
        {
            var snapshot = offer.Snapshot as Akka.Persistence.AtLeastOnceDeliverySnapshot;
            SetDeliverySnapshot(snapshot);
        });

        Command<DoSend>(send =>
        {
            Self.Tell(new Write("Message " + Characters[_counter++ % Characters.Length]));
        });

        Command<Write>(write =>
        {
            Deliver(_targetActor.Path, 
			messageId => new ReliableDeliveryEnvelope<Write>(write, messageId));

            // save the full state of the at least once delivery actor
            // so we don't lose any messages upon crash
            SaveSnapshot(GetDeliverySnapshot());
        });

        Command<ReliableDeliveryAck>(ack =>
        {
            ConfirmDelivery(ack.MessageId);
        });

        Command<SaveSnapshotSuccess>(saved =>
        {
            var seqNo = saved.Metadata.SequenceNr;

			// delete all but the most current snapshot
            DeleteSnapshots(new SnapshotSelectionCriteria(seqNo, 
			saved.Metadata.Timestamp.AddMilliseconds(-1))); 
        });

        Command<SaveSnapshotFailure>(failure =>
        {
            // log or do something else
        });
    }

    protected override void PreStart()
    {
        _recurringMessageSend = Context.System.Scheduler
        .ScheduleTellRepeatedlyCancelable(
				TimeSpan.FromSeconds(1),
            	TimeSpan.FromSeconds(10), Self, new DoSend(), Self);

        base.PreStart();
    }

    protected override void PostStop()
    {
        _recurringMessageSend?.Cancel();

        base.PostStop();
    }
}

I left out some auxiliary pieces of code in this sample just to make it easier to read, but you can see the full source for this AtLeastOnceDelivery actor and the rest of this sample here.

How AtLeastOnceDeliveryActors Work

The trick to making at least once message delivery work lies in this function:

Deliver(_targetActor.Path, 
	messageId => new ReliableDeliveryEnvelope<Write>(write, messageId));

The AtLeastOnceDeliveryActor.Deliver method is what makes the entire at least once delivery process work; it immediately does the four following things:

  1. Records the exact message we are trying to send (a user-defined Write message in this case) and the ActorPath of the actor we’re trying to send it to;
  2. Assigns a monotonically increasing message ID to that message, and passes it back in the end-user in the Func<long, object> lambda defined as the second parameter of the message - more on that in a moment;
  3. Schedules a redelivery attempt to occur after the interval configured at akka.persistence.at-least-once-delivery.redeliver-interval, which is 5 seconds by default; and
  4. Executes the first delivery attempt to the actor at the given path.

The most important thing to understand with this process is that we have to include the message ID assigned by the AtLeastOnceDeliveryActor in the final message that is going to be delivered to the recipient actor, because the recipient actor, which can be any user-defined actor, must be able to reply back to the AtLeastOnceDeliveryActor with a confirmation that it has processed the message that corresponds to that ID.

In this case, we pack the original Write message into a generic wrapper “envelope” message class called ReliableDeliveryEnvelope<T>:

public class ReliableDeliveryEnvelope<TMessage>
{
    public ReliableDeliveryEnvelope(TMessage message, long messageId)
    {
        Message = message;
        MessageId = messageId;
    }

    public TMessage Message { get; private set; }

    public long MessageId { get; private set; }
}

This allows us to add at least once delivery functionality wherever we need it in our application without having to reinvent our message classes - we can simply wrap that class with and outer envelope that we need to include the at least once delivery metadata.

Now, on the other side of the interaction here we have to make sure that our actor receiving the messages knows what to do.

public class MyRecipientActor : ReceiveActor
{
    public MyRecipientActor()
    {
        Receive<ReliableDeliveryEnvelope<Write>>(write =>
        {
            Console.WriteLine("Received message {0} [id: {1}] from {2} - accept?",
			 write.Message.Content, write.MessageId, Sender);
            var response = Console.ReadLine()?.ToLowerInvariant();
            if (!string.IsNullOrEmpty(response) 
				&& (response.Equals("yes") 
				|| response.Equals("y")))
            {
                // confirm delivery only if the user explicitly agrees
                Sender.Tell(new ReliableDeliveryAck(write.MessageId));
                Console.WriteLine("Confirmed delivery of message ID {0}", 
				write.MessageId);
            }
            else
            {
                Console.WriteLine("Did not confirm delivery of message ID {0}", 
				write.MessageId);
            }
        });
    }
}

If you’re receiving messages from an AtLeastOnceDeliveryActor, you have to be aware that those messages must be responded to with an explicit acknowledgement otherwise they will be redelivered forever.

That’s why our MyRecipientActor will extract the MessageId from the ReliableDeliveryEnvelope<T> and send a ReliableDeliveryAck(long messageId) back to the original sender.

When the original MyAtLeastOnceDeliveryActor receives the ReliableDeliveryAck, we call another built-in method on the AtLeastOnceDeliveryActor base class that will cancel any additional attempts to deliver the message, because it’s already been successfully processed.

Command<ReliableDeliveryAck>(ack =>
{
    ConfirmDelivery(ack.MessageId);
});

The AtLeastOnceDeliveryActor.ConfirmDelivery method marks the message as delivered and removes it from the AtLeastOnceDeliveryActor’s internal state, so we won’t attempt to deliver it again going forward.

And that’s the basics for how at least once message delivery is implemented, on the happy path.

When Deliveries Fail

The unhappy path occurs when one of the following three scenarios happen: 1. The recipient actor never receives the original message, because it never arrived - often due to a network failure; 2. The recipient actor received the original message, but died before it was able to process it; or 3. The recpient actor received and processed the original message, but the ACK message back to the sender was lost or delayed in transit.

Scenario 3 is special and we’ll get to that at the end, but let’s focus on the first two scenarios for now: failing to process a message.

Akka.NET AtLeastOnceDeliveryActor failure case

In this scenario, the recipient actor is never able to process the message - whether the actor never received the message or received it and failed to process it is irrelevant to the sender. The outcome is the same.

The sender, having not received an acknowledgment or a confirmation message back from the recipient, will automatically retry redelivering a message every 5 seconds (or whatever interval you configure with the akka.persistence.at-least-once-delivery.redeliver-interval HOCON setting.)

The at least once delivery actor will continue attempting to redeliver the message at that interval, indefinitely, until it is finally acknowledged. Here’s a brief demonstration of that behavior in-action using the AtLeastOnceDelivery sample

Live demonstration of AtLeastOnceDeliveryActor attempting to redeliver a message

In the recording above you can see that I chose not to acknowledge message ID 2, and that it was redelivered to the recipient actor again before message ID 3 arrived. You can try playing around with this yourself if you follow the link above.

Duplicate Messages

One pitfall of at least once message delivery is the possibility of duplicates, which is exactly what happens in scenario 3 above: when the recipient actor successfully processes the message but doesn’t confirm with the AtLeastOnceDeliveryActor before the redelivery deadline expires.

Akka.NET AtLeastOnceDeliveryActor duplicate message case

So, what does this mean? It means that in an at least once delivery scenario, the recipient actor needs to be programmed to handle duplicates. We’re going to cover how to do that in our next post!

When the AtLeastOnceDeliveryActor Crashes

One more important case needs to be covered in our exploration of the AtLeastOnceDeliveryActor - what happens if the AtLeastOnceDeliveryActor itself crashes or if the machine its on is killed? What happens then?

The AtLeastOnceDeliveryActor inherits from the PersistentActor base class, thus it has all of the same state durability and recovery capabilities that a normal PersistentActor does.

In the MyAtLeastOnceDeliveryActor we do a couple of important things:

// recover the most recent at least once delivery state
Recover<SnapshotOffer>(
	offer => offer.Snapshot is Akka.Persistence.AtLeastOnceDeliverySnapshot, 
	offer =>
{
    var snapshot = offer.Snapshot as Akka.Persistence.AtLeastOnceDeliverySnapshot;
    SetDeliverySnapshot(snapshot);
});

First, when the actor initially starts it will recover its entire delivery state from a SnapshotOffer and it will then use the built-in AtLeastOnceDeliveryActor.SetDeliverySnapshot method to re-hydrate its state from whatever was previously stored in the database.

Command<Write>(write =>
{
    Deliver(_targetActor.Path, messageId => 
        new ReliableDeliveryEnvelope<Write>(write, messageId));

    // save the full state of the at least once delivery actor
    // so we don't lose any messages upon crash
    SaveSnapshot(GetDeliverySnapshot());
});

Next, whenever there’s a command to deliver a new message we create a new snapshot of this state and save that, using the AtLeastOnceDeliveryActor.GetDeliverySnapshot method to grab a serializable copy of our current delivery state.

In the full AtLeastOnceDelivery sample we also have some code that causes the MyAtLeastOnceDeliveryActor to also periodically save new snapshots so it can cleanup confirmed messages, but we omitted that for brevity here.

So there you have it - in a nutshell we’ve covered the nuts and bolts of how to use AtLeastOnceDeliveryActor to setup guaranteed delivery of messages inside your Akka.NET applications. Post a comment if you have questions!

If you liked learning about at least once message delivery, considering taking our Akka.Remote Virtual Training - where we cover how to use it in combination with designing distributed systems built on top of Akka.NET.

UPDATE: We just published the sequel to this post, “Why You Should Try to Avoid Exactly Once Message Delivery.” Give that a read!

If you liked this post, you can share it with your followers or follow us on Twitter!
Written by Aaron Stannard on March 11, 2016

 

Upcoming Petabridge Live Akka.NET Webinar Trainings

Get up to speed on the bleeding edge of large-scale .NET development with the Petabridge team. Each training is done remotely via webinar, lasts four hours, and will save you weeks of trial and error.

Get to the cutting edge with Akka.NET

Learn production best practices, operations and deployment approaches for using Akka.NET.