Technical Overview of Akka.Cluster.Sharding in Akka.NET

How Akka.Cluster.Sharding Allocates Shards, Rebalances, and More

We have an updated guide to Akka.Cluster.Sharding that incorporates newer APIs and practices. Please see “Distributing State Reliably with Akka.Cluster.Sharding” instead.

In our previous post about using Akka.Cluster.Sharding we looked at the module from an end-user’s perspective. Today we’ll provide a little more insights into how this plugin works internally.

Cluster sharding depends on several types of actors:

Coordinator - one per entity type for an entire cluster.
Shard region - one per entity type per each cluster node, where sharding should be enabled.
Shard - there can be many on each shard region, and they can move between shard regions located on different nodes.
Entity - actors defined by the end-user. There can be many on each shard, but they are always bound to a specific shard.

All of them can be visualized using diagram below:

Conceptual image of cluster shard internal actors located across the cluster nodes

If you look at actor paths of the entities you’ve created you’ll see that they reflect the structure of that hierarchy. They follow the pattern /user/sharding/<typeName>/<shardId>/<entityId>. Given that, it’s easy to infer that /user/sharding/<typeName> is path to a shard region while subsequent path segments are responsible for shard actor and entity actor.

Distributing State Evenly and Automatically with Self-Managing Actors

We have an updated guide to Akka.Cluster.Sharding that incorporates newer APIs and practices. Please see “Distributing State Reliably with Akka.Cluster.Sharding” instead.

In this post, we’ll discuss one of the Akka.NET plugins, Akka.Cluster.Sharding and how it gives us easier, higher level abstractions to work with actors (in this case also referred to as entities). Cluster.Sharding gives us:

By using a logical keys (in form of ShardId/EntityId pairs) it’s able to route our messages to the right entity without need to explicitly determine on which node in the cluster it lives.
It automatically manages the process of actor creation, so you don’t have to do it explicitly. When a message is addressed to an entity that didn’t exist, an actor for that entity will be created automatically.
It’s able to rebalance actors across cluster as it grows or shrinks, to ensure its optimal usage.

Given all of those traits, let’s see how to utilize cluster sharding in a standard Akka.NET application.

Decomposing Complex Domains into Understandable Actor Hierarchies

In the first post in this series, we discussed how the correct place to begin thinking about an Akka.NET application is actually with your data flows and organizing those into reusable “protocols”. Once that’s done, then it’s time to start slotting actors into some of the interaction points inside the protocol.

But what happens when certain types of interactions are complicated and can’t easily be expressed inside one actor? Or what happens if you need to accumulate state for each individual instance of the protocol?

Think of a protocol like a class. A protocol is a logical unit of encapsulation that expresses some defined behaviors, inputs, and outputs. And just like classes, protocols can be composed - one class can have members that are of another type of class. Small protocols can be combined with other protocols to build large, system-defining behaviors. This is generally how stream processing architectures are actually designed at scale.

An instance of a class is called an object in OOP. In protocol-driven design a “protocol instance” is an instance of a protocol, just like how a class is instance of an object.

Actor Hierarchies and Protocols

A protocol consists of multiple different interactions and can have totally different flows depending on the state of each particular protocol instance. Your Akka.NET actor system can theoretically run hundreds of thousands of concurrent protocol instances at the same time. It’s the goal of a well-designed actor hierarchy to make it performant and easy to manage.

So the first rule of thumb when it comes to designing an actor hierarchy is once again: separate your concerns.

Brute actor hierarchy

It’s pointless to attempt to design an entire end-to-end actor system before you’ve written any code, because you don’t know what you don’t know yet....

The Business Case and Power Behind Akka.NET and Akka.Cluster

On Monday, September 26th at 8:00am PDT Petabridge will be hosting a free webinar entitled “Introduction to Distributed Systems with Akka.NET with Akka.Cluster“ - the goal of which is to help educate developers, architects, and technology executives on how these technologies can be used to build highly available, distributed systems.

We’ve never hosted a live webinar on the subject before and it’s a regularly requested topic - and if this one is popular we’ll definitely do another in the future on Akka.Cluster and other areas such as Akka.Persistence, Akka.NET DevOps, and so forth.

What We’ll Cover

This presentation is 90 minutes long and will be focused on the architectural concepts and possibilities that Akka.Cluster creates. Specifically:

Why businesses are adopting Akka.Cluster and why this technology is desperately needed;
How distributed systems are designed using Akka.NET actors and message-based systems;
What Akka.Cluster does to make this easy; and
How Akka.Cluster synergizes with modern deployment environments, such as Windows Azure Resource Manager, Azure Service Fabric, * Kubernetes, Apache Mesos, and so forth.

Seating is limited. We still have some room available, so please register now: click here to register.

We’ll be taking questions from the audience at the end of the presentation too!

N.B. the material we’re presenting in this webinar does have some overlap with our Akka.Cluster training, but is mostly new material. If you’ve attended our Akka.Cluster training in the past you will still get value out of this talk.

In Case You Can’t Attend

If you aren’t able to attend due to timezones or conflicts, go ahead and register anyway - we’ll send you a recording later the same day.

If you have any questions or comments, please leave them...

Large-Scale, Real-time Complex Event Processing with Akka.NET

Our last customer case study “Akka.NET Goes to Wall Street” remains as one of the most popular articles on Petabridge, and today I’m pleased to share with you a new case study written by Kim-Lisa Gad and the DigiOutsource team from sunny Cape Town, South Africa.

The DigiOutsource team, lead by Jean-Pierré Vermeulen, developed an extremely high-speed complex event processing system real-time customer analysis system on top of Akka.NET from conception to production within 5 months, and that system has gone on to increase overall revenue by 35%.

What follows is their story!

Writing Code That Ages Well

Since releasing Akka.NET 1.1 I’ve been spending more time sharpening the saw here at Petabridge. Combing over the ways we spend our time and money and quantifying the returns that provides to us and to our customers. As it turns out, quantifying this is rather difficult for reasons that are all-too-common in the business world: data silos.

Our “business output” is measured and recorded in a number of disparate, disconnected, off-the-shelf systems. For example: We record our sales through Stripe and Quickbooks Online, but we never correlate them with the end-user interactions with Akka.NET Bootcamp, our YouTube videos, or our blog.

We want a complete picture of what really lead to a sale or to a successful deployment of Akka.NET, because that helps tell us what were good investments of our resources. So in order to do this I started designing a business intelligence application called “Brute” designed to perpetually stream information from all of these sources into a consolidated view. The first version of it is extremely simple but we have plans to expand what it does and the number of systems it can connect to.

Designing an Akka.NET Application

I decided that “Brute” presented a good opportunity for Petabridge to dogfood Akka.NET, especially some of the new modules such as Akka.Streams and Akka.Cluster.Sharding. Thus I’ve spent the past few weeks in the design process writing specifications, models, and documentation.

Protocol-Driven Design

Here’s the catch with designing an Akka.NET application, or really, any actor-based application: your actors aren’t the correct place to begin the design.

Instead, you always want to start the design of any Akka.NET application with the flow of events and information that go through it.

Petabridge customer event flow

Petabridge Now Offers Certified Builds, Developer Support, and Production Support for Akka.NET

More and more companies are choosing to use Akka.NET each day to fulfill mission-critical workloads in tons of different business domains: finance, health care, fleet and vehicle tracking, energy, and more.

Beginning today Petabridge is pleased to offer these companies Akka.NET certified builds, developer support, and production support to give them additional tools for getting the job done with Akka.NET in production.

Certified Akka.NET Build

Akka.NET certified builds and support plans

The idea behind the certified build of Akka.NET is to make it easier to support; each Petabridge support subscription uses a certified build of Akka.NET which is authenticode signed and (optionally) strong named.

Our certified build is also subjected to additional testing and certification beyond the open source implementation before it’s released (specifically, it’s integrated and deployed into supported production environments and run under continuous load prior to being certified.)

Right now the certified build is limited to just the core Akka.NET modules but we will be adding support for specific plug-ins as-needed.

Developer Support

The first flavor of support Petabridge offers is “developer support” - this support plan is designed to assist Petabridge customers with design and developer activities to help head off issues before you go into production.

Each support customer can expect to have their design and development questions answered and issues resolved by an Akka.NET expert promptly.

Production Support

For customers who are already live in production Petabridge offers “production support” for them, to troubleshoot issues that occur in production in real-time. We guarantee fast turn-around times and the ability to get a live human being on the phone promptly (depending on your Service Level Agreement) if an issue occurs.

All of these services should help give any organization looking to leverage Akka.NET in production...

Consistent Hash Distributions and Clustered Routers

This blog post is out of date and the practices here are no longer recommended for state distribution with Akka.NET. Please see “Distributing State Reliably with Akka.Cluster.Sharding” instead.

One of the most frequently asked questions about Akka.Cluster I receive is “how do I reliably distribute state in an Akka.NET Cluster?”

Akka.Cluster is primarily for building highly available, soft real-time applications in server-side environments - and one of the concepts that is essential to delivering “soft real-time” is statefulness.

Stateful applications are becoming more and more common because they execute types of work that are infeasible and impractical with stateless CRUD applications, but they also introduce new types of challenges such as:

How do I evenly distribute state across my cluster? No “hotspots.”
How do I find the state I need within my cluster?
What happens to my state if a stateful node goes down?
How do I guarantee that my state is modified safely? Consistent histories for state, in other words.

We can address, at a high level, all of these questions using a tool built into Akka.Cluster: consistent hash routers and Cluster.Sharding.

High Availability for Enterprise .NET

As of a few moments ago, we released Akka.NET 1.1 - the biggest feature release we’ve done for Akka.NET since releasing 1.0 last year! We have other major Akka.NET feature releases planned for later this year, but this is the first and one of the most long-awaited ones.

Akka.Cluster is released from beta

The biggest change in this release is the first stable release of Akka.Cluster, which has been in various beta stages since August 2014. During that time it’s been used as a beta package by hundreds of users who have given the Akka.NET project large amounts of bug reports and feedback.

Since that time we’ve covered Akka.Cluster in a huge range of multi-node tests, designed to ensure the cluster behaves correctly under a variety of network conditions (including some rather hostile ones) and it’s performed well.

We’ve also replaced the underlying Helios 1.4.1 transport with a brand new Helios 2.1.1 transport, which delivers up to 5x the throughput with a tremendously lower memory footprint than the previous versions.

Release of the Multi-node TestKit, Multi-node Test Runner

One tool that has proven utterly indispensable in the development of Akka.Cluster, Akka.Remote, Akka.Cluster.Tools, and Akka.Cluster.Sharding is the Akka.Remote.TestKit package - commonly referred to as the “multi-node testkit.”

This library is an extension of the Akka.TestKit that developers use for unit testing simple actors, and it offers capabilities designed for facilitating distributed unit tests for Akka.NET ActorSystems that are using Akka.Remote, Akka.Cluster, or any of the other high availability (HA) modules.

The Akka.Remote.TestKit is designed to allow you to do the to following:

Run a unit test across multiple processes simultaneously, simulating how servers or virtual machines would behave in the real-world;
Offers a dead-simple Domain Specific Language (DSL) that allows...

Feature Releases for Akka.NET in 2016

It’s been a while since we’ve published an official roadmap update for Akka.NET. We are still on track to achieve the goals of the previous roadmap, but with a few minor changes that I will explain here.

Akka.NET 1.1 - Akka.Cluster Release to Market; Akka.Streams Beta

We’ve publicly committed to releasing Akka.NET 1.1 on June 14th, 2016:

Coming soon to @AkkaDotNET - Akka.Cluster stable release. June 14 target ship date, along with the to-do list :p pic.twitter.com/f4UKOASowR
— Aaron Stannard (@Aaronontheweb) May 23, 2016

This release has the following goals:

Officially releasing Akka.Cluster to market, signifying “it is ready for full-blown production use;”
Deploy Helios 2.0 transport to production, which is significantly faster, more memory efficient, and more reliable than the current Helios 1.4.1-based transport;
Releasing the MultinodeTestRunner and the Akka.Remote.TestKit, used for testing distributed systems built with Akka.NET; and
Releasing the very first beta of the new Akka.Streams module, which you can read more about here.

Akka.Cluster has been available as a beta package for nearly two years and has had thousands of users. It is currently serving production workloads both on Linux and on Windows for a variety of different types of customers. During this period we have collected lots of bug reports, feedback, and data that has been used to help improve its reliability and performance.

This will be a tremendous opportunity for Akka.NET users to build high availability systems of all shapes and sizes on any cloud they wish.

Akka.NET 1.5 - TLS, New Serializer, Faster Transports

The next major release we have planned following Akka.NET 1.1 is Akka.NET 1.5. This release will introduce some breaking changes at the dependency level.

We are making the following two important changes:

...

Beyond HTTP

Technical Overview of Akka.Cluster.Sharding in Akka.NET

How Akka.Cluster.Sharding Allocates Shards, Rebalances, and More

Introduction to Akka.Cluster.Sharding in Akka.NET