One of the most powerful ways to use Akka.NET in production is to create clustered applications that can scale on-demand. You can literally deploy code somewhere other than the immediate, local process making the call.
And the way this is accomplished is with the Akka.Cluster module. AkkaCluster is the module that adds clustering capabilities to our Akka.NET applications.
When do I need clustering?
Clustering is something you need to use in high availability scenarios, or when you need elastic scalability in your systems.
Here are some examples of high availability scenarios that come up often in the real-world, and these are projects that often use clustering!
- Marketing Automation
- Multiplayer Games
- Devices Tracking / Internet of Things
- Alerting & Monitoring Systems
- Recommendation Engines
- Dynamic Pricing
- …and many more!
As you can see, clustering has a wide-range of use cases, and it’s also the way to create a scalable microservices architecture in Akka.NET. To put it bluntly, you should use clustering in any scenario where you have:
- A sizable load of traffic;
- With non-trivial work that has to be performed;
- And an expectation of fast response times;
- And frequent mutation in state.
How Do I Form a Cluster of Services?
After you’ve enabled Akka.Cluster inside your Akka.NET application, it’s easy to make your locally developed application cluster over the network.
Check out the video and slides I’ve added below, which give you a thorough introduction to what clustering is, how to enable it, and how to use it in your applications. It’s 21:33 long, but it explains the core concepts of Akka.Cluster and actually shows you how to put it to work!
The code used in the video examples can be found in two places:
Samples.Cluster.Simpleexample within the Akka.NET Project Repo.
Cluster.WebCrawlerexample in the Petabridge official Akka.NET samples repo.
Master Akka.NET and Akka.Cluster
If you think this is cool and want to know how to do this at a production scale and build scalable microservices, do message/cluster versioning, learn to upgrade your cluster in place, then you should take one of our Akka.NET trainings:
- Beginning Akka.NET: Akka.NET Bootcamp - our popular free, self-directed Akka.NET training that over 1,500 developers have done! This is a prerequisite for all of our advanced courses.
- Advanced Akka.Cluster: Petabridge Akka.Cluster Virtual Training - we offer live virtual training to help you truly master Akka.Cluster and put its power to use in production. (We also offer courses to master Akka.NET Design Patterns and Akka.Remote.)
Hello and welcome to the short video introduction to the Akka.Cluster module. My name is Andrew Skotzko and I’m the co-founder a company called Petabridge. And I’m going to be taking you through the Akka.Cluster introduction. What is the module? What does it do for your applications? And how can I make your life better as a developer? There’s not a whole lot of documentation on this module as of right now that is June 2015 but we’re gonna be contributing a lot more docs back over the next few months and we want to start off that effort by giving you a brief video
[0:30] tutorial on the module and how it works. So, let’s get into it. Akka is a framework for building powerful concurrent and distributed applications easily and more quickly, frankly it just makes your life as a distributed systems developer a lot better. In this video we’re gonna be focusing specifically on the distributed part of Akka.NET that is, how do you use Akka.Cluster to build applications that can span many nodes it can across the network. How can you build micro services? How can you create as application
[1:00] that can elastically scale up or down as needed on demand? I’m the co-founder of the Petabridge where the company supports the Akka.NET project and it has the privilege of working with the global contributor community. Petabridge also supervise provide support training, consulting and to linked for Akka.NET and for distributed systems development in general in that net. So that any further do, let’s get into it, let’s talk about Akka.Cluster. We’re gonna be starting with some slides to you, illustrate the concepts and help you get a mental framework for what is happening
[1:30] and then I’m actually gonna mixing in some codes samples and demos towards the end to this, you can actually see things in actions. So, here’s what we’re going to cover in this video. We’re going to start off with answering the very simple question of, what is a cluster? Then we’re gonna move on asked, what does clustering give you? Why you should even care? And then, how do you enable it? How do you turn this on? And then finally, how do you actually use clustering in a real application? So let’s get started. What is a cluster? We’re gonna start of visualizing what a cluster
[2:00] looks like? And that would be this. A cluster is a group of computers working together to accomplish a task. Clustering is also a way you do a micro services architecture in Akka.NET. We’re gonna be getting into that much more end of the video but this is a very simple way illustrating what a cluster is. So now, let’s talk about some of the features of what a cluster is. A cluster first of all is a fault-tolerant, it can recover from failures gracefully. Second of all it is elastic, this is probably the most commonly thought an aspect of
[2:30] a cluster that is … elastically had new nodes to the cluster to grow in size and take on more work or can do the opposite it can shrink in size when it doesn’t need the out of machines that has to save you money, also clusters are decentralized that is there is no one machine in the closer, there are many machines. There are no client or server semantics here. There are just servers talking to other servers over open sockets. Everything is done through persistent socket connections. So if you pull all of these
[3:00] things together, what you really get is that there’s no single point of failure in the cluster and similarly, if you’re doing a lot of processing , there is no single bottleneck in the cluster. Akka.Cluster is a module that enable these types of features to exist in your application and all it takes is installing it in your architecture in your app in a certain way which were gonna cover here. So let’s talk briefly about what some in the benefits of clustering … We sort of touched on them already but just recap those really quickly. First and foremost I would say
[3:30] it’s elasticity the idea that you can have very very scalable systems using clustering. Secondly, they recover from failure really well, these are hard systems to break. And finally, clustering is a very easy way to get all the benefits of a micro services architecture. It’s much much easier to create micro services with clustering that starting with some monolithic app and then trying to cut it up. It’s easy to develop. It’s easy to develop it locally and then you can deploy your application out across cluster
[4:00] whether that’s on the cloud or on your own hardware. And the best part, no code changes. Yes really, the way that happens through your configuration through hole con, and this is a bit of what it looks like, one quick note just while we’re on the topic of configuration is that all your nodes in the Akka.Cluster have to have the same ActorSystem name. You can across in the ActorSystem name boundaries with the Akka.Remote but that module is a bit beyond scope for this video. Now that
[4:30] we know what a clustering is. When do we use it? Here’s some the example use cases that fit cluster really nicely. By no means is this all the use cases for clustering, but these are some of the common ones I have been seeing a production whether you want to use it to build scalable analytic systems, marketing automation systems, if you want to build streaming multi-player games or if you want to do device are Internet of Things tracking, you’re gonna need clustering. Also if you want scalable monitoring systems or recommendation
[5:00] in machine learning engines or something like perhaps dynamic pricing for concert tickets. These are all really great use cases for clustering. Now I’m going to walk you through how clusters actually formed. Everything you’re about to see is powered by what is called gossip. I don’t really have time in this video to go deep on gossip but basically gossip is a lot of back-and-forth communication between all the nodes in your cluster and its use by the cluster to figure out who all the nodes in the cluster are and also to constantly monitor the status
[5:30] of those members of the cluster. This is the initial state of our clustered, we have five nodes and right now they all exist independently and they don’t know about each other. We have AB and then CD and E. As you can see, there’s a little circle around A and B and they labeled seed nodes. What is that? A seed nodes is an initial well-known contact point that other nodes can use to join the cluster. You can configure seed nodes yourselves or we have something to help you with the this and that is called Lighthouse.
[6:00] Lighthouse is a free dedicated seed nodes tool for our Akka.NET clustering. It only has to be operate one occur cluster itself is upgraded and it’s not actually deployed as part of your app, so it should never have to be redeployed when you make no changes but it will need to be upgraded as occur that cluster gets upgraded. Lighthouse is a free tool developed by Petabridge, click the link in your screen to get the code and get up. Okay, back to how a cluster forms. As we said, we have five nodes A,B,C,D and E.
[6:30] A and B being seed nodes already know about each other so they start the cluster themselves A contacts, be B contacts. A between them a very quickly form a cluster. Now we have three other nodes, C,D, and E who are not yet part of the clusters, so A and B know about each other and then nobody knows else knows about them and they don’t know about anybody else. So what happens is that C and D on the right, they only know that node A as their seed node and node E only knows about B as a seed node.
[7:00] So each of these seed nodes will get in touch with the seed nodes they know about. So everybody attempt to join the cluster through the seed nodes, so C and D try to get join the cluster by contacting A and node E try to get join the cluster by contacting node B. Now C,D, and E have contacted their seed nodes and started to form a cluster with them. A is elected a leader of the cluster and A through its gossip with B knows about cluster of A,B,C and D. It doesn’t yet know about E
[7:30] this is when the quote-unquote ring begins to form as the gossip spreads through the cluster, it informs the other members of the newly formed cluster about each other E is told about C and D and A learns about E and so on and so forth. The communication is also established between non seed nodes. So now node C and node D and node E all begin talk to each other directly and then they can actually communicate even if
[8:00] a seed node dies. So once, your cluster is up and running. If the seed nodes die that actually will not stop the processing and the work that takes place within your cluster. The other nodes in the cluster, they already know about each other and they no longer need the seed node to work with each other. The seed nodes just serve the purpose of establishing the cluster in the first place. We now have a fully formed cluster up and running. All five nodes know about everybody else they even know about nodes they did not have any knowledgeable for and
[8:30] who join the cluster through a totally different seed node then they did, so for example here C now knows about E even though it did not know that E before and both C and E have different seed nodes. Okay, now we have a cluster great at least conceptually but how do you actually enable clustering in your app. Let’s go into that now, here are the six steps you need to take to turn on architect cluster inside your app. First of all, you need to install the new get
[9:00] package, then you need to configure the cluster actor ref provider. Actor ref providers are like actor factories in the Akka.NET and you need to turn on clustering capabilities so that your actors you get back are cluster capable. Steps 3 and 4 you need to enable and configure at least one network transport within Akka.Remote. If you’re not familiar with it, Akka.Remote is another advance module and it’s actually the module that powers all remoting in all network communications with the Akka.NET, so it is hard dependency
[9:30] of clustering. It’s also very well worth your time to learn if this is an area that you’re actually interested in. Okay, step 5, we need to enable at least one of the seed nodes like we’ve discussed and finally we have to start up our ActorSystem. Okay, I’m now gonna walk you through those steps one by one within the context of the simple cluster sample which is a simple clustering demo built into the Akka project itself. The demo also illustrate for you what cluster gossip messages look like. Let’s go to the code.
Okay, now I’m actually gonna walk you through how to enable the Akka.Cluster inside one of your projects and to do this, I’m going to use a built-in sample to the Akka.NET project its called samples.cluster.simple and it lives over the examples folder. The first step is to install the new get package so we’re gonna take
install package Akka.Cluster -pre.
[10:30] Our next step is to configure the cluster actor ref provider, so we’re gonna add after section to our Akka … on and they were gonna type in provider equals Akka.Cluster.ClusterActorRefProvider,Akka.Cluster. Our next step is to enable at least one are Akka.Remote transport and it can configure the address for the transport, so we are at a remote section to a row count in Fig and were gonna add log
[11:00] remote lifecycle events equals DEBUG too so we see all the events then we need to configure Helios which is the actual network transport the powers of everything in our Akka remote and therefore Akka.Cluster we’re reconfigure the host name to be 127 001 and they were configure the port to be 0. If you not familiar with what port 0 means, it’s basically a special plug that as operating system to sinus a randomly available port, so we don’t actually know the final part number gonna end up with but that’s okay because the seed nodes
[11:30] which we’re going to talk about about now. So the final step is to configure the seed node so we add a cluster section to our Hogan config and then we add the seed nodes which is just an array of strings. So we’re gonna add the seed nodes which are going to be the ActorSystems we start at Akka.TCP//[email protected]:2551, and port 2252. Okay now that we have installed the package
[12:00] and configured our ClusterRefProvider enable our socket that remote TCP transport and added that seed nodes we’re ready to go so we can start up the system. Okay we can clearly see something happening here so I’m gonna take a minute and its walk you through what these messages are, so up at the top here we can see that remoting has started and that the cluster starting to form all the warning messages, don’t worry about them this is what are the ActorSystems is trying
[12:30] to contact the other seed nodes which is not yet up and running but its gonna be back often retract so then we can see that there are joining messages or 2 seed nodes up and then our dynamic node is up getting a welcome message from the leader and then from here on out we actually have a fully formed cluster and it doesn’t look like much but that’s what this is. The rest to these messages are actually just got the messages as if you’ve been wondering what gossip looks like. This is it… is messages just bouncing round nodes in our cluster so its up and running. This is great but
[13:00] what makes this all possible, the answer is actually very simple its Location Transparency. The ideal location transparency is actually really well illustrated by a cellphone, if I wanna call you on your cellphone I don’t actually need to know where you are. I just have to know your phone number and then I can call you and whether you’re in Los Angeles or New York or London or somewhere else, it’s not my concern, it’s the job at the cellphone network to route the voice packets from me to you and back
[13:30] regardless of where either versus that’s exactly what location transparency means. This is the same thing is happening within the cluster itself the location of the actor within the cluster doesn’t matter you don’t care if the actors actually on node A or node E or F, it doesn’t matter. This property location transparency is essential because its what allows you to write your absolutely on one machine and skill them out across hundreds or thousands without having to worry about
[14:00] it just let the cluster do it for you without location transparency you could not do that. Now, how do you actually use clustering in an application we’re gonna go into our favorite use case for Akka.Cluster and that’s micro services. Micro Services if you’re not familiar with it is the idea up having multiple independent services all on separate physically isolated machines collaborating as a unit what happens is that we run multiple Akka.NET application within a cluster.
[14:30] Each applications on a node and has a job to fulfill and its all collaborate within the cluster. This is how we treat scalable micro services within Akka.NET and what powers all this is roles in Cluster-aware routers. I’m not going to go into routers in this video so you should check the docs but I’m gonna go into roles now. What is a role? A role is nothing more than the capability the other node, you can think of it as the job that node has to fulfill
[15:00] in a given node in your cluster can have more than one role. A role is declared configuration and its literally nothing more than a string it’s a name other roles unnamed responsibility . What you can do with it as you can have Cluster aware routers and they can target other nodes in the cluster based on their role, what that could mean is you have router deploy actors on to another node in the cluster based on its role so you could say I want to remotely deploy my actors
[15:30] on to nodes that are all the role tracker for example or if you had a group router you could rather can say I will only send my messages to actors that path slash user slash foo and only on nodes that are in this particular role again tracker for example. I’m going to introduce you now to one Petabridge sample applications that show of Clustering roles, Cluster-aware routers, and a lot of other important concepts that we don’t have time to cover in this video. This is our
[16:00] web crawler sample, the web crawler is exactly that it’s a application we built, it’s a … web crawler built using micro services in Akka.NET, basically you get to put in a URL and then the cluster will go crawl the URL you can submit many many URL’s at once in Intel right ahead a CPU or network bandwidth and you can just keep doing this and you can add more crawlers on-demand to the crawl job in order to scale up so there’s a few roles at player those are web
[16:30] tracker and crawler. First we have the web role in the top left this is the front end user interface where you input the domain you want to cross so you can type in HTTP CNN.com that’s submits to the tracker roles in the bottom left the job of the tracker role is to keep track of what crawls job are in progress and to make sure that we don’t have duplicate crawl jobs going for the same domain so we wouldn’t want to be having two entirely separate crawl jobs at CNN going at the same time that would be waste of resources
[17:00] so then we have the actual crawler roles in the bottom right these are the scalable services that actually do the crawling, the downloading, the parsing of what ever websites you put in, so the tracker rules are state for their tracking the state above the various crawl drops, the crawler roles are stateless they’re just doing work, they’re a command processors. Now these are linked up within a cluster and they form the cluster using lighthouse which is dedicated seed node to all the Petabridge that had an open source.
[17:30] They form a cluster by all talking to lighthouse initially, then they form a cluster with each other and assigned work using their rules. I encourage you to go through the sample in depth. We have spent a lot of time thoroughly documenting explaining exactly what’s going on but it goes way beyond the matter of time we have in this video but I will link to it and you should go checking out but now just gets you excited about it I’m gonna show it to you and show you how the web crawler works and talk you through a little bit. Okay, I’m a show you just really quickly a demo
[18:00] of what web crawler sample looks like. So here I have I’m inside where pro solution which you can get on get up so then you can see I have lighthouse already running here so again, lighthouse is a dedicated seed node its living at local host port 4053. It’s just gonna be dedicated seed node for the rest of the web crawler this go into App.Config down here to cluster, there’s my seed node
[18:30] that lighthouse right there, so there’s everybody is gonna be looking to lighthouse on this port at this address to form a cluster and the I’m a show you that to you now and a fired up to see can get an idea of what this thing looks like when its up and running. … So here we’ve got the tracking service and the crawl service and then I have lighthouse down here, so as you can see
[19:00] with the lighthouse, lighthouse started up and everything is good and then there was a brief bit of time where things were between themselves out and then you can see that the leader which is lighthouse in this case was taking them off but then was moving them here to up and easy tracker and crawler right here. So then over here on tracking service and crawl service you can see them being welcome from Akka TCP … web crawler at 4053 which is
[19:30] actually lighthouse, so crawl service and tracking service are now in the cluster they know about each other and they were welcomed in a former cluster with what lighthouse, so these three are now in a cluster. So let’s go ahead and … were gonna crawl CNN.com so that should kick off the ranch is the second, so okay, so that’s all now the crawlers being fanned out the work is being reported back to the tracking service to the front and and we’re do gonna do
[20:00] is were actually a launch another crawler which is going to joined the cluster and should start picking up work affected. Okay good, so now we have two crawlers running and the tracking services that they’re doing its thing tracking. So these crawlers are working away they are now …. Okay so this point we have a fully up and running cluster that is crawling away on
[20:30] CNN.com and where were able to elastically scale the cluster all be it in a small way andwe were able to add … to the cluster that were not there in the beginning and they were able to join and dynamically expand the cluster start picking up work. Great, now you’ve seen web crawler for yourself and I encourage you to go in depth on it and really read the code, read the explanations we put there because it covers everything we talked about in this video and more in depth. You can get the entire set of code for yourself from our code sample … you go ahead and click here on this link
[21:00] and that’ll take you to it finally thank you so much for watching and sticking me through this video. I really really hope that this has helped you get an understanding of what our Akka.Cluster is, what it does for you and how to set one up and get it going. Please subscribe to our YouTube Channel for more tutorials and please also check out our advanced talking to cluster training which goes way in depth and gets into how operationalize Akka.Cluster combine it with … to zero downtime cluster upgrades message in cluster versioning and much much more. Again thanks so much for your time and I hope this has really been helpful to you
[21:30]If you liked this post, you can share it with your followers or follow us on Twitter!
Upcoming Petabridge Live Akka.NET Webinar Trainings
Get up to speed on the leading edge of large-scale .NET development with the Petabridge team. Each training is done remotely via webinar, lasts four hours, and will save you weeks of trial and error.