One of the topics I covered during my presentation about using Akka.NET in production at MarkedUp was the concept of stateful applications built with actors - the idea that state can reside within your application rather than outside of it by default, and how it was this idea that made our marketing automation product possible to build.
This tweet from an attendee sums up the realization well:
@Aaronontheweb A key take away for me is Actors are stateful and I do not have to get/post constantly from Redis. Wow that is liberating.— akshay123 (@akshay123) August 12, 2015
Exactly right! Actors can be stateful - and that means we no longer have to factor round-trip times to SQL Server, Redis, Cassandra, or whatever into the design of our applications. The application already has the state it needs to do its job by the time a request arrives!
The Limitations of Stateless Design
The traditional way of developing web applications is stateless - and that’s a natural consequence of HTTP, itself an inherently stateless protocol.
But unless you’re just serving up static content, nearly all of the web applications we are responsible for as developers are stateful!
For instance, all of the following commonly developed types of applications have mission-critical behavior that is inherently stateful:
|Application Type||Stateful Entities|
|eCommerce||inventory, pricing, availability|
|Social Apps||updates, graph|
|CRM||activity / deadlines|
|Analytics||reports / indicators|
|Marketing Automation||activity / rules|
|HR||resource status, workflow state|
|Chat / Messaging||messages|
I could spend all day coming up with different examples for this, but the point’s clear: a large part of our jobs as engineers already consists of managing, mutating, and retrieving state.
Why Stateless Design Worked
Throughout most of the history of web programming, however, we’ve largely delegated these responsibilities to the database. I would argue that most web applications, even today in 2015, follow this CRUD design:
In this design, your “web application” is really just a stateless command processor - all of the information about a user’s session, such as their identity and authorization claims, are really stored and retrieved from the database. All of the content you need to serve, whether it’s listing products for an eCommerce system or any of the other examples I gave above, are all things that fundamentally live in the database.
What makes this design great is that it cleanly separates your application from your data - you can easily throw more web servers under a load balancer if your traffic increases. And you can easily update your stateless web applications dozens of times a day without worrying about the availability of your data.
There’s a lot to like about this design - that’s why it’s ubiquitous in modern web programming.
Where Stateless Falls Short
But increasingly, this stateless application approach is showing its age. There are two primary areas where this design falls apart:
- Can’t support increasingly popular categories of applications demanded by users - any application that needs to perform real-time work could never be built using stateless CRUD models, because you need state locality in order to achieve those response times. This is why you’ll never see a multiplayer video game or chat application implemented using a stateless application model.
- Can’t support increasingly large workloads - as I point out in my talk, the Internet is a much larger place now than it was a few years ago and we developers are expected to retain and use as much data as possible in order to satisfy the increasingly high standards of our customers. As a result of that, we’re throwing a lot more information at our data tier than we used to - like multiple orders of magnitude more data. And those databases can’t scale to support arbitrarily large workloads, especially write-heavy ones. Thus we have to rethink the way we store and retrieve data in order to support the larger units of work while still maintaining fast response times.
It’s because of both of these trends that many architects are being forced to reconsider the decades-old stateless “get the data into SQL Server as fast as possible” approach to web application design.
One solution many architects are considering for solving the “data layer scalability problem” is NoSQL. This approach certainly worked for me and our first product at MarkedUp (we used Cassandra and Hadoop,) but it didn’t make a difference at all when it came to building our second product, which imposed a soft real-time requirement that was essential to the success of the product.
So I would like to propose a new possibility… Developing stateful web applications.
The idea behind stateful applications is elegantly simple: your application is the single source of truth for the “state” of things, and your database is just cold storage / backup.
For instance, if we were to architect a typical web application to be “stateful,” here’s one way how that might look:
We retain the stateless Web UI / presentation layer, because we want to keep the benefit of being able to deploy changes to our UI with zero risk to the integrity of our state. This design achieves that. But we’ve added a new, “middle” application tier that sits between the Web UI and the database. This is novel part of the design.
The stateful middle service is responsible for doing the following:
- Accumulating state, in-memory, and asynchronously flushing updates in critical state to a durable store;
- Re-hydrating its state, on-demand, from the durable store;
- Fulfilling requests from other services using its state; and most importantly
- Reacting to changes in state as they occur in real-time.
It’s the fourth trait that makes this design approach so powerful, and it’s something no off-the-shelf NoSQL product could ever do for you. The ability to react in an application-specific way to changes in state over time is an immensely powerful tool - this is what allowed MarkedUp In-App Marekting to function quickly and at considerable scale.
Instead of polling the database for answers continuously, which increases contention, I/O, response times, and error rates, we can simply react to changes we’re interested in as they occur. This change can produce a multiple order of magnitude increase in throughput, responsiveness, resiliency, and scalability of the system.
By treating the database like merely a durable store for looking things up and not the beating heart of our application, many of the problems we run into with the traditional CRUD model are gone.
For instance, suppose we were building a marketing automaton system and our customer wanted to send one of their users a message after that user had triggered events 0-3, often in rapid succession.
This approach would fail miserably using the CRUD model - observe:
In theory, this is how it would work if we trusted the database to handle everything for us. Our stateless web application would write the event currently being observed to the database, and then read that event plus all of the other accumulated events out of the database. Eventually, when the last event was observed, we’d send the message to the end-user because the database would have returned all four events to us.
In reality, this idea failed miserably - even under the most idealized test environment.
The CRUD design simply can’t account for two things:
- Consistency of the data store at the time of query, even under ACID;
- Concurrency of the events and requests happening at the level above the database.
As a result, the values you retrieve under this Read-after-Write model are chaotic and random. Whether you use SQL Server or Cassandra is irrelevant. You’d have this problem with any database.
The real failure of this system is that we’re trusting the database to be responsible for the ordering of the messages and to have an immediately consistent view of that data whenever we please, ignoring any other traffic or network conditions that might be affecting the application as a whole or the database specifically. This is madness.
So to resolve this problem, we take matters into our own hands.
Under the stateful programming approach, we design our system such that all events for an individual user are always routed to the exact same machine and the exact same actor in our network; in Akka.NET a Consistent Hashing Router is one such tool we could use to accomplish this.
These events can arrive in any arbitrary order, interleaved with other events potentially, and multiple web servers operating under the load-balancer can each forward one or more of these events to the stateful application server responsible for accumulating them for this user. But once all four events are observed, this fact is both consistent and well-known to the application immediately without needing to talk to a database.
The result is a predictable product with low overhead, great scalability, and extremely fast response times.
The tradeoff is in that it requires more care and caution to deploy updates to the stateful application and requires the developers to learn how to manage their state in a concurrent + distributed fashion. This used to be quite hard, which is why this type of application design has been rare until very recently, hence why Akka.NET and the actor model have become so popular.
Why You Should Learn Stateful Application Design… Today!
By now I’ve made it clear that the CRUD model of application design limits our success and self-expression as developers, and therefore we must change our assumptions and design techniques in order to meet the standards of today’s users and information economy. Nothing is more crucial than taking control of your state back from the database.
If message-ordering and timeliness is important to you, deal with it at the application level.
If dynamic state-driven behavior is something your application needs, a database won’t do this for you.
And on and on.
Not every application will have stateful requirements that necessitate the need to have a stateful service, but those applications will become the exception rather than the rule in the future. Becoming comfortable with state and distributing it across a networked application is something web developers will become expected to do.
So I’d like to give you access to a possibility: what if you could be the person who transformed your organization to use stateful programming to your advantage? What if we embraced this coming change as an opportunity to revolutionize our software and make the experience of its users radically better than could ever be accomplished using CRUD alone? You could take that on. Try it!
If you want to learn more about how to put stateful programming to work for you, I suggest checking out the example below and signing up for our Akka.Cluster training course, where we dive deep on the subject. If you’ve never done Akka.NET or stateful programming with actors before, give Akka.NET Bootcamp a try - that’ll show you the ropes.
A Real-World Example
In the video below, I explain how and why our attempts to build MarkedUp In-app Marketing with stateless design approaches failed, and ultimately how stateful application programming techniques are what made this product possible to build.
It’s about 30 minutes in length, although we do an hour of Q&A afterwards. Definitely recommend watching it to see if any of the painful mistakes I made are ones you can learn from for free.
Upcoming Petabridge Live Akka.NET Webinar Trainings
Get up to speed on the leading edge of large-scale .NET development with the Petabridge team. Each training is done remotely via webinar, lasts four hours, and will save you weeks of trial and error.
|Akka.NET Application Architecture and Design Patterns|
|Building Networked .NET Applications with Akka.Remote|
|.NET Distributed Systems Architecture and Design with Akka.Cluster|