Solving Major Database Contention Problems with Throttling and Akka.NET Streams

Alleviate strain on production systems with in-process Akka.NET streams.

When troubleshooting performance problems in distributed systems or locally-run, high-throughput-required software I tell our users “your most severe performance problems are almost always going to be caused by flow control issues.”

My preferred batting order for troubleshooting performance issues is:

  1. Improve or resolve flow control issues;
  2. Eliminate wasteful I/O and round-trips; and
  3. Technical improvements - improve how efficiently work is performed leveraging mechanical sympathy.

This list is ranked in the order of “most likely to have largest real-world performance impact.”

In this post we’re going to address how you can use Akka.NET actors and Akka.Streams to easily resolve some one of the most painful flow control issues: database contention and bottlenecking.

10 Years of Building Akka.NET

Lessons learned from working on Akka.NET over the years.

It seems like just yesterday that I wrote “Akka.NET - One Year Later”; as of November 21st, 2023, Akka.NET is now ten years old.

I don’t have my original prototype Akka code anymore; all I have is Helios - the socket library I originally created to power Akka.NET’s remoting and clustering systems, which it did from 2014 until 2017. My first commit on that project dates back to Nov 21 2013. That’s when I mark this phase of my career: the Akka.NET years.

Celebrating 10 years of Akka.NET

In this post I wanted to share some lessons learned from developing and maintaining one of the most ambitious, professional grade, and independent open source projects in the .NET ecosystem for over 10 years. Some of these lessons are technical; some are more business-oriented; and some are just kind of funny. In any case, I hope you find them helpful.

Chinese version.

Don't Build Your Own Bespoke Company Frameworks on Top of Akka.NET

Akka.NET Application Management Best Practices

Last Thursday, September 7th we executed our “Akka.NET Application Management Best Practices” webinar and I’ve made the recording available on YouTube for everyone to watch for themselves.

The code sample I created for this webinar can also be found here: https://github.com/Aaronontheweb/akkadotnet-app-management-presentation.

However, I wanted to expand upon the webinar’s key points and emphasize some strategic / product management points that might not get noticed if you don’t pay close attention to the video.

If you’re interested in having any of these concepts taught to your company or team, please contact us about training options.

I’ve reviewed 100+ Akka.NET code bases at this point in my career, and I’ve reviewed stand-alone ASP.NET applications without any Akka.NET whatsoever. What many of these code bases all have in common is someone gets the bright idea to abstract over Akka.NET / ASP.NET / Entity Framework with an in-house framework that automates infrastructure decisions and enforces a one-size-fits-all design on how all domains are implemented.

This is a tremendously expensive design mistake that destroys optionality and creates more problems than it solves.

Bespoke Company Frameworks

I introduced the term “Bespoke Company Framework” (BCF) in a post on my personal blog earlier this year: “DRY Gone Bad: Bespoke Company Frameworks.”

As I mention in the presentation - BCFs attempt to standardize the way work is done inside a company’s specific domain. BCFs are actually quite helpful at solving infrastructure problems.

For instance, if you’re trying to automate manufacturing processes that have to be configured uniquely for multiple lines of products and all lines are built using the same universe of vendor drivers / components, writing a BCF that abstracts...

How We Made Phobos 2.4's OpenTelemetry Usage 62% Faster

OpenTelemetry Performance Optimization Practices

Phobos is our observability + monitoring library for Akka.NET and last year we launched Phobos 2.0 which moved our entire implementation onto OpenTelemetry.

Phobos 2.0 Instruments Akka.NET with OpenTelemetry

One of the issues we’ve had with Phobos: many of our customers build low-latency, real-time applications and adding observability to a software system comes with a noticeable latency + throughput penalty. We were asked by a customer at the beginning of July “is there any way you can make this faster?”

Challenge accepted - and completed.

Earlier this week we shipped Phobos 2.4.0, which is a staggering 62% faster than all of our previous Phobos 2.x implementations - actually Phobos 2.4 is even faster than that for real-world applications and we’ll get to that in a moment.

This blog post is really about optimizing hot-paths for maximum OpenTelemetry tracing and metrics performance and the techniques we used in the course of developing Phobos 2.4.

Let’s dig in.

Akka.NET v1.5: No Hocon, No Lighthouse, No Problem

Exploring Akka.Hosting, Akka.HealthCheck, and Akka.Management

In our previous post we covered the Akka.NET v1.5 release and in particular, we focused on the changes made to the core Akka.NET modules.

In this blog post we’re going to cover the three new libraries we’ve added to Akka.NET as part of the v1.5 development effort:

Akka.NET v1.5 is Now Available

Akka.Hosting, Akka.Management, Akka.HealthCheck, .NET 6 Dual Targeting, Akka.Cluster.Sharding Overhaul, and Many More Improvements.

As of today, Akka.NET v1.5 is now available as a stable release package on NuGet for both .NET Standard 2.0 and NET 6.0. This is a big release aimed at addressing pain points for current Akka.NET users.

We’ve published a detailed article on the Akka.NET website that describes what’s new in Akka.NET v1.5, but we wanted to capture some of the highlights here.

Scaling Akka.Persistence.Query to 100k+ Concurrent Queries for Large-Scale CQRS

How we solved an acute event-driven scaling problem for users in Akka.NET v1.5.

One of our major engineering milestones for Akka.NET v1.5 (ships on February 28th, 2023):

Make CQRS a priority in Akka.Persistence

This blog post is about an interesting engineering challenge we had to solve to accomplish this for Akka.NET v1.5: supporting hundreds of thousands or even millions of concurrent Akka.Persistence.Query projection queries all targeting a single database.

Fundamentally - there are many different facets of Akka.Persistence and Akka.Persistence.Query scalability that needed solving, but this post fixates on an acute problem users have in production with Akka.Persistence.Query today: large numbers of concurrent queries absolutely melting down production-grade database deployments.

Users reported APQ knocking their databases out with as few as 3-4 thousand streaming queries running at rates of once every 1-3 seconds - with several nodes all running similar workloads (so in aggregate, perhaps closer 10k+ queries per second.)

Beginning in Akka.NET v1.5, we have eliminated this issue and have explicitly tested it up to 100,000 queries per second targeting a dinky SQL Server 2019 database running inside a Docker container. We suspect our design can scale up to support 10s of millions of concurrent queries (although for reasons I get into later, end-users should use different approach.)

Here’s what we did.

Author’s Note: Petabridge turned 8 years old this month and over the years I’ve only written a couple of articles about our internal processes for developing software and running an open source software business. In late 2021 we began using OKRs - “Objectives and Key Results” - as a general management system for setting quarterly goals and allocating accountabilities across the organization.

OKRs took some adjusting to, but have worked out really well for us overall - in combination with using Notion to record daily plans, document critical procedures, write technical specifications, and track progress against our key results each week.

Tracking these OKRs and our daily work plans made it quite easy for me to summarize everything our team accomplished in each area of our business in 2022 - and I wanted to take advantage of that and show everyone the impact their work had in the previous year.

So to kick off 2023 I showed a summary of all of our key results in each area of our business to our team members and we spent a day going through them - what follows below are the sections about Akka.NET, what we accomplished last year, and what’s next in store for Akka.NET.

.NET 7.0's Performance Improvements with Dynamic PGO are Incredible

Akka.Remote is 33% faster, Akka.NET v1.5 is 75% faster in-memory.

.NET 7.0 was released to market last week and includes hundreds of major improvements across the board.

I ran Akka.NET’s RemotePingPong benchmark on .NET 7.0 shortly after installing the .NET 7.0 SDK - I’ll take every free lunch I can from the CoreCLR team.

Here’s how the numbers compare between .NET 6.0 and .NET 7.0 for RemotePingPong:

Lightbend's Akka License Change and Akka.NET

Akka's License Change Does Not Impact Akka.NET

N.B. For the purpose of clarity: “Akka” refers to Lightbend’s Scala / Java library and “Akka.NET” refers to the .NET Foundation library maintained by Petabridge. I have been very careful in my writing to ensure there is a little confusion as possible in this post.

TL;DR; Akka.NET is Not Impacted

Imagine my surprise this week: out of the blue one of the core committers to Akka.NET forwards me a link to a Lightbend blog post entitled “Why We Are Changing the License for Akka.”

Lightbend logo

Lightbend’s license change for the original Akka library has no impact on Akka.NET. All of Akka.NET’s source is still Apache 2.0 and anything we’ve ported from the original Akka library was also done under Apache 2.0 as well.