Why Akka.Streams.Kafka is the Best Kafka Client for .NET

Stop writing hundreds of lines of error handling code - there's a better way.

18 minutes to read

If you’re using Kafka in .NET, you’re probably writing hundreds of lines of code just to handle “what happens when my consumer crashes?” or “how do I retry failed messages?” or “what happens when I’m consuming messages too fast?”

What if I told you there was a way to handle all of that in just 5-10 lines of code?

That’s exactly what Akka.Streams.Kafka brings to the table - and it’s one of the most underrated parts of the entire Akka.NET ecosystem.

The Pain of Kafka Development in .NET

Let me be clear upfront: Confluent.Kafka is a great library. It’s a solid, first-party driver that does exactly what it’s supposed to do - provide a low-level interface to Kafka’s binary protocol through the battle-tested librdkafka C library.

But here’s the thing: building production-ready Kafka applications requires solving a lot of problems that aren’t directly related to Kafka itself:

  • Error handling with retry logic - What happens when your message processor throws an exception?
  • Backpressure management - How do you prevent fast producers from overwhelming slow consumers?
  • Partition rebalancing - How do you gracefully handle consumer group changes?
  • Dead letter queues - Where do poison messages go?
  • Graceful shutdown - How do you ensure in-flight messages are processed before termination?

With raw Confluent.Kafka, you’re on your own for all of these concerns. That means writing a lot of infrastructure code that has nothing to do with your actual business logic.

The Complexity Tax

To understand why building production Kafka applications is so complex, it helps to see how these libraries relate to each other:

graph TB
    subgraph "Focus on WHAT"
        A["Akka.Streams.Kafka<br/>(Stream Processing)"]
    end
    subgraph "Manage HOW"
        B["Confluent.Kafka<br/>(.NET Client Library)"]
    end
    subgraph "Protocol Details"
        C["librdkafka<br/>(C Library)"]
    end
    subgraph "Raw Communication"
        D["Kafka Protocol<br/>(TCP/Binary)"]
    end
    
    A --> B
    B --> C
    C --> D
    
    style A fill:#2E7D32,stroke:#1B5E20,color:#fff,stroke-width:2px
    style B fill:#1976D2,stroke:#0D47A1,color:#fff,stroke-width:2px
    style C fill:#5E35B1,stroke:#311B92,color:#fff,stroke-width:2px
    style D fill:#424242,stroke:#212121,color:#fff,stroke-width:2px

Akka.Streams.Kafka isn’t replacing Confluent.Kafka - it’s a higher-level abstraction that sits on top of it, handling all the distributed systems complexity so you can focus on your business logic.

To illustrate this difference, I’ve built a side-by-side comparison demo that implements identical order processing functionality using both approaches.

The Confluent.Kafka implementation: ~350 lines of complex error handling, retry logic, thread management, and partition rebalancing callbacks.

The Akka.Streams.Kafka implementation: ~50 lines total.

Let me show you what I mean.

Raw Confluent.Kafka - The Manual Approach

Here’s a condensed version of what the raw Confluent.Kafka consumer looks like:

public class ManualErrorHandlingConsumer
{
    private readonly ConcurrentDictionary<string, RetryInfo> _retryTracker = new();
    private readonly ConcurrentQueue<(OrderEvent order, int attempts, Exception error)> _deadLetterQueue = new();
    private readonly SemaphoreSlim _processingThrottle = new(10);
    private readonly object _offsetLock = new();
    private readonly Dictionary<TopicPartition, Offset> _pendingOffsets = new();
    private readonly HashSet<TopicPartition> _assignedPartitions = new();
    
    public async Task StartAsync(CancellationToken cancellationToken)
    {
        var config = new ConsumerConfig
        {
            BootstrapServers = _bootstrapServers,
            GroupId = "manual-consumer-group",
            AutoOffsetReset = AutoOffsetReset.Earliest,
            EnableAutoCommit = false // Manual offset management
        };
        
        using var consumer = new ConsumerBuilder<string, string>(config)
            .SetPartitionsAssignedHandler((c, partitions) =>
            {
                // Complex state management during rebalancing
                lock (_assignedPartitions)
                {
                    _assignedPartitions.Clear();
                    foreach (var tp in partitions)
                        _assignedPartitions.Add(new TopicPartition(tp.Topic, tp.Partition));
                }
                return partitions.Select(tp => new TopicPartitionOffset(tp, Offset.Unset));
            })
            .SetPartitionsRevokedHandler((c, partitions) =>
            {
                // Try to commit any pending offsets before revocation
                try { c.Commit(); }
                catch (KafkaException ex) { /* handle commit failures */ }
            })
            .Build();
            
        consumer.Subscribe(Topics.Orders);
        
        while (!cancellationToken.IsCancellationRequested)
        {
            var consumeResult = consumer.Consume(TimeSpan.FromMilliseconds(100));
            if (consumeResult?.Message != null)
            {
                await _processingThrottle.WaitAsync(cancellationToken);
                _ = Task.Run(async () =>
                {
                    try
                    {
                        await ProcessWithRetryAsync(order, consumeResult, consumer);
                    }
                    finally
                    {
                        _processingThrottle.Release();
                    }
                }, cancellationToken);
            }
        }
    }
    
    private async Task ProcessWithRetryAsync(OrderEvent order, ConsumeResult<string, string> result, IConsumer<string, string> consumer)
    {
        for (int attempt = 1; attempt <= 3; attempt++)
        {
            try
            {
                if (attempt > 1)
                {
                    var delay = TimeSpan.FromSeconds(Math.Pow(2, attempt - 1));
                    await Task.Delay(delay);
                }
                
                await _processor.ProcessOrderAsync(order);
                
                // Manual offset management with locking
                lock (_offsetLock)
                {
                    var tp = new TopicPartition(result.Topic, result.Partition);
                    _pendingOffsets[tp] = result.Offset + 1;
                    consumer.Commit(_pendingOffsets.Select(kv => 
                        new TopicPartitionOffset(kv.Key, kv.Value)));
                }
                return;
            }
            catch (PoisonMessageException ex)
            {
                _deadLetterQueue.Enqueue((order, attempt, ex));
                return;
            }
            catch (ProcessingException ex) when (ex.IsTransient && attempt < 3)
            {
                // Continue retry loop with exponential backoff
            }
        }
    }
}

And this doesn’t even include the separate dead letter queue processor, the complex partition assignment tracking, or the numerous edge cases around commit failures and rebalancing race conditions.

Akka.Streams.Kafka - The Elegant Solution

Now let’s look at the identical functionality implemented with Akka.Streams.Kafka:

// Configure consumer and producer settings
var consumerSettings = ConsumerSettings<string, string>
    .Create(system, Deserializers.Utf8, Deserializers.Utf8)
    .WithBootstrapServers(bootstrapServers)
    .WithGroupId("akka-consumer-group")
    .WithProperty("auto.offset.reset", "earliest");

var committerSettings = CommitterSettings.Create(system);
var producerSettings = ProducerSettings<string, string>
    .Create(system, Serializers.Utf8, Serializers.Utf8)
    .WithBootstrapServers(bootstrapServers);

// The entire consumer pipeline in ~20 lines
var (control, completion) = KafkaConsumer
    .CommittableSource(consumerSettings, Subscriptions.Topics("orders"))
    .SelectAsync(10, async msg =>
    {
        try
        {
            var order = JsonSerializer.Deserialize<OrderEvent>(msg.Record.Message.Value);
            var result = await processor.ProcessOrderAsync(order);
            return (Success: true, Message: msg, Order: order, Error: null as Exception);
        }
        catch (PoisonMessageException ex)
        {
            return (Success: false, Message: msg, Order: order, Error: ex as Exception);
        }
        catch (ProcessingException ex) when (ex.IsTransient)
        {
            throw; // Let supervision strategy handle retry
        }
    })
    // Automatic retry with supervision
    .WithAttributes(ActorAttributes.CreateSupervisionStrategy(ex =>
    {
        if (ex is ProcessingException { IsTransient: true })
            return Directive.Restart; // Automatic retry with backoff
        return Directive.Stop;
    }))
    // Handle dead letter queue
    .SelectAsync(1, async result =>
    {
        if (result is { Success: false, Order: not null })
        {
            var dlqMessage = new ProducerRecord<string, string>(
                "dead-letters", result.Order.OrderId,
                JsonSerializer.Serialize(new { result.Order, result.Error?.Message }));
            
            await Source.Single(dlqMessage)
                .RunWith(KafkaProducer.PlainSink(producerSettings), materializer);
        }
        return (ICommittable)result.Message.CommitableOffset;
    })
    // Automatic offset commits
    .ToMaterialized(Committer.Sink(committerSettings), Keep.Both)
    .Run(materializer);

// Graceful shutdown
lifetime.ApplicationStopping.Register(() =>
{
    _ = control.DrainAndShutdown(completion);
});

That’s it. Same functionality, dramatically less code, and all the complexity is handled for you.

What Makes Akka.Streams.Kafka Special

1. Built-in Backpressure

One of the biggest complaints about Confluent.Kafka is the lack of backpressure support. Once you start polling for messages, you’re expected to handle whatever throughput Kafka throws at you.

Akka.Streams.Kafka automatically handles this through its reactive streams implementation. Here’s how it works:

graph LR
    K[Kafka] -->|Fast Messages| C[Consumer]
    C --> S[Stream Processing]
    S -->|Signal: Slow Down| C
    S --> P1[Order Processing]
    P1 --> DB[(Database)]
    DB -->|Slow Response| P1
    P1 -.->|Backpressure| S
    S -.->|Pause Polling| C
    
    style S fill:#2E7D32,color:#fff
    style DB fill:#D32F2F,color:#fff
    style C fill:#1976D2,color:#fff

If your downstream processing (like database writes) can’t keep up, the stream automatically pauses polling from Kafka until the backlog clears. No manual semaphores or thread pool management required.

2. Supervision Strategies for Error Handling

Instead of writing try-catch blocks everywhere with manual retry logic, you can use Akka.NET’s supervision system. Define once how errors should be handled (retry, skip, or stop), and the system takes care of the rest:

.WithAttributes(ActorAttributes.CreateSupervisionStrategy(ex =>
{
    if (ex is ProcessingException { IsTransient: true })
        return Directive.Restart; // Automatic retry
    return Directive.Stop; // Give up
}))

3. Transparent Partition Rebalancing

When consumer instances join or leave the group, Akka.Streams.Kafka automatically handles the complex coordination required to maintain message ordering and prevent processing duplicate messages from revoked partitions. The manual approach requires dozens of lines of complex callback handling.

4. Compositional Design

Because it’s built on Akka.Streams, you can easily compose complex processing pipelines:

KafkaConsumer.CommittableSource(settings, subscription)
    .SelectAsync(10, ProcessOrderAsync)           // Parallel processing
    .GroupedWithin(100, TimeSpan.FromSeconds(5))  // Batch processing
    .SelectAsync(1, WriteBatchToDatabase)         // Database writes
    .ToMaterialized(Committer.Sink(committerSettings), Keep.Both)
    .Run(materializer);

Each stage can have its own error handling, parallelism settings, and backpressure characteristics.

Real-World Performance

I’ve tested both approaches with identical workloads processing 100 orders with various failure scenarios:

Scenario Confluent.Kafka (Manual) Akka.Streams.Kafka
Normal Processing ✓ Works ✓ Works
Transient Failures Manual retry logic Automatic supervision
Poison Messages Manual DLQ handling Integrated DLQ flow
Partition Rebalancing Complex callbacks Transparent
Backpressure Manual semaphores Automatic
Graceful Shutdown Manual coordination Built-in draining
Lines of Code ~350 ~50

Both approaches handle the same workload identically, but the Akka.Streams.Kafka version requires 85% less code and is far more maintainable.

Scaling and Operations

Here’s where Akka.Streams.Kafka really shines. Both approaches handle partition rebalancing, but the devil is in the details.

The Confluent.Kafka Reality

With raw Confluent.Kafka, partition rebalancing requires complex callback handling:

.SetPartitionsRevokedHandler((c, partitions) =>
{
    // 1. You must manually commit any pending offsets
    try { c.Commit(); }
    catch (KafkaException ex) { /* handle commit failures */ }
    
    // 2. You must track and invalidate any in-flight messages
    // 3. You must handle race conditions between processing and revocation
    // 4. You must coordinate this across your thread pool
    // 5. You must deal with potential message duplication or loss
})

The callback-based approach means you’re responsible for:

  • Message invalidation: Preventing processing of messages from revoked partitions
  • Offset coordination: Committing offsets before partitions are revoked
  • Race condition handling: Managing threads that might still be processing revoked messages
  • Error recovery: Handling cases where commits fail during rebalancing

Get any of this wrong and you risk message duplication, message loss, or processing messages from partitions you no longer own.

The Akka.Streams.Kafka Advantage

With Akka.Streams.Kafka, all of this complexity is handled transparently:

# Terminal 1
dotnet run 1

# Terminal 2  
dotnet run 2

# Terminal 3
dotnet run 3

Behind the scenes, Akka.Streams.Kafka automatically:

  1. Invalidates in-flight messages from revoked partitions (as long as they haven’t been emitted to your processing code yet)
  2. Commits outstanding offsets from revoked partitions immediately during rebalancing
  3. Coordinates with the stream backpressure system to ensure clean handovers
  4. Prevents race conditions between message processing and partition revocation

You don’t write a single line of rebalancing code, yet you get more sophisticated behavior than most manual implementations provide.

Why Akka.Streams.Kafka is Just Better

Here’s the thing: writing Kafka applications should be fun. You should be able to focus on solving interesting business problems, not debugging race conditions in partition rebalancing callbacks at 2 AM.

Akka.Streams.Kafka gives you:

  • Instant productivity - Build production-ready consumers in minutes, not weeks
  • Composable building blocks - Mix and match sources, flows, and sinks like LEGO pieces
  • Built-in resilience - Error handling, retries, and backpressure that just work
  • Zero infrastructure code - No manual thread management, offset tracking, or rebalancing logic
  • Sleep-friendly operations - Fewer 3 AM alerts because the framework handles edge cases

The best part? It’s built on battle-tested foundations (Confluent.Kafka + librdkafka) so you get enterprise-grade reliability with startup-friendly developer experience.

Getting Started

Akka.Streams.Kafka is available as a NuGet package and is fully supported as part of Petabridge’s Akka.NET support plans.

<PackageReference Include="Akka.Streams.Kafka" Version="1.5.*" />
<PackageReference Include="Akka.Hosting" Version="1.5.*" />

The full demo code includes Docker Compose setup for Kafka and runnable examples of both approaches.

Want to see more Akka.Streams content? Check out our “Introduction to Akka.Streams” and “Why Akka.Streams?” blog posts.

The Bottom Line

You shouldn’t have to be a distributed systems expert to use Kafka effectively. Akka.Streams.Kafka provides a higher-level abstraction that handles the complex infrastructure concerns for you, letting you focus on what matters: your business logic.

Stop writing hundreds of lines of error handling code. There’s a better way.

If you liked this post, you can share it with your followers or follow us on Twitter!
Written by Aaron Stannard on August 14, 2025

 

 

Observe and Monitor Your Akka.NET Applications with Phobos

Did you know that Phobos can automatically instrument your Akka.NET applications with OpenTelemetry?

Click here to learn more.