Skip to main content
data systems from the ground up

Event Streaming vs Message Queues

6 min read Chapter 22 of 36

Event Streaming vs Message Queues

The logistics platform has two data movement requirements that look similar but are mechanically different.

Requirement 1: When a package is scanned, notify the tracking dashboard, the analytics pipeline, and the customer notification service. Each consumer processes the event independently. If a new consumer is added next month (a fraud detection service), it should be able to replay historical events.

Requirement 2: When a delivery route is computed, dispatch it to exactly one available driver. If the driver’s app crashes before acknowledging, the route must be re-dispatched to another driver. No two drivers should receive the same route.

Requirement 1 is event streaming. Requirement 2 is message queuing. Kafka solves the first. RabbitMQ solves the second. They are not interchangeable.

Kafka: The Append-Only Log as a Message System

Kafka’s storage is the append-only log from Chapter 1, partitioned and replicated. A Kafka topic is a logical stream. Each topic is split into partitions, and each partition is an independent append-only log.

The Producer Path

A producer sends a message to a topic. Kafka determines which partition receives the message:

  • If the message has a key (e.g., package_id), Kafka hashes the key to select a partition. All messages with the same key go to the same partition. This guarantees ordering per key.
  • If the message has no key, Kafka round-robins across partitions.
// Concept: Kafka producer with key-based partitioning
// All events for the same package go to the same partition,
// guaranteeing ordering per package.

Properties props = new Properties();
props.put("bootstrap.servers", "kafka-1:9092,kafka-2:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("acks", "all");           // Wait for all ISR replicas
props.put("compression.type", "lz4"); // Compress batches

KafkaProducer<String, String> producer = new KafkaProducer<>(props);

// Key = packageId ensures all events for PKG-40291 land on the same partition
producer.send(new ProducerRecord<>(
    "package-events",       // topic
    "PKG-40291",           // key (determines partition)
    "{\"status\":\"SCANNED\",\"warehouse\":\"WH-042\"}"  // value
));

The Consumer Group

A consumer group is a set of consumers that collectively read from a topic. Each partition is assigned to exactly one consumer in the group. If the group has 3 consumers and the topic has 6 partitions, each consumer reads from 2 partitions.

The critical difference from a message queue: the message is not deleted after consumption. It remains in the log until the retention period expires. Multiple consumer groups can independently read the same messages.

# Concept: consumer group partition assignment
kafka-consumer-groups.sh --describe --group tracking-dashboard

# GROUP              TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG
# tracking-dashboard package-events  0          4521890         4521895         5
# tracking-dashboard package-events  1          3892100         3892100         0
# tracking-dashboard package-events  2          5102340         5102380         40

# Consumer group "tracking-dashboard" has 3 consumers, one per partition.
# Partition 0 has 5 messages of lag (consumer is 5 messages behind).
# Partition 2 has 40 messages of lag (consumer is falling behind).

# A separate consumer group "analytics-pipeline" can read the same partitions
# at its own pace without affecting "tracking-dashboard".

Consumer Lag: The Metric That Matters

Consumer lag is the difference between the log-end offset (latest message produced) and the current offset (latest message consumed). Lag is measured per partition.

When lag grows, the consumer is processing messages slower than the producer is writing them. Causes: slow deserialization, slow downstream writes, garbage collection pauses, rebalancing overhead.

Kafka consumer group partition assignment showing consumers, partitions, and offset tracking

The diagram shows a Kafka topic with 6 partitions assigned across 3 consumers in a group. Each consumer reads from exactly 2 partitions. The offset pointer per partition tracks where each consumer has read up to. When consumer C2 fails, its 2 partitions are reassigned to C1 and C3 during rebalancing. During rebalancing, no consumer in the group processes messages, creating a processing gap.

RabbitMQ: The Queue That Delivers and Forgets

RabbitMQ is a message broker. A producer sends a message to an exchange. The exchange routes the message to one or more queues based on routing rules. A consumer receives the message from the queue and acknowledges it. After acknowledgment, the message is deleted from the queue.

The Dispatch Pattern

The logistics platform’s route dispatch sends a computed route to exactly one driver. RabbitMQ’s work queue pattern handles this:

// Concept: RabbitMQ work queue for single-consumer dispatch
// Multiple driver apps consume from the same queue.
// RabbitMQ delivers each message to exactly one consumer.

ConnectionFactory factory = new ConnectionFactory();
factory.setHost("rabbitmq-1");

Connection connection = factory.newConnection();
Channel channel = connection.createChannel();

// Declare a durable queue
channel.queueDeclare("route-dispatch", true, false, false, null);

// Set prefetch to 1: each consumer receives one message at a time.
// The next message is not delivered until the current one is acknowledged.
// This prevents a fast consumer from starving slow ones.
channel.basicQos(1);

// Consume with manual acknowledgment
channel.basicConsume("route-dispatch", false, (tag, delivery) -> {
    String routeJson = new String(delivery.getBody(), StandardCharsets.UTF_8);
    try {
        assignRouteToDriver(routeJson);
        channel.basicAck(delivery.getEnvelope().getDeliveryTag(), false);
    } catch (Exception e) {
        // Reject and requeue: message goes back to the queue for another consumer
        channel.basicNack(delivery.getEnvelope().getDeliveryTag(), false, true);
    }
}, tag -> {});

If the consumer crashes before acknowledging, RabbitMQ redelivers the message to another consumer. This is at-least-once delivery: the message is guaranteed to be processed, but may be processed more than once if the consumer crashes after processing but before acknowledging.

The Key Differences

PropertyKafkaRabbitMQ
Storage modelAppend-only log (Chapter 1)Queue (FIFO, message deleted on ack)
Message retentionRetained until retention expiresDeleted after consumer ack
Multiple consumersIndependent consumer groupsCompeting consumers on same queue
OrderingPer-partition orderingNo ordering across competing consumers
ReplayConsumers can reset offset and replayNot possible after ack
DeliveryPull (consumer polls)Push (broker delivers)
BackpressureConsumer controls read ratePrefetch count limits in-flight messages

The Decision Rule

Use Kafka when multiple independent consumers need the same data, when replay is required, or when ordering per key matters. Event streaming, change data capture, audit trails. The logistics platform’s package events are published to Kafka because the tracking dashboard, the analytics pipeline, the notification service, and future consumers all need the same events independently.

Use RabbitMQ when exactly one consumer should process each message, when the message is no longer needed after processing, and when at-most-once or at-least-once delivery per consumer is the requirement. Task dispatch, route assignment, work queues. The logistics platform’s route dispatch uses RabbitMQ because each route should go to exactly one driver, and a route that fails should be requeued.

Do not use Kafka as a work queue. While Kafka can approximate competing consumers within a single consumer group, the partition-based assignment is coarser than RabbitMQ’s per-message delivery, and rebalancing during consumer failures causes processing pauses.

Do not use RabbitMQ as an event log. Messages are deleted after acknowledgment. Replay is impossible. Adding a new consumer that needs historical data requires re-publishing from the source, which defeats the purpose.