How Kafka Behaves When You Add More Partitions

Arpit Bhayani

curious, tinkerer, and explorer


Partitions sit right in the middle of how Kafka works. They define ordering, parallelism, and how far it can scale. But what actually happens when you need more of them? How does Kafka grow that number, what happens to the data you already have, and which guarantees stay intact after the change?

This essay takes you through the complete picture of partition expansion in Kafka. We will cover

  • the mechanics of adding partitions
  • understand why existing data does not move
  • explore the implications for key-based ordering, and
  • dive deep into the consumer rebalancing protocols

Why Increase Partitions

Before diving into the mechanics, let us understand the scenarios that drive partition expansion. Kafka partitions determine several critical aspects of your system.

Throughput scales with partitions. Producers can write to different partitions in parallel, and consumers in a group can read from different partitions simultaneously. If you have three partitions and three consumers, each consumer handles one partition. Adding more partitions allows you to add more consumers and increase parallelism.

Storage distribution improves with more partitions. Each partition is hosted on a broker (node), so more partitions means better distribution of data across your cluster. This prevents any single broker from becoming a storage bottleneck.

Consumer concurrency is bounded by partition count. You cannot have more active consumers in a consumer group than partitions. If your topic has four partitions, the fifth consumer will sit idle. Scaling consumer processing power requires scaling partition count.

Common scenarios for partition expansion include:

  • Traffic growth exceeding current throughput capacity
  • Adding consumers to process backlogs faster
  • Proactive scaling based on projected growth
  • Rebalancing load across a cluster after adding new brokers

Mechanics of Adding Partitions

Kafka provides two primary methods for adding partitions to an existing topic. The first is the CLI approach using the kafka-topics.sh script.

bin/kafka-topics.sh --bootstrap-server localhost:9092 \
    --topic my-topic \
    --alter \
    --partitions 6

The number you specify is the total partition count, not the number to add. If your topic currently has 3 partitions and you want to add 3 more, you specify 6.

When you execute, Kafka performs several operations.

  1. The controller updates the topic metadata in the cluster to reflect the new partition count.
  2. New partition directories are created on the brokers (nodes) assigned to host them.
  3. The partition metadata is propagated to all brokers.
  4. Producers and consumers receive updated metadata on their next refresh cycle.

Importantly, this operation is additive only. Kafka does not support reducing the number of partitions. Once you add partitions, they are permanent unless you delete and recreate the topic.

What Happens to Existing Data

When you add partitions to a Kafka topic, existing data does not move. The messages already written to partitions 0, 1, and 2 stay exactly where they are. The new partitions 3, 4, and 5 start empty.

This choice has real consequences. Right after you expand a topic, the old partitions are still operate with all the historical messages while the new ones start out empty. Over time, as new messages arrive, this imbalance gradually corrects itself.

Consider a topic with 3 partitions that has been running for a month with 100 million messages. After expanding to 6 partitions, you have 100 million messages spread across partitions 0 to 2, and zero messages in partitions 3 to 5. If your consumers care about total lag or throughput parity, this imbalance matters.

So why doesn’t Kafka shuffle the old data into the new partitions? It comes down to how Kafka is built.

Kafka is built for append-only, immutable logs. Moving data between partitions would require rewriting message offsets, breaking consumer position tracking, and creating complex consistency challenges. The engineering complexity and operational risk of automatic redistribution far outweigh the benefits.

If you need to redistribute existing data across more partitions, you must do it yourself. The standard approach is to create a new topic with the desired partition count, stream all data from the old topic to the new topic, and switch your producers and consumers to the new topic.

The Key Ordering Problem

One of the trickiest parts of adding partitions is what it does to key ordering.

When a producer sends a message with a key, Kafka uses a deterministic hash function to select the partition. The default partitioner uses the Murmur2 hashing algorithm.

int partition = Math.abs(murmur2(keyBytes)) % numPartitions;

The critical part is the modulus operation against the partition count. When you change the number of partitions, the modulus changes, and keys that previously went to one partition may now go to a different partition.

Here’s a concrete example. Suppose you have a topic with 4 partitions and a message key “user-123”. The Murmur2 hash of “user-123” might be 7654321. With 4 partitions, this maps to partition 1 (7654321 % 4 = 1).

Now you expand to 6 partitions. The same key “user-123” with the same hash 7654321 now maps to partition 3 (7654321 % 4 = 3).

Here is what this means for your app.

Before partition expansion, all messages for “user-123” went to partition 1 and were processed by a single consumer. The consumer saw all messages for this user in order.

After partition expansion, new messages for “user-123” go to partition 3 while old messages remain in partition 1. If a consumer is processing both partitions, messages for “user-123” are no longer guaranteed to be processed in order.

Even worse, different consumers might process the old and new messages for the same key. Consumer A might be assigned partition 1 (old messages) while Consumer B handles partition 3 (new messages). You now have concurrent processing of what should be sequential updates.

This can mess up patterns like event sourcing, where the order of events matters for rebuilding state. It breaks stream processing aggregations, where you accumulate state per key. It breaks any business logic that assumes all events for an entity arrive at the same processor.

Maintaining Ordering Guarantees

With ordering on the line, the question becomes: how do you expand partitions without breaking things for keyed workloads?

The first strategy is to accept the temporary inconsistency. If your old data has been fully consumed and processed, the ordering disruption only affects keys with messages straddling the expansion. For append-only analytics or logging use cases, this is often acceptable.

The second strategy is the dual-write approach during a transition period. Before expanding partitions, start writing to both the old topic and a new topic with more partitions. Continue reading from the old topic until it drains. Then switch consumers to the new topic.

The third strategy is to use application-level ordering. Instead of relying on Kafka’s partition-based ordering, include a sequence number or timestamp in your messages. Your consumer logic reorders messages regardless of arrival sequence.

For Kafka Streams applications, the situation is more complex.

Streams maintains state stores backed by changelog topics, and the state is partitioned. Adding partitions to input topics can cause keys to be redistributed, but the associated state does not follow them.

Hence, the recommended approach for stateful Kafka Streams applications is to create new topics rather than expanding existing ones.

Consumer Group Rebalancing

When the partition count changes, consumers have to reshuffle their work. That’s handled by Kafka’s rebalancing protocol.

The original rebalancing protocol used an eager, stop-the-world approach. When rebalancing triggered, every consumer in the group stopped processing. The group coordinator recalculated partition assignments. All consumers received new assignments and resumed. During this window, throughput dropped to zero.

This was problematic for large consumer groups. A group with 100 consumers and 200 partitions might spend 30 seconds or more in rebalancing, during which no messages are processed.

Kafka 2.4 introduced the incremental cooperative rebalancing protocol through KIP-429. Instead of stopping all consumers, this protocol works in phases. Consumers who need to give up partitions release them first. In a subsequent rebalance, those partitions are assigned to other consumers. Consumers who keep their partitions continue processing throughout.

This significantly reduces the impact of rebalancing. Instead of 30 seconds of zero throughput, you might see individual partitions pause for a few seconds while overall throughput remains high.

Kafka 4.0 brought the next generation consumer rebalance protocol through KIP-848. This moves the assignment logic from the consumer group leader to the broker-side group coordinator. Rebalancing becomes fully asynchronous, meaning consumers not affected by the partition changes experience no interruption at all.

Partition Assignment Strategies

Kafka provides several partition assignment strategies, each with different behaviors during rebalancing.

The RangeAssignor assigns partitions on a per-topic basis. It sorts partitions and consumers, then divides partitions among consumers as evenly as possible. This can lead to imbalanced assignments when you have multiple topics with different partition counts.

Topic A: 4 partitions, Topic B: 4 partitions
Consumer 1: A-0, A-1, B-0, B-1
Consumer 2: A-2, A-3, B-2, B-3
Consumer 3: (idle)

The RoundRobinAssignor distributes partitions across consumers in a round-robin fashion across all subscribed topics. This achieves a better balance but does not attempt to preserve assignments during rebalancing.

Topic A: 4 partitions, Topic B: 4 partitions
Consumer 1: A-0, A-2, B-0, B-2
Consumer 2: A-1, A-3, B-1, B-3
Consumer 3: (idle)

The StickyAssignor aims for balanced distribution like RoundRobin but also minimizes partition movements during rebalancing. If Consumer 2 leaves, partitions are redistributed with minimal changes to Consumer 1’s assignments.

The CooperativeStickyAssignor adds cooperative protocol support to the StickyAssignor. This is the recommended choice for most production deployments.

When you add partitions, the assignment strategy determines how quickly and smoothly work is redistributed. With sticky assignors, existing assignments remain stable while new partitions are assigned to consumers with capacity.

Exactly-once Semantics and Partition Expansion

Exactly-once in Kafka is built on two mechanisms. Idempotent producers ensure that retried messages are not duplicated within a partition. Transactions ensure atomic writes across multiple partitions.

Idempotent producers work per-partition. Each producer is assigned a Producer ID (PID), and each message includes a sequence number. The broker deduplicates based on PID, partition, and sequence number.

props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);

When you add partitions, idempotent production continues to work for the original partitions. New partitions start fresh with their own sequence tracking. There is no semantic issue here.

Transactions are more nuanced. A transaction can span multiple partitions, and Kafka ensures that either all writes succeed or none do.

producer.initTransactions();
producer.beginTransaction();
producer.send(new ProducerRecord<>("topic-a", "key", "value"));
producer.send(new ProducerRecord<>("topic-b", "key", "value"));
producer.commitTransaction();

Adding partitions during an active transaction does not affect the transaction. The new partitions are simply available for future transactions. However, if your transaction logic assumes certain partition counts or key-to-partition mappings, expanding partitions mid-operation could violate those assumptions.

For stream processing applications using exactly-once semantics, the transactional ID should encode partition information to ensure correct fencing after failures.

String transactionalId = "my-app-" + inputTopic + "-" + partition;

This pattern ensures that each input partition has a dedicated transactional producer, maintaining exactly-once guarantees across restarts and rebalances.

Best Practices

  1. Plan partition counts for future growth. Over-partition initially if you expect significant traffic increases. Having idle partitions costs little, while under-partitioning forces painful migrations.
  2. Verify that your consumers handle rebalancing correctly and that your application tolerates the temporary ordering disruption for keyed messages.
  3. Use sticky assignors to minimize partition movement.
  4. Monitor during and after expansion. Watch consumer lag, rebalance metrics, and application error rates for at least an hour after expanding partitions.
  5. For stateful applications, prefer creating new topics over expanding existing ones. The migration effort is worth the consistency guarantees.

Conclusion

Growing the number of partitions in Kafka is easy to do, but the ripple effects can get complicated. The kafka-topics.sh command finishes in seconds, but it is fun to understand what happens afterward.

The gist is that the existing data stays where it is, key-to-partition mappings change and break ordering assumptions, consumer groups rebalance with configurable levels of disruption, and stateful stream processing applications need special care.


If you find this helpful and interesting,

Arpit Bhayani

Staff Engg at GCP Memorystore, Creator of DiceDB, ex-Staff Engg for Google Ads and GCP Dataproc, ex-Amazon Fast Data, ex-Director of Engg. SRE and Data Engineering at Unacademy. I spark engineering curiosity through my no-fluff engineering videos on YouTube and my courses