Why are Hash Tables always doubled?



865 views Hash Table Internals



To maintain consistent performance, the hash table has to be resized - be it growing or shrinking. The trigger to resize depends on the load factor, which is defined as the ratio of the number of keys that the hash table holds to the total number of slots.

When should we resize?

The Hash Table is resized when the load factor hits a certain threshold. If we get too aggressive or too lenient, we would not be able to get the optimal efficiency. Hence, we have to find a sweet spot.

We typically grow the hash table when the load factor hits 0.5 and shrink when we hit 0.125.

Why do we always double?

We have heard and seen so many times, that when a Hash Table is required to grow, we always double the underlying array; but why? Can we not just increase it by 1 every time we are trying to insert?

Resizing by 1 every time

Let’s take an example: say, we grow the array by 1 every time we insert an element in the Hash Table. Let’s compute the time it requires to fill n elements.

Inserting the 1st element is: allocating an array of size 1, and inserting 1 element; so in all O(1) operations.

Inserting the 2nd element is: allocating an array of size 2, copying 1 element from the old array, and then inserting the 2nd element; so in all O(2) operations.

Hence, inserting the nth element is: allocating an array of size n, copying n-1 elements from the old array, and then inserting the nth element; so in all O(n) operations.

Total operations to insert n elements = 1 + 2 + … + n = (n(n-1))/2 which is O(n^2).

Doubling every time

If we double every time, inserting n elements requires O(n) time, as it is spacing out an expensive resize operation. We would be inserting n/2 elements before resizing the array to 2n.

Note: For a detailed amortized analysis, please refer to the video attached here, where I have explained the reasoning in depth.

Why is a hash table array always a power of 2?

For a power of 2, the MOD and the bitwise AND spit out the same result and given that the bitwise AND is a magnitude faster than the MOD, we get the best performance out of our Hash Tables when we use AND

(i % 2^k) == (i & (2^k) - 1)

This is why the length of the underlying array is always a power of 2, making our most-frequent operation efficient.

Shrinking the Hash Table

To ensure we are not wasting space, we trigger the shrink when we do not utilize the underlying array enough. While triggering a shrink, we also need to ensure that we are not aggressive enough that we have to grow immediately after the shrink.

Hence, we shrink the hash table when the load factor hits ⅛ i.e. in a table of length 64 if we are only holding 8 keys, we trigger a shrink and that reduces the length to 32.

Note: To understand why we do it at load factor = ⅛, please refer to the video.


Arpit Bhayani

Arpit's Newsletter

CS newsletter for the curious engineers

❤️ by 17000+ readers

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.




Other essays that you might like


Implementing Hash Maps with Hash Tables

544 views 24 likes 2022-08-03

Maps and Dictionaries are amazing, but how are they implemented? In this 11th video of this Hash Table Internals series...

Implementing Hash Sets with Hash Tables

320 views 8 likes 2022-08-01

Sets are amazing, but how are they implemented? In this 10th video of this Hash Table Internals series, we take an in-d...

Implementing Resize of a Hash Table

322 views 11 likes 2022-07-29

So, the Hash Table needs to be resized in order to maintain consistent performance, but how exactly? In this 9th Video ...

Why are Hash Tables always doubled?

865 views 28 likes 2022-07-27

Why are the underlying arrays of the hash tables always a power of 2? When we trigger a resize why are Hash Tables alway...


Be a better engineer

A set of courses designed to make you a better engineer and excel at your career; no-fluff, pure engineering.


System Design Masterclass

A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems.

800+ learners

Details →

Designing Microservices

A free playlist to help you understand Microservices and their high-level patterns in depth.

17+ learners

Details →

GitHub Outage Dissections

A free playlist to help you learn core engineering from outages that happened at GitHub.

67+ learners

Details →

Hash Table Internals

A free playlist to help you understand the internal workings and construction of Hash Tables.

25+ learners

Details →

BitTorrent Internals

A free playlist to help you understand the algorithms and strategies that power P2P networks and BitTorrent.

42+ learners

Details →

Topics I talk about

Being a passionate engineer, I love to talk about a wide range of topics, but these are my personal favourites.




Arpit's Newsletter read by 17000+ engineers

🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.



  • v12.7.8
  • © Arpit Bhayani, 2022

Powered by this tech stack.