Conflict Resolution in Hash Tables with Open Addressing



500 views Hash Table Internals



Open Addressing is a common yet interesting way of handling collisions in Hash Tables and instead of using an auxiliary data structure, it leverages empty slots of the hash table to store the collided keys, making the entire approach space efficient.

While inserting a key in the hash table, we pass it through the hash function to get the slot, and if the slot is already occupied, we find the next available slot through probing.

Probing

Probing is the process through which we deterministically find the next available slot in the hash table. The approach and algorithm could vary, but formally it is a function of the key to be placed and the attempt, that spits out a slot.

j = p(k, i)

The probing function uses the key k and an attempt i to spit out an index j. The attempt i and the index j are in the range [0, m-1], where m is the size of the hash table.

Good probing function

A good probing function would generate a deterministic permutation of numbers [0, m - 1] specific to the key k, so that we eventually check and cover the entire hash table for an available slot.

For example: for some key k1, on a hash table of size 8, the probing function might generate a sequence: 5, 7, 2, 1, 0, 6, 4, and 3. This means we first try to put the key in index 5, if that is occupied we check 7, then 2, and so on.

Hash Table Operations

Adding a key

Until we find a free slot in the hash table, we keep invoking the probing function with attempts 0, 1, 2,, etc. Once we find an empty slot, we stop and place the key in that slot.

Key Lookups

For looking up a key, we invoke the probing function with attempt 0 and check the slot. If the slot holds the key we need, we stop and return the value. If not, we continue to probe with attempts 1, 2, etc., and continue to hunt.

We stop the iteration when

  • we find the key,
  • we stumble upon an empty slot during iteration
  • we have checked the complete hash table

Deleting a key

With open addressing, deleting a key from the table has to be a soft delete, because we need a way to differentiate between an empty slot and a deleted key.

If during deletion we empty the slot, then we would be unable to look and reach for a key that had a collision and was placed after the key we deleted.

Limitation of Open addressing

Since we are not having any auxiliary data structure, a major limitation of Open Addressing is that the maximum number of keys we can hold are the same as the number of slots in the hash table.


Arpit Bhayani

Arpit's Newsletter

CS newsletter for the curious engineers

❤️ by 17000+ readers

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.




Other essays that you might like


Implementing Hash Maps with Hash Tables

425 views 15 likes 2022-08-03

Maps and Dictionaries are amazing, but how are they implemented? In this 11th video of this Hash Table Internals series...

Implementing Hash Sets with Hash Tables

259 views 6 likes 2022-08-01

Sets are amazing, but how are they implemented? In this 10th video of this Hash Table Internals series, we take an in-d...

Implementing Resize of a Hash Table

257 views 9 likes 2022-07-29

So, the Hash Table needs to be resized in order to maintain consistent performance, but how exactly? In this 9th Video ...

Why are Hash Tables always doubled?

770 views 19 likes 2022-07-27

Why are the underlying arrays of the hash tables always a power of 2? When we trigger a resize why are Hash Tables alway...


Be a better engineer

A set of courses designed to make you a better engineer and excel at your career; no-fluff, pure engineering.


System Design Masterclass

A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems.

Enrolled by 700+ learners

Details →

Designing Microservices

A free course to help you understand Microservices and their high-level patterns in depth.

Enrolled by 17+ learners

Details →

GitHub Outage Dissections

A free course to help you learn core engineering from outages that happened at GitHub.

Enrolled by 67+ learners

Details →

Hash Table Internals

A free course to help you learn core engineering from outages that happened at GitHub.

Enrolled by 25+ learners

Details →

BitTorrent Internals

A free course to help you understand the algorithms and strategies that power P2P networks and BitTorrent.

Enrolled by 42+ learners

Details →

Topics I talk about

Being a passionate engineer, I love to talk about a wide range of topics, but these are my personal favourites.




Arpit's Newsletter read by 17000+ engineers

🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.



  • v12.4.4
  • © Arpit Bhayani, 2022

Powered by this tech stack.