Hash Table

How it Works

A hash table stores key-value pairs in an array of buckets. When you insert a key, a hash function converts the key into an integer index that determines which bucket the value goes into. To retrieve a value, the same hash function is applied to the key to find the correct bucket instantly.

The hash function is the core of the data structure. A good hash function distributes keys uniformly across all buckets, minimizing the chance of two different keys mapping to the same index. In this visualization we use a simple modular hash: hash(key) = key % numBuckets.

Collision resolution is necessary because multiple keys can hash to the same bucket index. The two most common strategies are:

Chaining (separate chaining) — Each bucket holds a linked list (or chain) of all entries that hash to that index. On collision, the new entry is simply appended to the chain. This is the approach shown in the visualization below.
Open addressing — When a collision occurs, the algorithm probes other bucket positions (linear probing, quadratic probing, or double hashing) until an empty slot is found.

Key characteristics of hash tables:

Average O(1) operations — Insert, search, and delete are all constant time on average, assuming a good hash function and a reasonable load factor.
Worst case O(n) — If many keys collide into the same bucket, scanning the chain degrades to linear time.
Load factor — The ratio of stored entries to the number of buckets. When the load factor exceeds a threshold (commonly 0.75), the table is resized (rehashed) to maintain performance.
Unordered — Hash tables do not maintain insertion order or sorted order of keys. Iteration order is determined by hash values, not logical order.
Dynamic resizing — To keep operations efficient, hash tables double in size and rehash all entries when the load factor becomes too high.

Hash tables are one of the most widely used data structures in practice. They power dictionaries in Python, objects in JavaScript, HashMaps in Java, and are used internally in databases, caches, symbol tables, and countless other systems.

Interactive Visualization

Experiment with the hash table below. Enter an integer value and click an operation button to see how hashing, insertion, search, and deletion work step by step. Collisions are resolved using chaining (linked lists per bucket).

Hash table with 8 buckets. Initial values: [15, 42, 8, 23, 7, 31, 50, 19]. Hash function: key % 8. Use the operation buttons below.

table = {}
# Hash function
idx = hash(key) % size
# Insert
table[key] = value
# Search
value = table.get(key)

Speed No steps

Code Implementation

Below is a hash table implementation using separate chaining for collision resolution, with insert, search, delete, and display operations in three languages.

class HashTable:
    def __init__(self, size=8):
        self.size = size
        self.buckets = [[] for _ in range(size)]

    def _hash(self, key):
        """Compute bucket index from key."""
        return key % self.size

    def insert(self, key):
        """Insert a key into the hash table. O(1) avg"""
        index = self._hash(key)
        bucket = self.buckets[index]
        # Avoid duplicates
        for item in bucket:
            if item == key:
                return  # Already exists
        bucket.append(key)

    def search(self, key):
        """Search for a key. Returns True/False. O(1) avg"""
        index = self._hash(key)
        bucket = self.buckets[index]
        for item in bucket:
            if item == key:
                return True
        return False

    def delete(self, key):
        """Delete a key from the hash table. O(1) avg"""
        index = self._hash(key)
        bucket = self.buckets[index]
        for i, item in enumerate(bucket):
            if item == key:
                bucket.pop(i)
                return True
        return False

    def display(self):
        """Print the hash table."""
        for i, bucket in enumerate(self.buckets):
            chain = " -> ".join(str(v) for v in bucket)
            print(f"Bucket {i}: {chain if chain else 'empty'}")


# Example usage
ht = HashTable(8)
for val in [15, 42, 8, 23, 7, 31, 50, 19]:
    ht.insert(val)

ht.display()
# Bucket 0: 8
# Bucket 1: 42 -> 50
# Bucket 2: ...
print(ht.search(23))   # True
print(ht.search(99))   # False
ht.delete(42)
ht.display()

Complexity Analysis

The table below summarizes the time and space complexities for hash table operations with separate chaining.

Operation	Average Case	Worst Case	Notes
Insert	O(1)	O(n)	Constant time if hash distributes evenly; worst case when all keys collide into one bucket
Search	O(1)	O(n)	Hash to the bucket in O(1), then scan the chain; worst case all n keys in one chain
Delete	O(1)	O(n)	Same as search: find the element, then remove it from the chain
Space	O(n + m)		n entries stored plus m buckets in the underlying array
Resize / Rehash	O(n)		All entries must be rehashed into the new array; amortized O(1) per insert

Real-World Applications

🐍

Python Dicts {} is a hash table; O(1) average lookup powers everyday Python

🔍

Database Indexes Hash indexes on columns enable O(1) row lookup by key

🌐

DNS Resolution Domain names → IP addresses via a distributed hash table

💾

Caches Redis, Memcached, and CPU caches use hash tables for O(1) get/set

Quiz

Q1. What is the average time complexity of searching for a key in a hash table?

O(n)

O(1)

O(log n)

O(n log n)

On average, the hash function computes the bucket index in O(1) time, and with a good distribution, the chain in each bucket is very short (close to 1 element), giving O(1) lookup.

Q2. What happens when two different keys hash to the same bucket index?

The second key is discarded

The hash table throws an error

A collision occurs and must be resolved (e.g., chaining or open addressing)

The first key is overwritten by the second

When two keys hash to the same index, it is called a collision. The hash table uses a collision resolution strategy (such as chaining or open addressing) to store both entries. Neither is discarded or overwritten.

Q3. In separate chaining, what data structure is used at each bucket to handle collisions?

A linked list (or similar collection)

A binary search tree

A stack

Another hash table

In separate chaining, each bucket stores a linked list (or sometimes a dynamic array) of all entries that hash to that index. When a collision occurs, the new entry is appended to the list at that bucket.

Q4. What is the worst-case time complexity of operations on a hash table?

O(1)

O(log n)

O(n log n)

O(n)

In the worst case, all n keys hash to the same bucket, forming a single chain of length n. Searching, inserting, or deleting then requires scanning the entire chain, which takes O(n) time.

Q5. What is the load factor of a hash table?

The number of buckets divided by the number of stored entries

The number of stored entries divided by the number of buckets

The maximum number of collisions in any single bucket

The percentage of buckets that are empty