π Today I Learned Hash Tables β Collisions, Resizing, and a Duplicate Finder in Python
π What are Hash Tables?
A hash table (also known as a dictionary or map) stores data in key-value pairs. It's like a super-fast lookup table.
Example:
student_scores = {
"Alice": 90,
"Bob": 85,
"Charlie": 78
}
Keys (
"Alice") are hashed to a memory index.Values (
90) are stored at that index.This makes lookup, insert, and delete operations nearly O(1) in time.
π₯ What is a Collision?
A collision happens when two keys are hashed to the same memory location.
Example: If "cat" and "tac" both land on index 5, the hash table has to handle that.
π§ Collision handling methods:
Chaining: Use a list at each index to store multiple values.
Open addressing: Find another empty slot (linear probing, quadratic, etc.)
π¦ What is Resizing?
Hash tables grow dynamically.
When the table gets too full (usually past 70% load), it resizes itself.
This involves:
Creating a new bigger array (usually double the size).
Rehashing all existing keys into this new array.
Resizing helps maintain fast access time even as more elements are added.
π§ͺ Task: Finding Duplicates Using a Hash Table
I practiced by solving a real problem: finding all duplicate numbers in a list.
π§ Python Code:
array = [10, 3, 6, 9, 10, 10, 3, 5, 8, 60, 30, 4, 3, 5]
count_dict = {}
duplicates = []
# Step 1: Count each element
for num in array:
if num in count_dict:
count_dict[num] += 1
else:
count_dict[num] = 1
# Step 2: Collect duplicates
for key in count_dict:
if count_dict[key] > 1:
duplicates.append(key)
# Step 3: Print results
print("duplicates:", duplicates)
for n in duplicates:
print(f"{n} appeared {count_dict[n]} times")
β Output:
duplicates: [10, 3, 5]
10 appeared 3 times
3 appeared 3 times
5 appeared 2 times
π§ What I Learned
Hash tables are ideal for quick lookup tasks.
Collisions are normal, and we need strategies to handle them.
Resizing keeps the hash table efficient as it grows.
Real-world use: Hash tables make duplicate detection fast and simple.