Introduction To Algorithms: 6.046J/18.401J/SMA5503
Introduction To Algorithms: 6.046J/18.401J/SMA5503
6.046J/18.401J/SMA5503
Lecture 8
Prof. Charles E. Leiserson
A weakness of hashing
Problem: For any hash function h, a set
of keys exists that can cause the average
access time of a hash table to skyrocket.
• An adversary can pick all keys from
{k ∈ U : h(k) = i} for some slot i.
IDEA: Choose the hash function at random,
independently of the keys.
• Even if an adversary can see your code,
he or she cannot find a bad set of keys,
since he or she doesn’t know exactly
which hash function will be chosen.
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.2
Universal hashing
Definition. Let U be a universe of keys, and
let H be a finite collection of hash functions,
each mapping U to {0, 1, …, m–1}. We say
H is universal if for all x, y ∈ U, where x ≠ y,
we have |{h ∈ H : h(x) = h(y)}| = |H|/m.
= n −1 . • Algebra.
m
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.9
Constructing a set of
universal hash functions
Let m be prime. Decompose key k into r + 1
digits, each with value in the set {0, 1, …, m–1}.
That is, let k = 〈k0, k1, …, kr〉, where 0 ≤ ki < m.
Randomized strategy:
Pick a = 〈a0, a1, …, ar〉 where each ai is chosen
randomly from {0, 1, …, m–1}.
r
Define ha (k ) = ∑ ai ki mod m . Dot product,
i =0
modulo m
How big is H = {ha}? |H| = mr + 1. REMEMBER
THIS!
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.10
Universality of dot-product
hash functions
Theorem. The set H = {ha} is universal.
Example: m = 7.
z 1 2 3 4 5 6
z–1 1 4 5 2 3 6