Algorithm Design Slides
Algorithm Design Slides
1)
w Most algorithms transform
Analysis of Algorithms
Input
Algorithm
Output
best case
average case
worst case
120
100
Running Time
80
60
40
20
Easier to analyze
Crucial to applications such as
games, finance and robotics
1000
2000
3000
Analysis of Algorithms
8000
7000
Time (ms)
Limitations of Experiments
9000
implementing the
algorithm
w Run the program with
inputs of varying size and
composition
w Use a method like
System.currentTimeMillis() to
get an accurate measure
of the actual running time
w Plot the results
4000
Input Size
6000
5000
4000
3000
2000
1000
0
0
50
100
Input Size
Analysis of Algorithms
Theoretical Analysis
Pseudocode (1.1)
Example: find max
element of an array
of an algorithm
More structured than Algorithm arrayMax(A, n)
English prose
Input array A of n integers
Less detailed than a
Output maximum element of A
program
currentMax A[0]
Preferred notation for
for i 1 to n 1 do
describing algorithms
if A[i] > currentMax then
Hides program design
currentMax A[i]
issues
return currentMax
w High-level description
Analysis of Algorithms
w
w
w
w
Analysis of Algorithms
Pseudocode Details
w Method call
w Control flow
n
n
n
n
n
if then [else ]
while do
repeat until
for do
Indentation replaces braces
w Method declaration
Algorithm method (arg [, arg])
Input
Output
w A CPU
w Return value
w An potentially unbounded
return expression
w Expressions
Assignment
(like = in Java)
= Equality testing
(like == in Java)
n 2 Superscripts and other
mathematical
formatting allowed
Analysis of Algorithms
w
w
w
w
performed by an algorithm
Identifiable in pseudocode
Largely independent from the
programming language
Exact definition not important
(we will see why later)
Assumed to take a constant
amount of time in the RAM
model
w Examples:
n
n
n
Evaluating an
expression
Assigning a value
to a variable
Indexing into an
array
Calling a method
Returning from a
method
Analysis of Algorithms
Algorithm arrayMax(A, n)
currentMax A[0]
for i 1 to n 1 do
if A[i] > currentMax then
currentMax A[i]
{ increment counter i }
return currentMax
# operations
2
2 +n
2(n 1)
2(n 1)
2(n 1)
1
Total
7n 1
Analysis of Algorithms
10
environment
Analysis of Algorithms
Analysis of Algorithms
Counting Primitive
Operations (1.1)
Primitive Operations
w Basic computations
11
12
Growth Rates
Constant Factors
functions:
n
n
Linear n
Quadratic n2
Cubic n 3
T (n )
w In a log-log chart,
1E+30
1E+28
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
1E+0
Cubic
not affected by
Quadratic
constant factors or
lower-order terms
Linear
w Examples
T ( n)
w Growth rates of
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+8
1E+6
1E+4
1E+2
1E+0
Quadratic
Quadratic
Linear
Linear
1E+0
1E+2
1E+4
1E+6
1E+8
1E+2
1E+4
1E+10
Analysis of Algorithms
13
2n + 10 cn
(c 2) n 10
n 10/(c 2)
Pick c = 3 and n 0 = 10
1,000,000
3n
2n+10
n2
n
n
n
10
10
100
n 2 cn
nc
The above inequality
cannot be satisfied
since c must be a
constant
100n
10n
n
10,000
1,000
1,000
100
10
100
1,000
n
15
Analysis of Algorithms
16
7n-2 is O(n)
need c > 0 and n 0 1 such that 7n-2 cn for n n 0
this is true for c = 7 and n 0 = 1
+ 20n2 + 5
3n 3 + 20n2 + 5 is O(n3 )
need c > 0 and n 0 1 such that 3n3 + 20n 2 + 5 cn3 for n n 0
this is true for c = 4 and n 0 = 21
7n-2
n3
is not O(n)
10
1
n 3n 3
14
n^2
100,000
Analysis of Algorithms
1E+10
Big-Oh Example
10,000
1E+8
Analysis of Algorithms
1E+6
n
17
f(n) is O(g(n))
g(n) is O(f(n))
Yes
No
Yes
No
Yes
Yes
Analysis of Algorithms
18
Big-Oh Rules
O(n2 )
Analysis of Algorithms
19
35
X
A
30
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A new array of n integers
n
for i 0 to n 1 do
n
s X[0]
n
for j 1 to i do
1 + 2 + + (n 1)
s s + X[j]
1 + 2 + + (n 1)
A[i] s / (i + 1)
n
return A
1
25
20
15
10
0
1
Analysis of Algorithms
21
Arithmetic Progression
w Thus, algorithm
prefixAverages1 runs in
O(n 2) time
20
Analysis of Algorithms
w We further illustrate
prefixAverages1 is
O(1 + 2 + + n)
w The sum of the first n
integers is n(n + 1) / 2
w Example:
22
6
5
4
3
2
1
0
Analysis of Algorithms
Analysis of Algorithms
6
23
Algorithm prefixAverages2(X, n)
Input array X of n integers
Output array A of prefix averages of X
A new array of n integers
s0
for i 0 to n 1 do
s s + X[i]
A[i] s / (i + 1)
return A
#operations
n
1
n
n
n
1
24
Relatives of Big-Oh
w big-Omega
w
w
25
Analysis of Algorithms
26
Analysis of Algorithms
w properties of logarithms:
27
5n 2 is (n2 )
28
Auxiliary stack
push(object): inserts an
element
object pop(): removes and
returns the last inserted
element
operations:
Applications of Stacks
A simple way of
Direct applications
Indirect applications
Algorithm pop():
if isEmpty() then
throw EmptyStackException
else
tt1
return S[t + 1]
implementing the
Stack ADT uses an
array
We add elements
Algorithm push(o)
from left to right
if t = S.length 1 then
A variable t keeps
throw FullStackException
track of the index of
else
the top element
tt+1
(size is t+1)
S[t] o
S
0 1 2
t
Elementary Data Structures
Comparison of the
Strategies
Growable Array-based
Stack (1.5)
In a push operation, when
incremental strategy:
increase the size by a
constant c
doubling strategy: double
the size
SA
tt+1
S[t] o
Analysis of the
Incremental Strategy
times
operations is proportional to
n + c + 2c + 3c + 4c + + kc =
n + c(1 + 2 + 3 + + k) =
n + ck(k + 1)/2
Since c is a constant, T(n) is O(n + k2), i.e.,
O(n2)
The amortized time of a push operation is O(n)
Elementary Data Structures
geometric series
2
1
1
8
operations:
enqueue(object): inserts an
element at the end of the
queue
object dequeue(): removes and
returns the element at the front
of the queue
Exceptions
$
$
$
$
$
$
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 2 3 4 5 6 7
10
Direct applications
11
Waiting lines
Access to shared resources (e.g., printer)
Multiprogramming
Indirect applications
for the array growth for the beginning of the next phase.
Applications of Queues
At the end of a phase we must have saved enough to pay for the
12
element
link to the next node
next
node
elem
nodes
elements
13
sequence of positions
storing arbitrary objects
It allows for insertion
and removal in the
middle
Query methods:
isFirst(p), isLast(p)
first(), last()
before(p), after(p)
node
trailer
elements
Elementary Data Structures
16
ComputersRUs
tree is an abstract model
of a hierarchical
structure
Sales
Manufacturing
R&D
A tree consists of nodes
with a parent-child
relation
US
International
Laptops
Desktops
Applications:
nodes/positions
15
In computer science, a
elem
header
Trees (2.3)
Organization charts
File systems
Europe
Programming
environments
next
replaceElement(p, o),
swapElements(p, q)
insertBefore(p, o),
insertAfter(p, o),
insertFirst(o),
insertLast(o)
remove(p)
element
link to the previous node
link to the next node
Update methods:
prev
Accessor methods:
14
nodes
Generic methods:
Asia
Canada
17
integer size()
boolean isEmpty()
objectIterator elements()
positionIterator positions()
Accessor methods:
Query methods:
boolean isInternal(p)
boolean isExternal(p)
boolean isRoot(p)
Update methods:
swapElements(p, q)
object replaceElement(p, o)
position root()
position parent(p)
positionIterator children(p)
18
Algorithm preOrder(v)
visit(v)
for each child w of v
preorder (w)
cs16/
1. Motivations
2. Methods
4
1.2 Avidity
2.2 Ponzi
Scheme
2.3 Bank
Robbery
h1c.doc
3K
h1nc.doc
2K
22
No
Yes
expression (2 (a 1) + (3 b))
Decision Tree
arithmetic expressions
decision processes
searching
21
20
Applications:
following properties:
6
Robot.java
20K
Stocks.java
25K
DDR.java
10K
19
Amortized Analysis of
Tree Traversal
todo.txt
1K
programs/
homeworks/
References
2.1 Stock
Fraud
Algorithm postOrder(v)
for each child w of v
postOrder (w)
visit(v)
1.1 Greed
In a postorder traversal, a
23
On expense account?
Yes
No
Yes
No
Starbucks
In N Out
Antoine's
Dennys
24
Inorder Traversal
In an inorder traversal a
Properties:
n number of nodes
e number of
external nodes
i number of internal
nodes
h height
e=i+1
n = 2e 1
hi
h (n 1)/2
e 2h
h log2 e
h log2 (n + 1) 1
4
3
traversal
Element
Parent node
Sequence of children
nodes
Element
Parent node
Left child node
Right child node
B
F
E
C
Elementary Data Structures
28
A node is represented
27
A node is represented by
Algorithm printExpression(v)
if isInternal (v)
print(()
inOrder (leftChild (v))
print(v.element ())
if isInternal (v)
inOrder (rightChild (v))
print ())
Specialization of an inorder
26
7
5
25
Algorithm inOrder(v)
if isInternal (v)
inOrder (leftChild (v))
visit(v)
if isInternal (v)
inOrder (rightChild (v))
E
29
D
C
E
30
Array-Based Representation of
Binary Trees
nodes are stored in an array
1
A
3
B
4
rank(root) = 1
if node is the left child of parent(node),
rank(node) = 2*rank(parent(node))
if node is the right child of parent(node),
rank(node) = 2*rank(parent(node))+1
Elementary Data Structures
5
E
7
C
10
11
G
H
31
Stacks
Stacks
Data stored
Operations on the
data
Error conditions
associated with
operations
Error conditions:
Buy/sell a nonexistent stock
Exceptions
push(object): inserts an
element
object pop(): removes and
returns the last inserted
element
Auxiliary stack
operations:
Stacks
Applications of Stacks
Attempting the
execution of an
operation of ADT may
sometimes cause an
error condition, called
an exception
Exceptions are said to
be thrown by an
operation that cannot
be executed
Stacks
Direct applications
Indirect applications
Array-based Stack
bar
PC = 1
m=6
foo
PC = 3
j=5
k=6
main
PC = 2
i=5
Stacks
t
Stacks
0 1 2
Limitations
t
Stacks
Computing Spans
7
We show how to use a stack 6
as an auxiliary data structure 5
in an algorithm
4
Given an an array X, the span
3
S[i] of X[i] is the maximum
2
number of consecutive
elements X[j] immediately
1
preceding X[i] and such that 0
X[j] X[i]
Spans have applications to
financial analysis
0 1 2
Performance
Algorithm push(o)
if t = S.length 1 then
throw FullStackException
else
tt+1
Limitation of the arrayS[t] o
based implementation
Algorithm pop()
if isEmpty() then
throw EmptyStackException
else
tt1
return S[t + 1]
Algorithm size()
return t + 1
A simple way of
implementing the
Stack ADT uses an
array
We add elements
from left to right
A variable keeps
track of the index of
the top element
Stacks
10
Quadratic Algorithm
X
S
6
1
3
1
4
2
5
3
2
1
11
Algorithm spans1(X, n)
Input array X of n integers
Output array S of spans of X
S new array of n integers
for i 0 to n 1 do
s1
while s i X[i s] X[i]
ss+1
S[i] s
return S
n
n
n
1 + 2 + + (n 1)
1 + 2 + + (n 1)
n
1
12
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7
Stacks
13
incremental strategy:
increase the size by a
constant c
doubling strategy: double
the size
tt+1
S[t] o
Stacks
15
The statements in
the while-loop are
executed at most
n times
Algorithm spans2
runs in O(n) time
#
n
1
n
n
n
n
n
n
n
1
Stacks
14
16
Algorithm spans2(X, n)
S new array of n integers
A new empty stack
for i 0 to n 1 do
while (A.isEmpty()
X[top()] X[i] ) do
j A.pop()
if A.isEmpty() then
S[i] i + 1
else
S[i] i j
A.push(i)
return S
Linear Algorithm
17
geometric series
2
1
1
8
18
19
// constructor
public ArrayStack(int capacity) {
S = new Object[capacity]);
}
Stacks
20
Vectors
6/8/2002 2:14 PM
Vectors
6/8/2002 2:14 PM
Vectors
6/8/2002 2:14 PM
Vectors
Applications of Vectors
Main vector operations:
object elemAtRank(integer r):
returns the element at rank r
without removing it
object replaceAtRank(integer r,
object o): replace the element at
rank with o and return the old
element
insertAtRank(integer r, object o):
insert a new element o to have
rank r
object removeAtRank(integer r):
removes and returns the element
at rank r
Additional operations size() and
isEmpty()
Vectors
Array-based Vector
Direct applications
Indirect applications
6/8/2002 2:14 PM
Vectors
Insertion
In operation insertAtRank(r, o), we need to make
room for the new element by shifting forward the
n r elements V[r], , V[n 1]
In the worst case (r = 0), this takes O(n) time
V
V
0 1 2
0 1 2
0 1 2
o
r
V
0 1 2
V
6/8/2002 2:14 PM
Vectors
6/8/2002 2:14 PM
Vectors
n
6
Vectors
6/8/2002 2:14 PM
Deletion
Performance
V
0 1 2
o
r
0 1 2
V
0 1 2
r
Vectors
6/8/2002 2:14 PM
n
7
6/8/2002 2:14 PM
Vectors
Queues
6/8/2002 2:16 PM
Queues
6/8/2002 2:16 PM
Queues
enqueue(object): inserts an
element at the end of the
queue
object dequeue(): removes and
returns the element at the front
of the queue
6/8/2002 2:16 PM
6/8/2002 2:16 PM
Queues
Applications of Queues
Auxiliary queue
operations:
object front(): returns the
element at the front without
removing it
integer size(): returns the
number of elements stored
boolean isEmpty(): indicates
whether no elements are
stored
Exceptions
Queues
Array-based Queue
Indirect applications
Direct applications
6/8/2002 2:16 PM
Queues
Queue Operations
We use the
modulo operator
(remainder of
division)
Algorithm size()
return (N f + r) mod N
Algorithm isEmpty()
return (f = r)
normal configuration
Q
0 1 2
wrapped-around configuration
6/8/2002 2:16 PM
0 1 2
Q
0 1 2
0 1 2
f
Queues
6/8/2002 2:16 PM
Queues
Queues
6/8/2002 2:16 PM
Algorithm enqueue(o)
if size() = N 1 then
throw FullQueueException
else
Q[r] o
r (r + 1) mod N
Q
0 1 2
0 1 2
Queues
0 1 2
6/8/2002 2:16 PM
Java interface
corresponding to
our Queue ADT
Requires the
definition of class
EmptyQueueException
No corresponding
built-in Java class
6/8/2002 2:16 PM
r
f
Queues
0 1 2
Q
f
Queues
Algorithm dequeue()
if isEmpty() then
throw EmptyQueueException
else
o Q[f]
f (f + 1) mod N
return o
6/8/2002 2:16 PM
Operation dequeue
throws an exception
if the queue is empty
This exception is
specified in the
queue ADT
6/8/2002 2:16 PM
Queues
10
Sequences
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Sequences
next
element
link to the next node
Sequences
6/8/2002 2:15 PM
node
elem
nodes
6/8/2002 2:15 PM
Sequences
elements
3
6/8/2002 2:15 PM
Position ADT
The Position ADT models the notion of
place within a data structure where a
single object is stored
It gives a unified view of diverse ways
of storing data, such as
Sequences
a cell of an array
a node of a linked list
elements
6/8/2002 2:15 PM
Sequences
6/8/2002 2:15 PM
Sequences
Sequences
6/8/2002 2:15 PM
List ADT
size(), isEmpty()
isFirst(p), isLast(p)
first(), last()
before(p), after(p)
replaceElement(p, o),
swapElements(p, q)
insertBefore(p, o),
insertAfter(p, o),
insertFirst(o),
insertLast(o)
remove(p)
prev
element
link to the previous node
link to the next node
Update methods:
Query methods:
Accessor methods:
next
elem
node
header
trailer
elements
6/8/2002 2:15 PM
Sequences
Insertion
6/8/2002 2:15 PM
Sequences
Deletion
p
D
p
A
X
p
A
6/8/2002 2:15 PM
q
B
Sequences
Performance
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Sequences
10
Sequence ADT
Sequences
11
List-based methods:
Rank, or
Position
Generic methods:
size(), isEmpty()
Vector-based methods:
elemAtRank(r),
replaceAtRank(r, o),
insertAtRank(r, o),
removeAtRank(r)
6/8/2002 2:15 PM
first(), last(),
before(p), after(p),
replaceElement(p, o),
swapElements(p, q),
insertBefore(p, o),
insertAfter(p, o),
insertFirst(o),
insertLast(o),
remove(p)
Bridge methods:
Sequences
atRank(r), rankOf(p)
12
Sequences
6/8/2002 2:15 PM
Applications of Sequences
Array-based Implementation
The Sequence ADT is a basic, generalpurpose, data structure for storing an ordered
collection of elements
Direct applications:
Element
Rank
Indices f and l
keep track of
first and last
positions
Indirect applications:
Building block of more complex data structures
elements
We use a
circular array
storing
positions
A position
object stores:
0
Sequences
13
Sequence Implementations
Operation
size, isEmpty
atRank, rankOf, elemAtRank
first, last, before, after
replaceElement, swapElements
replaceAtRank
insertAtRank, removeAtRank
insertFirst, insertLast
insertAfter, insertBefore
remove
6/8/2002 2:15 PM
Sequences
Array
1
1
1
1
1
n
1
n
n
3
positions
S
f
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Sequences
14
Iterators
List
1
n
1
1
n
n
1
1
1
object object()
boolean hasNext()
object nextObject()
reset()
6/8/2002 2:15 PM
ObjectIterator elements()
An iterator is typically
associated with an another
data structure
We can augment the Stack,
Queue, Vector, List and
Sequence ADTs with method:
Sequences
16
Trees
6/8/2002 2:15 PM
Trees
Make Money Fast!
Stock
Fraud
6/8/2002 2:15 PM
Ponzi
Scheme
Bank
Robbery
Trees
What is a Tree
Organization charts
File systems
Europe
Programming
environments
6/8/2002 2:15 PM
Asia
Canada
Trees
Tree ADT
integer size()
boolean isEmpty()
objectIterator elements()
positionIterator positions()
Accessor methods:
position root()
position parent(p)
positionIterator children(p)
6/8/2002 2:15 PM
subtree
J
Trees
Preorder Traversal
Trees
Tree Terminology
In computer science, a
ComputersRUs
tree is an abstract model
of a hierarchical
structure
Sales
Manufacturing
R&D
A tree consists of nodes
with a parent-child
relation
US
International
Laptops
Desktops
Applications:
6/8/2002 2:15 PM
Trees
Query methods:
boolean isInternal(p)
boolean isExternal(p)
boolean isRoot(p)
Update methods:
swapElements(p, q)
object replaceElement(p, o)
1. Motivations
9
2. Methods
1.1 Greed
1.2 Avidity
6/8/2002 2:15 PM
Algorithm preOrder(v)
visit(v)
for each child w of v
preorder (w)
2.1 Stock
Fraud
Trees
2.2 Ponzi
Scheme
References
8
2.3 Bank
Robbery
6
Trees
6/8/2002 2:15 PM
Postorder Traversal
In a postorder traversal, a
node is visited after its
descendants
Application: compute space
used by files in a directory and
its subdirectories
9
Binary Tree
Algorithm postOrder(v)
for each child w of v
postOrder (w)
visit(v)
cs16/
homeworks/
todo.txt
1K
programs/
h1c.doc
3K
h1nc.doc
2K
DDR.java
10K
6/8/2002 2:15 PM
Stocks.java
25K
Trees
Trees
I
8
No
On expense account?
Yes
No
Yes
No
Starbucks
Spikes
Al Forno
Caf Paragon
6/8/2002 2:15 PM
Trees
10
BinaryTree ADT
Properties:
e = i + 1
n = 2e 1
h i
h (n 1)/2
h
e 2
h log2 e
h log2 (n + 1) 1
Trees
6/8/2002 2:15 PM
Yes
n number of nodes
e number of
external nodes
i number of internal
nodes
h height
Notation
6/8/2002 2:15 PM
Trees
6/8/2002 2:15 PM
arithmetic expressions
decision processes
searching
Decision Tree
Applications:
Robot.java
20K
11
Update methods
may be defined by
data structures
implementing the
BinaryTree ADT
position leftChild(p)
position rightChild(p)
position sibling(p)
6/8/2002 2:15 PM
Trees
12
Trees
6/8/2002 2:15 PM
Inorder Traversal
In an inorder traversal a
node is visited after its left
subtree and before its right
subtree
Application: draw a binary
tree
Algorithm inOrder(v)
if isInternal (v)
inOrder (leftChild (v))
visit(v)
if isInternal (v)
inOrder (rightChild (v))
2
4
3
Trees
13
Algorithm evalExpr(v)
if isExternal (v)
return v.element ()
else
x evalExpr(leftChild (v))
y evalExpr(rightChild (v))
operator stored at v
return x y
1
Trees
14
6/8/2002 2:15 PM
Trees
15
6/8/2002 2:15 PM
6/8/2002 2:15 PM
Specialization of a postorder
traversal
6/8/2002 2:15 PM
Algorithm printExpression(v)
if isInternal (v)
print(()
inOrder (leftChild (v))
print(v.element ())
if isInternal (v)
inOrder (rightChild (v))
print ())
Specialization of an inorder
traversal
Trees
16
Specializations of EulerTour
6/8/2002 2:15 PM
17
We show how to
specialize class
EulerTour to evaluate
an arithmetic
expression
Assumptions
6/8/2002 2:15 PM
Trees
18
Trees
6/8/2002 2:15 PM
Element
Parent node
Sequence of children
nodes
A node is represented
by an object storing
6/8/2002 2:15 PM
Tree interface
BinaryTree interface
extending Tree
Classes implementing Tree
and BinaryTree and
providing
Trees
19
expandExternal(v)
removeAboveExternal(w)
6/8/2002 2:15 PM
Trees
Trees
20
InspectableBinaryTree
InspectableTree
BinaryTree
Tree
InspectableTree
removeAboveExternal(w)
6/8/2002 2:15 PM
Constructors
Update methods
Print methods
Trees in JDSL
expandExternal(v)
Java Implementation
D
C
Element
Parent node
Left child node
Right child node
21
NodeBinaryTree
NodeTree
6/8/2002 2:15 PM
Tree
BinaryTree
Trees
22
Heaps
4/5/2002 14:4
Priority Queue
ADT ( 2.4.1)
A priority queue stores a
Reflexive property:
xx
Antisymmetric property:
xy yxx=y
Transitive property:
xyyzxz
A comparator encapsulates
Standby flyers
Auctions
Stock market
isLessThan(x, y)
isLessThanOrEqualTo(x,y)
isEqualTo(x,y)
isGreaterThan(x, y)
isGreaterThanOrEqualTo(x,y)
isComparable(x)
Implementation with an
Algorithm PQ-Sort(S, C)
Input sequence S, comparator C
for the elements of S
Output sequence S sorted in
increasing order according to C
P priority queue with
comparator C
while S.isEmpty ()
e S.remove (S. first ())
P.insertItem(e, e)
while P.isEmpty()
e P.removeMin()
S.insertLast(e)
Applications:
minKey(k, o)
returns, but does not
remove, the smallest key of
an item
minElement()
returns, but does not
remove, the element of an
item with smallest key
size(), isEmpty()
Sorting with a
Priority Queue ( 2.4.2)
queue to sort a set of
comparable elements
insertItem(k, o)
inserts an item with key k
and element o
removeMin()
removes the item with
smallest key and returns its
element
Additional methods
collection of items
An item is a pair
(key, element)
Main methods of the Priority
Queue ADT
unsorted list
Performance:
Implementation with a
sorted list
1
Performance:
Heaps
4/5/2002 14:4
Selection-Sort
Insertion-Sort
Insertion-sort is the variation of PQ-sort where the
priority queue is implemented with a sorted
sequence
1 + 2 + + n
last node
h2
2h2
h1
10
Insertion into a
Heap (2.4.3)
(7, Anna)
(5, Pat)
depth keys
0
1
(9, Jeff)
5
9
2i nodes of depth i
at depth h 1, the internal
nodes are to the left of the
external nodes
We
We
We
For
1 + 2 + + n
11
2
5
9
z
7
insertion node
2
5
9
6
7
12
Heaps
4/5/2002 14:4
Upheap
Method removeMin of
violated
Algorithm upheap restores the heap-order property by swapping k
along an upward path from the insertion node
Upheap terminates when the key k reaches the root or a node
whose parent has a key smaller than or equal to k
Since a heap has height O(log n), upheap runs in O(log n) time
13
Downheap
w
last node
7
14
After replacing the root key with the key k of the last node, the
2
5
5
6
15
16
Vector-based Heap
Implementation (2.4.3)
Heap-Sort (2.4.4)
Consider a priority
Using a heap-based
17
2
5
to inserting at rank n + 1
Operation removeMin corresponds
to removing at rank n
Yields in-place heap-sort
18
Heaps
4/5/2002 14:4
Bottom-up Heap
Construction (2.4.3)
3
8
2
5
7
3
8
2
5
2
3
8
6
19
Example
20
Example (contd.)
25
15
15
25
16
2i 1
2i+11
16
2i 1
12
12
23
23
11
20
16
15
27
20
5
15
16
4
25
21
Example (contd.)
11
12
27
9
23
6
12
11
20
23
9
27
20
22
Example (end)
10
7
15
16
4
25
6
12
11
27
15
23
20
16
5
25
8
12
11
23
9
27
20
4
4
15
16
5
25
8
12
11
23
9
27
15
20
23
16
7
25
10
8
12
11
23
9
27
20
24
Heaps
4/5/2002 14:4
Analysis
We visualize the worst-case time of a downheap with a proxy path
that goes first right and then repeatedly goes left until the bottom
of the heap (this path may differ from the actual downheap path)
Since each node is traversed by at most two proxy paths, the total
number of nodes of the proxy paths is O(n)
Thus, bottom-up heap construction runs in O(n) time
Bottom-up heap construction is faster than n successive insertions
and speeds up the first phase of heap-sort
25
Priority Queues
6/8/2002 2:00 PM
Priority Queues
6/8/2002 2:00 PM
Sell
100
IBM
$122
Sell
300
IBM
$120
Buy 500
IBM
$119
Buy 400
IBM
$118
Priority Queues
insertItem(k, o)
inserts an item with key k
and element o
removeMin()
removes the item with
smallest key and returns its
element
6/8/2002 2:00 PM
minKey(k, o)
returns, but does not
remove, the smallest key of
an item
minElement()
returns, but does not
remove, the element of an
item with smallest key
size(), isEmpty()
Applications:
Priority Queues
Standby flyers
Auctions
Stock market
Comparator ADT
A comparator encapsulates
the action of comparing two
objects according to a given
total order relation
A generic priority queue
uses an auxiliary
comparator
The comparator is external
to the keys being compared
When the priority queue
needs to compare two keys,
it uses its comparator
6/8/2002 2:00 PM
Priority Queues
Additional methods
6/8/2002 2:00 PM
Keys in a priority
queue can be
arbitrary objects
on which an order
is defined
Two distinct items
in a priority queue
can have the
same key
6/8/2002 2:00 PM
Mathematical concept
of total order relation
Reflexive property:
xx
Antisymmetric property:
xyyxx=y
Transitive property:
xyyzxz
Priority Queues
Priority Queues
isLessThan(x, y)
isLessThanOrEqualTo(x,y)
isEqualTo(x,y)
isGreaterThan(x, y)
isGreaterThanOrEqualTo(x,y)
isComparable(x)
6/8/2002 2:00 PM
Algorithm PQ-Sort(S, C)
Input sequence S, comparator C
for the elements of S
Output sequence S sorted in
increasing order according to C
P priority queue with
comparator C
while S.isEmpty ()
e S.remove (S. first ())
P.insertItem(e, e)
while P.isEmpty()
e P.removeMin()
S.insertLast(e)
Priority Queues
Priority Queues
6/8/2002 2:00 PM
Implementation with a
sorted sequence
Performance:
6/8/2002 2:00 PM
Performance:
Priority Queues
Insertion-Sort
1 + 2 + + n
2.
1 + 2 + + n
Selection-sort runs in O(n2) time
6/8/2002 2:00 PM
Priority Queues
Priority Queues
Instead of using an
external data structure,
we can implement
selection-sort and
insertion-sort in-place
A portion of the input
sequence itself serves as
the priority queue
For in-place insertion-sort
In-place Insertion-sort
Selection-Sort
6/8/2002 2:00 PM
Priority Queues
5
10
Dictionaries
4/5/2002 15:1
025-612-0001
981-101-0002
451-229-0004
address book
credit card authorization
mapping host names (e.g.,
cs16.net) to internet addresses
(e.g., 128.148.34.101)
Example:
insertItem takes O(1) time since we can insert the new item at the
beginning or at the end of the sequence
findElement and removeElement take O(n) time since in the worst
case (the item is not found) we traverse the entire sequence to
look for an item with the given key
The log file is effective only for dictionaries of small size or for
dictionaries on which insertions are the most common
operations, while searches and removals are rarely performed
(e.g., historical record of logins to a workstation)
Performance:
h(x) = x mod N
is a hash function for integer keys
The integer h(x) is called the hash value of key x
Example
0
1
2
3
4
A hash function is
025-612-0001
981-101-0002
9997
9998
9999
451-229-0004
200-751-9998
way
Dictionaries
4/5/2002 15:1
Component sum:
Integer cast:
The following
polynomials are
successively computed,
each from the previous
one in O(1) time
p0(z) = an1
pi (z) = ani1 + zpi1(z)
(i = 1, 2, , n 1)
025-612-0001
451-229-0004
981-101-0004
Chaining is simple,
but requires
additional memory
outside the table
h(x) = x mod 13
Insert keys 18, 41,
22, 44, 59, 32, 31,
73, in this order
0 1 2 3 4 5 6 7 8 9 10 11 12
found, or
or
41
18 44 59 32 22 31 73
0 1 2 3 4 5 6 7 8 9 10 11 12
Dictionaries and Hash Tables
0
1
2
3
4
10
Example:
Divide (MAD):
Collision Handling
( 2.5.5)
Multiply, Add and
h2 (y) = y mod N
The size N of the
hash table is usually
chosen to be a prime
The reason has to do
with number theory
and is beyond the
scope of this course
Compression
Maps (2.5.4)
Division:
Polynomial accumulation:
11
unsuccessfully probed
Algorithm findElement(k)
i h(k)
p0
repeat
c A[i]
if c =
return NO_SUCH_KEY
else if c.key () = k
return c.element()
else
i (i + 1) mod N
pp+1
until p = N
return NO_SUCH_KEY
12
Dictionaries
4/5/2002 15:1
deletions, we introduce a
special object, called
AVAILABLE, which replaces
deleted elements
removeElement(k)
Double Hashing
Double hashing uses a
insert Item(k, o)
13
k
18
41
22
44
59
32
31
73
N = 13
h(k) = k mod 13
d(k) = 7 k mod 7
3
1
6
5
4
3
4
4
5
2
9
5
7
6
5
8
31
41
18 32 59 73 22 44
0 1 2 3 4 5 6 7 8 9 10 11 12
15
Universal
Hashing ( 2.5.6)
1, 2, , q
14
small databases
compilers
browser caches
16
q<N
q is a prime
0 1 2 3 4 5 6 7 8 9 10 11 12
10
Performance of
Hashing
h (k ) d (k ) Probes
5
2
9
5
7
6
5
8
Common choice of
We throw an exception
if the table is full
We start at cell h(k)
We probe consecutive
cells until one of the
following occurs
So a(j-k) is a multiple of p
But both are less than p
So a(j-k) = 0. I.e., j=k.
(contradiction)
aj + b
ak + b
aj + b
p = ak + b p p
p
aj + b ak + b
a ( j k ) =
p
p p
17
18
Dictionaries
4/5/2002 15:1
p / N 1
19
Dictionaries
6/8/2002 2:01 PM
Dictionaries
6
<
2
>
4 =
6/8/2002 2:01 PM
Dictionaries
6/8/2002 2:01 PM
Dictionary ADT
address book
credit card authorization
mapping host names (e.g.,
cs16.net) to internet addresses
(e.g., 128.148.34.101)
6/8/2002 2:01 PM
findElement(k): if the
dictionary has an item with
key k, returns its element,
else, returns the special
element NO_SUCH_KEY
insertItem(k, o): inserts item
(k, o) into the dictionary
removeElement(k): if the
dictionary has an item with
key k, removes it from the
dictionary and returns its
element, else returns the
special element
NO_SUCH_KEY
size(), isEmpty()
keys(), Elements()
Dictionaries
Performance:
l
0
0
11
14
16
18
19
11
14
16
18
19
11
14
16
18
19
11
14
16
18
m
1
m
1
1
3
3
h
4
Performance:
l
0
Dictionaries
insertItem takes O(1) time since we can insert the new item at the
beginning or at the end of the sequence
findElement and removeElement take O(n) time since in the worst
case (the item is not found) we traverse the entire sequence to
look for an item with the given key
The log file is effective only for dictionaries of small size or for
dictionaries on which insertions are the most common
operations, while searches and removals are rarely performed
(e.g., historical record of logins to a workstation)
Example: findElement(7)
0
Lookup Table
6/8/2002 2:01 PM
Binary Search
Dictionaries
Log File
Search (3.1.3)
Insertion (3.1.4)
Deletion (3.1.5)
Performance (3.1.6)
19
l=m =h
6/8/2002 2:01 PM
Dictionaries
6/8/2002 2:01 PM
Dictionaries
Dictionaries
6/8/2002 2:01 PM
Search
An inorder traversal of a
binary search trees
visits the keys in
increasing order
6
Dictionaries
Insertion
6
<
2
>
>
w
6
Dictionaries
Example: remove 3
6/8/2002 2:01 PM
Dictionaries
>
4 v
8
5
6
2
9
5
10
Performance
1
3
8
6
Consider a dictionary
with n items
implemented by means
of a binary search tree
of height h
1
5
8
6
Dictionaries
6/8/2002 2:01 PM
<
Deletion (cont.)
Dictionaries
To perform operation
removeElement(k), we
search for key k
Assume key k is in the tree,
and let let v be the node
storing k
If node v has a leaf child w,
we remove v and w from the
tree with operation
removeAboveExternal(w)
Example: remove 4
6/8/2002 2:01 PM
Deletion
To perform operation
insertItem(k, o), we search
for key k
Assume k is not already in
the tree, and let let w be
the leaf reached by the
search
We insert k at node w and
expand w into an internal
node
Example: insert 5
6/8/2002 2:01 PM
Algorithm findElement(k, v)
To search for a key k,
if T.isExternal (v)
we trace a downward
return NO_SUCH_KEY
path starting at the root
if
k < key(v)
The next node visited
return findElement(k, T.leftChild(v))
depends on the
else if k = key(v)
outcome of the
return element(v)
comparison of k with
else { k > key(v) }
the key of the current
return findElement(k, T.rightChild(v))
node
6
If we reach a leaf, the
<
key is not found and we
2
9
return NO_SUCH_KEY
>
8
Example:
1
4 =
findElement(4)
11
Dictionaries
12
Dictionaries
4/5/2002 15:1
Ordered Dictionaries
Binary Search Trees
order.
New operations:
<
2
4 =
>
closestKeyBefore(k)
closestElemBefore(k)
closestKeyAfter(k)
closestElemAfter(k)
Example: findElement(7)
Performance:
11
14
16
18
19
l
0
11
14
16
18
19
11
14
16
18
19
11
14
16
18
19
l=m =h
Binary Search Trees
Binary Search
Tree (3.1.2)
Search (3.1.3)
An inorder traversal of a
binary search trees
visits the keys in
increasing order
6
2
1
9
4
we trace a downward
path starting at the root
The next node visited
depends on the
outcome of the
comparison of k with
the key of the current
node
If we reach a leaf, the
key is not found and we
return NO_SUCH_KEY
Example:
findElement(4)
Algorithm findElement(k, v)
if T.isExternal (v)
return NO_SUCH_KEY
if k < key(v)
return findElement(k, T.leftChild(v))
else if k = key(v)
return element(v)
else { k > key(v) }
return findElement(k, T.rightChild(v))
<
2
1
6
9
>
4 =
Dictionaries
4/5/2002 15:1
Insertion (3.1.4)
Deletion (3.1.5)
6
<
To perform operation
>
removeElement(k), we
search for key k
Assume key k is in the tree,
and let let v be the node
storing k
If node v has a leaf child w,
we remove v and w from the
tree with operation
removeAboveExternal(w)
Example: remove 4
>
w
6
Deletion (cont.)
We consider the case where
Example: remove 3
4 v
8
5
6
2
9
5
Consider a dictionary
8
6
with n items
implemented by means
of a binary search tree
of height h
1
5
8
6
>
Performance (3.1.6)
1
<
To perform operation
9
10
Red-Black Trees
4/10/2002 11:1
AVL Trees
v
balanced.
An AVL Tree is a
17
48
n(1)
17
78
32
50
48
n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), (by induction),
n(h) > 2in(h-2i)
44
2
48
62
b=x
54
after insertion
4
17
32
1
1
a=z
c=y
T0
b=x
unbalanced...
2
1
4
62
T0
T3
T2
T1
b=x
44
c=y
4
3
17
32
a=z
88
54
T2
c=x
7
1
50
48
64
78
2 y
T3
b=y
62
88
the three
a=z
50
AVL Trees
T3
32
before insertion
Trinode Restructuring
T2
88
T1
c=z
78
AVL Trees
T1
17
a=y
c=x
T0
b=y
AVL Trees
a=z
62
...balanced
1
1
2 y
2
48
50
4 x
z6
62
78
54
7
88
T2
T0
T1
T2
T3
AVL Trees
88
50
1
n(2)
32
children of v can
differ by at most 1.
AVL Trees
78
1
6
3
44
T0
T1
T3
T2
5
AVL Trees
T0
T1
T3
Red-Black Trees
4/10/2002 11:1
Restructuring
(as Single Rotations)
Restructuring
(as Double Rotations)
Single Rotations:
double rotations:
a=z
single rotation
b=y
c=x
T0
T1
double rotation
a=z
c=x
T0
T1
T3
T2
T0
T3
T2
T1
single rotation
b=y
T3
T2
T1
a=x
T3
c=z
T3
T2
T1
17
48
54
before deletion of 32
48
78
54
find is O(log n)
height of tree is O(log n), no restructures needed
insert is O(log n)
remove is O(log n)
AVL Trees
T0
8
62
17
11
c=x
78
54
44
b=y
62
48
after deletion
T1
44
50
T2
up the tree from w. Also, let y be the child of z with the larger
height, and let x be the child of y with the larger height.
We perform restructure(x) to restore balance at z.
As this restructuring may upset the balance of another node
higher in the tree, we must continue checking for balance until
the root of T is reached
88
AVL Trees
T3
AVL Trees
a=z
50
88
c=z
62
78
b=x
50
T3
T1
62
T0
T2
T0
17
T2
T1
b=x
AVL Trees
32
c=y
a=y
a=y
b=y
a=x
T0
T0
double rotation
c=z
c=z
b=x
a=z
c=y
b=x
T3
T2
b=y
a=z
17
50
48
88
AVL Trees
78
88
54
10
(2,4) Trees
6/8/2002 2:08 PM
(2,4) Trees
Definition
Search
10 14
Definition
Search
Insertion
Deletion
(2,4) Trees
(2,4) Trees
11
2 6 8
2 6 8
27
24
24
15
6/8/2002 2:08 PM
32
1
30
4
3
6
5
12
15
27
32
14
10
18
30
11
13
15
6/8/2002 2:08 PM
(2,4) Trees
Multi-Way Searching
2 6 8
24
15
17
(2,4) Trees
(2,4) Tree
6/8/2002 2:08 PM
19
16
27
32
2 8
12
18
27
32
30
6/8/2002 2:08 PM
(2,4) Trees
6/8/2002 2:08 PM
(2,4) Trees
(2,4) Trees
6/8/2002 2:08 PM
Insertion
We insert a new item (k, o) at the parent v of the leaf reached by
searching for k
h1
2h1
0
(2,4) Trees
15 24
v'
27 30 32 35
12
6/8/2002 2:08 PM
18
27 30
v"
35
v1 v2 v3 v4
v1 v2 v3 v4 v5
(2,4) Trees
3. while overflow(v)
if isRoot(v)
create a new empty root above v
v split(v)
6/8/2002 2:08 PM
Thus, an insertion in a
(2,4) tree takes O(log n)
time
(2,4) Trees
10
10 15 24
12
18
27 32 35
u
10 15 27
6/8/2002 2:08 PM
2 8
v5
Deletion
2 8
27 30 32 35
18
18
27 32 35
(2,4) Trees
Algorithm insertItem(k, o)
12
12
18
Analysis of Insertion
2 8
6/8/2002 2:08 PM
12
10 15 24
6/8/2002 2:08 PM
2 8
12
(2,4) Trees
18
2 5 7
9 14
10
2 5 7
9
10 14
v'
32 35
11
6/8/2002 2:08 PM
(2,4) Trees
12
(2,4) Trees
6/8/2002 2:08 PM
Analysis of Deletion
Transfer operation:
1. we move a child of w to v
2. we move an item from u to v
3. we move an item from w to u
After a transfer, no underflow occurs
u
4 9
2
6 8
4 8
6/8/2002 2:08 PM
In a deletion operation
(2,4) Trees
13
(2,4) Trees
14
Implementing a Dictionary
Comparison of efficient dictionary implementations
Search
Hash
Table
Skip List
(2,4)
Tree
6/8/2002 2:08 PM
Insert
Delete
expected
expected
expected
Notes
no ordered dictionary
methods
simple to implement
log n
log n
log n
high prob.
high prob.
high prob.
randomized insertion
simple to implement
complex to implement
log n
log n
log n
worst-case
worst-case
worst-case
(2,4) Trees
15
Red-Black Trees
6/8/2002 2:20 PM
Red-Black Trees
3
4
Definition
Height
Insertion
restructuring
recoloring
Deletion
restructuring
recoloring
adjustment
6/8/2002 2:20 PM
Red-Black Trees
Red-Black Trees
Red-Black Tree
6/8/2002 2:20 PM
2 6 7
5
3
6/8/2002 2:20 PM
OR
6
5
Red-Black Trees
Red-Black Trees
21
12
7
6/8/2002 2:20 PM
Red-Black Trees
Insertion
6/8/2002 2:20 PM
15
6
3
6/8/2002 2:20 PM
6
8
3
4
Red-Black Trees
Red-Black Trees
6/8/2002 2:20 PM
Restructuring
Case 2: w is red
4
2
4
2
4 6 7
4 6 7
2 4 6 7
.. 2 ..
.. 2 ..
6/8/2002 2:20 PM
Red-Black Trees
Restructuring (cont.)
6
6
4
4
.. 2 ..
6/8/2002 2:20 PM
Red-Black Trees
Recoloring
A recoloring remedies a child-parent double red when the parent
red node has a red sibling
The parent v and its sibling w become black and the grandparent u
becomes red, unless it is the root
It is equivalent to performing a split on a 5-node
The double red violation may propagate to the grandparent u
4 6 7
2
4
6
4
2
4
2
4
2
6/8/2002 2:20 PM
Red-Black Trees
Analysis of Insertion
Algorithm insertItem(k, o)
1. We search for key k to locate
the insertion node z
2. We add the new item (k, o) at
node z and color z red
3. while doubleRed(z)
if isBlack(sibling(parent(z)))
z restructure(z)
return
else { sibling(parent(z) is red }
z recolor(z)
6/8/2002 2:20 PM
Red-Black Trees
6/8/2002 2:20 PM
6 7
Red-Black Trees
10
Deletion
2 4 6 7
6/8/2002 2:20 PM
v
w
Red-Black Trees
12
Red-Black Trees
6/8/2002 2:20 PM
Case 3: y is red
6/8/2002 2:20 PM
Red-Black Trees
13
Insertion
result
restructuring
change of 4-node
representation
recoloring
split
Deletion
result
restructuring
transfer
recoloring
fusion
adjustment
change of 3-node
representation
restructuring or
recoloring follows
6/8/2002 2:20 PM
Red-Black Trees
14
Skip Lists
4/5/2002 15:3
Skip Lists
Search (3.5.1)
Insertion (3.5.2)
Deletion (3.5.2)
S3
S2
S0
10
15
S1
15
23
15
23
Implementation
Analysis (3.5.3)
+
36
Space usage
Search and update times
4/5/2002 15:30
Skip Lists
4/5/2002 15:30
S2
S1
S0
31
23
12
23
26
4/5/2002 15:30
31
34
31
34
64
44
56
64
78
Skip Lists
S3
S2
S1
S0
+
3
b random()
if b = 0
do A
else { b = 1}
do B
4/5/2002 15:30
of a randomized algorithm is
often large but has very low
probability (e.g., it occurs
when all the coin tosses give
heads)
We use a randomized
algorithm to insert items into
a skip list
Skip Lists
23
12
23
26
4/5/2002 15:30
31
34
31
34
64
44
56
64
78
Skip Lists
running time of a
randomized algorithm under
the following assumptions
+
31
Insertion
Randomized Algorithms
A randomized algorithm
S3
Search
Skip Lists
p0
10
4/5/2002 15:30
23
23
36
S3
S2
15
S1
15
23
S0
15
23
Skip Lists
10
+
+
+
36
6
Skip Lists
4/5/2002 15:3
Deletion
Implementation
follows:
We search for x in the skip list and find the positions p0, p1 , , pi
of the items with key x, where position pj is in list Sj
We remove positions p0, p1 , , pi from the lists S0, S1, , Si
We remove all but one list containing only the two special keys
A quad-node stores:
S3
34
S1
S0
p2
S2
12
23
34
23
34
p1
p0
45
4/5/2002 15:30
S2
S1
S0
+
+
23
12
23
n
1
2i = n 2i < 2n
i =0
i =0
Skip Lists
4/5/2002 15:30
Skip Lists
A scan-forward step is
associated with a former coin
toss that gave tails
Skip Lists
search an insertion
algorithms is affected by the
height h of the skip list
We show that with high
probability, a skip list with n
items has height O(log n)
We use the following
additional probabilistic fact:
Fact 3: If each of n events has
probability p, the probability
that at least one event
occurs is at most np
4/5/2002 15:30
items
Skip Lists
10
Summary
4/5/2002 15:30
quad-node
before
after
below
after
Height
node
node
node
node
45
Skip Lists
the
the
the
the
Space Usage
4/5/2002 15:30
item
link to
link to
link to
link to
11
structure for
dictionaries that uses a
randomized insertion
algorithm
In a skip list with n
items
4/5/2002 15:30
probabilistic analysis,
one can show that
these performance
bounds also hold with
high probability
Skip lists are fast and
simple to implement in
practice
Skip Lists
12
Red-Black Trees
4/5/2002 15:2
(10,A)
(35,R)
BST Rules:
3
4
(14,J)
(7,T)
(1,Q)
(1,C)
(5,H)
(10,A)
(21,O)
(8,N)
(5,H)
(1,Q)
(40,X)
(1,C)
(10,U)
Splaying:
start with
node x
is x the
root?
left rotation
yes
stop
is x a child of
the root?
no
T1
T2
is x a left-left
grandchild?
is x a right-right
grandchild?
yes
is x a right-left
grandchild?
T2
T1
is x the left
child of the
root?
T3
T3
yes
T2
T3
Splay Trees
T1
zig
right-rotate
about the root
T2
5
(6,Y)
T1
(10,U)
x is a
yes
no
zig
(40,X)
no
y
T3
(37,P)
(36,L)
(5,G)
yes
(21,O)
Splay Trees
(7,P)
(5,I)
(5,H)
(2,R)
(14,J)
(8,N)
(6,Y)
Splay Trees
(35,R)
(7,T)
(5,G)
(5,I)
right rotation
(10,A)
(37,P)
(36,L)
(7,P)
(2,R)
an internal node.
(35,R)
(14,J)
(7,T)
(1,C)
(6,Y)
(5,I)
(1,Q)
(10,U)
(5,G)
(20,Z)
(40,X)
Splay Trees
(37,P)
(36,L)
(7,P)
(2,R)
Splay Trees
(21,O)
(8,N)
(20,Z)
yes
is x a left-right
grandchild?
left-rotate about
the root
yes
Splay Trees
zig-zig
right-rotate about g,
right-rotate about p
zig-zig
left-rotate about g,
left-rotate about p
zig-zag
left-rotate about p,
right-rotate about g
zig-zag
right-rotate about p,
left-rotate about g
6
Red-Black Trees
4/5/2002 15:2
Splaying Example
Visualizing the
Splaying Cases
zig-zag
let x = (8,N)
x is the right child of its parent,
z
z
y
y
T4
T1
T1
T4
T2
T2
T2
T3
T4
T1
T1
T2
T2
T3
(1,C)
T4
(7,P)
(21,O)
(14,J)
(10,U)
(5,G)
(6,Y)
(40,X)
(1,C)
(10,A)
(21,O)
(7,P)
1.
(before applying
rotation)
(1,Q)
(5,H)
(6,Y)
(after second
rotation)
x is not yet the root, so
we splay again 8
(6,Y)
(5,I)
(8,N)
2.
(20,Z)
(20,Z)
(10,A)
before
(35,R)
(14,J)
(7,T)
(1,Q)
(21,O)
(8,N)
(5,H)
(7,P)
(14,J)
(21,O)
(10,U)
(5,G)
(5,G)
(40,X)
(6,Y)
(5,I)
(36,L)
(40,X)
(1,Q)
(35,R)
(37,P)
(14,J)
(7,T)
(37,P)
(8,N)
(1,C)
(5,H)
(7,P)
(2,R)
(10,U)
(5,I)
(21,O)
(1,C)
(36,L)
(35,R)
(8,N)
(5,H)
(2,R)
(5,G)
(7,P)
(21,O)
(36,L)
(10,U)
(5,G)
(5,I)
(6,Y)
(6,Y)
after second
splay
Splay Trees
10
method
findElement
insertElement
Splay Trees
(10,A)
(37,P)
(6,Y)
Splay Trees
(40,X)
(10,U)
(7,P)
(40,X)
(14,J)
(7,T)
(35,R)
(10,A)
(5,H)
(37,P)
(36,L)
(20,Z)
(10,A)
(after rotation)
(40,X)
3.
(5,G)
(1,Q)
(1,C)
(37,P)
(36,L)
(14,J)
(10,U)
(2,R)
(7,T)
(5,I)
(35,R)
(20,Z)
(2,R)
(40,X)
(7,T)
Example Result
of Splaying
(37,P)
(36,L)
(36,L)
(5,G)
(10,A)
(8,N)
(1,Q)
(5,H)
(35,R)
(8,N)
(5,I)
(6,Y)
Splay Trees
(10,U)
(7,P)
(5,G)
(37,P)
(2,R)
(5,H)
(20,Z)
2.
(2,R)
(5,H)
(1,C)
(7,P)
(5,I)
Splay Trees
(20,Z)
(21,O)
(10,U)
(7,T)
T3
T4
(before
rotating)
(40,X)
(35,R)
(14,J)
(8,N)
(1,Q)
T4
T2
T3
1.
(37,P)
(36,L)
g
p
(1,C)
(8,N)
(10,A)
zig
T1
(21,O)
(20,Z)
T3
(1,Q)
(1,Q)
(5,I)
zig-zig
(35,R)
(14,J)
(7,T)
(2,R)
T3
(7,T)
(10,A)
g
p
T1
(20,Z)
11
splay node
if key found, use that node
if key not found, use parent of ending external node
use the new node containing the item inserted
use the parent of the internal node that was actually
removed from the tree (the parent of the node that the
removed item was swapped with)
Splay Trees
12
Red-Black Trees
4/5/2002 15:2
Amortized Analysis of
Splay Trees
for splaying.
Define rank(v) as the logarithm (base 2) of the number
of nodes in subtree rooted at v.
Costs: zig = $1, zig-zig = $2, zig-zag = $2.
Thus, cost for playing a node at depth d = $d.
Imagine that we store rank(v) cyber-dollars at each
node v of the splay tree (just for the sake of analysis).
zig-zig
T4
T1
T1
T2
T3
T1
T2
T1
cost cost i
i =1
x
z
T4
T4
d /2
zig-zag
y
T3
rooted at r:
T4
3(rank(x) - rank(x)) - 2.
Proof: See Theorem 3.9, Page 192.
z
T2
T2
T1
T2
Cost of Splaying
y
T1
T3
T3
Splay Trees
13
T4
Splay Trees
zig
x
i =1
y
T2
T3
T4
3( rank ( r ) rank ( x )) d + 2.
T3
Splay Trees
15
Performance of
Splay Trees
Recall: rank of a node is logarithm of its size.
Thus, amortized cost of any splay operation is
O(log n).
17
Splay Trees
16
Locators
5/15/2002 11:36 AM
Locators (2.4.4)
Locator-based methods (2.4.4)
Implementation
Positions vs. Locators
Locators
1 g
5/15/2002 11:36 AM
4 e
Locators
Locators
Application example:
key
element
position (or rank) of
the item in the
underlying structure
5/15/2002 11:36 AM
Locators
5/15/2002 11:36 AM
Locator-based dictionary
methods:
Locators
3 a
Position
9 b
number of shares
Implementation
The locator is an
object storing
the price
5/15/2002 11:36 AM
claim check
reservation number
Locators
Locator-based Methods
5/15/2002 11:36 AM
1 g
Locators
4 e
Locator
represents a place in a
data structure
related to other positions in
the data structure (e.g.,
previous/next or
parent/child)
implemented as a node or
an array cell
8 c
5/15/2002 11:36 AM
Locators
Merge Sort
4/9/2002 10:0
Merge Sort
7 29 4 2 4 7 9
72 2 7
94 4 9
77
4/9/2002 10:09
22
99
44
Merge Sort
Divide-and-Conquer
Divide-and conquer is a
sequence S with n
elements consists of
three steps:
It uses a comparator
It has O(n log n) running
time
Unlike heap-sort
4/9/2002 10:09
Merge Sort
Merge-sort on an input
4/9/2002 10:09
Merge-Sort
Merge-sort is a sorting
Algorithm
Merging two sorted sequences
Merge-sort tree
Execution example
Analysis
Merge Sort
4/9/2002 10:09
Algorithm mergeSort(S, C)
Input sequence S with n
elements, comparator C
Output sequence S sorted
according to C
if S.size() > 1
(S1, S2) partition(S, n/2)
mergeSort(S1, C)
mergeSort(S2, C)
S merge(S1, S2)
Merge Sort
Merge-Sort Tree
merge-sort consists
of merging two
sorted sequences A
and B into a sorted
sequence S
containing the union
of the elements of A
and B
Merging two sorted
sequences, each
with n/2 elements
and implemented by
means of a doubly
linked list, takes O(n)
time
4/9/2002 10:09
Algorithm merge(A, B)
Input sequences A and B with
n/2 elements each
Output sorted sequence of A B
S empty sequence
while A.isEmpty() B.isEmpty()
if A.first().element() < B.first().element()
S.insertLast(A.remove(A.first()))
else
S.insertLast(B.remove(B.first()))
while A.isEmpty()
S.insertLast(A.remove(A.first()))
while B.isEmpty()
S.insertLast(B.remove(B.first()))
return S
Merge Sort
7 2
7
9 4 2 4 7 9
2 2 7
77
4/9/2002 10:09
22
4 4 9
99
Merge Sort
44
6
Merge Sort
4/9/2002 10:0
Execution Example
Partition
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
7 2 9 4 2 4 7 9
7 2 2 7
77
22
3 8 6 1 1 3 8 6
9 4 4 9
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
44
3 8 3 8
33
88
7 29 4 2 4 7 9
6 1 1 6
66
Merge Sort
11
7
7 2 2 7
77
22
77
22
99
4/9/2002 10:09
44
3 8 3 8
33
88
6 1 1 6
66
11
9
722 7
77
22
22
4/9/2002 10:09
99
Merge Sort
3 8 3 8
33
99
44
3 8 3 8
33
88
6 1 1 6
66
Merge Sort
11
10
Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
3 8 6 1 1 3 8 6
44
11
3 8 6 1 1 3 8 6
9 4 4 9
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
77
66
9 4 4 9
88
Merge Sort
7 29 4 2 4 7 9
722 7
33
6 1 1 6
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
Merge Sort
7 29 4 2 4 7 9
44
3 8 3 8
3 8 6 1 1 3 8 6
9 4 4 9
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
722 7
9 4 4 9
7 29 4 2 4 7 9
3 8 6 1 1 3 8 6
88
7 29 4 2 4 7 9
6 1 1 6
66
11
11
722 7
77
22
4/9/2002 10:09
3 8 6 1 1 3 8 6
9 4 4 9
99
44
Merge Sort
3 8 3 8
33
88
6 1 1 6
66
11
12
Merge Sort
4/9/2002 10:0
Merge
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
7 29 4 2 4 7 9
722 7
77
3 8 6 1 1 3 8 6
9 4 4 9
22
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
44
3 8 3 8
33
88
7 29 4 2 4 7 9
6 1 1 6
66
Merge Sort
11
13
722 7
77
22
77
22
99
4/9/2002 10:09
44
3 8 3 8
33
88
6 1 1 6
66
Merge Sort
11
15
n/2
2i
n/2i
4/9/2002 10:09
Merge Sort
66
Merge Sort
11
14
722 7
77
22
3 8 6 1 1 3 8 6
9 4 4 9
99
3 8 3 8
44
4/9/2002 10:09
33
88
6 1 1 6
66
Merge Sort
Algorithm
Time
selection-sort
O(n2)
insertion-sort
O(n2)
heap-sort
O(n log n)
merge-sort
O(n log n)
88
11
16
33
7 29 4 2 4 7 9
44
6 1 1 6
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
Analysis of Merge-Sort
3 8 3 8
Merge
3 8 6 1 1 3 8 6
9 4 4 9
99
4/9/2002 10:09
7 2 9 43 8 6 1 1 2 3 4 6 7 8 9
722 7
9 4 4 9
7 29 4 2 4 7 9
3 8 6 1 1 3 8 6
17
4/9/2002 10:09
Notes
slow
in-place
for small data sets (< 1K)
slow
in-place
for small data sets (< 1K)
fast
in-place
for large data sets (1K 1M)
fast
sequential data access
for huge data sets (> 1M)
Merge Sort
18
Quick-Sort
4/9/2002 10:1
Quick-Sort
7 4 9 6 2 2 4 6 7 9
4 2 2 4
7 9 7 9
22
99
Quick-Sort
Quick-Sort
We partition an input
sequence as follows:
E elements equal x
Quick-Sort
Quick-Sort Tree
Each node represents a recursive call of quick-sort and stores
Pivot selection
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
Algorithm partition(S, p)
Input sequence S, position p of pivot
Output subsequences L, E, G of the
elements of S less than, equal to,
or greater than the pivot, resp.
L, E, G empty sequences
x S.remove(p)
while S.isEmpty()
y S.remove(S.first())
if y < x
L.insertLast(y)
else if y = x
E.insertLast(y)
else { y > x }
G.insertLast(y)
return L, E, G
Quick-Sort
Execution Example
Quick-Sort
Partition
Quick-sort is a randomized
Algorithm
Partition step
Quick-sort tree
Execution example
7 2 9 4 2 4 7 9
3 8 6 1 1 3 8 6
7 4 9 6 2 2 4 6 7 9
4 2 2 4
22
7 9 7 9
22
99
99
Quick-Sort
9 4 4 9
33
88
44
Quick-Sort
Quick-Sort
4/9/2002 10:1
7 2 9 4 3 7 6 1 1 2 3 4 6 7 8 9
2 4 3 1 2 4 7 9
22
3 8 6 1 1 3 8 6
9 4 4 9
99
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
33
2 4 3 1 2 4 7
88
11
44
7
99
33
2 4 3 1 1 2 3 4
88
11
99
44
10
Join, join
7 2 9 4 3 7 6 1 1 2 3 4 6 7 7 9
7 9 7 1 1 3 8 6
2 4 3 1 1 2 3 4
99
44
Quick-Sort
99
Quick-Sort
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
88
88
4 3 3 4
7 9 7 1 1 3 8 6
4 3 3 4
99
Quick-Sort
11
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
44
2 4 3 1 1 2 3 4
44
3 8 6 1 1 3 8 6
4 3 3 4
88
Quick-Sort
7 2 9 43 7 6 1 1 2 3 4 6 7 8 9
11
33
2 4 3 1 1 2 3 4
9 4 4 9
99
Quick-Sort
3 8 6 1 1 3 8 6
11
4 3 3 4
99
11
7 9 7 17 7 9
88
99
44
Quick-Sort
12
Quick-Sort
4/9/2002 10:1
The worst case for quick-sort occurs when the pivot is the unique
minimum or maximum element
One of L and G has size n 1 and the other has size 0
The running time is proportional to the sum
n + (n 1) + + 2 + 1
Thus, the worst-case running time of quick-sort is O(n2)
Good call: the sizes of L and G are each less than 3s/4
Bad call: one of L and G has size greater than 3s/4
7 2 9 43 7 6 1
7 2 9 43 7 6 19
7 9 7 1 1
2 4 3 1
Bad call
n1
n1
Bad pivots
Quick-Sort
13
Good pivots
Quick-Sort
In-Place Quick-Sort
14
s(r)
s(a)
O(n)
s(b)
O(log n)
s(c)
s(d)
s(e)
s(f)
O(n)
Quick-Sort
15
Time
Notes
selection-sort
O(n2)
in-place
slow (good for small inputs)
insertion-sort
O(n2)
in-place
slow (good for small inputs)
quick-sort
O(n log n)
expected
in-place, randomized
fastest (good for large inputs)
heap-sort
O(n log n)
in-place
fast (good for large inputs)
merge-sort
O(n log n)
Quick-Sort
17
Algorithm inPlaceQuickSort(S, l, r)
Input sequence S, ranks l and r
Output sequence S with the
elements of rank between l and r
rearranged in increasing order
the elements less than the
if l r
pivot have rank less than h
return
the elements equal to the pivot
i a random integer between l and r
have rank between h and k
x S.elemAtRank(i)
the elements greater than the
pivot have rank greater than k
(h, k) inPlacePartition(x)
The recursive calls consider
inPlaceQuickSort(S, l, h 1)
elements with rank less than h
inPlaceQuickSort(S, k + 1, r)
elements with rank greater
than k
expected height
Bad pivots
to run in-place
Therefore, we have
7294376
Good call
depth time
Quick-Sort
16
(2,4) Trees
4/9/2002 13:1
Comparison-Based
Sorting ( 4.4)
Many sorting algorithms are comparison based.
Is xi < xj?
yes
Sorting Lower Bound
Counting Comparisons
xa < xb ?
The height of this decision tree is a lower bound on the running time
Every possible input permutation must lead to a separate leaf output.
If not, some input 45 would have same output ordering as
54, which would be wrong.
Since there are n!=1*2**n leaves, the height is at least log (n!)
xc < xd ?
xi < xj ?
xa < xb ?
xc < xd ?
log (n!)
xe < xf ?
xk < xl ?
xm < xo ?
xp < xq ?
xe < xf ?
xm < xo ?
xk < xl ?
xp < xq ?
n!
Sorting Lower Bound
n 2
log (n!) log = (n / 2) log (n / 2).
2
That is, any comparison-based sorting algorithm must
run in (n log n) time.
Sequences
4/15/2002 11:5
Set Operations
We represent a set by the
Set union:
Sets
Set intersection:
union
intersection
subtraction
Sets
Sets
canonical ordering
aIsLess
bIsLess
bothAreEqual
Set elements
3
For example:
Sets
Sets
aIsLess(a, S)
{ do nothing }
bIsLess(b, S)
{ do nothing }
bothAreEqual(a, b, S)
S. insertLast(a)
Generic Merging
List
aIsLess(a, S)
S.insertFirst(a)
bIsLess(b, S)
S.insertLast(b)
bothAreEqual(a, b, S)
S. insertLast(a)
S empty sequence
while A.isEmpty() B.isEmpty()
a A.first().element(); b B.first().element()
if a < b
aIsLess(a, S); A.remove(A.first())
else if b < a
bIsLess(b, S); B.remove(B.first())
else { b = a }
bothAreEqual(a, b, S)
A.remove(A.first()); B.remove(B.first())
while A.isEmpty()
aIsLess(a, S); A.remove(A.first())
while B.isEmpty()
bIsLess(b, S); B.remove(B.first())
return S
Sets
Radish-Sort
4/9/2002 13:4
Bucket-Sort ( 4.5.1)
Let be S be a sequence of n
3, a
3, b
7, d
7, g
Algorithm bucketSort(S, N)
Input sequence S of (key, element)
items with keys in the range
[0, N 1]
Output sequence S sorted by
increasing keys
B array of N empty sequences
while S.isEmpty()
f S.first()
(k, o) S.remove(f)
B[k].insertLast((k, o))
for i 0 to N 1
while B[i].isEmpty()
f B[i].first()
(k, o) B[i].remove(f)
S.insertLast((k, o))
7, e
Analysis:
0 1 2 3 4 5 6 7 8 9
Example
Key-type Property
1, c
3, a
7, g
3, b
7, e
Phase 1
1, c
B
3, a
3, b
7, d
7, g
7, e
3, a
3, b
7, d
7, g
Lexicographic Order
B[k a]
B[r(k)]
Lexicographic-Sort
Extensions
7, e
Phase 2
1, c
Algorithm lexicographicSort(S)
Input sequence S of d-tuples
Output sequence S sorted in
lexicographic order
for i d downto 1
stableSort(S, Ci)
Example:
(7,4,6) (5,1,5) (2,4,6) (2, 1, 4) (3, 2, 4)
(2, 1, 4) (3, 2, 4) (5,1,5) (7,4,6) (2,4,6)
(2, 1, 4) (5,1,5) (3, 2, 4) (7,4,6) (2,4,6)
(2, 1, 4) (2,4,6) (3, 2, 4) (5,1,5) (7,4,6)
Radish-Sort
4/9/2002 13:4
Radix-Sort for
Binary Numbers
Radix-Sort ( 4.5.2)
Radix-sort is a
specialization of
lexicographic-sort that
uses bucket-sort as the
stable sorting algorithm
in each dimension
Radix-sort is applicable
to tuples where the
keys in each dimension i
are integers in the
range [0, N 1]
Radix-sort runs in time
O(d( n + N))
Consider a sequence of n
Algorithm radixSort(S, N)
Input sequence S of d-tuples such
that (0, , 0) (x1, , xd) and
(x1, , xd) (N 1, , N 1)
for each tuple (x1, , xd) in S
Output sequence S sorted in
lexicographic order
for i d downto 1
bucketSort(S, N)
Example
Sorting a sequence of 4-bit integers
1001
0010
1001
1001
0001
0010
1110
1101
0001
0010
1101
1001
0001
0010
1001
0001
1101
0010
1101
1101
1110
0001
1110
1110
1110
b-bit integers
x = xb 1 x1x0
We represent each element
as a b-tuple of integers in
the range [0, 1] and apply
radix-sort with N = 2
This application of the
radix-sort algorithm runs in
O(bn) time
For example, we can sort a
sequence of 32-bit integers
in linear time
Algorithm binaryRadixSort(S)
Input sequence S of b-bit
integers
Output sequence S sorted
replace each element x
of S with the item (0, x)
for i 0 to b 1
replace the key k of
each item (k, x) of S
with bit xi of x
bucketSort(S, 2)
Quick-Sort
4/15/2002 11:4
Selection
7 4 9 6 2 2 4 6 7 9
Selection
Quick-Select ( 4.7)
Partition
Quick-select is a randomized
We partition an input
k < |L|
k > |L|+|E|
k = k - |L| - |E|
Selection
Quick-Select Visualization
Selection
recursion path
Good call: the sizes of L and G are each less than 3s/4
Bad call: one of L and G has size greater than 3s/4
7 2 9 43 7 6 1
7 2 9 43 7 6 19
k=5, S=(7 4 9 3 2 6 5 1 8)
7 9 7 1 1
2 4 3 1
7294376
Good call
k=2, S=(7 4 9 6 5 8)
Bad call
k=2, S=(7 4 6 5)
k=1, S=(7 6 5)
Bad pivots
5
Selection
Algorithm partition(S, p)
Input sequence S, position p of pivot
Output subsequences L, E, G of the
elements of S less than, equal to,
or greater than the pivot, resp.
L, E, G empty sequences
x S.remove(p)
while S.isEmpty()
y S.remove(S.first())
if y < x
L.insertLast(y)
else if y = x
E.insertLast(y)
else { y > x }
G.insertLast(y)
return L, E, G
Good pivots
Selection
Bad pivots
6
Quick-Sort
4/15/2002 11:4
Deterministic Selection
By Fact #1,
Min size
for L
Selection
Min size
for G
Selection
Merge Sort
4/21/2002 4:43 PM
Constraint:
b (x / w )
i
bi - a positive benefit
wi - a positive weight
Weight:
Benefit:
Value:
xi W
iS
iS
bi - a positive benefit
wi - a positive weight
Example
Making Change
($ per ml)
4 ml
8 ml
2 ml
6 ml
1 ml
$12
$32
$40
$30
$50
20
50
1
2
6
1
ml
ml
ml
ml
of
of
of
of
5
3
4
2
10 ml
6
Merge Sort
4/21/2002 4:43 PM
Since bi ( xi / wi ) = (bi / wi ) xi
iS
iS
Run time: O(n log n). Why?
Task Scheduling
Given: a set T of n tasks, each having:
Algorithm fractionalKnapsack(S, W)
Input: set S of items w/ benefit bi
and weight wi; max. weight W
Output: amount xi of each item i
to maximize benefit w/ weight
at most W
for each item i in S
xi 0
{value}
vi bi / wi
{total weight}
w0
while w < W
remove item i w/ highest vi
xi min{wi , W - w}
w w + min{wi , W - w}
A start time, si
A finish time, fi (where si < fi)
Task Scheduling
Algorithm
Example
A start time, si
A finish time, fi (where si < fi)
[1,4], [1,3], [2,5], [3,7], [4,7], [6,9], [7,8] (ordered by start)
10
Merge Sort
4/26/2002 10:19 AM
Divide-and-Conquer
7 29 4 2 4 7 9
72 2 7
77
22
94 4 9
99
44
Divide-and-Conquer
Divide-and-Conquer
Merge-sort on an input
sequence S with n
elements consists of
three steps:
Divide-and-Conquer
Recurrence Equation
Analysis
if n < 2
if n 2
Algorithm mergeSort(S, C)
Input sequence S with n
elements, comparator C
Output sequence S sorted
according to C
if S.size() > 1
(S1, S2) partition(S, n/2)
mergeSort(S1, C)
mergeSort(S2, C)
S merge(S1, S2)
Divide-and-Conquer
= 2( 2T (n / 2 2 )) + b( n / 2)) + bn
= 22 T (n / 22 ) + 2bn
= 23 T (n / 23 ) + 3bn
= 24 T (n / 24 ) + 4bn
= ...
= 2i T (n / 2i ) + ibn
Iterative Substitution
T (n) =
2T ( n / 2) + bn
Divide-and-Conquer
Merge-Sort Review
Divide-and conquer is a
general algorithm design
paradigm:
Iterative substitution
Recursion trees
Guess-and-test
The master method
Note that base, T(n)=b, case occurs when 2i=n. That is, i = log n.
So,
T (n ) = bn + bn log n
That is, a solution that has T(n) only on the left-hand side.
Divide-and-Conquer
Merge Sort
4/26/2002 10:19 AM
Guess-and-Test Method
Draw the recursion tree for the recurrence relation and look for a
pattern:
b
if n < 2
T (n ) =
2T ( n / 2) + bn if n 2
time
T (n ) =
2T ( n / 2) + bn log n if n 2
Guess: T(n) < cn log n.
depth
Ts
size
bn
n/2
bn
2i
n/2i
bn
= cn log n cn + bn log n
T (n ) = 2T (n / 2) + bn log n
Guess-and-Test Method,
Part 2
Recall the recurrence equation:
b
T (n ) =
2T ( n / 2) + bn log n
Guess #2: T(n) < cn log2 n.
if n < 2
T (n) =
aT
n
b
(
/
) + f (n)
if n 2
cn log 2 n
T (n) =
aT ( n / b) + f (n )
if n < d
if n d
if c > b.
Master Method
T (n) = 2T (n / 2) + bn log n
Divide-and-Conquer
Divide-and-Conquer
10
if n < d
if n d
The form: T (n ) =
c
aT ( n / b) + f (n )
if n < d
if n d
Example:
Example:
T (n) = 4T (n / 2) + n
T (n) = 2T (n / 2) + n log n
O(n2).
Divide-and-Conquer
12
Merge Sort
4/26/2002 10:19 AM
T (n) =
aT ( n / b) + f (n )
if n < d
if n d
The form: T (n ) =
c
aT ( n / b) + f (n )
log b a
log
k +1
n)
Example:
Example:
T (n) = T (n / 3) + n log n
T (n) =
aT ( n / b) + f (n )
T (n) = 8T (n / 2) + n 2
if n < d
if n d
Divide-and-Conquer
14
if n < d
if n d
The form: T (n ) =
c
aT ( n / b) + f (n )
if n < d
if n d
3. if f (n) is (n
log b a +
Example:
Example:
T (n) = 9T (n / 3) + n 3
c
aT ( n / b) + f (n )
15
Divide-and-Conquer
if n < d
if n d
= a 3T (n / b 3 ) + a 2 f (n / b 2 ) + af (n / b) + f (n)
= ...
16
log b a +
(binary search)
T ( n ) = T ( n / 2) + 1
= a logb nT (1) +
(log b n ) 1
i
f (n / bi )
i =0
Example:
T (n) = 2T (n / 2) + log n
(heap construction)
= n logb aT (1) +
f (n / bi )
17
a
i =0
Divide-and-Conquer
(log b n ) 1
i
18
Merge Sort
4/26/2002 10:19 AM
An Improved Integer
Multiplication Algorithm
Integer Multiplication
Algorithm: Multiply two n-bit integers I and J.
I = I h 2n / 2 + Il
J = J h 2n / 2 + J l
J = J h 2n / 2 + J l
I * J = ( I h 2n / 2 + Il ) * ( J h 2n / 2 + J l )
I * J = I h J h 2 n + [( I h I l )( J l J h ) + I h J h + I l J l ]2 n / 2 + I l J l
= I h J h 2 n + [( I h J l I l J l I h J h + I l J h ) + I h J h + I l J l ]2 n / 2 + I l J l
= I h J h 2n + I h J l 2n / 2 + I l J h 2n / 2 + I l J l
= I h J h 2 n + ( I h J l + I l J h )2 n / 2 + I l J l
Divide-and-Conquer
I = I h 2n / 2 + Il
19
20
Merge Sort
4/29/2002 11:40 AM
Dynamic Programming
Dynamic Programming
Dynamic Programming
Matrix Chain-Products
Matrix Chain-Products
Matrix Chain-Product:
C = A*B
A is d e and B is e f
e 1
O(def ) time
e
A
Dynamic Programming
i,j
f
Dynamic Programming
Running time:
B is 3 100
C is 100 5
D is 5 5
(B*C)*D takes 1500 + 75 = 1575 ops
B*(C*D) takes 1500 + 2500 = 4000 ops
A Greedy Approach
An Enumeration Approach
Compute A=A0*A1**An-1
Ai is di di+1
Problem: How to parenthesize?
Example
k =0
A is 10 5
B is 5 10
C is 10 5
D is 5 10
Greedy idea #1 gives (A*B)*(C*D), which takes
500+1000+500 = 2000 ops
A*((B*C)*D) takes 500+250+250 = 1000 ops
Dynamic Programming
Merge Sort
4/29/2002 11:40 AM
A Recursive Approach
A is 101 11
B is 11 9
C is 9 100
D is 100 99
Greedy idea #2 gives A*((B*C)*D)), which takes
109989+9900+108900=228789 ops
(A*B)*(C*D) takes 9999+89991+89100=189090 ops
Dynamic Programming
Since subproblems
Algorithm matrixChain(S):
overlap, we dont
use recursion.
Input: sequence S of n matrices to be multiplied
Output: number of operations in an optimal
Instead, we
paranethization of S
construct optimal
for i 1 to n-1 do
subproblems
bottom-up.
Ni,i 0
for b 1 to n-1 do
Ni,is are easy, so
start with them
for i 0 to n-b-1 do
j i+b
Then do length
2,3, subproblems,
Ni,j +infinity
and so on.
for k i to j-1 do
Running time: O(n3)
Ni,j min{Ni,j , Ni,k +Nk+1,j +di dk+1 dj+1}
A Dynamic Programming
Algorithm Visualization
ik < j
The bottom-up
N 0 1 2
construction fills in the
N array by diagonals
0
1
Ni,j gets values from
A Dynamic Programming
Algorithm
A Characterizing
Equation
Define subproblems:
n-1
Dynamic Programming
10
11
Dynamic Programming
12
Merge Sort
4/29/2002 11:40 AM
Example
bi - a positive benefit
wi - a positive weight
Objective: maximize
Constraint:
w
iT
Weight:
Benefit:
Dynamic Programming
13
Solution:
1
4 in
2 in
2 in
6 in
2 in
$20
$3
$6
$25
$80
5 (2 in)
3 (2 in)
4 (4 in)
9 in
Dynamic Programming
14
b
iT
bi - a positive benefit
wi - a positive weight
B[k , w] =
B
k
w
B
k
w
w
b
max{
[
1
,
],
[
1
,
]
}
else
k
k
15
Dynamic Programming
16
B[k 1, w]
if wk > w
B[k , w] =
else
max{B[k 1, w], B[k 1, w wk ] + bk }
Algorithm 01Knapsack(S, W):
Since B[k,w] is defined in
Input: set S of items w/ benefit bi
terms of B[k-1,*], we can
and weight wi; max. weight W
Output: benefit of best subset with
reuse the same array
weight at most W
for w 0 to W do
B[w] 0
for k 1 to n do
for w W downto wk do
if B[w-wk]+bk > B[w] then
B[w] B[w-wk]+bk
Dynamic Programming
17
Graphs
5/3/2002 7:41 AM
Graphs
1843
337
43
17
LAX
1233
802
SFO
ORD
DFW
5/3/2002 7:41 AM
Graphs
Graph
SFO
2555
337
HNL
1843
43
17
LAX
1233
5/3/2002 7:41 AM
849
ORD
802
DFW
2
14
PVD
7
138
1120
flight
AA 1206
PVD
ORD
849
miles
PVD
MIA
5/3/2002 7:41 AM
Graphs
cslab1b
cs.brown.edu
Highway network
Flight network
brown.edu
qwest.net
att.net
Adjacent vertices
John
Graphs
e
W
X has degree 5
h and i are parallel edges
j
Z
i
g
Parallel edges
cox.net
Paul
Degree of a vertex
Self-loop
David
Entity-relationship diagram
5/3/2002 7:41 AM
Databases
ORD
Undirected graph
10
99
math.brown.edu
Computer networks
Terminology
Transportation networks
Electronic circuits
Undirected edge
LGA
Graphs
cslab1a
Graphs
Directed graph
Applications
5/3/2002 7:41 AM
Directed edge
Example:
Edge Types
Definition
Applications
Terminology
Properties
ADT
j is a self-loop
5/3/2002 7:41 AM
Graphs
Graphs
5/3/2002 7:41 AM
Terminology (cont.)
Terminology (cont.)
Cycle
Path
sequence of alternating
vertices and edges
begins with a vertex
ends with a vertex
each edge is preceded and
followed by its endpoints
a
U
c
Simple path
P1
X
Graphs
5/3/2002 7:41 AM
Notation
v deg(v) = 2m
Property 2
Graphs
element
reference to position in
vertex sequence
sequence of vertex
objects
Edge sequence
aVertex()
incidentEdges(v)
endVertices(e)
isDirected(e)
origin(e)
destination(e)
opposite(v, e)
areAdjacent(v, w)
sequence of
references to edge
objects of incident
edges
Augmented edge
objects
Vertex sequence
insertVertex(o)
insertEdge(v, w, o)
insertDirectedEdge(v, w, o)
removeVertex(v)
removeEdge(e)
Generic methods
numVertices()
numEdges()
vertices()
edges()
Graphs
10
Edge object
Update methods
Vertex object
element
origin vertex object
destination vertex object
reference to position in
edge sequence
are positions
store elements
5/3/2002 7:41 AM
Graphs
Accessor methods
Example
n = 4
m = 6
deg(v) = 3
In an undirected
graph with no selfloops and no
multiple edges
m n (n 1)/2
Proof: each vertex
has degree at most
(n 1)
C1
g
number of vertices
number of edges
degree of vertex v
n
m
deg(v)
Property 1
C1=(V,b,X,g,Y,f,W,c,U,a,) is a
simple cycle
C2=(U,c,W,e,X,g,Y,f,W,d,V,a,)
is a cycle that is not simple
Properties
5/3/2002 7:41 AM
Examples
d
C2
Simple cycle
5/3/2002 7:41 AM
d
P2
Examples
references to
associated
positions in
incidence
sequences of end
vertices
5/3/2002 7:41 AM
Graphs
11
5/3/2002 7:41 AM
Graphs
12
Graphs
5/3/2002 7:41 AM
Performance
n vertices
m edges
no parallel edges
no self-loops
2D-array adjacency
array
Reference to edge
object for adjacent
vertices
Null for non
nonadjacent
vertices
5/3/2002 7:41 AM
1
0
0
Graphs
1
a
13
Adjacency
List
Adjacency
Matrix
n+m
n+m
n2
incidentEdges(v)
deg(v)
areAdjacent (v, w)
insertVertex(o)
m
1
min(deg(v), deg(w))
1
1
n2
Space
0
Edge
List
insertEdge(v, w, o)
removeVertex(v)
removeEdge(e)
m
1
deg(v)
1
n2
1
5/3/2002 7:41 AM
Graphs
14
Campus Tour
6/8/2002 2:07 PM
Campus Tour
6/8/2002 2:07 PM
Campus Tour
Graph Assignment
6/8/2002 2:07 PM
Campus Tour
Kruskals Algorithm
The vertices are
partitioned into clouds
Partition ADT:
Reference to edge
object for adjacent
vertices
Null for non
nonadjacent
vertices
6/8/2002 2:07 PM
6/8/2002 2:07 PM
1
0
6/8/2002 2:07 PM
0
a
Campus Tour
Example
Algorithm KruskalMSF(G)
Input weighted graph G
Output labeling of the edges of a
minimum spanning forest of G
6
H
C 11
9
C 11
7
10
6/8/2002 2:07 PM
3
D
2
A
Campus Tour
C 11
10
G
9
10
8
5
3
D
10
C 11
Campus Tour
2D-array adjacency
array
Frontend
Your task
Goals
Campus Tour
3
D
Campus Tour
6/8/2002 2:07 PM
Example (contd.)
8
B
5
6
H
C 11
C 11
10
3
D
9
C 11
6/8/2002 2:07 PM
st
e
4
four steps
tw
o
B
1
10
Partition implementation
4
ps
10
C 11
Partition Implementation
6
2
10
Label operations
Campus Tour
6/8/2002 2:07 PM
Campus Tour
6
A
DFS: unexplored/visited
label for vertices and
unexplored/ forward/back
labels for edges
Dijkstra and Prim-Jarnik:
distance, locator, and
parent labels for vertices
Kruskal: locator label for
vertices and MSF label for
edges
Campus Tour
10
TSP Approximation
Auxiliary data
Output
6/8/2002 2:07 PM
Examples
Partition operations
Campus Tour
Decorator Pattern
Graph operations
6/8/2002 2:07 PM
Amortized analysis
Campus Tour
Example of traveling
salesperson tour
(with weight 17)
Approximation algorithm
11
6/8/2002 2:07 PM
Campus Tour
12
Depth-First Search
5/6/2002 11:46 AM
Depth-First Search
Subgraph
Connectivity
Spanning trees and forests
Algorithm
Example
Properties
Analysis
Depth-First Search
Subgraphs
Depth-First Search
Connectivity
A subgraph S of a graph
G is a graph such that
Path finding
Cycle finding
A graph is
connected if there is
a path between
every pair of
vertices
A connected
component of a
graph G is a
maximal connected
subgraph of G
Subgraph
A spanning subgraph of G
is a subgraph that
contains all the vertices
of G
Spanning subgraph
Depth-First Search
Depth-First Search
Depth-First Search
A spanning tree of a
connected graph is a
spanning subgraph that is
a tree
A spanning tree is not
unique unless the graph is
a tree
Spanning trees have
applications to the design
of communication
networks
A spanning forest of a
graph is a spanning
subgraph that is a forest
A forest is an undirected
graph without cycles
The connected
components of a forest
are trees
A (free) tree is an
undirected graph T such
that
Connected graph
Tree
Forest
Graph
Spanning tree
5
Depth-First Search
Depth-First Search
5/6/2002 11:46 AM
Depth-First Search
Depth-first search (DFS)
is a general technique
for traversing a graph
A DFS traversal of a
graph G
DFS Algorithm
Depth-first search is to
graphs what Euler tour
is to binary trees
Depth-First Search
B
C
C
9
C
Depth-First Search
10
Properties of DFS
Property 1
DFS(G, v) visits all the
vertices and edges in
the connected
component of v
We mark each
intersection, corner
and dead end (vertex)
visited
We mark each corridor
(edge ) traversed
We keep track of the
path back to the
entrance (start vertex)
by means of a rope
(recursion stack)
Depth-First Search
D
C
A
D
Depth-First Search
Depth-First Search
A
B
Algorithm DFS(G, v)
Input graph G and a start vertex v of G
Output labeling of the edges of G
in the connected component of v
as discovery edges and back edges
setLabel(v, VISITED)
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
DFS(G, w)
else
setLabel(e, BACK)
Example (cont.)
unexplored vertex
visited vertex
unexplored edge
discovery edge
back edge
Algorithm DFS(G)
Input graph G
Output labeling of the edges of G
as discovery edges and
back edges
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
DFS(G, v)
Example
A
Property 2
The discovery edges
labeled by DFS(G, v)
form a spanning tree of
the connected
component of v
11
Depth-First Search
12
Depth-First Search
5/6/2002 11:46 AM
Analysis of DFS
Path Finding
once as UNEXPLORED
once as VISITED
once as UNEXPLORED
once as DISCOVERY or BACK
Recall that
v deg(v) = 2m
Depth-First Search
13
Algorithm pathDFS(G, v, z)
setLabel(v, VISITED)
S.push(v)
if v = z
return S.elements()
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
S.push(e)
pathDFS(G, w, z)
S.pop(e)
else
setLabel(e, BACK)
S.pop(v)
Depth-First Search
14
Cycle Finding
We can specialize the
DFS algorithm to find a
simple cycle using the
template method pattern
We use a stack S to
keep track of the path
between the start vertex
and the current vertex
As soon as a back edge
(v, w) is encountered,
we return the cycle as
the portion of the stack
from the top to vertex w
Algorithm cycleDFS(G, v, z)
setLabel(v, VISITED)
S.push(v)
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
S.push(e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
pathDFS(G, w, z)
S.pop(e)
else
T new empty stack
repeat
o S.pop()
T.push(o)
until o = w
return T.elements()
S.pop(v)
Depth-First Search
15
Breadth-First Search
5/7/2002 11:06 AM
Breadth-First Search
Algorithm
Example
Properties
Analysis
Applications
L0
L1
L2
Comparison of applications
Comparison of edge labels
5/7/2002 11:06 AM
Breadth-First Search
5/7/2002 11:06 AM
Breadth-First Search
Breadth-first search
(BFS) is a general
technique for traversing
a graph
A BFS traversal of a
graph G
5/7/2002 11:06 AM
L0
L1
L1
C
E
5/7/2002 11:06 AM
L0
L0
L1
L1
Breadth-First Search
L0
C
E
L0
A
C
E
Breadth-First Search
5/7/2002 11:06 AM
Algorithm BFS(G, s)
L0 new empty sequence
L0.insertLast(s)
setLabel(s, VISITED)
i0
while Li.isEmpty()
Li +1 new empty sequence
for all v Li.elements()
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
setLabel(w, VISITED)
Li +1.insertLast(w)
else
setLabel(e, CROSS)
i i +1
Example (cont.)
Algorithm BFS(G)
Input graph G
Output labeling of the edges
and partition of the
vertices of G
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
BFS(G, v)
L0
unexplored vertex
visited vertex
unexplored edge
discovery edge
cross edge
Breadth-First Search
Example
BFS Algorithm
Breadth-First Search
L1
F
5
L0
C
5/7/2002 11:06 AM
L1
Breadth-First Search
C
E
D
F
L2
L2
L2
L1
C
E
D
F
6
Breadth-First Search
5/7/2002 11:06 AM
Example (cont.)
L0
L1
L2
Notation
L2
Property 3
C
F
Breadth-First Search
once as UNEXPLORED
once as VISITED
once as UNEXPLORED
once as DISCOVERY or CROSS
Recall that
v deg(v) = 2m
5/7/2002 11:06 AM
Breadth-First Search
DFS
BFS
Shortest paths
L2
Breadth-First Search
L0
Breadth-First Search
10
w is an ancestor of v in
the tree of discovery
edges
A
C
5/7/2002 11:06 AM
Biconnected components
L1
C
E
C
E
L1
11
5/7/2002 11:06 AM
L2
DFS
BFS
Breadth-First Search
L0
L2
DFS
5/7/2002 11:06 AM
L0
5/7/2002 11:06 AM
Applications
C
E
L1
Property 2
Analysis
Property 1
5/7/2002 11:06 AM
L2
L1
L0
L1
Properties
L0
C
E
D
F
BFS
Breadth-First Search
12
Biconnectivity
5/7/2002 11:09 AM
Biconnectivity
SEA
PVD
ORD
SNA
FCO
Algorithms (6.3.2)
MIA
Auxiliary graph
Proxy graph
5/7/2002 11:09 AM
Biconnectivity
Applications
Separation edges and vertices represent single points of failure in a
network and are critical to the operation of the network
Example
SFO
PVD
LAX
5/7/2002 11:09 AM
DFW
HNL
LAX
MIA
Biconnectivity
Biconnected Components
SFO
ORD
PVD
5/7/2002 11:09 AM
LAX
DFW
Biconnectivity
DFW
MIA
Biconnectivity
LGA
HNL
5/7/2002 11:09 AM
Equivalence Classes
PVD
ORD
LGA
LGA
HNL
Example
Biconnectivity
Biconnected Graph
Definitions
5/7/2002 11:09 AM
RDU
MIA
5
5/7/2002 11:09 AM
Biconnectivity
Biconnectivity
5/7/2002 11:09 AM
Link Relation
Link Components
e = f, or
G has a simple cycle
containing e and f
Proof Sketch:
The reflexive and
symmetric properties
follow from the definition
For the transitive
property, consider two
simple cycles sharing an
edge
5/7/2002 11:09 AM
Theorem:
The link relation on the
edges of a graph is an
equivalence relation
Biconnectivity
5/7/2002 11:09 AM
HNL
i
i
j
d
RDU
DFW
MIA
Biconnectivity
DFS on graph G
g
e
f
c
a
DFS on graph G
Auxiliary graph B
Biconnectivity
5/7/2002 11:09 AM
Proxy Graph
Auxiliary graph B
Biconnectivity
10
Algorithm proxyGraph(G)
Input connected graph G
Output proxy graph F for G
F empty graph
DFS(G, s) { s is any vertex of G}
for all discovery edges e of G
F.insertVertex(e)
setLabel(e, UNLINKED)
for all vertices v of G in DFS visit order
for all back edges e = (u,v)
F.insertVertex(e)
while u s
f discovery edge with dest. u
F.insertEdge(e,f,)
if f getLabel(f) = UNLINKED
setLabel(f, LINKED)
u origin of edge f
else
u s { ends the while loop }
return F
5/7/2002 11:09 AM
PVD
LAX
5/7/2002 11:09 AM
g
b
ORD
SFO
LGA
Auxiliary Graph
Biconnectivity
j
d
DFS on graph G
g
b
c
f
d
11
5/7/2002 11:09 AM
Biconnectivity
j
d
DFS on graph G
g
Proxy graph F
c
a
f
d
Proxy graph F
12
Shortest Path
5/13/2002 10:55 AM
Directed Graphs
BOS
ORD
Directed DFS
Strong connectivity
JFK
SFO
LAX
DFW
MIA
Directed Graphs
Topological Sorting
Directed Graphs
Digraphs
Digraph Properties
C
A digraph is a graph
whose edges are all
directed
Applications
B
A
Directed Graphs
ics51
ics53
ics52
ics161
ics141
ics121
ics171
Directed Graphs
discovery edges
back edges
forward edges
cross edges
E
D
C
B
ics151
Directed Graphs
Directed DFS
Digraph Application
ics22
ics21
one-way streets
flights
task scheduling
ics131
Directed Graphs
Shortest Path
5/13/2002 10:55 AM
Strong Connectivity
Reachability
D
F
Directed Graphs
Strong Connectivity
Algorithm
a
G:
c
a
G:
c
d
Directed Graphs
e
b
f
9
B
C
B
C
A
We can perform
DFS starting at
each vertex
{a,c,g}
{f,d,e,b}
Directed Graphs
Computing the
Transitive Closure
Transitive Closure
Directed Graphs
Strongly Connected
Components
Pick a vertex v in G.
Perform a DFS from v in G.
Directed Graphs
C
A
10
O(n(n+m))
G*
11
Directed Graphs
12
Shortest Path
5/13/2002 10:55 AM
Floyd-Warshall
Transitive Closure
Floyd-Warshalls Algorithm
Directed Graphs
Algorithm FloydWarshall(G)
Input digraph G
Output transitive closure G* of G
i1
for all v G.vertices()
G0=G
denote v as vi
Gk has a directed edge (vi, vj)
ii+1
if G has a directed path from
G0 G
vi to vj with intermediate
for k 1 to n do
vertices in the set {v1 , , vk}
Gk Gk 1
We have that Gn = G*
for i 1 to n (i k) do
for j 1 to n (j i, k) do
In phase k, digraph Gk is
computed from Gk 1
if Gk 1.areAdjacent(vi, vk)
Gk 1.areAdjacent(vk, vj)
Running time: O(n3),
if Gk.areAdjacent(vi, vj)
assuming areAdjacent is O(1)
Gk.insertDirectedEdge(vi, vj , k)
(e.g., adjacency matrix)
return Gn
Floyd-Warshalls algorithm
numbers the vertices of G as
v1 , , vn and computes a
series of digraphs G0, , Gn
13
Floyd-Warshall Example
v7
Directed Graphs
14
Floyd-Warshall, Iteration 1
BOS
ORD
v4
ORD
JFK
v2
SFO
v1
v4
JFK
v2
v6
LAX
v6
SFO
DFW
LAX
v3
v1
MIA
DFW
v3
MIA
v5
v5
Directed Graphs
15
Floyd-Warshall, Iteration 2
v7
Directed Graphs
16
Floyd-Warshall, Iteration 3
BOS
ORD
ORD
JFK
v4
JFK
v2
v6
SFO
v1
v7
BOS
v4
v2
LAX
v7
BOS
v6
SFO
DFW
LAX
v3
v1
MIA
DFW
v3
MIA
v5
Directed Graphs
v5
17
Directed Graphs
18
Shortest Path
5/13/2002 10:55 AM
Floyd-Warshall, Iteration 4
v7
Floyd-Warshall, Iteration 5
v4
ORD
SFO
v1
JFK
v2
v6
LAX
v4
ORD
JFK
v2
v6
SFO
DFW
DFW
LAX
v3
v3
v1
MIA
MIA
v5
v5
Directed Graphs
19
Floyd-Warshall, Iteration 6
v7
Directed Graphs
20
Floyd-Warshall, Conclusion
BOS
JFK
v2
v6
SFO
v1
v4
ORD
JFK
v2
v7
BOS
v4
ORD
LAX
v7
BOS
BOS
v6
SFO
DFW
DFW
LAX
v3
v3
v1
MIA
MIA
v5
v5
Directed Graphs
21
Topological Sorting
wake up
2
study computer sci.
C
DAG G
A
D
B
C
v4
22
Directed Graphs
v5
4
7
play
nap
Topological
ordering of G
23
5
more c.s.
8
write c.s. program
9
make cookies
for professors
v3
eat
6
work out
10
sleep
11
dream about graphs
Directed Graphs
24
Shortest Path
5/13/2002 10:55 AM
Algorithm topologicalDFS(G, v)
Input graph G and a start vertex v of G
Output labeling of the vertices of G
in the connected component of v
setLabel(v, VISITED)
for all e G.incidentEdges(v)
if getLabel(e) = UNEXPLORED
w opposite(v,e)
if getLabel(w) = UNEXPLORED
setLabel(e, DISCOVERY)
topologicalDFS(G, w)
else
{e is a forward or cross edge}
Label v with topological number n
nn-1
Algorithm topologicalDFS(G)
Input dag G
Output topological ordering of G
n G.numVertices()
for all u G.vertices()
setLabel(u, UNEXPLORED)
for all e G.edges()
setLabel(e, UNEXPLORED)
for all v G.vertices()
if getLabel(v) = UNEXPLORED
topologicalDFS(G, v)
Method TopologicalSort(G)
HG
// Temporary copy of G
n G.numVertices()
while H is not empty do
Let v be a vertex with no outgoing edges
Label v n
nn-1
Remove v from H
Topological Sorting
Algorithm using DFS
O(n+m) time.
25
Directed Graphs
26
9
Directed Graphs
27
Directed Graphs
28
7
8
8
9
Directed Graphs
9
29
Directed Graphs
30
Shortest Path
5/13/2002 10:55 AM
6
7
5
7
8
9
Directed Graphs
31
Directed Graphs
32
4
6
4
6
5
7
5
7
8
9
Directed Graphs
33
Directed Graphs
34
3
4
6
4
6
5
7
5
7
8
9
Directed Graphs
9
35
Directed Graphs
36
Shortest Path
5/15/2002 11:40 AM
Shortest Paths
5
E
D
8
Shortest Paths
Weighted Graphs
802
DFW
LGA
7
138
1120
PVD
10
99
SFO
HNL
MIA
Shortest Paths
There is a tree of shortest paths from a start vertex to all the other
vertices
Example:
Tree of shortest paths from Providence
1233
802
LAX
DFW
Shortest Paths
849
2
14
PVD
LGA
7
138
1120
10
99
1205
2555
337
HNL
43
17
LAX
1233
849
2
14
PVD
LGA
DFW
Shortest Paths
Property 2:
ORD
43
17
ORD
7
138
1120
10
99
MIA
4
Dijkstras Algorithm
Property 1:
1843
2555
1843
SFO
1205
LAX
1233
2
14
Applications
337
43
17
ORD
849
Example:
1205
2555
337
HNL
1843
SFO
Shortest Paths
Algorithm
Edge relaxation
802
MIA
5
Shortest Paths
Shortest Path
5/15/2002 11:40 AM
Edge Relaxation
Example
d(u) = 50
8
e
10 d(z) = 75
10 d(z) = 60
Shortest Paths
5
E
B
2
C
3
A
2
C
3
5
E
4
1
D
8
Label operations
We set/get the distance and locator labels of vertex z O(deg(z)) times
Setting/getting a label takes O(1) time
Recall that
B
2
D
8
2
7
C
3
5
E
4
1
D
8
5
8
Algorithm DijkstraDistances(G, s)
Q new heap-based priority queue
for all v G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
Q.replaceKey(getLocator(z),r)
Shortest Paths
10
Extension
Graph operations
insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item
Analysis
Key: distance
Element: vertex
Shortest Paths
Locator-based methods
11
8
D
Shortest Paths
5
E
8
F
Dijkstras Algorithm
C
3
Example (cont.)
8
A
7
d(u) = 50
v deg(v) = 2m
The running time can also be expressed as O(m log n) since the
graph is connected
Shortest Paths
11
Algorithm DijkstraShortestPathsTree(G, s)
setParent(v, )
12
Shortest Path
5/15/2002 11:40 AM
8
B
2
C
3
5
E
D
8
13
Bellman-Ford Algorithm
C
0
-8
D
5
Shortest Paths
-2
-2
1
-2
-2
5 8
1
-2
-2
1
-2
3
4
-2
9
-2
setDistance(z,r)
Shortest Paths
14
Bellman-Ford Example
Shortest Paths
-1
15
-2
-2
9
-1
4
9
Shortest Paths
16
DAG-based Algorithm
Works even with
negative-weight edges
Uses topological order
Doesnt use any fancy
data structures
Is much faster than
Dijkstras algorithm
Running time: O(n+m).
DAG Example
Algorithm DagDistances(G, s)
for all v G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
Perform a topological sort of the vertices
for u 1 to n do {in topological order}
for each e G.outEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
17
-5
-2
3
-2
2
9
5
-5
3
6
5
5
-2
-2
2
-2
0
7
-5
Shortest Paths
-2
9
-1
5
-5
5
Shortest Paths
7
0
1
3
6
-2
9
7
(two steps)
-1
5
5
18
Shortest Path
5/15/2002 11:40 AM
5/13/2002 4:52 PM
337
144
JFK
1258
184
Definitions
A crucial fact
187
740
621
802
LAX
PVD
ORD
1846
BWI
1391
1464
BOS
867
849
SFO
1090
DFW
1235
946
1121
MIA
2342
PIT
DEN
Let T be a minimum
spanning tree of a
weighted graph G
Let e be an edge of G
that is not in T and C let
be the cycle formed by e
with T
For every edge f of C,
weight(f) weight(e)
Proof:
By contradiction
If weight(f) > weight(e) we
can get a spanning tree
of smaller weight by
replacing e with f
10
7
3
STL
DCA
Applications
Communications networks
Transportation networks
DFW
ATL
Cycle Property:
Spanning tree
Cycle Property
ORD
Subgraph of a graph G
containing all the vertices of G
9
3
7
Replacing f with e yields
a better spanning tree
f
2
8
4
9
3
7
Minimum Spanning Trees
Partition Property
Partition Property:
Consider a partition of the vertices of
G into subsets U and V
Let e be an edge of minimum weight
across the partition
There is a minimum spanning tree of
G containing edge e
Proof:
Let T be an MST of G
If T does not contain e, consider the
cycle C formed by e with T and let f
be an edge of C across the partition
By the cycle property,
weight(f) weight(e)
Thus, weight(f) = weight(e)
We obtain another MST by replacing
f with e
4
9
8
8
e
7
4
9
8
8
Prim-Jarniks Algorithm
Similar to Dijkstras algorithm (for a connected graph)
We pick an arbitrary vertex s and we grow the MST as a
cloud of vertices, starting from s
We store with each vertex v a label d(v) = the smallest
weight of an edge connecting v to a vertex in the cloud
At each step:
We add to the cloud the
vertex u outside the cloud
with the smallest distance
label
We update the labels of the
vertices adjacent to u
e
7
5
5/13/2002 4:52 PM
Key: distance
Element: vertex
Locator-based methods
insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item
Example
Algorithm PrimJarnikMST(G)
Q new heap-based priority queue
s a vertex of G
for all v G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
setParent(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
z G.opposite(u,e)
r weight(e)
if r < getDistance(z)
setDistance(z,r)
setParent(z,e)
Q.replaceKey(getLocator(z),r)
Distance
Parent edge in MST
Locator in priority queue
3
2
8
E
8
A
5
C
3
7
8
Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex w in the priority queue is modified at most deg(w)
times, where each key change takes O(log n) time
Recall that
v deg(v) = 2m
10
Algorithm KruskalMST(G)
for each vertex V in G do
define a Cloud(v) of {v}
let Q be a priority queue.
Insert all edges into Q using their
weights as the key
T
while T has fewer than n-1 edges do
edge e = T.removeMin()
Let u, v be the endpoints of e
if Cloud(v) Cloud(u) then
Add edge e to T
Merge Cloud(v) and Cloud(u)
return T
Kruskals Algorithm
3
7
5
C
5
8
Analysis
F
E
3
7
Label operations
4
8
8
0
Graph operations
5
C
8
7
8
A
5
C
Example (contd.)
7
5
C
8
B
5
F
E
7
2
4
8
8
A
8
C
11
12
5/13/2002 4:52 PM
Representation of a
Partition
Partition-Based
Implementation
A partition-based version of Kruskals Algorithm
performs cloud merges as unions and tests as finds.
Algorithm Kruskal(G):
Input: A weighted graph G.
Output: An MST T for G.
Let P be a partition of the vertices of G, where each vertex forms a separate set.
Let Q be a priority queue storing the edges of G, sorted by their weights
Let T be an initially-empty tree
while Q is not empty do
(u,v) Q.removeMinElement()
if P.find(u) != P.find(v) then
Running time:
Add (u,v) to T
O((n+m)log n)
P.union(u,v)
return T
Kruskal
Example
2704
13
849
337
LAX
337
1090
DFW
1235
946
LAX
BWI
1090
946
DFW
1235
1121
1121
MIA
MIA
2342
2342
Minimum Spanning Trees
Example
2704
15
849
740
621
1846
337
LAX
1391
1464
1235
740
621
1846
1258
337
LAX
1391
1464
1235
PVD
187
144
JFK
1258
184
802
SFO
BWI
946
BWI
1090
946
DFW
1121
1121
MIA
MIA
2342
Minimum Spanning Trees
BOS
849
ORD
144
JFK
16
867
PVD
1090
DFW
2704
187
184
802
Example
BOS
867
ORD
SFO
1258
184
1391
1464
144
JFK
802
SFO
BWI
1391
1464
PVD
187
740
621
1846
1258
184
802
BOS
867
849
144
JFK
14
ORD
187
740
621
SFO
2704
PVD
ORD
1846
Example
BOS
867
2342
17
18
Example
5/13/2002 4:52 PM
2704
849
337
LAX
1846
337
1090
946
DFW
1235
LAX
BWI
1090
946
DFW
1235
1121
1121
MIA
MIA
2342
2342
Example
2704
19
849
337
LAX
144
337
946
LAX
BWI
1090
946
DFW
1235
1121
1121
MIA
MIA
2342
2342
Example
2704
21
849
740
621
1846
337
LAX
1391
1464
1235
740
621
1846
1258
337
LAX
1391
1464
1235
PVD
187
144
JFK
1258
184
802
SFO
BWI
946
BWI
1090
946
DFW
1121
1121
MIA
MIA
2342
Minimum Spanning Trees
BOS
849
ORD
144
JFK
22
867
PVD
1090
DFW
2704
187
184
802
Example
BOS
867
ORD
SFO
1258
184
1391
1464
144
JFK
802
SFO
1090
1235
PVD
187
740
621
1846
1258
BWI
DFW
BOS
849
ORD
184
1391
20
867
PVD
JFK
802
1464
2704
187
740
621
1846
Example
BOS
867
ORD
SFO
1258
184
1391
1464
144
JFK
802
SFO
BWI
1391
PVD
187
740
621
1258
184
802
1464
849
144
JFK
BOS
867
ORD
187
740
621
SFO
2704
PVD
ORD
1846
Example
BOS
867
2342
23
24
Example
5/13/2002 4:52 PM
2704
849
LAX
1846
337
1090
946
DFW
1235
LAX
BWI
1391
1090
946
DFW
1235
1121
MIA
MIA
2342
2342
2704
25
ORD
740
621
1846
1391
1464
337
LAX
1235
PVD
BWI
946
DFW
1235
1121
MIA
MIA
2342
Minimum Spanning Trees
BWI
1090
946
LAX
1258
184
1391
1464
144
JFK
802
SFO
337
187
740
621
1846
1258
1121
2342
27
Baruvka
Example
Baruvkas Algorithm
2704
28
BOS
867
849
Algorithm BaruvkaMST(G)
T V {just the vertices of G}
while T has fewer than n-1 edges do
for each connected component C in T do
Let edge e be the smallest-weight edge from C to another component in T.
if e is not already in T then
Add edge e to T
return T
BOS
867
ORD
144
JFK
26
849
187
1090
DFW
2704
PVD
184
802
Example
BOS
867
849
SFO
1258
184
1464
1121
Example
144
JFK
802
SFO
BWI
1391
1464
PVD
187
740
621
1258
184
802
337
849
144
JFK
BOS
867
ORD
187
740
621
SFO
2704
PVD
ORD
1846
Example
BOS
867
ORD
740
1846
621
802
SFO
337
LAX
1391
1464
1235
PVD
187
144
JFK
1258
184
BWI
1090
946
DFW
1121
MIA
2342
29
30
Example
5/13/2002 4:52 PM
2704
849
ORD
740
1846
621
802
SFO
337
LAX
1391
1464
1235
Example
BOS
867
187
849
ORD
144
JFK
740
1846
621
1258
184
802
SFO
BWI
337
946
LAX
1391
1464
1235
187
PVD
144
JFK
1258
184
BWI
1090
DFW
946
1121
1121
MIA
MIA
2342
2342
Minimum Spanning Trees
BOS
867
PVD
1090
DFW
2704
31
32
Maximum Flow
5/13/2002 5:09 PM
Maximum Flow
4/6
3/3
1/1
3/5
5/13/2002 5:09 PM
Maximum flow
3/3
1/1
4/7
1/9
3/5
2/2
Maximum Flow
Example:
v
1
u
5/13/2002 5:09 PM
7
w
Hydraulic systems
Electrical circuits
Traffic movements
Freight transportation
5/13/2002 5:09 PM
where
E(v)
5
z
f ( e) = f (e )
E+(v)
1/1
3/3
u
3
5/13/2002 5:09 PM
1/1
3/5
3/7
2/9
4/5
z
2/2
Maximum Flow
Cut
v
2/6
1/3
1/1
3/3
1/1
3/5
3/7
2/9
4/5
z
2/2
Flow of value 8 = 2 + 3 + 3 = 1 + 3 + 4
v
4/6
3/3
1/1
3/3
1/1
3/5
3/7
2/9
t
4/5
z
2/2
Maximum flow of value 10 = 4 + 3 + 3 = 3 + 3 + 4
u
Maximum Flow
eE + ( v )
2/6
Maximum Flow
Maximum Flow
Maximum Flow
The value of a flow f , denoted |f|, is the total flow from the source,
which is the same as the total flow into the sink
Example:
v
1/3
5/13/2002 5:09 PM
Flow
Flow Network
Flow (8.1.1)
Cut (8.1.2)
5/13/2002 5:09 PM
Maximum Flow
3
1
w
5
u
and destination in Vt
Backward edge of cut : origin in
Vt and destination in Vs
5
z
2/6
3/3
1/3
1/1
w
1/1
3/5
u
2/2
3/7
2/9
4/5
z
6
Maximum Flow
5/13/2002 5:09 PM
Augmenting Path
Lemma:
The flow f() across any
cut is equal to the flow
value |f|
Lemma:
The flow f() across a cut
is less than or equal to
the capacity c() of the cut
Theorem:
The value of any flow is
less than or equal to the
capacity of any cut, i.e.,
for any flow f and any cut
, we have
|f| c()
5/13/2002 5:09 PM
2
v
2/6
1/3
1/1
3/3
2/9
1/1
3/5
3/7
4/5
2/2
5/13/2002 5:09 PM
Flow Augmentation
Forward edge:
f (e) = f(e) + f()
Backward edge:
f (e) = f(e) f()
2/9
4/5
|f|=7
2/2
2/6
0/1
3/3
2/7
1/1
2/9
4/5
Maximum Flow
5/13/2002 5:09 PM
Theorem:
The value of a maximum
flow is equal to the
capacity of a minimum cut
10
1/1
u
0/9
1/3
0/5
1/5
0/2
0/3
1/1
w
0/1
u
2/7
0/9
1/5
z
1/2
4/7
1/9
2/2
1/6
t
1/7
1/1
u
3/3
1/1
3/3
0/3
0/3
0/1
1/5
4/6
3/5
Maximum Flow
0/6
An edge e is traversed
from u to v provided
f(u, v) > 0
Example (1)
Define
Vs set of vertices reachable from s
by augmenting paths
Vt set of remaining vertices
Search for an
augmenting path
Augment by f() the
flow along the edges
of
| f | = 8
2/2
Maximum Flow
Termination of Ford-Fulkersons
algorithm
|f| = 7
Algorithm FordFulkersonMaxFlow(N)
for all e G.edges()
setFlow(e, 0)
while G has an augmenting path
{ compute residual capacity of }
A specialization of DFS
(or BFS) searches for an
augmenting path
2/3
f() = 1
u
5/13/2002 5:09 PM
2/7
0/1
3/5
2/2
Maximum Flow
1/3
1/1
3/3
4/5
z
Ford-Fulkersons Algorithm
2/6
2/5
2/9
f(s,u) = 3
f(u,w) = 1
f(w,v) = 1
f(v,t) = 2
f() = 1
A path from s to t is an
augmenting path if f() > 0
2/7
0/1
2/5
c(1) = 12 = 6 + 3 + 1 + 2
c(2) = 21 = 3 + 7 + 9 + 2
|f| = 8
1/3
1/1
3/3
Maximum Flow
Lemma:
Let be an augmenting path
for flow f in network N. There
exists a flow f for N of value
| f | = |f | + f()
Proof:
We compute flow f by
modifying the flow on the
edges of
2/6
3/5
z
0/1
u
11
0/3
0/1
1/3
s
1/5
c() = | f | = 10
0/6
5/13/2002 5:09 PM
1/2
1/6
t
1/7
0/9
1/5
z
2/3
s
1/5
0/1
u
Maximum Flow
1/3
0/1
1/2
2/7
0/9
1/5
z
12
Maximum Flow
5/13/2002 5:09 PM
Example (2)
v
3/6
1/5
3/3
0/1
2/3
0/1
u
Analysis
t
2/7
0/9
1/5
3/3
1/1
3/3
1/5
z
1/2
4/6
0/1
u
3/7
1/9
2/5
z
1/2
two steps
v
3/6
0/1
3/3
s
1/5
3/3
0/1
u
5/13/2002 5:09 PM
1/2
4/6
t
2/7
1/9
2/5
z
3/3
s
3/5
1/1
u
Maximum Flow
3/3
1/1
2/2
4/7
1/9
3/5
z
13
1/1
1/50
Maximum Flow
0/50
1/50
0/50
v
0/1
1/50
s
1/50
1/50
t
2
1/50
14
Pattern Matching
5/29/2002 11:27 AM
Pattern Matching
Pattern Matching
Strings
Java program
HTML document
DNA sequence
Digitized image
ASCII
Unicode
{0, 1}
{A, C, G, T}
Pattern Matching
Text editors
Search engines
Biological research
Pattern Matching
Example
p a t
r i
1
t h m
t e r n
2
t h m
m a t c h i n g
3
t h m
r
r
4
t h m
Pattern Matching
a l g o r
5
t h m
t h m
11 10 9 8 7
r i t h m
r i
6
t h m
5
Example:
= {a, b, c, d}
r i
Algorithm BruteForceMatch(T, P)
Input text T of size n and pattern
P of size m
Output starting index of a
substring of T equal to P or 1
if no such substring exists
a match is found, or
for i 0 to n m
all placements of the pattern
{ test shift i of the pattern }
have been tried
j0
Brute-force pattern matching
while j < m T[i + j] = P[j]
runs in time O(nm)
jj+1
Example of worst case:
if j = m
T = aaa ah
return i {match at i}
P = aaah
may occur in images and
else
DNA sequences
break while loop {mismatch}
unlikely in English text
return -1 {no match anywhere}
Last-Occurrence Function
Boyer-Moore Heuristics
Pattern Matching
Brute-Force Algorithm
A string is a sequence of
characters
Examples of strings:
P = abacab
L(c)
Pattern Matching
Pattern Matching
5/29/2002 11:27 AM
Example
Case 1: j 1 + l
.
a .
i
b a
j l
mj
a
a
a .
i
a .
l
b .
j
m (1 + l)
a .
a
4
b
3
13 12 11 10 9
a
7
b .
1+l
Pattern Matching
Boyer-Moores algorithm
runs in time O(nm + s)
Example of worst case:
Analysis
b
a
Pattern Matching
b a
Case 2: 1 + l j
.
12 11 10
T = aaa a
P = baaa
Knuth-Morris-Pratts algorithm
compares the pattern to the
text in left-to-right, but shifts
the pattern more intelligently
than the brute-force algorithm.
When a mismatch occurs,
what is the most we can shift
the pattern so as to avoid
redundant comparisons?
Answer: the largest prefix of
P[0..j] that is a suffix of P[1..j]
18 17 16 15 14 13
24 23 22 21 20 19
Pattern Matching
a b a a b x .
a b a a b a
j
a b a a b a
No need to
repeat these
comparisons
Resume
comparing
here
10
P[j]
F(j)
a b a a b x .
a b a a b a
j
Pattern Matching
Pattern Matching
i increases by one, or
the shift amount i j
increases by at least one
(observe that F(j 1) < j)
a b a a b a
F(j 1)
11
Algorithm KMPMatch(T, P)
F failureFunction(P)
i0
j0
while i < n
if T[i] = P[j]
if j = m 1
return i j { match }
else
ii+1
jj+1
else
if j > 0
j F[j 1]
else
ii+1
return 1 { no match }
Pattern Matching
12
Pattern Matching
5/29/2002 11:27 AM
Example
i increases by one, or
the shift amount i j
increases by at least one
(observe that F(j 1) < j)
F[i] j + 1
ii+1
jj+1
else if j > 0 then
{use failure function to shift P}
j F[j 1]
else
F[i] 0 { no match }
ii+1
13
a b a c a a b a c c a b a c a b a a b b
1 2 3 4 5 6
a b a c a b
7
a b a c a b
8 9 10 11 12
a b a c a b
13
3
P[j]
F(j)
a b a c a b
14 15 16 17 18 19
a b a c a b
Pattern Matching
14
Tries
5/24/2002 8:37 AM
Tries
e
mize
mi
nimize
ze
nimize
5/24/2002 8:37 AM
nimize
ze
ze
Tries
Preprocessing Strings
After preprocessing the pattern, KMPs algorithm performs
pattern matching in time proportional to the text size
Tries
5/24/2002 8:37 AM
s
u
l
e
y
l
l
Tries
5/24/2002 8:37 AM
Tries
We insert the
words of the
text into a
trie
Each leaf
stores the
occurrences
of the
associated
word in the
text e
The standard trie for a set of strings S is an ordered tree such that:
5/24/2002 8:37 AM
Tries
5/24/2002 8:37 AM
o
c
k
p
5
78
5/24/2002 8:37 AM
s e e
b e a r ?
s e l
s t o c k !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
s e e
b u l
l ?
b u y
s t o c k !
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
b i d
s t o c k !
b i d
s t o c k !
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
h e a r
t h e
b e l
l ?
s t o p !
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
i
d
47, 58
u
l
l
e
y
36
a
r
69
30
Tries
e
e
0, 24
t
l
l
12
o
c
k
17, 40,
51, 62
p
84
Tries
5/24/2002 8:37 AM
Compressed Trie
A compressed trie has
internal nodes of degree
at least two
It is obtained from
standard trie by
compressing chains of
redundant nodes
Compact Representation
b
ar
id
ll
ell
ll
to
ck
0 1 2 3
0 1 2 3 4
S[4] =
S[1] =
s e e
b e a r
S[2] =
s e l l
S[3] =
s t o c k
S[0] =
b
e
5/24/2002 8:37 AM
1, 2, 3
Tries
S[7] =
S[5] =
S[8] =
h e a r
b e l l
S[6] =
b i d
S[9] =
s t o p
1, 0, 0
1, 1, 1
6, 1, 2
8, 2, 3
0 1 2 3
b u l l
b u y
0, 0, 0
7, 0, 3
4, 1, 1
4, 2, 3
5/24/2002 8:37 AM
0, 1, 1
5, 2, 2
0, 2, 2
3, 1, 2
2, 2, 3
3, 3, 4
9, 3, 3
Tries
m i n i m i z e
0 1 2 3 4 5 6 7
m i n i m i z e
0 1 2 3 4 5 6 7
mi
nimize
ze
7, 7
mize
nimize
ze
nimize
5/24/2002 8:37 AM
ze
4, 7
Tries
00
010
011
10
11
5/24/2002 8:37 AM
a
Tries
2, 7
0, 1
6, 7
5/24/2002 8:37 AM
2, 7
2, 7
6, 7
6, 7
Tries
10
1, 1
d
b
Given a text string X, we want to find a prefix code for the characters
of X that yields a small encoding for X
Frequent characters should have long code-words
Rare characters should have short code-words
Example
X = abracadabra
T1 encodes X into 29 bits
T2 encodes X into 24 bits
T1
T2
d
a
11
5/24/2002 8:37 AM
b
c
Tries
d
12
Tries
5/24/2002 8:37 AM
Huffmans Algorithm
Given a string X,
Huffmans algorithm
construct a prefix
code the minimizes
the size of the
encoding of X
It runs in time
O(n + d log d), where
n is the size of X
and d is the number
of distinct characters
of X
A heap-based
priority queue is
used as an auxiliary
structure
5/24/2002 8:37 AM
Example
Algorithm HuffmanEncoding(X)
Input string X of size n
Output optimal encoding trie for X
C distinctCharacters(X)
computeFrequencies(C, X)
Q new empty heap
for all c C
T new single-node tree storing c
Q.insert(getFrequency(c), T)
while Q.size() > 1
f1 Q.minKey()
T1 Q.removeMin()
f2 Q.minKey()
T2 Q.removeMin()
T join(T1, T2)
Q.insert(f1 + f2, T)
return Q.removeMin()
Tries
11
a
5
b
2
c
1
d
1
X = abracadabra
Frequencies
a
5
b
2
5/24/2002 8:37 AM
r
2
a
5
2
d
r
2
13
a
5
Tries
4
d
4
d
r
14
Numerical Algorithms
6/8/2002 2:07 PM
Outline
Divisibility and primes
Modular arithmetic
Euclids GCD algorithm
Multiplicative inverses
Powers
Fermats little theorem
Eulers theorem
Numerical Algorithms
x1
6/8/2002 2:07 PM
9
9
Numerical Algorithms
2, 7, 19 are primes
3, 1, 6 are not primes
gcd(18, 30) = 6
gcd(21, 49) = 7
p is an integer
p2
The only divisors of p are 1and p
Examples
Numerical Algorithms
Prime number p:
6/8/2002 2:07 PM
gcd(0, 20) = 20
200 = 23 52
Numerical Algorithms
Modular Arithmetic
13 mod 13 = 0
13 = 0 + 113
1 mod 13 = 12
12 = 1 + 113
Numerical Algorithms
6/8/2002 2:07 PM
gcd(412, 260) = 4
Algorithm EuclidGCD(a, b)
Input integers a and b
Output gcd(a, b)
if b = 0
return a
else
return EuclidGCD(b, a mod b)
44
20
20
44
6/8/2002 2:07 PM
Numerical Algorithms
Numerical Algorithms
6/8/2002 2:07 PM
Analysis
ai + 2 < ai + 1 ai
ai + 2 = ai mod ai + 1 = ai ai + 1 ai
6/8/2002 2:07 PM
Numerical Algorithms
Corollary
If is p is prime, every nonzero residue in Zp has a multiplicative
inverse
Theorem
A variation of Euclids GCD algorithm computes the multiplicative
inverse of an element x of Zn or determines that it does not exist
6/8/2002 2:07 PM
1
1
3
7
7
3
24 mod 1 = 16 mod 5 = 1
44 mod 1 = 256 mod 5 = 1
10
10
Numerical Algorithms
Let p be a prime
The sequences of successive powers of the elements of Zp
exhibit repeating subsequences
The sizes of the repeating subsequences and the number of
their repetitions are the divisors of p 1
Example (p = 7)
x
x2
x3
x4
x5
6/8/2002 2:07 PM
x6
Numerical Algorithms
10
The multiplicative group for Zn, denoted with Z*n, is the subset of
elements of Zn relatively prime with n
The totient function of n, denoted with (n), is the size of Z*n
Example
Z*10 = { 1, 3, 7, 9 }
(10) = 4
If p is prime, we have
Corollary
Let p be a prime. For each nonzero residue x of Zp,
the multiplicative inverse of x is xp 2 mod p
Proof
x(xp 2 mod p) mod p = xxp 2 mod p = xp 1 mod p = 1
Numerical Algorithms
Eulers Theorem
Theorem
Let p be a prime. For each nonzero residue x of Zp,
we have xp 1 mod p = 1
Example (p = 5):
6/8/2002 2:07 PM
6/8/2002 2:07 PM
9
9
Numerical Algorithms
14 mod 5 = 1
34 mod 1 = 81 mod 5 = 1
Powers
Theorem
An element x of Zn has a multiplicative inverse if and only if x and
n are relatively prime
Example
x
x1
x1
11
(p) = p 1
Theorem
For each element x of Z*n, we have x(n) mod n = 1
Example (n = 10)
3(10) mod 10 = 34 mod 10 = 81 mod 10 = 1
7(10) mod 10 = 74 mod 10 = 2401 mod 10 = 1
9(10) mod 10 = 94 mod 10 = 6561 mod 10 = 1
6/8/2002 2:07 PM
Numerical Algorithms
12
FFT
11/27/2002 1:42 AM
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
FFT
Polynomials
FFT
Polynomial Evaluation
Polynomial:
Horners Rule:
p( x ) = 5 + 2 x + 8 x 2 + 3x 3 + 4 x 4
p ( x ) = ai x i
In general,
n 1
p ( x ) = ai x
i =0
Given x, we can evaluate p(x) in O(n) time using the equation
i =0
or
Eval(A,x):
p( x ) = a0 + a1 x + a2 x + L + an 1 x
2
n 1
[Where A=(a0,a1,a2,,an-1)]
return a0+x*Eval(A,x)
FFT
Polynomial Multiplication
Problem
n 1
p ( x ) q( x ) = ci x
where
FFT
i =0
ci = a j bi j
j =0
FFT
FFT
11/27/2002 1:42 AM
Properties of
Primitive Roots of Unity
Example 1:
n = 1
The numbers 1, , 2, , n-1 are all distinct
Z*11:
x
1
2
3
4
5
6
7
8
9
10
x^2
1
4
9
5
3
3
5
9
4
1
x^3
1
8
5
9
4
7
2
6
3
10
x^4
1
5
4
3
9
9
3
4
5
1
x^5
1
10
1
1
1
10
10
10
1
10
x^6
1
9
3
4
5
5
4
3
9
1
x^7
1
7
9
5
3
8
6
2
4
10
x^8
1
3
5
9
4
4
9
5
3
1
x^9
1
6
4
3
9
2
8
7
5
10
=0
A[i , j ] =
A[i , i ] =
1 n 1 ki kj
n k =0
1 n 1 ki ki 1 n 1 0 1
= n =1
= n
n k =0
n
k =0
A[i , j ] =
1 n 1 ( j i ) k
= 0
n k =0
[a0,a1,a2,...,an-1]
[b0,b1,b2,...,bn-1]
[a0,a1,a2,...,an-1,0,0,...,0]
[b0,b1,b2,...,bn-1,0,0,...,0]
DFT
DFT
[y0,y1,y2,...,y2n-1]
FFT
Correctness of the
inverse DFT
F[i,j]=ij.
FFT
0 = ( n / 2 ) j = 0 + n / 2 + 0 + n / 2 + L + 0 + n / 2 = (n / 2)(1 + n / 2 )
kj
j =0
( ) 1 ( ) 1 (1) 1
11
=
= k
=
=0
k 1
k 1
1 k 1
If i=j, then
y j = ai
Convolution
kj
1,,2,,n-1
We produce (y0,y1,y2,,yn-1), where yj=p(j)
n 1
That is,
ij
Matrix form: y=Fa, where
i =0
n 1
j=0
Proof:
x^10
1
1
1
1
1
1
1
1
1
1
n 1
Proof: n-1=n=1
[z0,z1,z2,...,z2n-1]
FFT
10
Component
Multiply
[y0z0,y1z1,...,y2n-1z2n-1]
inverse DFT
[c0,c1,c2,...,c2n-1]
FFT
(Convolution)
11
FFT
12
FFT
11/27/2002 1:42 AM
13
Java Example:
Multiplying Big Integers
FFT
14
Java Integer
Multiply Method
FFT
15
FFT
16
FFT
17
FFT
18
FFT
11/27/2002 1:42 AM
Non-recursive FFT
Experimental Results
FFT
19
FFT
20
Cryptography
6/8/2002 2:08 PM
Outline
Traditional cryptography
Statistical attacks
Secret-key encryption
Public-key encryption
Cryptography
plaintext
6/8/2002 2:08 PM
encrypt
ciphertext
Cryptography
Encryption
6/8/2002 2:08 PM
encrypt
ciphertext
decrypt
Cryptography
plaintext
3
Statistical Attacks
Cryptography
Cryptography
replace a with d
replace b with e
...
replace z with c
plaintext
Issues:
Cryptography
Traditional Cryptography
Scenario:
6/8/2002 2:08 PM
LBO is THE
6/8/2002 2:08 PM
Cryptography
Cryptography
6/8/2002 2:08 PM
Decryption
6/8/2002 2:08 PM
Cryptography
Secret-Key Encryption
DES
3DES
IDEA
BLOWFISH
Cryptography
Cryptography
6/8/2002 2:08 PM
Public-Key Encryption
Code:
X Z A V O I D B Y G E R S P C F H J K L M N Q T U W
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ciphertext:
PCQ VMJYPD LBYK LYSO KBXBJXWXV BXV ZCJPO EYPD KBXBJYUXJ
LBJOO KCPK. CP LBO LBCMKXPV XPV IYJKL PYDBL, QBOP KBO BXV
OPVOV LBO LXRO CI SX'XJMI, KBO JCKO XPV EYKKOV LBO DJCMPV
ZOICJO BYS, KXUYPD: DJOXL EYPD, ICJ X LBCMKXPV XPV CPO PYDBLK
Y BXNO ZOOP JOACMPLYPD LC UCM LBO IXZROK CI FXKL XDOK XPV
LBO RODOPVK CI XPAYOPL EYPDK. SXU Y SXEO KC ZCRV XK LC AJXNO
X IXNCMJ CI UCMJ SXGOKLU?
OFYRCDMO, LXROK IJCS LBO LBCMKXPV XPV CPO PYDBLK
Plaintext:
Now during this time Shahrazad had borne King Shahriyar three sons.
On the thousand and first night, when she had ended the tale of
Ma'aruf, she rose and kissed the ground before him, saying: Great King,
for a thousand and one nights I have been recounting to you the fables
of past ages and the legends of ancient kings. May I make so bold as to
crave a favour of your majesty?
Epilogue, Tales from the Thousand and One Nights
plaintext
9
6/8/2002 2:08 PM
encrypt
private key
ciphertext
Cryptography
decrypt
plaintext
10
RSA Cryptosystem
6/8/2002 2:20 PM
Outline
Eulers theorem (10.1.3)
RSA cryptosystem (10.2.3)
RSA Cryptosystem
Bits
PCs
Memory
430
128MB
760
215,000
4GB
1,020
342106
170GB
1,620
1.61015
120TB
6/8/2002 2:20 PM
RSA Cryptosystem
Setup:
(10) = 4
RSA Cryptosystem
1
1
19
39
37
53
2
8
20
25
38
37
3
27
21
21
39
29
6/8/2002 2:20 PM
4
9
22
33
40
35
5
15
23
12
41
6
6
51
24
19
42
3
7
13
25
5
43
32
8
17
26
31
44
44
9
14
27
48
45
45
10
10
28
7
46
41
RSA Cryptosystem
11
11
29
24
47
38
M = C27 mod 55
12
23
30
50
48
42
13
52
31
36
49
4
Encryption:
M = 19
C = 195 mod 119 = 66
Decryption:
C = 6677 mod 119 = 19
M = Cd mod n
RSA Cryptosystem
C = M3 mod 55
Decryption
Keys:
Security
Encryption
p = 5, q = 11
n = 511 = 55
(n) = 410 = 40
e = 3
d = 27 (327 = 81 = 240 + 1)
Plaintext M in Zn
C = Me mod n
6/8/2002 2:20 PM
Decryption:
Setup:
p = 7, q = 17
n = 717 = 119
(n) = 616 = 96
e=5
d = 77
Encryption:
Setup:
Keys:
(p) = p 1
Eulers Theorem
For each element x of Z*n, we have x(n) mod n = 1
Example (n = 10)
M
C
M
C
M
C
Example
If p is prime, we have
6/8/2002 2:20 PM
RSA Cryptosystem
RSA Cryptosystem
The multiplicative group for Zn, denoted with Z*n, is the subset of
elements of Zn relatively prime with n
The totient function of n, denoted with (n), is the size of Z*n
Example
6/8/2002 2:20 PM
Eulers Theorem
Z*10 = { 1, 3, 7, 9 }
Definition
Example
Security
Correctness
14
49
32
43
50
40
15
20
33
22
51
46
16
26
34
34
52
28
17
18
35
30
53
47
5
18
2
36
16
54
54
RSA Cryptosystem
Bits
PCs
Memory
430
128MB
760
215,000
4GB
1,020
342106
170GB
1,620
1.61015
120TB
6
RSA Cryptosystem
6/8/2002 2:20 PM
Correctness
Algorithmic Issues
Thus, we obtain
(Me)d mod n =
Med mod n =
Mk(n) + 1 mod n =
MMk(n) mod n =
M (M(n))k mod n =
M (M(n) mod n)k mod n =
M (1)k mod n =
M mod n =
M
See the book for the proof of
correctness in the case when
the plaintext M divides n
The implementation of
the RSA cryptosystem
requires various
algorithms
Overall
RSA Cryptosystem
Modular Power
The repeated squaring
algorithm speeds up the
computation of a modular
power ap mod n
Write the exponent p in binary
p = pb 1 pb 2 p1 p0
Start with
Q1 = apb 1 mod n
Repeatedly compute
Qi = ((Qi 1)2 mod n)apb i mod n
We obtain
Qb = ap mod n
The repeated squaring
algorithm performs O (log p)
arithmetic operations
6/8/2002 2:20 PM
Qi
18
RSA Cryptosystem
a = 21
b = 15
d=3
i = 3, j = 4
3 = 321 + (4)15 =
63 60 = 3
6/8/2002 2:20 PM
d = gcd(a,b)
d = ia + jb
RSA Cryptosystem
10
6/8/2002 2:20 PM
RSA Cryptosystem
Theorem
Given positive integers a
and b, let d be the smallest
positive integer such that
d = ia + jb
for some integers i and j.
We have
d = gcd(a,b)
Example
Modular power
6/8/2002 2:20 PM
2 p5 i
Modular power
Decryption
p5 1
Representation of integers
of arbitrarily large size and
arithmetic operations on
them
Modular Inverse
Example
Pseudoprimality Testing
Generation of random
numbers with a given
number of bits (to generate
candidates p and q)
Primality testing (to check
that candidates p and q are
prime)
Computation of the GCD (to
verify that e and (n) are
relatively prime)
Computation of the
multiplicative inverse (to
compute d from e)
Encryption
6/8/2002 2:20 PM
Setup
11
Algorithm RandPrimeTest(n, k)
Input integer n,confidence
parameter k and composite
witness function witness(x,n)
with error probability q
Output an indication of
whether n is composite or prime
with probability 2k
RSA Cryptosystem
t k/log2(1/q)
for i 1 to t
x random()
if witness(x,n)= true
return n is composite
return n is prime
12
Information Security
6/8/2002 2:20 PM
Information Security
Definition (10.2.2)
RSA signature and verification (10.2.3)
message
M
one-way hash
fingerprint
f = H(M)
Key distribution
6/8/2002 2:20 PM
Information Security
Digital Signature
Integrity: S establishes that M has not been altered
Nonrepudiation: S unequivocally identifies the author A of M and proves
that A did indeed sign M
Information Security
Setup:
Signature:
M = 51
S = 5127 mod 55 = 6
Verification:
6/8/2002 2:20 PM
p = 5, q = 11
n = 511 = 55
(n) = 410 = 40
e=3
d = 27 (327 = 81 = 240 + 1)
Keys:
Signature:
Information Security
6/8/2002 2:20 PM
Verification:
Setup:
Information Security
Keys:
Information Security
6/8/2002 2:20 PM
6/8/2002 2:20 PM
Certificates (10.3.5)
Revocation (10.3.5)
Definition (10.3.1)
Applications (10.3.2)
6/8/2002 2:20 PM
Information Security
Information Security
6/8/2002 2:20 PM
Certificates
message
M
6/8/2002 2:20 PM
one-way hash
fingerprint
f = H(M)
sign
Information Security
signature
S = f d mod n
7
Information Security
Serial number
Hash and signature schemes
(e.g., MD5 and RSA)
Issuer (certification authority)
Period of validity (from, to)
Subject (URL and organization)
Public key
6/8/2002 2:20 PM
6/8/2002 2:20 PM
Certificate Revocation
6/8/2002 2:20 PM
Information Security
10
Convex Hull
6/8/2002 12:42 PM
Convex Hull
obstacle
start
6/8/2002 12:42 PM
end
Convex Hull
Convex Polygon
6/8/2002 12:42 PM
Convex Hull
6/8/2002 12:42 PM
Geometric algorithms
Two points
All the points are
collinear
obstacle
6/8/2002 12:42 PM
Motion planning
Convex Hull
Applications
nonconvex
Special Cases
Convex Hull
Convex Hull
convex
6/8/2002 12:42 PM
start
Convex Hull
6/8/2002 12:42 PM
end
Convex Hull
Convex Hull
6/8/2002 12:42 PM
Orientation
The orientation of three points in the
plane is clockwise, counterclockwise, or
collinear
orientation(a, b, c)
6/8/2002 12:42 PM
Convex Hull
CCW
b
a
COLL
b
p
q
6/8/2002 12:42 PM
COLL
CW
c
a
p
9
6/8/2002 12:42 PM
Graham Scan
c
b
Convex Hull
Convex Hull
yc 1
CCW
6/8/2002 12:42 PM
xc
yb 1
CW
(a, b, c) = xb
6/8/2002 12:42 PM
Sorting by Angle
ya 1
c
a
Convex Hull
10
Analysis
for each vertex r of the polygon
Convex Hull
p
H
11
6/8/2002 12:42 PM
Convex Hull
12
6/8/2002 2:01 PM
6/8/2002 2:01 PM
Point Location
Given a convex polygon P, a
point location query locate(q)
determines whether a query
point q is inside (IN), outside
(OUT), or on the boundary
(ON) of P
An efficient data structure for
point location stores the top
and bottom chains of P in two
binary search trees, TL and TH
of logarithmic height
6/8/2002 2:01 PM
locate(q): determines if
query point q is inside,
outside or on the convex
hull of S
insert(q): inserts a new
point q into S
hull(): returns the convex
hull of S
6/8/2002 2:01 PM
TL
6/8/2002 2:01 PM
TH
Problem
Data structure
Insertion algorithm
Analysis
TH
Problem
Data structure
6/8/2002 2:01 PM
vL
TL
Insertion of a Point
Incremental convex
hull data structure
eH
P
In operation insert(q),
we consider four
cases that depend on
the location of point q
A IN or ON: no change
B OUT and above: add q
to the upper hull
C OUT and below: add q
to the lower hull
D OUT and left or right:
add q to the lower and
upper hull
6/8/2002 2:01 PM
6/8/2002 2:01 PM
We remove vertex u
ut
t right neighbor of u
We remove vertex w
wz
z left neighbor of w
Analysis
q
We add vertex q
6/8/2002 2:01 PM
w
z
t
7
6/8/2002 2:01 PM
Graphs
6/3/2002 1:41 PM
NP-Completeness
x1
x1
x2
x2
12
x3
x3
x4
22
x4
NP-completeness (13.2)
32
11
13
21
23
31
NP-Completeness
Input size, n
To be exact, let n denote the number of bits in a nonunary
encoding of the input
SFO
2555
337
HNL
1843
43
17
LAX
1233
849
ORD
802
33
NP-Completeness
Definition of P
Definition of NP
Alternate definition of NP
7
138
DFW
2
14
PVD
LGA
1120
10
99
MIA
NP-Completeness
NP-Completeness
NP-Completeness
Graphs
6/3/2002 1:41 PM
Polynomial-Time
Decision Problems
NP-Completeness
NP-Completeness
NP-Completeness
NP-Completeness
10
NP example
1.
NP-Completeness
11
NP-Completeness
12
Graphs
6/3/2002 1:41 PM
Equivalence of the
Two Definitions
NP example (2)
Problem: Decide if a graph has an MST of weight K
Verification Algorithm:
1.
2.
3.
NP-Completeness
13
An Interesting Problem
NP-Completeness
CIRCUIT-SAT is in NP
Inputs:
Logic Gates:
0
1
Inputs:
1
Logic Gates:
NOT
0
Output:
0
1
1
Output:
OR
0
1
NOT
OR
1
14
AND
AND
NP-Completeness
15
NP-Completeness
poly-time
NP-Completeness
16
Cook-Levin Theorem
NP-Completeness
CIRCUIT-SAT is NP-complete.
NP
L
17
poly-time
NP-Completeness
CIRCUIT-SAT
18
Graphs
6/3/2002 1:41 PM
Some Thoughts
about P and NP
Cook-Levin Proof
We can build a circuit that simulates the verification of xs
membership in M using y.
Inputs
D
Let W be the working storage
for D (including registers,
such as program counter); let
D be given in RAM machine < p(n) W
cells
code.
S
Simulate p(n) steps of D by
replicating circuit S for each
y
step of D. Only input: y.
Circuit is satisfiable if and only
n x
if x is accepted by D with
some certificate y
Total size is still polynomial:
O(p(n)3).
NP-Completeness
W
S
Output
0/1
from D
p(n)
steps
19
NP
CIRCUIT-SAT
NP-complete
problems live here
20
Graphs
6/4/2002 3:44 PM
NP-Completeness (2)
x1
x1
x2
x2
12
x3
x3
x4
22
x4
32
11
13
21
23
31
33
NP-Completeness
NP-Completeness
CIRCUIT-SAT is in NP
For every M in NP, MCIRCUIT-SAT.
Types of reductions:
Output:
1
0
NP-Completeness
SAT
NP-Completeness
(a+b+d+e)(a+c)(b+c+d+e)(a+c+e)
OR: +, AND: (times), NOT:
SAT is NP-complete
Transitivity of Reducibility
Problem reduction
CNF-SAT and 3SAT
Vertex Cover
Clique
Hamiltonian Cycle
Problem Reduction
Inputs:
Example: m((a+b)e)(cf)(dg)(eh)(efi)
a
b
d
5
h
i
Output:
NP-Completeness
Graphs
6/4/2002 3:44 PM
3SAT
Vertex Cover
(a+b+d)(a+c+e)(b+d+e)(a+c+e)
NP-Completeness
Vertex-Cover is NP-complete
Vertex-Cover is NP-complete
12
11
22
13
21
NP-Completeness
31
NP-Completeness
b
10
Clique
32
23
c
9
Example: (a+b+c)(a+b+c)(b+c+d)
Graph has vertex cover of size K=4+6=10 iff formula is
satisfiable.
b
Vertex-Cover is NP-complete
NP-Completeness
33
11
Graphs
6/4/2002 3:44 PM
Some Other
NP-Complete Problems
CLIQUE is NP-Complete
Reduction from VERTEX-COVER.
A graph G has a vertex cover of size K if and only if
its complement has a clique of size n-K.
NP-Completeness
13
NP-Completeness
14
Some Other
NP-Complete Problems
0/1 Knapsack: Given a collection of items with
weights and benefits, is there a subset of weight
at most W and benefit at least K?
15
Graphs
6/7/2002 11:50 AM
Approximation Algorithms
Approximation Algorithms
Approximation ratios
Polynomial-Time Approximation Schemes (13.4.1)
2-Approximation for Vertex Cover (13.4.2)
2-Approximation for TSP special case (13.4.3)
Log n-Approximation for Set Cover (13.4.4)
Approximation Algorithms
Polynomial-Time Approximation
Schemes
Approximation Ratios
Optimization Problems
Approximation Algorithms
A 2-Approximation for
Vertex Cover
Vertex Cover
A vertex cover of graph G=(V,E) is a subset W of V,
such that, for every (a,b) in E, a is in W or b is in W.
OPT-VERTEX-COVER: Given an graph G, find a vertex
cover of G with smallest size.
OPT-VERTEX-COVER is NP-hard.
Approximation Algorithms
Algorithm VertexCoverApprox(G)
Input graph G
Output a vertex cover C for G
C empty set
HG
while H has edges
e H.removeEdge(H.anEdge())
v H.origin(e)
w H.destination(e)
C.add(v)
C.add(w)
for each f incident to v or w
H.removeEdge(f)
return C
Approximation Algorithms
Graphs
6/7/2002 11:50 AM
OPT-TSP is NP-hard
Special case: edge weights satisfy the triangle
inequality (which is common in many applications):
5
a
Output tour T
Approximation Algorithms
Approximation Algorithms
OPT-SET-COVER: Given a
collection of m sets, find the
smallest number of them
whose union is the same as
the whole collection of m sets?
Set Cover
Output tour T
(at most the cost of P)
Algorithm TSPApprox(G)
Input weighted complete graph G,
satisfying the triangle inequality
Output a TSP tour T for G
M a minimum spanning tree for G
P an Euler tour traversal of M,
starting at some vertex s
T empty list
for each vertex v in P (in traversal order)
if this is vs first appearance in P then
T.insertLast(v)
T.insertLast(s)
return T
OPT-SET-COVER is NP-hard
Algorithm SetCoverApprox(G)
Input a collection of sets S1Sm
Output a subcollection C with same union
F {S1,S2,,Sm}
C empty set
U union of S1Sm
while U is not empty
Si set in F with most elements in U
F.remove(Si)
C.add(Si)
Remove all elements in Si from U
return C
Approximation Algorithms
10