greedy algorithms

Chapter 4
Greedy Algorithms
An algorithm is designed to achieve optimum solution for a given problem. In greedy algorithm
approach, decisions are made from the given solution domain. As being greedy, the closest solution that
seems to provide an optimum solution is chosen.
Greedy algorithms try to find a localized optimum solution, which may eventually lead to globally
optimized solutions. However, generally greedy algorithms do not provide globally optimized
solutions.
Counting Coins
This problem is to count to a desired value by choosing the least possible coins and the greedy
approach forces the algorithm to pick the largest possible coin. If we are provided coins of $1, 2, 5 and
10 and we are asked to count $ 18 then the greedy procedure will be −
1. Select one $ 10 coin, the remaining count is 8
2. Then select one $ 5 coin, the remaining count is 3
3. Then select one $ 2 coin, the remaining count is 1
4. And finally, the selection of one $ 1 coins solves the problem
Though, it seems to be working fine, for this count we need to pick only 4 coins. But if we slightly
change the problem then the same approach may not be able to produce the same optimum result. For
the currency system, where we have coins of 1, 7, 10 value, counting coins for value 18 will be
absolutely optimum but for count like 15, it may use more coins than necessary. For example, the
greedy approach will use 10 + 1 + 1 + 1 + 1 + 1, total 6 coins. Whereas the same problem could be
solved by using only 3 coins 7 + 7 + 1. Hence, we may conclude that the greedy approach picks an
immediate optimized solution and may fail where global optimization is a major concern.
Examples: Most networking algorithms use the greedy approach. Here is a list of few of them −
• Travelling Salesman Problem
• Prim's Minimal Spanning Tree Algorithm
• Kruskal's Minimal Spanning Tree Algorithm
• Dijkstra's Minimal Spanning Tree Algorithm
• Graph - Map Coloring
• Graph - Vertex Cover
• Knapsack Problem
• Job Scheduling Problem
Minimum Spanning Trees and Prim’s Algorithm
Spanning Trees: A sub-graph T of a undirected graph G (V, E) is a spanning tree of G if it is a tree and
contains every vertex of G. Example:
Every connected graph has a spanning tree. A weighted graph is a graph, in which each edge has a
weight (some real number). Hence, weight of a Graph is the sum of the weights of all edges.
A Minimum Spanning Tree in an undirected connected weighted graph is a spanning tree of minimum
weight (among all spanning trees). Example: in the above graph, Tree 2 with w=71 is the MST.
The minimum spanning tree may not be unique. However, if the weights of all the edges are pairwise
distinct, it is indeed unique. Example:
Generic Algorithm for MST problem
Let A be a set of edges such that A C T, where T is a MST. An edge (u, v) is a safe edge for A, if A U
{(u,v)} is also a subset of some MST.
If at each step, we can find a safe edge (u, v), we can ’grow’ a MST. This leads to the following generic
approach:
Generic-MST(G, w)
Let A=EMPTY;
while A does not form a spanning tree find an edge (u, v) that is safe for A add (u, v) to A
return A
How can we find a safe edge?
We first give some definitions. Let G = (V, E) be a connected and undirected graph. We define:
Cut A cut (S, V – S) of G is a partition of V.
Cross An edge (u, v) E E crosses the cut (S, V – S) if one of its endpoints is in , and the other is in V - S.
Respect A cut respects a set A of edges if no edge in A crosses the cut.
Light edge An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the
cut.
Lemma
Let G = (V, E) be a connected, undirected graph with a real-valued weight function defined on E. Let A
be a subset of E that is included in some minimum spanning tree for G, let (S, V – S) be any cut of G
that respects A, and let (u, v) be a light edge crossing the cut (S, V – S). Then, edge (u, v) is safe for A.
It means that we can find a safe edge by:
1. First finding a cut that respects A,
2. Then finding the light edge crossing that cut.
That light edge is a safe edge.
Prim’s Algorithm
The generic algorithm gives us an idea how to ’grow’ a MST. If you read the theorem and the proof
carefully, you will notice that the choice of a cut (and hence the corresponding light edge) in each
iteration is immaterial. We can select any cut (that respects the selected edges) and find the light edge
crossing that cut to proceed.
The Prim’s algorithm makes a nature choice of the cut in each iteration – it grows a single tree and adds
a light edge in each iteration.
Prim’s Algorithm : How to grow a tree
• Start by picking any vertex r to be the root of the tree.
• While the tree does not contain all vertices in the graph find shortest edge leaving the tree and
add it to the tree .
Running time is O((|V| + |E|) log |V|).
Step 0: Choose any element r; set S = {r} and A = {}. (Take r as the root of our spanning tree.)
Step 1: Find a lightest edge such that one endpoint is in S and the other is in V \ S. Add this edge to A
and its (other) endpoint to S.
Step 2: If V \ S = {}, then stop and output (minimum) spanning tree (S, A). Otherwise go to Step 1.
The idea: expand the current tree by adding the lightest (shortest) edge leaving it and its endpoint.
Exercise:
Find MST.
Answer: A={{a,b},{b,d},{c,d},{c,f},{f,g},{f,e}}
Show the necessary steps.
Analysis of Prim’s Algorithm

Scheduling Classes
The next example is slightly less trivial. Suppose you decide to drop out of computer science at the last
minute and change your major to Applied Chaos. The Applied Chaos department has all of its classes
on the same day every week, referred to as “Soberday" by the students (but interestingly, not by the
faculty). Every class has a different start time and a different ending time: AC 101 (‘Toilet Paper
Landscape Architecture’) starts at 10:27pm and ends at 11:51pm; AC 666 (‘Immanentizing the
Eschaton’) starts at 4:18pm and ends at 7:06pm, and so on. In the interests of graduating as quickly as
possible, you want to register for as many classes as you can. (Applied Chaos classes don’t require any
actual work.) The University’s registration computer won’t let you register for overlapping classes, and
no one in the department knows how to override this ‘feature’. Which classes should you take?
More formally, suppose you are given two arrays S[1 .. n] and F [1 .. n] listing the start and finish times
of each class. Your task is to choose the largest possible subset X ∈ {1, 2, . . . , n} so that for any pair i,
j ∈ X , either S[i] > F [ j] or S[ j] > F [i]. We can illustrate the problem by drawing each class as a
rectangle whose left and right x-coordinates show the start and finish times. The goal is to find a largest
subset of rectangles that do not overlap vertically.
This problem has a fairly simple recursive solution, based on the observation that either you take class
1 or you don’t. Let B4 be the set of classes that end before class 1 starts, and let L8 be the set of classes
that start later than class 1 ends:
B 4 = {i | 2 ≤ i ≤ n and F [i] < S[1]} L 8 = {i | 2 ≤ i ≤ n and S[i] > F [1]}
If class 1 is in the optimal schedule, then so are the optimal schedules for B4 and L8 , which we can
find recursively. If not, we can find the optimal schedule for {2, 3, . . . , n} recursively. So we should
try both choices and take whichever one gives the better schedule. Evaluating this recursive algorithm
from the bottom up gives us a dynamic programming algorithm that runs in O(n 2 ) time. I won’t bother
to go through the details, because we can do better.
Intuitively, we’d like the first class to finish as early as possible, because that leaves us with the most
remaining classes. If this greedy strategy works, it suggests the following very simple algorithm. Scan
through the classes in order of finish time; whenever you encounter a class that doesn’t conflict with
your latest class so far, take it!
We can write the greedy algorithm somewhat more formally as follows. (Hopefully the first line is
understandable.)
This algorithm clearly runs in O(n log n) time. To prove that this algorithm actually gives us a maximal
conflict-free schedule, we use an exchange argument, similar to the one we used for tape sorting. We
are not claiming that the greedy schedule is the only maximal schedule; there could be others. (See the
figures on the previous page.) All we can claim is that at least one of the maximal schedules is the one
that the greedy algorithm produces.
Lemma: At least one maximal conflict-free schedule includes the class that finishes first.
Theorem: The greedy schedule is an optimal schedule.
Huffman Codes
A binary code assigns a string of 0s and 1s to each character in the alphabet. A binary code is prefix
free if no code is a prefix of any other. 7-bit ASCII and Unicode’s UTF-8 are both prefix-free binary
codes. Morse code is a binary code, but it is not prefix-free; for example, the code for S (· · ·) includes
the code for E (·) as a prefix. Any prefix-free binary code can be visualized as a binary tree with the
encoded characters stored at the leaves. The code word for any symbol is given by the path from the
root to the corresponding leaf; 0 for left, 1 for right. The length of a codeword for a symbol is the depth
of the corresponding leaf. (Note that the code tree is not a binary search tree. We don’t care at all about
the sorted order of symbols at the leaves. (In fact. the symbols may not have a well-defined order!)
Suppose we want to encode messages in an n-character alphabet so that the encoded message is as
short as possible. Specifically, given an array frequency counts f [1 .. n], we want to compute a prefix
free binary code that minimizes the total encoded length of the message
Let x and y be the two least frequent characters (breaking ties between equally frequent characters
arbitrarily). There is an optimal code tree in which x and y are siblings.
Huffman codes are optimal prefix-free binary codes.
Exercises

greedy algorithms

Uploaded by

greedy algorithms

Uploaded by

Chapter 4

Analysis of Prim’s Algorithm

You might also like