The Role of Algorithms in Computing
The Role of Algorithms in Computing
1.1 Algorithms
Informally, an algorithm is any well-defined computational procedure
that takes some value, or set of values, as input and produces some
value, or set of values, as output in a finite amount of time. An
algorithm is thus a sequence of computational steps that transform the
input into the output.
You can also view an algorithm as a tool for solving a well-specified
computational problem. The statement of the problem specifies in
general terms the desired input/output relationship for problem
instances, typically of arbitrarily large size. The algorithm describes a
specific computational procedure for achieving that input/output
relationship for all problem instances.
As an example, suppose that you need to sort a sequence of numbers
into monotonically increasing order. This problem arises frequently in
practice and provides fertile ground for introducing many standard
design techniques and analysis tools. Here is how we formally define the
sorting problem:
Input: A sequence of n numbers 〈a1, a2, … , an〉.
Output: A permutation (reordering) of the input sequence
such that .
Thus, given the input sequence 〈31, 41, 59, 26, 41, 58〉, a correct sorting
algorithm returns as output the sequence 〈26, 31, 41, 41, 58, 59〉. Such
an input sequence is called an instance of the sorting problem. In
general, an instance of a problem1 consists of the input (satisfying
whatever constraints are imposed in the problem statement) needed to
compute a solution to the problem.
Because many programs use it as an intermediate step, sorting is a
fundamental operation in computer science. As a result, you have a
large number of good sorting algorithms at your disposal. Which
algorithm is best for a given application depends on—among other
factors—the number of items to be sorted, the extent to which the
items are already somewhat sorted, possible restrictions on the item
values, the architecture of the computer, and the kind of storage
devices to be used: main memory, disks, or even—archaically—tapes.
An algorithm for a computational problem is correct if, for every
problem instance provided as input, it halts—finishes its computing in
finite time—and outputs the correct solution to the problem instance. A
correct algorithm solves the given computational problem. An incorrect
algorithm might not halt at all on some input instances, or it might halt
with an incorrect answer. Contrary to what you might expect, incorrect
algorithms can sometimes be useful, if you can control their error rate.
We’ll see an example of an algorithm with a controllable error rate in
Chapter 31 when we study algorithms for finding large prime numbers.
Ordinarily, however, we’ll concern ourselves only with correct
algorithms.
An algorithm can be specified in English, as a computer program, or
even as a hardware design. The only requirement is that the
specification must provide a precise description of the computational
procedure to be followed.
Data structures
This book also presents several data structures. A data structure is a
way to store and organize data in order to facilitate access and
modifications. Using the appropriate data structure or structures is an
important part of algorithm design. No single data structure works well
for all purposes, and so you should know the strengths and limitations
of several of them.
Technique
Although you can use this book as a “cookbook” for algorithms, you
might someday encounter a problem for which you cannot readily find a
published algorithm (many of the exercises and problems in this book,
for example). This book will teach you techniques of algorithm design
and analysis so that you can develop algorithms on your own, show that
they give the correct answer, and analyze their efficiency. Different
chapters address different aspects of algorithmic problem solving. Some
chapters address specific problems, such as finding medians and order
statistics in Chapter 9, computing minimum spanning trees in Chapter
21, and determining a maximum flow in a network in Chapter 24. Other
chapters introduce techniques, such as divide-and-conquer in Chapters
2 and 4, dynamic programming in Chapter 14, and amortized analysis in
Chapter 16.
Hard problems
Most of this book is about efficient algorithms. Our usual measure of
efficiency is speed: how long does an algorithm take to produce its
result? There are some problems, however, for which we know of no
algorithm that runs in a reasonable amount of time. Chapter 34 studies
an interesting subset of these problems, which are known as NP-
complete.
Why are NP-complete problems interesting? First, although no
efficient algorithm for an NP-complete problem has ever been found,
nobody has ever proven that an efficient algorithm for one cannot exist.
In other words, no one knows whether efficient algorithms exist for NP-
complete problems. Second, the set of NP-complete problems has the
remarkable property that if an efficient algorithm exists for any one of
them, then efficient algorithms exist for all of them. This relationship
among the NP-complete problems makes the lack of efficient solutions
all the more tantalizing. Third, several NP-complete problems are
similar, but not identical, to problems for which we do know of efficient
algorithms. Computer scientists are intrigued by how a small change to
the problem statement can cause a big change to the efficiency of the
best known algorithm.
You should know about NP-complete problems because some of
them arise surprisingly often in real applications. If you are called upon
to produce an efficient algorithm for an NP-complete problem, you are
likely to spend a lot of time in a fruitless search. If, instead, you can
show that the problem is NP-complete, you can spend your time
developing an efficient approximation algorithm, that is, an algorithm
that gives a good, but not necessarily the best possible, solution.
As a concrete example, consider a delivery company with a central
depot. Each day, it loads up delivery trucks at the depot and sends them
around to deliver goods to several addresses. At the end of the day,
each truck must end up back at the depot so that it is ready to be
loaded for the next day. To reduce costs, the company wants to select an
order of delivery stops that yields the lowest overall distance traveled by
each truck. This problem is the well-known “traveling-salesperson
problem,” and it is NP-complete.2 It has no known efficient algorithm.
Under certain assumptions, however, we know of efficient algorithms
that compute overall distances close to the smallest possible. Chapter 35
discusses such “approximation algorithms.”
1.1-1
Describe your own real-world example that requires sorting. Describe
one that requires finding the shortest distance between two points.
1.1-2
Other than speed, what other measures of efficiency might you need to
consider in a real-world setting?
1.1-3
Select a data structure that you have seen, and discuss its strengths and
limitations.
1.1-4
How are the shortest-path and traveling-salesperson problems given
above similar? How are they different?
1.1-5
Suggest a real-world problem in which only the best solution will do.
Then come up with one in which “approximately” the best solution is
good enough.
1.1-6
Describe a real-world problem in which sometimes the entire input is
available before you need to solve the problem, but other times the
input is not entirely available in advance and arrives over time.