Data Structures Using C++
Data Structures Using C++
2 C++ Classes 8
2.1 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Constructors and Destructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Functions and Class Member Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Constant and Static Members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Operator Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.1 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Arrays and Array Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3 Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
CONTENTS ii
7.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.2 General Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.3 Application: Large Integers & Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9 Templates 55
11.1 big-oh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
11.2 Main Properties of big-oh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
11.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
14.1 Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
14.2 Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
14.3 Collision Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
14.4 Bucket Technique and Chained Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
23 Graphs 152
This chapter marks an important step in your exploration of computer science. Up to this point,
it has been difficult to divorce your study of computer science from the learning of C++. You have
developed problem-solving skills, but the problems we have encountered have been very focused.
That is, the problems were chosen specifically to illustrate a particular feature of C++. This is the
way problem-solving skills must be developed: Start with small problems and work toward large
ones.
By now, you are familiar with many features of the C++ programming language. You are ready
to direct your attention toward larger, more complex problems that require you to integrate many of
the particular skills you have developed. Now our attention will be directed more toward issues of
algorithm design and data structure and less toward describing C++. If we need a particular feature
of C++ that you have not yet learned, we will introduce it when appropriate.
• should not crash on incorrect input/should always check input for correctness.
• should always check the returned value of new operator (trying to access a null pointer
will cause your program to crash)
• should always crash gracefully (i.e. with error message, not segmentation fault).
#define MAX_LENGTH 20
#define MONTHS_IN_YEAR 12
#include <iostream.h>
int main() {
int num;
cout << "Enter an integer: ";
cin >> num;
if (num == 10)
cout << "The number is 10" << endl;
else
cout << "The number is not 10" << endl;
}
The three main characteristics of OOP are encapsulation, inheritance and polymorphism.
• Encapsulation is combining the data and methods that manipulate the data in one place
— an object. All processing for an object is bundled together in a library and hidden from
the user.
• Inheritance is the ability to define an object as a descendant of an existing object.
1 Abstract Data Types 3
• Polymorphism occurs when a single method name, used up and down the hierarchy of
objects, acts in an appropriate—possibly different—way for each object.
When using an ADT we need only worry about what each operation does, not how.
A Data Structure is an aggregation of simple and composite data type into a set with defined
relationship. The structure means a set of rules that hold the data together. In other words, if
we take a combination of data types and fit them into a structure such that we can define its
relating rules, we have made a data structure.
Pointers:
Enumerated types:
Boolean Data Type: C++ provided a new built-in data type called bool (boolean type). A
boolean type variable takes one of only two constant values true and false.
1 Abstract Data Types 4
bool GuessIt;
GuessIt = true;
Structures:
struct person {
int age;
char *firstName;
char *lastName;
};
person bob = {33, "Bob", "Smith"};
Unions:
union charOrInt {
char c;
int i;
};
Arrays:
person people[10];
int a[] = {1, 2, 3, 4, 5};
Pointer examples:
int *iptr;
char *cptr;
1 Abstract Data Types 5
int* p = &i;
which means that pp is a constant pointer pointing to a pointer whose type is constant pointer
to a constant int variable. Complicated?? Yes ...
Why? The swap function exchanges the values of local copies of its arguments a and b other
than themselves. Please read and run this program at http://turing.une.edu.au/˜comp282/Le
ctures/Lecture 01/Examples/test swap.cpp
The default initialization method of argument passing in C++ is to copy the values of argu-
ments into the storage of the parameters. This referred to as pass-by-value.
Under pass-by-value, the function never accesses the arguments of the call. The values that
function manipulates are its own local copies; they are stored on the run-time stack. When
function terminates, these values are lost.
Examples:
int i;
int &iref = i; // Reference variable
When the parameters are references, the function receives the lvalue of the argument rather
than a copy of its value. This means that the function knows where the argument resides in
memory and can therefore change its value or take its address.
1 Abstract Data Types 7
Any use of the pass-by-reference parameter within the body of the function will access the
argument in the calling program
Pass-by-reference is one of very powerful ways in passing values to a function. It has the sim-
ilar definition format as the pass-by-value, but its functioning is more likely to that of pointer
version.
If the function does not change the value of the argument within the body of the function, for
the sake of security you can declare the parameter type as const reference parameter. For example
The new operator allocates memory for any type including arrays and returns a pointer to the
desired type.
In other words, the new operator creates a new dynamic variable of any types. The dynamic
variable does not have any identifiers or names but we know the address of the variable
through the returned appointer to the variable.
The creation of new dynamic variables is called memory allocation and the memory is dynamic
memory.
The dynamic variables created by the new operator must be freed or destroyed by the delete
operator explicitly. It is desired for you always to destroy the dynamic memory if you don’t
want it any more or before terminating the program.
The delete operator frees memory for all types except arrays.
Examples:
delete iptr;
delete iiptr;
delete [] cptr;
Chapter 2
C++ Classes
2.1 Classes
2.1.1 Concept of C++ class
C++ classes are similar to structures in C, with the main difference being that classes can have
functions, or methods, as well as variables, or data members in their definitions.
The other main difference is that classes use the technique of information hiding to avoid incor-
rect use of the class. This is done via the public, private and protected key words.
A class can be considered a type, whereas an object is an instance of a class. For example, the
primary type int of C++ is a class, any variables declaration as int type can be considered
as int objects. In the following small section of pseudo-code, Date is a class, and dObj is an
object.
class Date {
...
//Definition
...
};
Date dObj;
The class technique of C++ is a good tool used for implementing the concept of abstract data
types in data structures
class class_name {
public:
Function prototypes that can be used
outside the class
......
private:
Type definitions and member declarations
accessible only inside the class
......
};
1. The class head: includes the C++ keyword class and the name of the new class
2. The public section: begins with the C++ keyword public followed by a colon. A list of
items appears after the colon. These are items that are made available to anyone who uses
the new data type.
3. The private section: begins with the C++ keyword private followed by a colon. After a
colon is a list of items that are part of the class but are not directly available to program-
mers who use the class.
class Card {
public:
Card(face_t f = ace,
suit_t s = hearts); // constructors
Card(const Card &c);
Card &operator=
(const Card &c); // operator
˜Card(); // destructor
private:
face_t face; // data members
suit_t suit;
};
2 C++ Classes 10
Then we use this class just like using the standard type int
Card c;
Card *cp = new Card(c);
Constructors always have the same name as the class, and never have a return type (not even
void: you must not write void at the front of the constructor’s head).
Constructors are declared like other methods with the above differences.
You may declare as many constructors as you like — one for each different way of initializing
an object.
1. Default constructor: If you write a class with no constructors, then the compiler auto-
matically creates a simple default constructor. This automatic default constructor doesn’t
do much work. It is a good way to write your own constructors even your own default
constructor, rather than depending on the automatic default constructor. Your own con-
structor is the one which is used by programmers to declare a variable of the class without
having to provide any arguments for the constructor.
2. Copy constructor: Copy constructor is a special member function which will be called
when the program creates a new object through object assignment or assigns an object to
another. We will talk about it later.
2.2.2 Destructors
Destructors are used to do any final preparations before an object is destroyed. They are called
automatically, either when delete is used on a dynamically allocated object, or when a static
object goes out of scope.
Unlike constructors, there can only be one destructor per class. The name of a destructor must
be the class name preceded by a tilde (˜). Destructors have no arguments, and no return type.
Destructors are particularly useful for freeing any dynamically allocated memory the object
might have. Without appropriate destructors, your program may have memory leaks (this
isn’t Java!).
2 C++ Classes 11
class A {
public:
A() { cout << "Making A" << endl; }
˜A() { cout << "Breaking A" << endl; }
};
class B {
public:
B() { cout << "Making B" << endl; }
˜B() { cout << "Breaking B" << endl; }
private:
A a;
};
void main() {
B b;
}
Answer:
Making A
Making B
Breaking B
Breaking A
Why? All the data members of an object are initialized before the constructor builds up the
object itself. In the above example, the program declares a new object b of class B. Before
building up the object b itself, its only data member, the object a of class A, should be initialized
or created first. Thus in order for the program to build up the object a, the constructor A() of
class A is called first. That is why we see the output Making A first and then Making B.
Before exiting the program, the object b is destroyed by calling its destructor first and the data
member a of the object b by calling its destructor, respectively.
A function’s type is determined only by its return type and its parameter list including param-
eter’s type on the list.
Example:
Then we say the function lexicoCompare’s type is the TYPE of returning type of int and
two parameters of type const string&. Denote by
In C++ a function/method can be given default arguments. Default arguments must be given
to the rightmost parameters first. Because the number of arguments to a function with default
values is variable, the default value parameters must appear after all parameters that don’t
have default values.
Correct examples:
Incorrect example:
Example 1:
Or placing a function definition inside a class definition, called an inline member function.
Example 2:
class Card {
public:
face_t getFace() const
{ return face; }
.
.
.
face_t face;
};
Notice that when you declare an inline member function, there is no semicolon before the
opening curly bracket or after the closing curly bracket.
For example, if we want to define the member function getFace() of the class Card outside
the body of the definition of the class Card, then we must write it as
This requirement, called the scope resolution operator, tells the compiler that the function is a
member function of a particular class.
We use the term function implementation to describe a full function definition. The function
implementation provides all the details of how the function works.
Usually all the function implementation codes of a certain class are written in a separate file,
called implementation file of the class while the prototype definition and all the information
needed to use the class should appear in a header file.
2 C++ Classes 14
class foo {
public:
foo()
{ b = a = 0; }
int x() const //constant method
{ return y()*2; } //call a non-constant method
int y() //non-constant method
{ return a; }
int *z()
{ return &b; }
private:
int a;
const int b;
};
Corrected Version:
class foo {
public:
foo(): b(0)
{ a = 0; }
int x() const
{ return y( )*2; } //works well
2 C++ Classes 15
int y( ) const
{ return a; }
const int *z( ) //return a pointer to int object
{ return &b; } //which is of constant type
private:
int a;
const int b;
};
Usually each object has its own copy of each member variable. However a static data member
only has one instance (classwide). That is, all of the class’s objects share the same value for a
static data member.
A static member is used w.r.t the whole class, not a specific object.
Example:
class Card {
public:
static Card getAce()
{ return Card(ace, spades);}
static Card getKing();
static const int cardsInPack;
...
};
In addition to declaring the static data members and methods within the class definition, the
program must also repeat the declaration of the static members in the implementation file
elsewhere. For the above example, we must declare the following stuff elsewhere
Card Card::getKing() {
return Card(king, diamonds);
}
2 C++ Classes 16
class X {
static int foo() const;
...
};
where the member method foo() is declared as static and constant function.
To use static data or methods, call with the class identifier other than a special object from the
class, for example:
Card c = Card::getKing();
for (int i = 0; i < Card::cardsInPack; i ++)
...;
because we have only one copy of static members for all the objects of the class.
Binary operators take actions (operations) on two operands. Most of C++ operators are binary
operators, for example, +, -, == and >= etc.
Unitary operators take actions on a single operand. The typical unitary operators are ! (nega-
tive boolean) and - (negative value).
The primary type variable can be used as operands for C++ operators. For example, we can
write
int a, b, c;
char letter = ’=’;
a = 1, b = 2;
c = a + b;
cout << ’c’ << letter << c;
Card A, B;
cout << A + B;
because C++ compiler does not know how to add two Card objects and how to output a Card
object.
Making operator available to new class objects is called operator overloading. For example, if
we want to write cout << A where A is a Card object, we need to tell the compiler how to
output the content of the objects for the Card class.
where the operator + can be replaced by any other binary operators such as == etc., and Op1
and Op2 are considered as the left operand and right operand, respectively. The type of the
result of the operation may be different from the type of the operands.
class vector {
public:
vector(); //default constructor
vector(int u, int v, int w) //constructor with three components
{ x = u;
y = v;
z = w; }
˜vector(); //destructor
....; //other member function
private:
int x;
int y;
int z;
}
2 C++ Classes 18
From mathematics we know the sum of two 3D vector is also a vector whose three components
are the sum of the components of the two vectors respectively. Thus we can define + for the
vector as
In mathematics, another important operation is the inner product between two vectors. The
result of inner product is the sum of the multiplications of the corresponding components. For
two vectors p1 = (x1, y1, z1) and p2 = (x2, y2, z2) the inner product of p1 and
p2 is
s = p1 ∗ p2 = x1 ∗ x2 + y1 ∗ y2 + z1 ∗ z2
The result is a scalar other than a vector. Hence the overloaded operator * should be defined
as
If a binary operator is overloaded as a member function of a class, then the object who calls this
function will implicitly be the first operand of the operation with the other operand coming
from the argument of the function. Hence the prototype for the overloaded operator as a
member function looks like
2 C++ Classes 19
where only one argument in the parameter list of the overloaded function which is the right
hand side operand.
class myString {
public:
myString(const char* = 0); //constructor
...
myString& operator+=(const myString& rhs); //overload +=
...
private:
int _size;
char *_string;
}
please read the above statement as the object mystring1 is calling its member function +=
with the argument mystring2.
2 C++ Classes 20
In the overloading prototype, the first argument and the return type must be an object of
ostream& and istream&, respectively. Thus you cannot overload both operators as member
functions of the new class.
The source parameter is a const reference parameter, meaning that the function will not
alter the point that it is writing.
You need to provide the complete implementation for the overloaded output and input oper-
ators, that is, how to output and input. Our textbook provides examples for the class point.
As both the functions for the overloaded output and input operators are not member functions
of new classes, so both of them do not have access to the private members of the objects of the
classes. To solve this problem, we can declare them as friend functions of the classes.
Chapter 3
The bag class is one of special kinds of container classes. A container class is a class where each
object contains a collection of items. The bag class that we will build is the simple version of more
complex classess defined in the C++ Standard Library.
Generally we make an assumption that the items in a container are of the same type.
A bag is a special kind of container in which the items may be repeative, without any order.
Our purpose is to implement the bag concept by using certain data structure.
In C++, in order to declare an array, we need to specify the type of items of the array. To be
more flexible, won’t actually use a special type when we refer to the types of the items in the
bag, i.e., in the array.
We use the name value type for the data type of the items in a bag. A typedef statement
will be used to link a specified type to the type name value type.
To keep track of how many items are in a bag, we will use a variable. The type of the variable
is defined as the name size type. The name is normally specified to an integer type of some
kind, for example, C++ data type size t which is an integer data type that can hold only
non-negative numbers.
3 The Bag Class 22
In the implementation, we may need to specify the size of the array in the Bag class, called the
bag’s CAPACITY.
class bag
{
public:
typedef int value type;
typedef std::size t size type;
static const size type CAPACITY = 30;
....
}
Data member CAPACITY is declared as static and const, so all the instances from this class
share a single copy of the CAPACITY and the value of CAPACITY is defined once and cannot
be changed.
In the above implementation, you change the int type to any other special types so that you
can make the bag class for items of the types.
bag b;
//We put some items into the bag b here
bag c(b); //make a new bag copying from the bag b
We may need a constant member function to count the number of the items contained in the
bag. Once we call this function for an object of the bag, the number of all the items in the bag
can be given. We call this function the size() function with the following prototype
We definitely need methods to allow taking items out of the bag, declared as
3 The Bag Class 23
The erase one function removes one copy of target and returns true or false depending
on if the target is really in the bag. The erase function removes all the copies of target in
the bag and returns the number of copies removed.
We need a method to report how many copies of a particular item are in the bag.
Or we may need an external method to merge two bags, called the union operation which can
be obtained by overloading the + operator. If the + operator is defined for bags, so it is sensible
to also overload the += operator.
The entries of a bag will be stored in the front part of an array, as shown in this example. That
is, the true number of items in the bag may be less than the constant CAPACITY.
The entries may appear in any order. This represents the same bag as the previous one. This is
the basic feature of the bag.
Because part of the array can contain garbage, the bag class needs to keep track of how many
numbers are in the bag. We put this information in the variable used of type size type. One
solution:
class bag
{
public:
...
private:
value type data[CAPACITY];
size type used;
};
3 The Bag Class 24
Every method that is in the class will use this array, removing something from it or putting
something in, etc. Remember that how you decide to represent a collection determines the
inner workings of the class, the data structure type, including its variables and methods. These
inner workings do not have to be known by the user of the class. The user can use the bag class
without thinking about the array or whatever is being used. All the user of the class needs to
know is what methods the class has and what parameters to send each method, and what each
method returns.
The default constructor initializes a bag as an empty bag, and does no other work. An empty
bag does not have any items in it. That is, no array entry is used. So the variable used = 0
bag::bag() { used = 0; }
3.2.2 Counting
The size function returns the number of items in the bag. This information is recorded in the
private member used. So the function implementation is very simple
To count the number of occurrences of a particular item target in a bag, we step through the
used portion of the array, that is from data[0] to data[used-1]. The function definition is
The erase one function removes an item named target from a bag. There may be several
occurences of the target in the bag, however the erase one function simply removes one
occurrence. As the items of the bag are stored in the bag’s array, in the first step the function
looks for the target from the beginning of the array until the entry indexed by used-1. The
first occurrence of the target along the array will be removed. To make the array partially
filled, in the second step, the last item indexed by used-1 will be moved to the place where
the first occurrence of the target stays.
The following example demonstrates the whole deletion procedure. Suppose the target is the
number 6. There are two occurences for the number 6. First the first number 6 is found in the
second entry of the array, and it will be removed. Then the number in the fifty entry, i.e., 7,
will be moved into the second place to make the array partially filled.
If the target is successfully deleted, the target function returns a true boolean value, other-
wise a false value in the case of no target in the bag.
With the aid of the erase one function, it is easy to implement the erase function which
deletes all the occurrences of a target from the bag if any.
3 The Bag Class 26
The core work of the overloaded function is to copy items from the array in one bag into the
array of another bag. The Standard Library contains a copy function for easy copying of items
from one location to another.
Although there are several built-in container classes in the C++ Standard Library, just using
them does not give you a real understanding of the structure. So we choose to implement the
bag container in our own way.
Having to figure out how to handle the removal of an element in a container of a particular
type makes you unable to avoid learning the container itself.
What are the characteristics that are peculiar to this particular container? For example, the bag
container has no order, so that removal of each element should be random. Adding items can
take place anywhere.
What are the operations that must be defined for a container? This tells you what methods
must be defined for the class implementing for the container.
Chapter 4
But unlike a bag, the items in a sequence are arranged in an order, that is, you can follow the
order to access the items one after another.
In the implementation of bag classes, the items of a bag are arranged in an array, one after
another. It seems that there is also an order in a bag. Please note, that is a quirk of our particular
bag implementation, and it is not part of the natures of a bag. Actually there is no order for
items of a bag at all.
4.1.2 Iterator
The items of a sequence are kept one after another. The order is part of natures of a sequence.
A program can step through the sequence one item at a time. A sequence class provides a pro-
gram methods to control precisely where items are inserted and removed within the sequence.
As we have done for defining a bag class, we use an array of size CAPACITY and type value type.
The constant CAPACITY is also declared in the class.
4 The Sequence Class 29
For example, our bag class for double values looks like
class sequence {
public:
typedef double value type;
typedef std::size t size type;
static const size type CAPACITY = 30;
....
}
The entries of a sequence will be stored in the front part of an array. That is, the true number
of items in the sequence may be less than the constant CAPACITY.
The entries must appear in sequence’s order. This oder is one of the basic features of a se-
quence.
Because part of the array can contain garbage, the sequence class needs to keep track of how
many numbers are in the sequence. We put this information in the variable used of type
size type.
In sequence class we define the so-called current item which indicates the precise location
where we are. The current item is reprented by its index in the array, declared as current index.
class sequence {
public:
...
private:
value type data[CAPACITY];
size type used;
size type current index;
};
4 The Sequence Class 30
The size member function returns the number of items in the sequence.
We will have member functions to exam a sequence which has already been built. For a bag,
all information we can gather is how many copies of a particular item are in the bag. For a
sequence, we may examine the items one after another and the items must be examined in
order, from the front to the back of the sequence. Three member functions are provided by the
sequence class to enforce the in-order retrieval rule. The start function moves to the front of
the sequence and makes the front item as the current item. The advance function changes the
current item to the next item in the sequence. The current function returns the information
of the current item in the sequence.
Generally the current function provides the information of the current item. However there
exists a case where we cannot get the information of the current item. For example, if the
current item is the last item in the sequence, then the current item does not exit after the
advance function is called. So the sequence class provides a function, called is item, to
examine whether there actually is another item for current to provide, or whether current
has advanced right off the end of the sequence.
Like a bag class, the sequence class also provide member functions to insert new items into the
sequence. The insert function places a new item before the current item while the attach
function adds a new item to a sequence after the current item. The operation here is different
from that of the insert function of a bag class where the new item is simply placed at the first
available position in the array.
We can remove the current item from a sequence by the remove current function. This
function has no parameters. In the bag class, the erase one function has a parameter of
target and one occurence of the target, if any, will be removed from the bag. For a sequence,
we have no such a deletion function. If you really want to delete a particular item from a
sequence, first you need to move the current position to the target then call remove current
function to remove the target item.
class sequence {
public:
// TYPEDEFS and MEMBER CONSTANTS
4 The Sequence Class 31
In our sequence class declaration we have a user defined copy constructor. Its purpose is to
create a new sequence object from another existed sequence object. The algorith is to copy
every items in the existed object into the created object one by one. Here is the example of its
implementation. You may have your own version.
By our definition, the insert function places a new item before the current item. There are a
lot of work to be done in the insert function. First of all we need to check if there exists any
extra space in the entry for the new item. If the array is full, then you cannot insert any new
items into the sequence.
The current item is indexed by the private member current index, so the new item should
be put in the data[current index]. If there is no current item, then the new item should
be put in the front of the sequence. Whatever the case is, we need to move all the items from
data[current index] one place toward the end of the sequence so that we can make the
space data[current index] available to the new item and keep the order of the items in
the sequence.
5.1 Pointers
5.1.1 Pointer Variables and Types
The values belonging to pointer data types are the memory addresses of a computer. A pointer
variable is a variable whose content is an address, or a memory location.
When you declare a pointer variable, you need to specify the data type of the value to be stored
in the memory location pointed to by the pointer variable.
Please review Lecture 1 on how to declare a pointer variable. Because C++ does not automati-
cally initialize variables, pointer variables must be initialized if you do not want them to point
to anything. For example
int * p;
p = NULL; // or
p = 0;
double * d ptr;
d ptr = new double;
From the above example, you can see dynamic variables are not declared, even it does not
have an identifier for a dynamic variable.
5 Pointers and Dynamic Arrays 34
Dynamic variables are created during the execution of a program. Only at that time does a
dynamic variable come into existence.
A value parameter of a function can be a pointer. An example was introduced in section 1.2.5
of Lecture 1.
Can you tell what is the difference between this swapping function and the one defined in
section 1.2.5 of Lecture 1.
An array definition consists of a type specifier, an identifier (name for the array) and a dimen-
sion
int myArray[10];
An array cannot be initialized with another array, nor can one array be assigned to another.
Additionally, it is not permitted to declare an array of references.
The array identifier evaluates to the address of the first element contained within it. Its type is
that of pointer to the type of the element the array contains. For example, myArray’s type is
int*.
5 Pointers and Dynamic Arrays 35
To declare an array parameter in a function you can use any of three declarations formats, as
shown in the following examples
void myFunc(int*);
void myFunc(int[]);
void myFunc(int[90]);
As an array is passed as a pointer, thus the changes to an array parameter within the called
function are actually made to the array argument itself and not to a local copy.
The array’s size is not part of its parameter type. The function being passed an array does not
know its actual size, and neither does the compiler. Usually the size information is passed to a
function by using an extra size parameter.
However if you can declare the parameter as a reference to an array. When the parameter
is a reference to an array type, the array size becomes part of the parameter and argument
types, and the compiler checks that the size of array argument matches the one specified in the
function parameter type.
int *p;
p = new int[10];
*p = 25; // storing 25 into the first memory location
In the above example, the pointer p is pointing to the first memory location of the allocated
memory for the created array. Moreover, p is simply called a dynamic array.
5 Pointers and Dynamic Arrays 36
Any component in a dynamic array is accessed by index operator [ ]. For example, the third
component in the dynamic array p is p[2].
delete [] p;
For multidimensional array note that only the first dimension of the array dynamically allo-
cated can be specified using an expression evaluated at run-time. The other dimension must
be constant values known at compile time.
Like a variable, a function also has a type. A function’s name is not part of its type.
A function type is determined only by its return type and its parameter list.
where the function name is sizeCompare, and the type of the function is given by its param-
eter list consisting of two const string & parameters, and its return type int. Thus the
following function shares the same type as the above
In C++, you can define functions with the same names but in different types. The following
example actually declares two “different” functions as compilers will perform function type
checking before loading an appropriate function.
A pointer to a member function must match the type of the function it is assigned, in three
areas: (1) the type and number of parameters, (2) the return type, and (3) the class type of
which it is a member.
The declaration of a pointer to member function requires an expanded syntax that takes the
class type into account. A pointer to a member function can be declared, initialized, and as-
signed, as follows
where F PtrVar is the variable name being defined. The following example defines a function
pointer variable named myPtr
Note: as a variable, myPtr’s type is a pointer pointing to functions of the type of two const
string& parameters and a return type of int.
Like an array name, a function name is considered as a function pointer (value) pointing to
the function. Thus we can assign a function name (value) to a function pointer variable of the
same type. For example, we can
5 Pointers and Dynamic Arrays 38
myPtr = sizeCompare;
int (*testCases[10])();
To see how to declare a “pointer to function” parameter, let’s have a look at the following
example
where the first line defines a name “PFunc” as the type of “pointer to function” pointing to
functions of type of two const string& parameter and a return type of int.
We can pass any function argument of type of two const string& parameter and a return
type of int to the function sortFunc. See the example below
In this example, we are calling the function sortFunc with two string arguments s1, s2
and a function argument lexicoCompare which itself is a function.
Chapter 6
Dynamic Implementation of
Container Classes
A class may be a dynamic data structure (it may use dynamic memory) by using pointer mem-
ber variables in the class.
Several new factors come into play when a class has dynamic memory. The pointer member
variables point to dynamic memory allocated at running time which are generally not consid-
ered as “part” of the class objects. Extra cares should be paid for dealing with the dynamic
memory when the class objects are created and destroyed.
The original bag class in lecture note 3 has a member variable that is a static array containing
the bag’s items. Now we will use a pointer member variable to create dynamic bag.
class bag {
public:
...
private:
value type *data; //Pointer to dynamic array
size type used; //How much of array is being used
size type capacity; //Current capacity of the bag
}
6 Dynamic Implementation of Container Classes 40
The initial size of the dynamic array is determined by the data member capacity. The size
of the array may increase to whatever capacity is needed to all the items of the bag objects in a
program.
Increasing the size of the array ensures more items can be inserted.
The prototype and definition of the constructor of the dynamic bag class may look like
where DEFAULT CAPACITY is a default value defined elsewhere for the initial size of the dy-
namic array.
As the dynamic memory for the bag items is allocated by the new operator in the constructor,
the memory should be explicitly returned to the system by using the delete operator when
the object goes out of scope. This can be done by the user-defined destructor.
bag::˜bag( ){
delete [ ] data;
}
However if a class has pointer member variable such as the pointer data in our bag class, only
can the pointer itself be copied from the existing object to the initialized object. The dynamic
array pointed by data of the bag object won’t be automatically copied.
6 Dynamic Implementation of Container Classes 41
If you want to avoid the simple copying of member variables (in fact you must avoid this),
then you must provide a copy constructor to do the job, copying the contents in the dynamic
array.
The copy constructor of the dynamic bag class should have the following prototype and im-
plementation
or
Like the automatic copy constructor, the automatic assignment operator makes a memberwise
copy between the objects of a class. If a class, like our dynamic bag class, contains any pointer
variable members, we must overload the assignment operator to make sure the contents in the
dynamic array to be correctly copied.
When you overload the assignment operator, C++ requires the overloaded operator to be a
member function of the class. In an assignment statement y=x (both x and y are objects of
the same class), the object y is activating the function, and the object x is the argument for the
parameter of the function.
The prototype and implementation of the overloaded assignment operator looks like
As a general rule we always use the self-assignment checking to avoid the assignment such as
y = y from occurring.
We may need a constant member function to count the number of the items contained in the
bag. Once we call this function for an object of the bag, the number of all the items in the bag
can be given.
We definitely need methods to allow taking items out of the bag. The erase one function
removes one copy of target and returns true or false depending on if the target is
really in the bag. The erase function removes all the copies of target in the bag and returns
the number of copies removed.
We need a method to report how many copies of a particular item are in the bag.
Or we may need an external method to merge two bags, called the union operation which can
be obtained by overloading the + operator. If the + operator is defined for bags, so it is sensible
to also overload the += operator.
You must override the automatic copy constructor and the automatic assignment operator.
The class must have a destructor to return all dynamic memory allocated to the object from the
heap.
Overall when a member variable of a class is a pointer to dynamic memory, the class should
always be given a destructor, an overloaded copy constructor, and an overloaded assignment
operator
or
When a return value of a function is an object of the class, by default the copying occurs by
using the automatics copy constructor, which copies all the member variables fro the local
variable to the return location. If you want to avoid the simple copying of member variable,
then you must provide your own copy constructor.
When a function uses a value parameter of an object of the class, the actual argument is copied
to the formal parameter. If there is any pointer member, you need to provide your own copy
constructor.
Chapter 7
In this lecture, we will talk about the linked lists, and discuss the dynamic implementation for
the linked lists.
7.1 Lists
7.1.1 Basic Definition
A list is a linear data structure. We can insert and delete elements in any order from the list.
A linked list is a new data structure which is used to implement a list of elements arranged in
some kind of order. It is a sequence of elements arranged one after another, with each element
connected to the next by a link.
A node is a container for data in a list. In the list, each node may have a successor and a prede-
cessor, but no explicit order. Each node knows only its successor and predecessor if any.
class node {
...
private:
data type data field;
node * link field;
7 Linked Lists and Their Applications 46
where data type may be any valid data type such as the primary types (int, double) or
user-defined classes.
For the node class we would like to provide members for the node constructor, for setting
data information and link, and for getting (reading) data information and link. For example, a
simplified version of the node class is
class node {
public:
node(const value type& init data = A Default,
node* init link = NULL) {
data field = init data;
link field = init link;
}
void set data(const value type& new data) {
data field = new data;
}
void set link(node* new link) {
link field = new link;
}
value type data() const
{ return data field; }
node* link() { return link field; }
private:
data type data field;
node * link field;
}
A pointer to the first node of the list is called the head pointer. And the pointer to the last node
of the list is called the tail pointer.
We could also implement and maintain pointers to other nodes in a linked list. For example,
we can define a head pointer, a current pointer to the so-called current node and a pointer to the
previous node of the current node.
7 Linked Lists and Their Applications 47
Each pointer to a node must be declared as a pointer variable. In our implementation we won’t
declare a linked list class but instead we declare two pointer variables
The implementation of a linked list is defined and manipulated by a linked-list toolkit although
we can declare a new class for the linked list data structure.
A empty linked list means the value in the pointer variable head ptr is NULL. And the
link field of the node pointed by the variable tail ptr is always NULL (why?).
• Inserting a new node at the head of a linked list. The work to be done in this insertion
process is to allocate new place (a new node) for the new element, to put the new node at
the head of the linked list and to update the variable head ptr of the linked list object.
The prototype and implementation is as follows
void list head insert(node*& head ptr, const value type& entry) {
head ptr = new node(entry, head ptr);
}
The node constructor will allocate memory for the entry and set the link field of the
newly allocated node to the head node (pointed by head ptr) of the list and the member
function sets the head ptr to the new node. That is, we have added the new node at the
head of the list.
• Inserting a new node at a place rather than the head of a linked list. In this insertion,
we need to know where the new entry is to be added. We assume that a particular node
is pointed by a node pointer previous ptr and we will add the new entry just after
that node. The work to be done includes allocating a new node (to be done by the node
constructor), placing the new entry in the data field, making the link field to the node
after the new node’s location and connecting the link field of the node pointed to by
previous ptr to the new node that we just created.
Searching is also an important task in manipulating linked lists. This operation find the node in
the linked list by comparing the data field of the node with a given data entry and the function
returns a pointer to the found node. The prototype of the function is
The searching process begins from the head node of the linked list and search for the target
node by node until the target is found or not.
Sometimes it is very important to remove a certain element from the linke list. Just like insert-
ing an element into a linked list, we define two different functions for removing the head node
and any other node, respectively. We need to provide the location of the node in the linked list.
Two function take the following formats respectively
where previous ptr is the pointer to the node just before the node to be removed. In other
word, the node pointed to by previous ptr is in the front of the node to be removed. Here
is the example of its implementation
class node {
public:
node(const value type& init data=value type(),
node* init link = NULL)
{ data field = init data;
link field = init link; }
node* link() { return link field; }
void set data(const value type& new data){
data field = new data; }
void set link(node* new link) {
link field = new link; }
value type data() const {
return data field; }
const node* link() const {
return link field; }
private:
value type data field;
node* link field;
}
//Functions for the linked list
std::size t list length(const node* head ptr);
void list head insert(node*& head ptr, const value type& entry);
void list insert(node* previous ptr, const value type& entry);
node* list search(node* head ptr, const value type& target);
const node* list search(const node* head ptr,
const value type& target);
node* list locate(node* head ptr, std::size t position);
const node* list locate(const node* head ptr,
std::size t position);
void list head remove(node*& head ptr);
void list remove(node* previous ptr);
void list clear(node*& head ptr);
void list copy(const node* source ptr, node*& head ptr,
node*& tail ptr);
6 3 1
The integer 631
Example:
head
7.3.2 Pointer addition
Given a pointer: type *p
When we add 1 to a pointer, we actually add sizeof(type) bytes to the memory location
Digit by digit
Basic Idea:
Advantage
Example:
8 ...
head
Advantage: Starting from any node, we can traverse the list to any other node.
A possible set of definitions for a doubly linked list of items is the following
class dnode {
public:
... ...
private:
value type data field;
dnode* link fore;
dnode* link back;
};
Example: (length 23 = 8)
00011101
deBruijn sequences can be implemented as circular lists. See the following figure
1 0
0 H A 0
G B
F C
1 E D 0
1 1
The data field of the node in the list consists of a bit (1 or 0) and a char.
Suppose a doubly linked list of a deBruijn sequence has been given, then we can design a
program to encode a text file.
The algorithm is simple. Read each char from the text file, look for the char in the doubly
linked list, read n consecutive bits from the list and write down the code for the char until no
char remained in the text file.
Templates
In this lecture, we study the concept of template class and show several examples of how to use
template classes and template functions.
We can instantiate many functions (following the same algorithm but for different types of
data) from a template function.
Actually a template function is not an ordinary function but just a template for instantiating
new functions which apply same task to arguments of different types.
• Type parameters
• Non-Type parameters
• Template parameters
...
}
In the case of the above swap template function, the type parameter T can be any of the C++
build-in type, or it may be a class. That is, we can use the above template function to swap two
elements of any type.
• Type parameters
• Non-type parameters
• Template parameters (the parameter is itself a template class).
Template classes are useful when we want a same basic class structure, for an unlimited num-
ber of possible classes.
9 Templates 57
They are particularly suited to structures that store different kind of things.
Templates provide the means with which programs can build their own generic containers
(class) and provide a mechanism for data abstraction.
The expression template < ... ...> is called the template prefix. It warns the compiler
that the following definition will use an unspecified data types, such as T1, T2 etc.
Each T1, T2 etc. can be used as a type within the template class definition.
Each v1,v2 etc. are normal variables throughout the template class definition.
Example:
This example is very interesting. Template class foo has a type parameter T and a non-type
parameter n of the type int.
In the body of the class definition, an array of type T (the actual type value is unspecified), x,
is defined and the length of the array x is n which is also unspecified.
From this template class one may build up a class including an array of any given length where
the type of the array elements can be specified according to applications.
For the function implementation, you must follow some rules to tell the compiler about the
dependency on the type parameters etc. Thus each method defined outside the class definition
and each static data member must be also prefixed by the template keyword and template
parameters.
The function definition outside the class template body looks much like the syntax for template
functions. For example, the replace function in the foo template class should be defined outside
the template class body in this way
For methods which use extra template arguments, we need to include template keyword within
the class definition. This is also true for normal classes with method templates.
class Foo {
template <class T>
void fooThis(T &x);
...
};
This way means that we are using a template function within a template class or a normal
class.
9 Templates 59
Before we can instantiate a template class, its definition, not declaration, must be known.
// declaration of X
template <class T>
class X;
// definition of Y
template <class T>
class Y { ... };
For example, if we were writing a generic function for sorting an array of type T. We would
assume that two values of type T could be compared.
Fortunately C++ lets us make any assumptions about the type arguments we want. If we in-
stantiate the template with arguments which make any of the assumptions false, the compiler
will complain.
Example:
void bar()
{ t.doBar(); // assumption: T has a method doBar
t.print(); } // assumption: T has a method print
9 Templates 60
private:
T t;
};
class Y {
public:
Y();
Y(int, char*);
void doBar();
void print();
};
Example:
template<class T = int>
class foo {
...
};
void main() {
int a, b;
cin >> a >> b;
Pic<a, b> p3; // Error: a & b not const
}
The compiler doesn’t know the values for a and b at the compile time.
One set of static data members per instantiation. Normally there is only one copy of static data
members for all the objects of a class. One can make many classes from a class template. If
there are static data members defined in a class template, then each newly created class from
the template has its own copy of static data members when instantiating.
Chapter 10
By using template class, we will replace the alias value type by a type parameter Item.
All member functions will be re-defined following the syntax of template classes.
From the definition we can see that the data field is of type Item which is currently un-
known and should be instantiated later.
Here is an example to declare a multiset (bag) object mySet containing int items
multiset<int> mySet;
The above statement invokes the default constructor creating an empty multiset object mySet.
10 Templates and Iterators 64
mySet.insert(4);
10.2.3 Iterators
The multiset’s insert function is different from the insert function that we defined for our
bag class. The multiset insertion function returns a special value called an iterator, however
the bag’s insertion function returns a void type.
An iterator is an object that permits a programmer to easily step through all the items in a
container, examining the items and perhaps changing them.
Any STL container has a standard member function called begin which returns an iterator
that provides access to the first item in the container. Remember we have no order among the
elements in a container (e.g. a set here), but the STL maintains ways in which elements are
stored in certain order. Then the begin function gives the iterator to the first element. For
example
multiset<string> actors;
//insert information into actors
multiset<string>::iterator myIndex;
myIndex = actors.begin(); //the iterator to the first actor
You can imagine that an iterator of an element is the “index” to the element. Through the
iterator you can get an access to the element. For example, suppose myRole is an iterator to
the actors set, then *myRole is the item that the iterator myRole is “pointing” to. That is, an
iterator has much meaning of an ordinary pointer.
When you use the ++ to an iterator, the iterator will be moved to the “next” item in the
container. This is the easy way to traverse all the elements in the container. For example,
++myRole is a new iterator “pointing to” to the next actor in the set actors.
10 Templates and Iterators 65
A container also has an end member function that returns an iterator to mark the end of its
items in the container. The typical way to traverse all the elements in actors is as follows
• Forward Iterator
• Bidirectional Iterator
• Random Access Iterator
From the definition we can see there is only one private member in the node iterator tem-
plate class which is a pointer to node<Item>.
10 Templates and Iterators 66
We will overload the * operator for the aim of getting information of the item “pointed to” by
the iterator
Also we should define ++ operations for our iterator, i.e., the prefix and postfix ++ operators
The context of this lecture is from Section 1.2 of the textbook. The big-oh technique is used to
analyze the effectiveness of algorithms. We provide many examples to demonstrate the application
of big-oh.
11.1 big-oh
11.1.1 Time and Space Complexity
In doing a time analysis for an algorithm, we count certain operations that occur while carrying
out an algorithm or a method other than count the the actual elapsed time during each algorithm.
That is, we do not usually measure the actual time taken to run the algorithm because the number
of seconds can depend on too many extraneous factors — such as the speed of the processor, and
whether the processor is busy with other tasks.
In computing the efficiency of an algorithm, some measure of the amount of work done or the
space requirements must be used. This measure is called the complexity or order of magnitude of
the algorithm.
For most algorithms/programs, the number of operations depends on the algorithm /pro-
gram’s input. The complexity of an algorithm is measured in terms of the size of the prob-
lem/input using an expression of size n. [Those of you who have done amth140 should know
what I’m talking about].
The method that we will use is to perform upper bound estimates of such as the complexity
and efficiency.
11 Complexity Analysis of Algorithms 68
11.1.2 An Example
A polynomial Pn (x) of order n is defined by
We want to evaluate polynomial Pn (x) at a given point x. Two algorithms are available
Now we want to analyze the number of multiplications used in both algorithms (we don’t care
about addition as it is much easier than multiplication on a computer)
n(n+1)
2 = 0.5n2 + 0.5n multiplications in algorithm A, and only n multiplications in algorithm
B
You may note that Algorithm B is faster than Algorithm A especially for larger n.
If we use both algorithm to a new polynomial with order twice n, that is, P2n (x), then the num-
ber of multiplication of algorithm A increase about 4-fold while algorithm B twice fold. If we
increase the order of polynomial to 10n, then algorithm A need about 100-fold multiplication
while algorithm B tenfold.
The critical amount for Algorithm A is n2 (not its coefficient 0.5) and the critical amount for
Algorithm B is n. We can express this kind of information in a format called big-O notation. For
example, we say the complexity of algorithm A is O(n2 ) (we don’t care about the coefficient
0.5 and the second term 0.5n) and algorithm B O(n)
| f (n) |≤ C | g(n) |
The values C and n0 in the above definition themselves are not important, the significance lies
in the existence of such C and n0 , rather than the magnitude of these values.
http://mcs.une.edu.au/˜amth140
f is of smaller complexity than g, written O(f ) < O(g), if f = O(g) but g 6= O(f ).
For example, Algorithm B (O(n)) has a smaller complexity than Algorithm A (O(n2 )). Why?
Because O(n) = O(n2 ) but O(n2 ) 6= O(n) (i.e., you cannot find a constant C and an integer n0
such that |n2 | ≤ C|n| for all n ≥ n0 )
Generally, for a problem of size n, suppose that there are two algorithms A and B.
Execute in f (n) and g(n) amounts of operation/space for algorithms A and B, respectively.
If O(f ) < O(g), then, for large enough values of n, A will be more efficient than B.
The following table shows you why an O(n2 ) algorithm is much worse than an O(n) algorithm
en + n1000 = O(en )
en + n1000 6= O(n1000 )
O(1) < O(log2 n) < O(n) < O(n2 ) < . . . < O(n1000 ) < . . . < O(en ) < O(n!) < O(nn )
11 Complexity Analysis of Algorithms 70
An exponential-time algorithm has complexity O(an ) for a > 1 (much harder than a polynomial-
time algorithm).
11.3 Examples
11.3.1 Examples: Code segments
For a sequence of statements, complexity is O(1).
For the following code segments, what is the time complexity with regard to n (assuming that
all non-output statements are negligible)?
Example 1:
int i, j;
Answer: We should find out how many times the inner output statement should be run.
The outer for has n loops (i from 0 to n). For each loop we need to do another for in which
we have n loops too (j from 0 to n). Thus there are n × n loops for two fors. Hence the final
time complexity of this piece of program is O(n2 )
Example 2:
int i, j, k;
Answer:
11 Complexity Analysis of Algorithms 71
For the most inner for when both i and j are fixed we should do j output statements, i.e., k
from 0 to j − 1. In the middle for block, j is limited to 0 to i − 1 where i is fixed. Thus to finish
two inner for block we need to do
(i − 1)i
T (i) = 0 + 1 + 2 + · · · + (i − 1) = , for a given i
2
output statements. Now let us vary i from 0 to n − 1. In order for us to finish outer for block,
the total number of output statements to be implemented is
For the recursive fibonacci function below, what is the space complexity with regard to n.
int fib(int n) {
if (n <= 2)
return 1;
return fib(n-1) + fib(n-2);
}
Take n = 5 as an example. When the program begins to call fib(5), in the body of fib(5)
the program needs to do fib(4) first. The program suspends calculation and begin to call
fib(4) before the addition (i.e., fib(4)+fib(3)). At this stage, the program will store the
information in the current module, i.e., the parameter 5 will be stored in the function stack by
the operating system. Please note that at this stage nothing happens to fib(3).
As the program is running on fib(4), it is stepping into the second level of the recursive
block. In the second level block the program is going to call fib(3) (Please note this fib(3)
is not the fib(3) at the upper level needed for fib(5). When the program moves into this
fib(3) the system will store the information of its caller fib(4), i.e., 4 in the stack.
Now the program is running on fib(3). In the same way, within the body of fib(3) the
program is going to call fib(2) first. At this point, caller fib(3)’s information (i.e. 3) will
be pushed on to the stack and then the program is about to process fib(2) call.
11 Complexity Analysis of Algorithms 72
In processing fib(2) the program meets the terminal condition and return the value 1 as the
returned value from the function call fib(2). Then the program starts the second function
call fib(1) in the body of fib(3). Similarly the call fib(1) returns with a value of 1.
Now the stack uses 3 unit spaces to hold the information 5, 4 and 3.
Then the program returns to the caller fib(3), thus the information 3 will be popped up from
the stack and fib(3) returns the value of fib(2)+fib(1). And the program comes back to
the body of the caller fib(4). In the body the program needs to process the second function
call fib(2) in which the terminal condition is arrived. After calculating fib(3)+fib(2),
the program is going back to the upper level caller fib(4).
Then the information 4 will be popped up from the stack and the program just finishes the first
function call in body of calling fib(5) and then moves to the second function call fib(3).
Within fib(3) the program needs to push the information 3 onto the stack again in order to
process the inner calls fib(2) and fib(1). Now the stack holds the information 5 and 3.
After finishing the process of calls fib(2) and fib(1) the program pops out the information
3 and returns to the body of its caller fib(5) and calculates the addition fib(4)+fib(3).
Finally the program pops up the information 5 and finishes the calling fib(5).
In the whole procedure only 3 (5 − 2) unit spaces are needed to implement recursive function
call fib(5). Generally in the procedure of computing fib(n) the program needs about n − 2
unit spaces for the stack. Thus the space complexity of fib(n) is O(n).
Chapter 12
In general, there are two approaches to writing repetitive algorithms. One uses iteration; the
other uses recursion. Recursion is a repetitive process in which an algorithm calls itself. In this
lecture, we study recursion and iteration as well as backtracking algorithms.
See Figure 9.1 on page 420 and Figure 9.2 on pages 423.
if (stopping_condition)
terminal_case
else
reducing_case
or
if (!stopping_condition)
reducing_case
12 Recursion and Iteration 74
Example
Recursive version of n!
double recFact(int n) {
if (n == 0)
return 1;
return n * recFact(n-1);
}
Iterative version of n!
double itFact(int n) {
double retVal = 1.0;
int i;
The computer also provides memory for the called function’s formal parameters and any local
variables that the called function uses.
Next, the actual arguments are plugged in for the formal parameters, and the code of the called
function begins to execute.
If the execution should encounter another function call — recursive — then the first function’s
computation is stopped temporarily. This is because the second function call must be executed
before the first function call can continue.
Information is saved that indicates precisely where the first function call should resume when
the second call is completed.
12 Recursion and Iteration 75
The second function is given memory for its own parameters and local variables. The execu-
tion then proceeds to the second function call.
When the second function call is completed, the execution returns to the correct location within
the first function. And the first function resumes its computation.
We call the function recFact(5), then the computer allocates memory for the formal param-
eter n (now no local variables at all) and copies the actual argument value 5 into memory of
the formal parameter n.
As the value of n is nonzero, then the function executes the second statement in its body, i.e.,
the return statement.
There it encounters another function call recFact(4). Then the computer stops the compu-
tation of the first function call and begins processing the second function call. At the same
time, information is saved that indicates the point of the first function computation.
The computer allocates new memory for the formal parameter n and plugs value 4 into it.
Note memory for n here is different from memory for n in the first function call which holds a
value of 5.
Then the computer begins processing the second call in which the third function call is encoun-
tered. Then the same procedure goes through. Now suppose the second function call finishes
its job, then the second function call returns its result to the body of the first function call.
Finally the first function call resume its computation from the stopping point, i.e., computing
5 times the result from the second function call, and then returns the product. Now the first
function call finishes its job.
12.2 Examples
12.2.1 The Problem
In this example we consider a recursive function for deleting an object from a black and white
image.
Two black pixels are in the same object if we can get from one to the other with horizontal and
vertical moves without going over any white pixels.
The base case occurs if the pixel is outside the bounds of the image, or if it is already white.
12 Recursion and Iteration 76
0 1 2 3 4
If not then we white out the pixel and then call the function recursively for the pixels directly
above, below, to the left and to the right of the current pixel.
Recursive case: make the pixel white and call function for each surrounding pixel.
eg. Figure 12.2 shows the result when we delete the object at pixel (3,2).
0 1 2 3 4
The code:
Suppose we call the above function at pixel (3,2), then can you trace the order of pixels chang-
ing from black to white?
12 Recursion and Iteration 78
• Recursive functions use more stack space than their iterative counterparts.
• Recursive functions can require more CPU time.
In Favour of Recursion:
The problem is to place eight queens on a chessboard such that no queen can take another.
(64
8 ) = 4, 426, 165, 368 combinations.
If this fails then we find another unguarded position for this column.
If there are no more unguarded positions then we signal failure and exit (base case).
If there are no more columns to consider then we signal success and exit (base case).
or
if (!stopping_condition)
while (more_guesses)
try_guess_reduce
The job of addQueen function is to play the game at the location (row, col)and report the
result (success or failure) from the location. Within the function, we first check if the location is
outside the board. If so, then the game is over and the function returns the game status (success
or failure). Then check if the location is guarded by the placed queens. If guarded, then we play
the game at the next location (row+1, col); if not, we put a new queen at the location. After
that we further play the game from a new column, that is, from the new location (0, col+1). If
the returned result is failure from that location, that means the location (row, col) is not a good
location for the new queen. Hence we go back one step by removing the queen at (row, col) and
try the next location on the same column, that is, the location (row+1, col).
Chapter 13
In this lecture, we talk about the concept of key and comparing algorithm, Sequential search
and Binary search (two versions) and their properties and conclusions.
13.1.1 Objectives
Search an array for a target using two algorithms:
• Sequential search.
• Binary search.
13.1.2 Keys
To search an array, we must be able to compare elements.
This is done via a key. A key is something which can be used to identify or clarify objects/elements.
It’s unique for objects of classes.
For simple types like int, char, and char*, the key may be the element itself.
For more complex types such as structures, this key may be one specific field.
For example, in the following structure, we could use studentNumber as the key.
13 Algorithms for Search 82
struct student {
char firstName[32];
char lastName[32];
int studentNumber;
};
The algorithms are compared by the number of key comparisons they require.
Finish when:
The technique is often used because it is easy to write the algorithm code and is applicable to
many situations including both sorted and unsorted arrays.
i = 0
found = FALSE
while ( (i < n) and (not found) ) do the following
{
if (x is a[i])
found = TRUE
else
i = i + 1
}
13 Algorithms for Search 83
if (i == n)
i = - 1
return i
The perfect answer depends on the distribution of elements in the array. Usually when we
analyze the performance of an algorithm, we consider the “hardest” case. This is called the
worst-case.
For the sequential search algorithm, the worst-case search occurs when the target is the last
element in the array. In this case, the algorithm compares every element in the array with the
target thus the algorithm takes n comparisons.
Another worst-case is that the target is not in the array at all. In this case, the algorithm has to
complete comparison with each element in the array.
The best case occurs when the target is the first element in the array, thus only one comparison
operation is needed.
n+1
• A successful search: 2 (the best case: 1, and the worst case: n).
• An unsuccessful search: n (no such element x in the array a)
index: 0 1 2 3 4 5 6 7
In this example, the algorithm needs five comparisons between the target and the elements in
the array.
If the target x is ’W’, then the algorithm needs 8 comparisons but unsuccessful.
13 Algorithms for Search 84
first = 0
last = n-1
found = FALSE
while ( (first <= last) and (not found) ) do the following
{
mid = (first + last)/2
if ( x < a[mid] )
last = mid - 1
else if ( x > a[mid] )
first = mid+1
else
found = TRUE
}
if ( not found ) then
mid = -1
return mid
13 Algorithms for Search 85
a: 3 7 8 9 14 22 24 28 31
index: 0 1 2 3 4 5 6 7 8
Step Two
a: 3 7 8 9 14 22 24 28 31
index: 0 1 2 3 4 5 6 7 8
Step Three
13 Algorithms for Search 86
a: 3 7 8 9 14 22 24 28 31
index: 0 1 2 3 4 5 6 7 8
first last
mid
first = 0
last = n-1
while ( first < last ) do the following
{
mid = (first + last) / 2
if ( x > a[mid] )
first = mid+1 else
last = mid
}
if (first = last and x = a[last])
13 Algorithms for Search 87
Question: What will happen if the key is the middle element at first comparison?
Consider two special cases: (1) the key is the first element and (2) the key is the last element.
Binary Search:
10 5.5 5 3
100 50.5 9 7
1000 500.5 14 10
10000 5000.5 19 13
100000 50000.5 24 17
10 10 6 3
100 100 10 7
1000 1000 15 10
10000 10000 20 13
100000 100000 25 17
Chapter 14
14.1 Hashing
14.1.1 Definitions
A table is a finite sequence of records such that each entry has a key which uniquely identifies
it.
The load factor, is a measure of the saturation of a table. It is the ratio of the number of items in
the table and the number of table locations.
The table structure is the main component of a relational database. A relational database is
composed of many tables, called relations.
Example Table
14 Hashing and Complexity 90
1. Array:
(a) Sorted
(b) Unsorted
2. Linked List:
(a) Sorted
(b) Unsorted
The characteristics of a particular application, including the frequency of doing each operation,
determine the best implementation of ADT table.
To match a table implementation with an application, we must have a clear understanding of
each implementation’s time and space complexities.
• Deletions: Delete the entry with key key at the location h(key)
• Retrieve: Check if the entry with key is at the locationh(key)
This direct access approach provides very efficient insertions, deletions and retrievals.
In the remainder method of hashing, the hash function has the form.
hash(k) = k%c
Note: The hash function may map two different keys to the same index. This leads to the
possibility of collisions.
hash(k) = k % 11
We wish numbers to be dispersed in a semi-random fashion throughout the table. Eg: Suppose
all keys were even, then we would only use half of the table.
A probe into a hash table is a check of a location for a key. Many collision-resolving methods
are possible:
• Linear probing
• Quadratic probing
• Random probing
• Chained addressing
Linear probing is an open addressing technique. If there is a collision, we continue from the
hash location on, looking for the next available position.
When using open addressing, each location is assigned one of three states:
14.3.6 Clustering
Any key that maps to a location h follows the same path, searching for an available cell.
The clustering of records in a hash table that follow the same path from a hash location is called
primary clustering
When a probing path for one key merges with other hash values and paths, secondary clustering
occurs.
h + 1, h + 4, h + 9, h + 16, ...
This method reduces secondary clustering, but does not reduce primary clustering, and it may
not examine every location
14 Hashing and Complexity 95
This probing method reduces both primary and secondary clustering. It is usually slower
because of time taken in generating random numbers.
As the table fills, there can be a severe degradation in the table performance.
Load factors between 60% and 70% are common. Load factors > 70% are undesirable.
The search time is dependent only on the load factor, not on the table size. We can use the
desired load factor to determine appropriate table size:
Allow collisions to produce many entries at the same location. When the bucket becomes full,
we must again deal with handling collisions.
A chain is a linked list of elements that share the same hash location.
Collisions are resolved by adding the element to the linked list at that location.
Chained Example
14 Hashing and Complexity 97
0
1 element with key = 18
2
3 element with key = 54 add new element here
In this lecture we talk about the stack structure. Stack is widely used in many computer tech-
niques. For example recursion is implemented in the computer by use of a run-time stack. The stack
is appropriate in situations when the last data placed into the structure should be the first processed.
The operation to place a data item on the top of the stack is push.
The top operation returns the top of the stack without removing it
The IsEmpty operation returns true if the stack is empty (i.e. has no elements).
The C++ Standard Template Library (STL) has a stack class. You can simply use it to define
your own stack object. For example, the following statement declares a stack for string
stack<string> myStrStack;
void pop();
void operator=(const stack& source);
bool empty() const
{ return (top ptr==NULL); }
Item top() const;
private:
node<Item>* top ptr;
};
15.2 Applications
15.2.1 Where are Stacks used?
Stack frames are used to store return addresses, parameters, and local variables in a function
calling
Robotics: Instructions are stores in a stack. We can apply stack controllers such as repeat loops
to these stacks.
Use recursion
Use Stack
Postfix notation: AB + C∗
15 Stacks and Their Application 101
Algorithm
Algorithm
In this lecture, we describe the queue structure which, for example, models a line of people
waiting at the salad bar, discuss the abstraction data type definition of queue as well as the appro-
priate operation on the queue structure.
16.1 Queues
16.1.1 Definition of a Queue
A list where additions and deletions are done at opposite ends.
Additional operations:
queue<string> myStrQueue;
When adding elements to the queue, the index last increases. When deleting elements from
the queue, the index first also increases. That is, each time we enqueue an element, the
queue grows to the right (ie. larger indices). Each time an element is dequeued, the queue
shrinks from the left. Thus, the entire queue will slowly drift to the right. Eventually, we will
reach the end of the array, and no longer be able to add elements to the queue even there are
space on the left hand side of the array.
To solve this problem, consider a queue of people waiting to be served. Each time someone
leaves the queue from the front, everyone else shuffles forward to fill the gap. Similarly, for an
array based queue, we could shift each element of the queue one index to the left each time we
dequeued something.
16 Queues and Their Application 104
The array wraps around, so that if we reach the end of the array we start enqueueing elements
at the start. Of course, if the queue is full, we still won’t be able to enqueue any more elements.
Operating System
To recognize palindromes, we first enter the line into both a stack and a queue of characters.
Then we compare the output of both structures char by char. Because a stack is a LIFO struc-
ture and a queue is FIFO, the order of the stack output is reversed, while the queue output is
in order.
Algorithm: Palindrome
return TRUE
16 Queues and Their Application 105
A queue of customers is maintained. (Service is nothing but only the time decrement in this
example)
Algorithm:
Dynamic Queues
Our approach uses two pointers: One points to the first node (front ptr), and the other
points to the last node (rear ptr).
};
17.1.2 Implementation
The constructor will initialize an empty queue. The implementation is simple
The copy constructor will create a new queue object from the given queue. The implementation
is to use the list copy function of the node class to copy items from the source to the newly
created queue.
Before you read the text about the detail of destructor, can you give the implementation of the
destructor ˜queue()
The push function of the queue class adds a node at the rear of the queue. In the implemen-
tation we need to create the node for the given entry, to add the node just after the current rear
node (via the insertion function of the node), and to move the rear ptr pointing to the newly
added node.
}
++count;
}
The pop function of the queue class removes a node from the front of the queue. After remov-
ing the second node will be the new front node. In the implementation we need to return the
space for the first node to the system and to move the front ptr to the second node.
The assignment operation is also important. Here is the implementation of the overloaded =
operator
Finally we need a function to return the data information in the front node.
For example, a computer operating system that keeps a queue of programs waiting to use
some resource, such as a printer, may give interactive programs a higher priority than batch
processing programs that will not be picked up by the user.
A priority queue is a container class that allows entries to be retrieved according to some speci-
fied priority levels. The highest priority entry is removed first.
If there are several entries with equally high priorities, then the priority queue’s implementa-
tion determines which will come out first.
If the value of the Item type can be compared with a built-in “less than” operator, then that
operator can be used to specify a priority for each item in the queue.
For your own class type Item, you may have defined your own “less than” operator (over-
loaded operator <)
A programmer may define his own priority function other than the operator <.
A common priority queue implementation uses a tree data structure that achieves high effi-
ciency.
There are several less efficient alternatives. One possibility is to implement the priority queue
as an ordinary linked list, where the items are kept in order from highest to lowest priority
Chapter 18
Unlike lists etc., tree is a nonlinear structure. You have knowledge about tree from the course
discrete mathematics. In this lecture we will discuss how to make abstract data type tree and its
implementation methods.
For any nonempty tree, there is a hierarchical arangement with one node at the top. This
representation is a rooted tree, the node at the top is called the root.
Animal
The successors of the root are roots of the subtrees of the root. The parent of a node (if any) is its
predecessor. The children of a node (if any) are its successors.
The depth/level of a node is its distance from the root. And the depth/height of a tree is the
maximum level of nodes in the tree
root
Animal Level 1
interior
parent node
child
edge
Parrot Cat Dog Level 3
node leaf
height = 3
A child on the left of the node is called the left child and a child on the right of the node is called
the right child
The left (right) subtree is the subtree containing the left (right) child.
Fruit
In a full binary tree, every leaf has the same depth, and every nonleaf has two children.
18 Trees and Binary Trees 113
In a complete binary tree, every level except the deepest must contain as many nodes as possible,
and at the deepest level, all the nodes are as far left as possible.
In this representation, the root node is stored as the first entry of an array, then the left child
of the root node the second entry of the array and the right child of the root the third entry of
the array. Then the most left child on the third level goes to the forth place of the array, then
the child from left to right goes to the array entry, and level by level. See the example on pages
460 — 461.
• Suppose that the data for a nonroot node appears in entry [i] of the array, then the data
for its parent is always at location [(i-1)/2].
• Suppose that the data for a node appears in entry [i] of the array, then its children (if
they exist) always have their data at these locations: the left child at entry [2i+1] and
the right child at entry [2i+2].
Actually a binary tree is presented by a lot of nodes linked together. We need a function
returning the data information of the tree node.
One of important operations is to destroy a binary tree. We need to return the nodes of a tree
to the heap. Our function will have one parameter, which is the root pointer of the tree. This
function employs a recursive algorithm. That is, before we return the root node to the heap,
we should return all the nodes in the left subtree and right subtree.
Our second function also has a simple recursive implementation. The function copies a tree.
The tree can be copied with these three steps
• Preorder
• Postorder
• Inorder
Algorithm:
10
7 14
2 11 20
We travel anticlockwise along the line and visit each node as we pass:
G R
B H Z
A F M X
T Y
Preorder: N G B A F H M R Z X T Y
Postorder: A F B M H G T Y X Z R N
Inorder: A B F G H M N R T X Y Z
Process is a type parameter for a function type. You instantiate the type parameter Process
by any type of function.
Following the same rule, you can define functions for postorder and inorder traversal algo-
rithms.
Chapter 19
• The entry in node is greater than all entries in its left subtree
• The entry in node is less than all entries in its right subtree
• Both subtrees are binary search trees
19.1.2 Examples
Two binary search trees
13 chicken
7 40 bread eggs
20 35
13 chicken
9 40 bread milk
7 smaller than 9
private:
binary tree node *root ptr;
};
Three cases:
1. The root is the node containing the entry we are looking for
2. The entry in the root node is larger than the target entry, then search left subtree
3. The entry in the root node is smaller than the target entry, then search right subtree
b f
a c e g
d d d
b f b f b f
a c e g a c e g a c e g
start at the root d to the right since e > d to the left since e < f
Example
19 Binary Search Trees and B-Trees 121
g g g
d j d j d j
a h m a h m a e h m
• No children: easy
• One child: easy
• Two children: difficult
e e
e
a g a f
a g
f p f r
e h n r e h
n
move left subtree
inorder predecessor to the right of
inorder predecessor
This method tends to produce higher trees which is undesirable, and causes the operations to
take longer.
19.3 B-Trees
19.3.1 The Problem of Unbalanced Trees
Binary search tree can be used to implement a bag class. However the efficiency of binary
search trees can go awry.
Suppose we add numbers in increasing order into a binary search tree, then we will get an
unbalanced tree in the sense that the levels of the tree are only sparsely filled. See Figure 11.3
of the text.
It will take long time to look up the “maximal” item in such an unbalanced binary search tree.
To make algorithm efficient on binary search trees we need to reduce the depth of the trees.
Another important property of B-trees is that each node contains more than just a single entry.
Every B-tree depends on a positive constant integer called MINIMUM. This constant determines
how many entries are held in a single node.
B-tree Rule 1: The root may have as few as one entry (or even no entries if it also has no
children); every other node has at least MINIMUM entries.
B-tree Rule 2: The maximum number of entries in a node is twice the value of MINIMUM.
19 Binary Search Trees and B-Trees 123
B-tree Rule 3: The entries of each B-tree node are stored in a partialy filled array, sorted from
the smallest entry (at index 0) to the largest entry (at the final used position of the array).
B-tree Rule 4: The number of subtrees below a nonleaf node is always one more than the num-
ber of entries in the node.
B-tree Rule 5: For any nonleaf node: (a) An entry at index i is greater than all the entries in
subtree number i of the node, and (b) Any entry at index i is less than all the entries in subtree
number i + 1 of the node.
B-tree example
For each B-tree node, we use data count to denote the actual number of the entries and
child count to denote the actual number of children (subtrees).
From the above structure we can see that the subtrees of the B-tree root node (set) are pointed
to by pointers stored in the subset array of type set, that is, each array entry subset[i] is
a pointer to another set object.
19 Binary Search Trees and B-Trees 124
Start with the entire B-tree, checking to see whether the target is in the root. If the target does
appear in the root, then the search is done. Or the target is not in the root, and the root has no
children. In this case the work is also done.
If the target is not in the root and there are subtrees below. In this case, there is only one
possible subtree where the target can appear, so the algorithm makes a recursive call to search
that one subtree for the target.
The pseudocode is
To illustrate how an insertion works, suppose we are developing a B-tree within MINIMUM=1,
so that each node has a maximum of two data items and three subtrees.
Suppose we wish to insert the following values into the tree: 25, 15, 8, 20, 23, 10, 17, and 12.
Because the root can accept two values, then at the beginning the root node contains items 15
and 25 (data[0]=15 and data[1]=25).
19 Binary Search Trees and B-Trees 125
15 25 15
8 25
There is no room for the next item 8. To perform the insertion, we split the node in two and
form a new root with these nodes as children. The middle of the sorted value 8, 15 and 25 goes
into the root.
Because the next value 20 is greater than 15, it goes in the rightmost child.
15 15 23
8 20 25 8 20 25
There is no room for the next value 23 which should also be in the rightmost subtree. We sort
the values, 20, 23, 25, and , as before, divide the node. The middle value 23 goes to the root,
while 20 and 25 proceed to separate nodes.
15 23
8 10 17 20 25
The value 1 also needs to go into the left child, but there is no room. Thus we sort 8, 10, 12,
place 8 and 12 in their own nodes, and send the middle value to the parent. However the
parent (the root) already is full. Consequently, we repeat the process. We sort, 10, 15, and 23,
move the middle value 15 to a new root and place 10 and 23 in leftmost and rightmost children,
respectively. Instead of propagating downward as with binary search trees, B-trees propagate
upward.
15
15 23
10 23
10
8 12 17 20 25 8 12 17 20 25
The algorithm is fully described in the textbook. Before you read the text, please try yourself
to see if you can work out the algorithm
Chapter 20
Quadratic Sorting
One of the most common applications in computer science is sorting, the process through
which data are arranged according to their values. We discuss six sorting algorithms.
To learn the efficiency of various sorting methods and about limitations on the best possible
methods
To learn how to select a sorting technique that is well-matched to the characteristic of a problem
you need to solve
That is, if entry i comes before entry j in the array, then the key of entry i is less than or equal
to the key of entry j.
20 Quadratic Sorting 128
• An in-place sort of an array occurs within the array and uses no external storage or other
arrays
• In-place sorts are more efficient in space utilization
• An external sort uses primary memory for the data currently being sorted and secondary
storage for any data that will not fit in primary memory.
Stable Sorts
• A sort is stable if it preserves the ordering of elements with the same key.
• i.e. If elements e1 and e2 have the same key, and e1 appears earlier than e2 before
sorting, then e1 is located before e2 after sorting.
• Example:
◦ Suppose we have an array of names and addresses that are already sorted by name.
◦ We want to have an ordering of these people by city.
◦ We want to preserve the alphabetical ordering for each city.
◦ We must use a stable sorting algorithm.
The general idea of insertion sort is for each element, find the slot where it belongs.
At each step of the sort, we insert the next unsorted element into the sorted sub-array
20 Quadratic Sorting 129
Sorted Unsorted
hen
cow
cat
ram
ewe
dog
cow hen
cat
ram
ewe
dog
cat cow hen
ram
ewe
dog
cat cow hen ram
ewe
dog
cat cow ewe hen ram
dog
Post:
Algorithm:
In−Place Sort Y
Stable Algorithm Y
Recursive Algorithm N
The outer for loop is executed n − 1 times, from 1 to n − 1. On the i-th iteration to find and
prepare the location for temp, the value of the i-th array element, we execute the while loop. If a[i]
is already in its proper location, the while loop does not execute. Thus our best situation is where
the data is already sorted. However, in the worst case the array is in descending order, then the
insertion sort to place the array in ascending order requires a total number of 1 + 2 + ... + (n − 1)
iteration of the while loop. Now you can find the average number of times.
One of disadvantage of insertion sort of an array is the amount of movement of data. In an
array of length records, those reassignments can be quite time-consuming.
20 Quadratic Sorting 131
Unsorted
cat
ram
ewe
dog
hen
cow
ram
cat
cow
hen
ewe
dog
cat cow
hen
ram
ewe
dog
# $
#$
## $
#
#
$
#
#
$
#
$ #
#
$
#
$ #
#
$
#
$ #
#!
$#"$#"
! !
!
""
! !
!
""
! !
!
""
! !
!
""
! !
!""!!
$ #$#$
$ #$#$
$
$ #$
#$#$
$ #$#!$#"
" $ !$"!"
" !"!"
"
" !"
!"!"
" !""!"!"
cat cow dog ram
$ #$ewe !"hen
%%%
&
&&%
&& &% %%%
& %%%
&
& %%%
&
& %%%
&
& '&%'%'
(
& ( '''
(
( '''
(
( '''
(
( '''
(
( ('''((
&&% &
% &&% &
% &&% &
% &&% % & ( %
' (
' (
' (
' (
' '
cat cow dog ewe % & %
&% & % & ram %
& % & ' %(&('
&%(& ' (('
( '
(('
( hen
(('
'
( '
(('
( '(('(
Sorted
Post:
Algorithm:
Post:
The function returns the index of the minimum element in the subarray a[i..(n − 1)].
Algorithm:
minIndex = i
for j from i+1 to n-1 do
if a[j] < a[minIndex] then
minIndex = j
return minIndex
In−Place Sort Y
Stable Algorithm N
Recursive Algorithm N
To discover the complexity, we notice that the order loop is a for loop that always executes n−1
times. The inner for loop, in indexOfMin function, is performed (n − 1) − (i + 1) + 1 = n − i − 1
times. The total number of execution of the if statement in the inner loop is
1 2 1
(n − 1) + ... + 3 + 2 + 1 = n − n
2 2
Regardless of the ordering of the data, the number of comparisons in a selection sort is O(n2 ).
Selection Sort:
• Insensitive to original ordering, slower than insertion sort if array is almost sorted
20 Quadratic Sorting 134
Recursive Sorting
21.1 Mergesort
We learn the idea of divide and conquer. The mergesort method is faster than Insertion and
Selection Sort, but not often used due to the time spent on combining the two subarrays.
21.2 Algorithm
21.2.1 Mergesort(a, first, last) Algorithm
Pre:
Post:
a[first..last] is sorted
Algorithm:
first, mid, and last are indices of a, where first ≤ mid ≤ last.
Post:
a[first..last] is sorted
Algorithm:
ndx1 = first
last1 = mid
ndx2 = mid+1
last2 = last
i = 0
i = 0
for j from first to last
a[j] = temp[i++]
21 Recursive Sorting 138
i = 0;
for(j = first; j <= last; j++)
a[j] = temp[i++];
21 Recursive Sorting 139
delete [] temp;
}
In−Place Sort N
Stable Algorithm Y
Recursive Algorithm Y
21.3 Quicksort
We discuss the idea of quicksort, how it works for both arrays & linked lists, How to choose the
pivot, and how to implement in C++ code.
The method is to choose an element as the pivot, then to partition the array into two subar-
rays:
cow rat
hen
cat ram
ewe
dog emu
dog
cow cat ewe hen rat
emu ram
dog ram
cat hen rat
cow emu
a is an array
Post:
Going from the left, find an entry which belongs on the right of the pivot (≥ pivot)
Going from the right, find an entry which belongs on the left of the pivot (≤ pivot)
pivot = ham
ham fish pork lamb veal beef
i loc
ham fish pork lamb veal beef
i loc
ham fish beef lamb veal pork
i loc
ham fish beef lamb veal pork
i loc
beef fish ham lamb veal pork
i loc
a is an array
f irst and last are are indices within the range of a and f irst < last
Post:
a[f irst..(loc − 1)] contains elements less than or equal to the pivot
Algorithm:
i = first
loc = last + 1
pivot = a[first]
do
loc = loc - 1
while (a[loc] > pivot)
if (i < loc)
swap(a[i], a[loc])
swap(a[first], a[loc])
return loc;
}
In both cases, the size of the subarray to be sorted at each recursive step is only reduced by
one.
The median-of-three method avoids the poor performance of quicksort when the elements are
sorted.
Instead of chosing a[f irst] as a pivot, the median of the f irst, last and middle elements is
computed
A B C D E F G
A B C D E F G A<D<G
D B C A E F G
A B C D E F G After partition
In−Place Sort Y
Stable Algorithm N
Recursive Algorithm Y
Heap Sort
What is a heap?
Efficiency
comparison functions
P Level 0
A X Level 1
D O G B Level 2
E Level 3
Y E
U O I
A X
D O G
Y E
U O I
A 0
B 1 C 2
Tree:
D 3 E 4 F 5 G 6
8 10 12
H 7 I J 9 K L 11 M N 13 O 14
List: A B C D E F G H I J K L M N O
If a node has index k then its left child has index 2k+1 and its right
child has index 2k+2
22.2 Heaps
22.2.1 Definition of Heaps
Definition:
A heap is a complete binary tree, where the values stored in a node is greater than or equal
to the values of its children.
The heapsort function works on a list not on a tree, but it is always convenient to draw a tree
to show the hierarchical relationships between the entries of the list.
For Arrays:
(a) Swap the first element with the last element of the heap
grows
22.3 Algorithms
22.3.1 PlaceInHeap Algorithm
while ((not a heap) and (the child is not at the root)) do:
find the parent
if parent and child are out of order then
swap parent and child
move the child marker up the tree one level
else
we have a heap
childN X is an index of h
Post:
h[0..childN X] is a heap
parentNX = (childNX-1)/2
heap = FALSE
childNX = parentNX
parentNX = (childNX-1)/2
else
heap = TRUE
end if
end while
Post:
heap sorted
grows
Post:
h is a sorted array
h is an array
Post:
h is a sorted array
BuildHeap(h, n)
BuildSortTree(h, n)
PlaceInHeap(h, childNX);
}
In−Place Sort Y
Stable Algorithm N
Recursive Algorithm N
Comparisons Assignments
Quicksort O(n 2) O(n 2)
Graphs
A graph, like a tree, is a nonlinear data structure consisting of nodes and links between the
nodes.
Example
a e frog
toad
b
flea
d
bat
c f donkey
If each edge in a graph has an orientation connecting its first vertex, called the edge’s source, to
its second vertex, called the edge’s destination or target, then the graph is called a directed graph.
Example
23 Graphs 153
An undirected graph can be considered as a directed graph if each edge in the undirected
graph is considered as two edges (two directions) in the directed graph.
A path in a graph is a sequence of vertices, v0 , v1 , ..., vn , such that each consecutive pair of
vertices vi and vi+1 are connected by an edge. In a directed graph, the connection must go
from the source vi to the target vi+1 is a sequence of nodes, i.e., (vi , vi+1 ) is an edge of the graph
In principle, a graph may have two or more edges connecting the same two vertices in the
same direction. These are called multiple edges.
If there are no loops and no multiple edges in a (undirected or directed) graph, the graph is
called a simple graph. We only investigate simple graphs in this course.
a e frog
toad
b
flea
d
bat
c f donkey
Dynamic:
We write
• TRUE = 1
• FALSE = 0
count = 4
b a
0 1 2 3
nodes = [a b c d]
0 1 2 3
0 0 0 1 1
c d edges = 1 0 0 0 1
2 1 0 0 1
3 1 1 1 0
The adjacency matrix of an undirected graph is always symmetric. This may be not true for a
directed graph. Why?
#define NUMNODES 40
private:
size t count;
Item nodes[NUMNODES]
bool edges[NUMNODES][NUMNODES];
...
}
23 Graphs 155
The same technique also works with any undirected graphs. We use an array of pointers to the
linked lists to represent the graph.
Following the same idea, a graph can be represented by an array of sets of integers. For exam-
ple, to represent a graph with 10 vertices, we can declare an array of 10 sets of integers. The
i-th set contains the vertex numbers of all of the vertices that vertex i is connected to.
Adding an edge means setting an appropriate element of the array edges to the value true.
The above code is for a directed graph. Can you change it for an undirected graph?
Thus removing an edge from a graph is also very simple. You can simply work it out.
Other methods include extracting entry information of the i-th node and getting all edges for
a given node.
Chapter 24
Process v
Find the next unprocessed node adjacent to v (or the target from v) and repeat.
Process v
Recursively search each unprocessed node adjacent to v (or the target from v)
The algorithm continues until the node is found, or we have exhausted every possibility
24 Algorithms for Graphs 158
24.1.4 Implementation
As we discussed in Lecture 23, an undirected graph can be considered as a directed graph.
Hence we simply consider the implementation for directed graphs.
The textbook shows the Depth-First Search algorithm with an example on pages 717—720 and
the Breadth-First Search algorithm with another example on pages 720—722.
1. Check that the start vertex is a valid vertex number of the graph
2. Set all the components of marked array to false indicating if a vertex has been processed
by the algorithm
3. Recursively call a separate function to process the neighbors’ neighbors until all the ver-
tices are processed.
The Breadth-First Search is implemented with a queue of vertex numbers. The start vertex is
processed, marked, and placed in the queue. Then following steps are repeated until the queue
is empty
Example
24 Algorithms for Graphs 160
Darwin
2600
2700
2500
Brisbane
3500
700
1600
Sydney
Perth
Adelaide 900
Hobart
The implementation for a network is similar to graph implementation. The adjacency matrix
stores numbers instead of booleans.
0 count = 5
1
nodes = [0 1 2 3 4]
50 0 1 2 3 4
edges = 0 0 0 0 0 50
2 1 0 0 0 0 0
70 2 0 0 0 70 120
120
3 0 0 70 0 100
4 3 4 50 0 120 100 0
100
A tree is a connected graph with no cycles. A spanning tree of a connected graph is a tree whose
edges are in the graph and that contains all of the vertices in the graph.
A minimal spanning tree of a connected network is a spanning tree that has the smallest possible
weight sum.
24 Algorithms for Graphs 161
3100 Boston
San Francisco
1000
2500
Atlanta
3100 Boston
San Francisco
1000
2500
Atlanta
3100 Boston
San Francisco
1000
2500
Atlanta
Prim’s algorithm is used to find a minimal spanning tree t of a connected network g of n vertices:
Dijkstra’s algorithm finds the shortest path from vertex 0 to all other vertices (1, 2, . . ., n-1). A
shortest path between two vertices may not cover every vertex of the network because it is
notnecessarily to find a spanning tree
(
0 if 0 and i are adjacent,
F romV ertex[i] ←
−1 otherwise
(
T RU E(1) if i = 0,
processed[i] ←
F ALSE(0) otherwise
The text proposes another version of the algorithm which simply output the shortest distance
between start vertex and all other vertices. Please read the pseudocode on page 735 carefully
and see the example of Dijkstra’s algorithm on pages 729—738.