Chapter 1 - Data Struct N Algo
Chapter 1 - Data Struct N Algo
1
Introduction to Data Structures
Syllabus :
Data Management concepts, Data types-primitive and nonprimitive, Performance
Analysis and Measurement (Time and space analysis of algorithms- Average, best
and worst case analysis), Types of Data structures – Linear and Non Linear Data
Structures.
1.1 Definition :
1.1.1 Data :
Fig. 1.1 shows example of atomic and composite data. In this example date of birth (say
20/08/2006) can be separated into three atomic values. First one gives the day of the month,
second one gives the month and the last one is the year.
Data & File Structures (GTU) 1-2 Introduction to Data Structures
Data object is a set of elements such that the set may be finite or infinite.
For example : consider a set of employees working in a bank. It is a finite set, whereas set of
natural numbers is an infinite set.
Primitive :
The integers, reals, logical data, character data, pointers and reference are primitive data
structures. These data types are available in most programming languages as built in type. Data
objects of primitive data types can be operated upon by machine level instructions.
Data & File Structures (GTU) 1-3 Introduction to Data Structures
Non-Primitive :
These data structure are derived from primitive data structures. A set of homogeneous
and heterogeneous data elements are stored together. Examples of non-primitives data
structures: Array, structure, union, linked-list, stack, queue, tree, graph.
2. Linear and Non-Linear Data Structure :
Linear :
Elements are arranged in a linear fashion (one dimension). All one-one relation can be
handled through linear data structures. Lists, stacks and queues are examples of linear data
structure. Fig. 1.2 shows the representation of linear data structure.
Non-Linear :
All one-many, many-many or many-many relations are handled through non-linear data
structures. Every data element can have a number of predecessors as well as successors. Tree
graphs and table are examples of non-linear data structures (Refer Fig. 1.3). Representation of
binary tree in linked and array structure is given in Fig. 1.4.
Data & File Structures (GTU) 1-4 Introduction to Data Structures
(a) Representation of the binary tree through (b) Representation of binary tree through an
linked structure array
Fig. 1.4 : Representation of tree of Fig. 1.3(a)
Static :
In case of static data structure, memory for objects is allocated at the time of loading of
the program. Amount of memory required is determined by the compiler during compilation.
Example : int A[50]; // C++ Statement
Remember following points :
• Static data structure causes under utilization of memory.
• Static data structure may cause overflow.
• No re-usability of allocated memory.
• Difficult to guess the exact size of data at the time of writing of program.
Dynamic :
In case of dynamic data structure, the memory space required by variables is calculated
and allocated during execution.
Data & File Structures (GTU) 1-5 Introduction to Data Structures
Persistent :
A data structure is said to be persistent if it can be accessed but can not be modified.
Any operation on such data structure creates two versions of data structures. Previous version
is saved and all changes are made in the new version. Functional data structures are persistent.
Ephemeral :
If we are able to create data cells and modify their contents, we can create ephemeral
data structures. These are data structure that changed over time.
1.2 Algorithm :
1.2.1 Definition :
The word Algorithm comes from the name of a Persian author Abu Jafar Mohammad
ibn Musba al Khowarizmi (c. 825 A.D.), who wrote the textbook on mathematics (see
Fig. 1.5). This word has taken special significance in computer science, where Algorithm has
come to refer to a method that can be used by a computer for the solution of a problem. This is
what makes algorithm different from words such as process, technique or method.
Fig. 1.5 : Abu Jafar Mohammad ibn Musba al Khowarizmi (825 AD)
Data & File Structures (GTU) 1-6 Introduction to Data Structures
Fig. 1.6
• Input : All the algorithms should have some input. The logic of the algorithm should
work on this input to give the desired result.
• Output : At least one output should be produced from the algorithm based on the input
given.
Example :
If we write a program to check the given number to be prime or not, we should get an
output as ‘number is prime’ or ‘number is not prime’.
• Definiteness : Every step of algorithm should be clear and not have any ambiguity.
• Finiteness : Every algorithm should have a proper end. The algorithm can’t lead to an
infinite condition.
Example :
If we write an algorithm to find the number of primes between 10 and 100, the output
should give the number of primes and execution should end. The computation should not get
into an infinite loop.
• Effectiveness : Every step in the algorithm should be easy to understand and can be
implemented using any programming language.
The algorithm is described as a series of steps of basic operations. These steps must be
performed in a sequence. Each step of the algorithm is labelled.
Algorithm is a backbone for any implementation by some desired language. One can
implement the algorithm very effectively if some method is followed in writing an algorithm.
For having the systematic approach into it we have to use some algorithmic notations. Let us
see one example of algorithm. Refer Fig. 1.7.
Data & File Structures (GTU) 1-7 Introduction to Data Structures
Fig. 1.7
Step 1 : Start.
Step 2 : Read two positive integers and store them in X and Y.
Step 3 : Divide X and Y. Let the remainder be R and the quotient be Q.
Step 4 : If R is zero then go to step 8.
Step 5 : Assign Y to X.
Step 6 : Assign R to Y.
Step 7 : Go to Step 3.
Step 8 : Print Y ( the required GCD).
Step 9 : Stop.
The steps mentioned in the above algorithm are simple and unambiguous. Anybody,
carrying out these will clearly know what to do in each step. Hence, the above algorithm
satisfies the definiteness property of an algorithm.
For sorting algorithm, the number of inputs is the total number of elements to be
arranged in a specific order. The number of outputs is the total number of sorted elements.
The number of operations involved in the algorithm.
Example :
For a searching algorithm, the number of operations is equal to the total number of
comparisons made with the search element.
• If we are searching an element in an array having ‘n’ elements, the problem size is same
as the number of elements in the array to be searched. Here the problem size and the
input size are the same and is equal to ‘n’.
• If we sorting elements in an array, there might be some copy operations (swaps)
involved. The problem size could be the number of elements in the array or the number
of copies performed during sorting.
• If two arrays of size n and m are merged , the problem size is the sum of two array
sizes(= n+m).
• If nth factorial is being computed, the problem size is n.
higher values of ‘n’ , the effect of ‘c’(constant) is not significant. Thus, constant
can be ignored.
Example 2 :
for (i=0; i<n; i++)
{
for(j=0; j<m; j++)
{
/*assume we have ‘c’ number of statements inside the loop*/
……..
……..
}
}
In the above example, based on earlier assumptions,
The total time for one execution of the for inner loop= c*1 = c
Since the inner loop is executed ‘m’ times,
The total time of execution in the inner loop = m*c
As per the nested loop concept, for every iteration of the outer loop, the inner loop
is executed ‘m’ times.
Since the outer loop is executed ‘n’ times, we have
The total time of execution of the algorithm = n*(m*c)
Since ‘c’ is constant, the order of magnitude of the above algorithm is
approximately equal to ‘n*m’.
The calculation of order of magnitude in the examples discussed above is the priori
analysis of the algorithm.
1.2.7 Worst Case, Average Case and Best Case Running Time of an
Algorithm :
While analyzing the algorithm based on Priori Analysis principles, three different cases
are identified. It is important to note that this classification is purely based on the nature of the
problem for which we are developing the algorithm.
Worst case :
Average case :
Note