0% found this document useful (0 votes)
31 views14 pages

DSU Unit 1 Notes Final

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
31 views14 pages

DSU Unit 1 Notes Final

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 14

Unit- 1. Introduction to Data Structures.

- Data : Data is a collection of information in raw form, when it becomes meaningful after processing
then it is called Information.
- Data type : - Data type is a term which refers to the type / kind of the data that variable can hold/ store.
- e.g in C programming data types are int, float, char double etc.
- Data object : - Data object is a term refers to the Set of elements, such a set may finite or infinite.
- e.g. set of students studying in second year is a finite set, set of natural numbers is an infinite set.

- Definition of Data structures : -


- A data structure is a mathematical or logical way of representing OR organizing data in memory where
not only the data item stored inside them but also the relationship among each data is also considered.

- Data structure specifies the following four things:


1. how the data should be organized in computer,
2. how the flow of data should be controlled,
3. how a data can be utilized, and
4. how data can be manipulated.

- Therefore, Data Structure = Organized Data + Allowed Operations

- For writing an efficient algorithm, proper data structures should be selected so that,

o The relationship between the data element can be expressed.


o Processing and accessing of the data should be easy.

- Need of Data Structure :-

- Data structures and algorithm are inter related to each other for efficient problem solving.
- Algorithms manipulate the data in these structures in various ways, such as inserting a new data element,
searching for a particular element, or sorting the elements.
- Thus the study of data structure create and identify the best data structure and select the most
suitable algorithm to execute on this data structure for efficient operation.
- E.g. In sequential data structure using array data accessing is very easy but modifications are difficult.
but If we implement the sequential data structure using linked list then modification become very easy
but accessing the large list become inefficient.
- Thus we need to study data structures so that we can select the best suitable data structure and
algorithm to solve the problem.

Abstract Data Type : -


- Abstraction of the primitive data types ( int, char, float) is provided by the compiler itself we can use the
variable of int data types and perform various operation on the variable without knowing how the
compiler allocate the memory, how compiler store values inside the int variable etc.
- But sometimes situation occurs that built in data types are NOT enough to handle complex data
structure.
- So it is programmers responsibility to create their own data type (user-defined data types) where and
how data values are stored, possible operations that can be carried out, Such custom data types are
called Abstract Data type, E.g struct in ‘C’.
- ADT = Type + Function Names + Behavior of Each Function

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 1


- Stack, Queue, List, tree, Graphs are examples of ADT along with their operations.

- While creating a ADT ,one has to define the following things:

1. The name of the type, so the programmer/others can refer to it.


2. The names of all primitive operations over elements of that type.
3. The conditions, under which these operations are applicable.
4. A precise specification of the behavior of those operations, when they are applicable.

- Advantages of ADT as compare to Primitive data type:


1. ADT is reusable and ensures robust data structure.
2. Encapsulation ensures that data cannot be corrupted.
3. It reduces coding efforts.
4. The ADT is a useful guideline to implemented and a useful tool to programmers who wish to use the data
type correctly.

Types / Classification Of Data Structure : -

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 2


Primitive Data Structures:
- Primitive data structures are basic data structure and are directly operated on machine instructions.
- e'g' integer, character. Normally, primitive data structures have different representation on different
computers.
- A primitive data type defines how the data will be internally represented in, stored and retrieved from
the memory. Examples are integer, character, real/float numbers etc.

Non-Primitive Data Structures : -

- The non-primitive data structures are highly developed complex data structures.
- Non-primitive data structures are derived from the primitive data structures.
- The non-primitive data structures emphasize on structuring of a group of homogeneous (same type)
or heterogeneous (different types) data items.
- Examples of non-primitive data structure in 'C' Language are arrays, structures, Union, linked list,
queue, tree, graph.
For example:
int a[10];
char b[10];
int arrl[5] = {10,20,30,40,50};

- Structure is the collection of heterogeneous (variables of different types) data elements.


struct address
{
char name[30];
char street[50];
char city[30];
char state[15];
int Pin;
);

- Linear Data Structure :-


- A linear data structure is one in which all its elements form a sequence.
- The different types of Linear data structures are : -

1. Array:

- An array is a collection of homogeneous (similar type) data elements.


- All the elements in an array are stored contiguously.
- The elements of array are stored in linear fashion.
- Array declared as int a[10];

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 3


2. Linked list : -

- Linked list is alternative approach of arrays. If we are not aware of upper limit of number of elements in
advance then linked list is one of the options.
- In this organization, data can be placed anywhere in memory and these data are linked with each other
using pointer field. This pointer field stores the address of next element in the list.
- Inked list is collection of nodes where each node consists of two parts i.e. data and pointer to next node.
- The first node in linked list is called head/start node which points to the first data element in list and the
last node contains null pointer as shown in Fig above.

3. Stack : -

- A stack is a data structure in which all the insertion and deletion are performed at one end which is
known as Top of the stack.
- The insertion operation in stack is known as push and deletion operation in stack is known as pop as
shown in Fig. below.
- stack follows LIFO (Last In First out) mechanism. The real life examples of stacks are, plates on a tray
and stack of coins etc.

4. Queue : -

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 4


- A queue is a linear data structure in which insertion and deletion are performed at rear and front
respectively as shown in Fig.
- Queue follows FIFO (First in first out) mechanism.
- The real life examples of queues are people waiting in the Bus stop for ticket, call log facility in mobile
where calls are stored in order of their arrival and older one get deleted automatically after some time
and in computer, processor used queue for task scheduling.

- Q. Write the Differences between array and linked list.

Sr Parameters Array Linked list


No
. 1. Definition Array is a collection of elements of Linked list is an ordered collection of
similar data type. elements of same type, which are connected
to each other using pointers.

2. Accessing the Array supports random access Linked list supports sequential access
element hence nth element can be accessed hence to access nth element of a linked list,
directly by specifying its index. one has to travel entire list.
3. Memory Elements are stored in contiguous New elements can be stored anywhere in
location memory location or consecutive the memory.
manner in the memory.
4. Operations In array, insertion and deletion Insertion and deletion operations are fast in
operation takes more time, as the linked list.
memory locations are consecutive and
fixed.
5. Memory Memory is allocated as soon as the array Memory is allocated at runtime, as and
allocation is declared, at compile time. It is also when a new node is added. It is also known
known as static memory allocation. as dynamic memory allocation.

6. Types Array can be 1-dimensional, 2- Linked list can be singly, doubly or


dimensional or multidimensional. circular.
7. Size Size of the array is fixed must be known Size of a linked list is variable.
in advance.

- Non - Linear Data Structure :-


- In non-ilnear data structures, data elements are not organized in a sequential fashion. Each element
in non linear data structure is not constrained to have a unique predecessor and a unique successor.

1. Tree : -
- Tree maintains a hierarchical structure on collection of data items.
- A tree is collection of, one or more nodes such that there is a specially node calied root and remaining
nodes can be partitioned into n (n>=0) disjoint subsets.
- Each disjoint set is a tree as shown in Fig.

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 5


2. Graph : -

- Graph is collection of vertices (nodes) and edges (arcs) where each edge in it is specified by a pair of
vertices.
- The graph can be directed graph or undirected graph as shown in Fig below.

Sr. No. Linear Data Structure Non-Linear Data Structure

1. A data structure is said to be non linear if


A data structure is said to be linear if its
its every data item is attached to several
elements form a sequence or a linear list.
other data items by some mean.
2. Data is arranged in linear sequence. Data is not arranged in sequence.
3. They are easy to implement in computer's They are difficult to implement in
memory since they are organized computer's memory since the data
sequentially. element can be attached to various other
data elements.
4. Example: List, Stacks, Queue etc. Example: Tree, Graph etc.

Operations on Data Structures : -

Operations on
Data Structures

Creation Destroy Selection Update Deletion Sort Merge Searching Traverse

- Creation: To create a new data structure, this operation is to allocate memory at compile time or run
time for data.
- Destroy: To delete data structure or to destroy a data structure, this operation is to deallocates memory
which is already allocated with help of malloc() or calloc() function.
- Selection: As per application or requirement, one wants to work on some specific data and other wants
to work on entire data available in data structure. Selection is used by programmers to access the data
within data structures.
- Update: This operation is used to edit or change the data within the data structure. This operation
updates or alters or changes the data.

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 6


- Insertion: Insert operation is performed to insert/add one or more data element in the existing/already
created data structure at a particular position.
- Deletion: to delete/remove one or more data element from the existing/already created data structure at a
particular position. Deletion should be performed in such a way that remaining elements should not be
in inconsistent state.
- Traversing: Traverse operation is generally performed to access each and every element of data
structure exactly one's.
- Searching: Search operation is performed on any data structure to find out the location of particular data
element in data structure.
- Sorting: to arrange all the data of data structure in a specific order either ascending or descending order.
- Merging: Merge operation is performed to combine two data structure into a single one.
- Copying: Copy operation performed for copying the contents of one data structure to another.
- Concatenation: This operation performed for combining the data from two or more data structure.

Algorithm:-
Definition :- An algorithm is a step by step method of solving a problem. They form the foundation of writing a
program.
Properties/Criteria/Characteristics of an algorithm : -
1. Input – input data supplied externally.
2. Output – Result of the program.
3. Definiteness – Each instruction must be clear and distinct.
4. Finiteness – it means algorithm must stop/terminates after finite number of steps.
5. Effectiveness – Effective means should be considered as principle and should be executing in finite
time, also it should be feasible to convert the algorithm into computer program.

Different approaches for designing an algorithm : -


- Top-Down Approach : -
- Top-down approaches emphasize first planning and a complete understanding of the system.
- No coding can begin until a sufficient level of detail has been reached in the design of at least some part
of the system.
- Programmer tries to partition the solution into subtasks.
- Each subtask is similarly decomposed until all tasks are decomposed within programming language.

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 7


Bottom – up Approach : -

- This is the reverse of the top-down approach.


- Different sub parts of the problem are first solved using a programming language.
- And then these pieces of parts are combined into a complete program.

Algorithmic complexity/Analysis: [what is Algorithmic complexity]


1. Algorithmic complexity is a measure of how much time and memory an algorithm would take to
complete for input of size n.
2. This complexity is calculated in terms of
o Time: how much time an algorithm takes to complete.
o Space: how much space an algorithm takes to complete.

3. Finding the exact algorithms complexity is difficult because the algorithmic complexity depends on
many factors as :
o Number if input or output.
o The algorithm used to calculate instruction.
o The software on which the algorithm executed.
o The hardware.
o Machine instruction.
4. In complexity analysis, only the dominant term is retained OR we consider only the most significant
term.
e.g. if an algorithm requires 2n3+logn+4 operations, its order is said to be O(n3) since 2 n3 is the
dominant term.
5. Constants and scaling factors are ignored since we are concerned only about asymptotic behavior.

Why to calculate complexity:


1. To calculate how much time and memory an algorithm would take to complete for input of size n.
2. To select the most efficient algorithm when we compare 2 algorithm on the basis of time and memory.

When to calculate complexity:


- When you are going to execute an algorithm very few times then we are NOT interested in
calculating the algorithmic complexity, we only look for
o how easy it is understandable,
o writable and readable.
Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 8
o Easy to debug.
- But when we execute the algorithm say 100 times then we need the calculate the time and
memory requirement for the algorithm.

Time complexity :-
- Time complexity gives total time required by the program to run till its completion.
- This is calculated by counting the number of elementary steps OR the dominant term OR we
consider only the most significant term performed by any algorithm to finish execution.
- The time complexity of algorithms is most commonly expressed using the big O notation.
- It's an asymptotic notation to represent the time complexity.
- As the algorithm's performance may vary with different types of input data, hence for an algorithm we
usually use the worst-case Time complexity because that is the maximum time taken for any input size.

- E.g. Algorithm A : a = a + 1.
- Above statement is independent and not contained in any loop thus executed only once and the
frequency count for this algorithm is 1.

- E.g. Algorithm B : for i=0 to n step 1


a = a + 1
loop
- Above algorithm has one basic operation a=a+1, but this is contained within a for loop which executes n times.
The frequency count for this algorithm is n.

- E.g. Algorithm C : for i=0 to n step 1


for i=0 to n step 1
a = a + 1
loop
loop
- In above algorithm a=a+1 is going to be executed n2, as there is a nested loop and the inner loop is going to be
executed n times every time the outer loop runs and the outer loop also runs n times so the frequency count for
this algorithm is n2.
Example 1: Let take an example of algorithm:

int i=1
loop ( i< =n )
pnint i
i=i+1

- Thus 1+(n+1)+n + n => 1+n+1+n+n => 3n+2


- Thus frequency count is 3n + 2. Neglecting the constants and by considering the order of magnitude we will
get O(n) as run-time complexity.

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 9


Example 2: Let take an example of algorithm:

int i=1
loop ( i< =n )
{
x=x+1
i=i+1
}
- Result is exactly same as above.

Example 3: Let take an example of algorithm:

int sum(int a[], int n)


{
int total = 0;
for(int i=0;i<n;i++)
{
total =total+a[i];
}
return(tota1);
}

- here = 1+(2n+2)+n+1 => 1+2n+2+n+1 =>1+2+1+2n+n => 3n+4,


- Thus, frequency count is 3n + 4. Neglecting the constants and by considering the order of magnitude we
will get O(n) as run-time complexity.

ONLY IF NEEDED ………


Use the following formula if needed for counting the number steps executed by the algorithm :
If time complexity is O(n)= n(n-1)/2
If time complexity is O(n2)= n(n-1)(2n+1)/6

Relationship between input size and Time Complexity : -


- As the input size increases so the time complexity also increases in order as shown in the below table :

Input size O(1) O (log n) O(n) O(n log n) O(n2) O(n3) O(2n)
1 1 0 1 0 1 1 2
2 1 1 2 2 4 8 4
4 1 2 4 8 16 64 16
8 1 3 8 24 64 512 256
18 1 4 16 64 256 4096 65535
32 1 5 32 160 1024 32768 2147483648

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 10


How to Measurement of Growth Rate :- [Asymptotic Consideration.] [Constant Factor Consideration.]
Two things needs to be considered when we compare the time complexities of 2 function.
1. Consider the behavior /result of algorithm only for the large value of ‘n’.
2. Do not consider the constant term because the time complexity largely depends on input ‘n’.
- e.g. consider the f1(n)=100n2 f2(n)=5n3. When we calculate these two function as.
N f1(n) f2(n)
1 100 5
5 2500 625
10 10000 5000
20 40000 40000
- We still prefer the solution given by f1(n) because by neglecting the constant and taking result only on the
larger value of input ‘n’, we get f1(n) ≤ f2(n) for all n≥20.

Space Complexity :-
- The space complexity of an algorithm is the amount of memory that it needs to run to completion.
- The space needed by the program is the sum of following components:

1. Fixed/Constant Space Requirement : - This includes the instruction space for simple variables,
fixed size structured variables and constants.
2. Variable/Linear Space Requirement - : These consist of space needed by structured variables
whose size depends on particular instance of variables. It also includes the additional space
required when the function uses recursion.

- In C programming language following is the space requirement for different primitive data type:
2 bytes to store Integer value,
4 bytes to store Floating Point value,
1 byte to store Character value,
8 bytes to store Double value

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 11


Asymptotic Notations
- Asymptotic Notations are languages that allow us to analyze an algorithm’s running time by identifying
its behavior as the input size for the algorithm increases. This is also known as an algorithm’s growth
rate.
- Asymptotic notation is the most simple and easiest way of describing the running time of an algorithm.
- it represents the efficiency and performance of an algorithm using meaningful notation.
- Asymptotic notation of an algorithm is a mathematical representation of its complexity.
- In other words, asymptotic notation is a way to express an algorithm's efficiency.
- It is called asymptotic because it deals with the behavior of the algorithm as the input size approaches
the asymptotic limit of infinity.
- There are three types of Asymptotic Notations of time complexity
 Big - Oh (O) for Worst case ( slowest time),
 Big - Omega (Ω) for best case (fastest possible) and
 Big – Theta (θ) for Average case (Average time)

 Best case, Worst case and Average case behavior : -


 For example: Let us take the example of searching for an element in an unordered array of length N.
The time complexity is given by:

1. Best Case: Fastest time to complete, with optimal inputs chosen. Best case for 'searching' algorithm
is element found at first position.it is denoted by Big - Omega (Ω).
2. Worst Case: Slowest time to complete Worst case for searching algorithm is element found at last
position or not found at all. Here, we need to do N comparison so time complexity of searching
algorithm is O (N) in worst case. Denoted by Big - Oh (O) notation.
3. Average Case: Average time to complete with random input chosen. Average case for searching
algorithm is element found in between the list. Here we need to do N/2 comparison. It is denoted by
Big – Theta (θ).

- Big - Oh (O) : -

- Big O is the formal method of expressing the upper bound of an algorithm's running time. It is the
measure of the longest amount of time required for the algorithm to complete.
- Commonly written as O, is an Asymptotic Notation for the worst case, or ceiling of growth for a given
function.

- Consider function f(n) and g(n) be the most significant term.


- Then If f(n) <= C g(n) for all n >= 1, C > 0, if this condition is true,
- Then we can represent f(n) as O(g(n)). i.e. f(n) = O(g(n)).

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 12


- Definition -- Consider the graph, In this graph after a particular input value ns, always C g(n) is greater than
f(n) which indicates the algorithm's upper bound.

- Example 1 : - f(n)=3n+2, g(n)=n


- If we want to represent f(n) as O(g(n)) then it must satisfy f(n) ≤ C g(n) for all values of C > 0 and n>= 1.
f(n) ≤ C g(n)
=>3n+2=Cn
- Above condition is always TRUE for all values of C = 4 and n ≥ 2. By using Big - Oh notation we can
represent the time complexity as 3n + 2 = O(n).
- It means we have the upper bound for the f(n) and g(n) for all values of C = 4 and n ≥ 2

- Example 2 : - f(n) = 3*n2


g(n) = n
- Is f(n) O(g(n)) ?
- Is 3 * n2 O(n) ? Let’s look at the definition of Big-O.
- 3 * n2 <= c * n
- Is there some pair of constants C, n that satisfies this for all n ≥1 and C> 0?
- No, there isn’t. f(n) is NOT O(g(n)).
- It means we cannot find the upper bound for function f(n) and g(n) for any value of C>0 and n≥1

[Any one example from the above can be given for big-O]
Big-Omega
- Big-Omega, commonly written as Ω, is an Asymptotic Notation for the Best case for a given function.
- It provides us with an asymptotic lower bound for the growth rate of runtime of an algorithm.
- It is measure of the fastest amount of time required for the algorithm to complete.

- Definition -- Consider function f(n) and g(n) is the most significant term.
- Then If f(n) >= C g(n) for all n >= 1, C > 0, if this condition is true,
- Then we can represent f(n) is Ω(g(n)). i.e. f(n) = Ω(g(n)).
- In this algorithm after a particular input ‘n’ , Cg(n) is less than f(n) always, this indicated the algorithm
lower bound, or best case or fastest execution time.

- Example 1 : - f(n)=3n+2, g(n)=n


- If we want to represent f(n) as Ω(g(n)) then it must satisfy f(n) ≥ C g(n) for all values of C > 0 and n>= 1.
f(n) ≥ Cg(n)
=>3n+2= C n
- Above condition is always TRUE for all values of C = 1 and n ≥ 1. By using Big - Omega notation we can
represent the time complexity as 3n + 2 = Ω(n).
- It means we have the lower bound for function f(n) and g(n) for all values of C = 1 and n ≥ 1.
Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 13
Big - Theta Notation (Θ)
- Big - Theta notation is used to define the average bound of an algorithm in terms of Time Complexity.
- That means Big - Theta notation always indicates the average time required by an algorithm for all input
values.
- That means Big - Theta notation describes the average case of an algorithm time complexity.
- Definition -- Consider function f(n) the time complexity of an algorithm and g(n) is the most significant
term.
- Then If C1 g(n) <= f(n) >= C2 g(n) for all n >= 1, C1, C2 > 0. Then we can represent f(n) as Θ(g(n)).

- In above graph after a particular input value n0, always C1 g(n) is less than f(n) and C2 g(n) is greater than f(n) which
indicates the algorithm's average bound.
- Example : - Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Θ(g(n))
- then it must satisfy C1 g(n) <= f(n) >= C2 g(n) for all values of C1, C2 > 0 and n>= 1

C1 g(n) <= f(n) >= ⇒C2 g(n)


C1 n <= 3n + 2 >= C2 n

- Above condition is always TRUE for all values of C1 = 1, C2 = 4 and n >= 1.


- By using Big - Theta notation we can represent the time complexity as follows 3n + 2 = Θ(n).

Mr. A.S.Anwar. Lect, Govt Poly, Dhule Page 14

You might also like