MCS 021 Data and File Structures
MCS 021 Data and File Structures
com
This assignment has twenty questions in all and carries 80 marks. The rest of the 20 marks are
for viva-voice. Answer all the questions. All questions carry equal marks (i.e. 4 marks each).
Please go through the guidelines regarding assignments given in the Programme Guide for the
format of presentation.
Disclaimer: This Assignment is prepared by our students. The Institution and publisher are not
responsible for any omission and errors.
Important Terms
Following are the important terms with respect to tree.
Path Path refers to the sequence of nodes along the edges of a tree.
Root The node at the top of the tree is called root. There is only one root per tree
and one path from the root node to any node.
Parent Any node except the root node has one edge upward to a node called parent.
Child The node below a given node connected by its edge downward is called its
child node.
Leaf The node which does not have any child node is called the leaf node.
Visiting Visiting refers to checking the value of a node when control is on the node.
Levels Level of a node represents the generation of a node. If the root node is at
level 0, then its next child node is at level 1, its grandchild is at level 2, and so on.
We're going to implement tree using node object and connecting them through
references.
Tree Node
The code to write a tree node would be similar to what is given below. It has a
data part and references to its left and right child nodes.
struct node {
int data;
};
We shall learn creating (inserting into) a tree structure and searching a data item
in a tree in this chapter. We shall learn about tree traversing methods in the
coming chapter.
Insert Operation
The very first insertion creates the tree. Afterwards, whenever an element is to
be inserted, first locate its proper location. Start searching from the root node,
then if the data is less than the key value, search for the empty location in the left
subtree and insert the data. Otherwise, search for the empty location in the right
subtree and insert the data.
Algorithm
If root is NULL
return
else
endwhile
insert data
end If
Implementation
The implementation of insert function should look like this
void insert(int data) {
tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild = NULL;
if(root == NULL) {
root = tempNode;
} else {
current = root;
parent = NULL;
while(1) {
parent = current;
parent->leftChild = tempNode;
return;
}
else {
current = current->rightChild;
if(current == NULL) {
parent->rightChild = tempNode;
return;
}
}
A real-world stack allows operations at one end only. For example, we can place
or remove a card or plate from the top of the stack only. Likewise, Stack ADT
allows all data operations at one end only. At any given time, we can only access
the top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here,
the element which is placed (inserted or added) last, is accessed first. In stack
terminology, insertion operation is called PUSH operation and removal operation
is called POP operation.
Stack Representation
The following diagram depicts a stack and its operations
Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing
it. Apart from these basic stuffs, a stack is used for the following two primary
operations
peek() get the top data element of the stack, without removing it.
1. #include <stdio.h>
2. #define SIZE 10
3.
4.
5. int ar[SIZE];
6. int top1 = -1;
7. int top2 = SIZE;
8.
9. //Functions to push data
10. void push_stack1 (int data)
11. {
12. if (top1 < top2 - 1)
13. {
14. ar[++top1] = data;
15. }
16. else
17. {
18. printf ("Stack Full! Cannot Push\n");
19. }
20. }
21. void push_stack2 (int data)
22. {
23. if (top1 < top2 - 1)
24. {
25. ar[--top2] = data;
26. }
27. else
28. {
29. printf ("Stack Full! Cannot Push\n");
30. }
31. }
32.
33. //Functions to pop data
34. void pop_stack1 ()
35. {
36. if (top1 >= 0)
37. {
38. int popped_value = ar[top1--];
39. printf ("%d is being popped from Stack 1\n", popped_value);
40. }
41. else
42. {
43. printf ("Stack Empty! Cannot Pop\n");
44. }
45. }
46. void pop_stack2 ()
47. {
48. if (top2 < SIZE)
49. {
96. {
97. push_stack2 (i);
98. printf ("Value Pushed in Stack 2 is %d\n", i);
99. }
100.
101. //Print Both Stacks
102. print_stack1 ();
103. print_stack2 ();
104.
105. //Pushing on Stack Full
106. printf ("Pushing Value in Stack 1 is %d\n", 11);
107. push_stack1 (11);
108.
109. //Popping All Elements From Stack 1
110. num_of_ele = top1 + 1;
111. while (num_of_ele)
112. {
113. pop_stack1 ();
114. --num_of_ele;
115. }
116.
117. //Trying to Pop From Empty Stack
118. pop_stack1 ();
119.
120. return 0;
121. }
Output :
gcc TwoStacksSingleArray.c
./a.out
We can push a total of 10 values
Value Pushed in Stack 1 is 1
Value Pushed in Stack 1 is 2
Value Pushed in Stack 1 is 3
Value Pushed in Stack 1 is 4
Value Pushed in Stack 1 is 5
Value Pushed in Stack 1 is 6
Value Pushed in Stack 2 is 1
Value Pushed in Stack 2 is 2
Value Pushed in Stack 2 is 3
Value Pushed in Stack 2 is 4
6 5 4 3 2 1
4 3 2 1
Pushing Value in Stack 1 is 11
Stack Full! Cannot Push
6 is being popped from Stack 1
5 is being popped from Stack 1
4 is being popped from Stack 1
3 is being popped from Stack 1
2 is being popped from Stack 1
1 is being popped from Stack 1
Stack Empty! Cannot Pop
Course Code : MCS-021 Course Title : Data and File Structures
Assignment Number : MCA(2)/021/Assignment/16-17 http://ignousolvedassignments.com
IGNOU Solved Assignments By http://ignousolvedassignments.com
Algorithm
Linear Search ( Array A, Value x)
Step 1: Set i to 1
Step 2: if i > n then go to step 7
Step 3: if A[i] = x then go to step 6
Step 4: Set i to i + 1
Step 5: Go to Step 2
Step 6: Print Element x Found at index i and go to step 8
Step 7: Print element not found
Step 8: Exit
Pseudocode
procedure linear_search (list, value)
end if
end for
end procedure
2) Binary Search----
Binary search is a fast search algorithm with run-time complexity of (log n). This search
algorithm works on the principle of divide and conquer. For this algorithm to work proper
ly, the data collection should be in the sorted form.
Binary search looks for a particular item by comparing the middle most item of the collec
tion. If a match occurs, then the index of item is returned. If the middle item is greater t
han the item, then the item is searched in the sub-array to the left of the middle item. Ot
herwise, the item is searched for in the sub-array to the right of the middle item. This pr
ocess continues on the sub-array as well until the size of the subarray reduces to zero.
For a binary search to work, it is mandatory for the target array to be sorted. We shall le
arn the process of binary search with a pictorial example. The following is our sorted arra
y and let us assume that we need to search the location of value 31 using binary search.
Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of the array.
Now we compare the value stored at location 4, with the value being searched, i.e. 31. W
e find that the value at location 4 is 27, which is not a match. As the value is greater tha
n 27 and we have a sorted array, so we also know that the target value must be in the u
pper portion of the array.
We change our low to mid + 1 and find the new mid value again.
low = mid + 1
mid = low + (high - low) / 2
Our new mid is 7 now. We compare the value stored at location 7 with our target value 3
1.
The value stored at location 7 is not a match, rather it is less than what we are looking fo
r. So, the value must be in the lower part from this location.
We compare the value stored at location 5 with our target value. We find that it is a matc
h.
Binary search halves the searchable items and thus reduces the count of comparisons to
be made to very less numbers.
Pseudocode
Procedure binary_search
A sorted array
n size of array
x value to be searched
Set lowerBound = 1
Set upperBound = n
if A[midPoint] < x
set lowerBound = midPoint + 1
if A[midPoint] > x
set upperBound = midPoint - 1
if A[midPoint] = x
EXIT: x found at location midPoint
end while
end procedure
Ans4. B-Tree---
Introduction
Tree structures support various basic dynamic set operations including Search, Pre
decessor, Successor, Minimum, Maximum, Insert, and Delete in time proportional t
o the height of the tree. Ideally, a tree will be balanced and the height will be log n w
here n is the number of nodes in the tree. To ensure that the height of the tree is as s
mall as possible and therefore provide the best running time, a balanced tree structu
re like a red-black tree, AVL tree, or b-tree must be used.
When working with large sets of data, it is often not possible or desirable to mainta
in the entire structure in primary storage (RAM). Instead, a relatively small portion
of the data structure is maintained in primary storage, and additional data is read fro
m secondary storage as needed. Unfortunately, a magnetic disk, the most common f
orm of secondary storage, is significantly slower than random access memory (RA
M). In fact, the system often spends more time retrieving data than actually process
ing data.
B-trees are balanced trees that are optimized for situations when part or all of the tr
ee must be maintained in secondary storage such as a magnetic disk. Since disk acc
esses are expensive (time consuming) operations, a b-tree tries to minimize the num
ber of disk accesses. For example, a b-tree with a height of 2 and a branching factor
of 1001 can store over one billion keys but requires at most two disk accesses to se
arch for any node (Cormen 384).
A b-tree has a minumum number of allowable children for each node known as th
e minimization factor. If t is this minimization factor, every node must have at leas
Course Code : MCS-021 Course Title : Data and File Structures
Assignment Number : MCA(2)/021/Assignment/16-17 http://ignousolvedassignments.com
IGNOU Solved Assignments By http://ignousolvedassignments.com
t t - 1 keys. Under certain circumstances, the root node is allowed to violate this pro
perty by having fewer than t - 1 keys. Every node may have at most 2t - 1 keys or, e
quivalently, 2t children.
Since each node tends to have a large branching factor (a large number of children)
, it is typically neccessary to traverse relatively few nodes before locating the desire
d key. If access to each node requires a disk access, then a b-tree will minimize the
number of disk accesses required. The minimzation factor is usually chosen so that
the total size of each node corresponds to a multiple of the block size of the underly
ing storage device. This choice simplifies and optimizes disk access. Consequently,
a b-tree is an ideal data structure for situations where all data cannot reside in prima
ry storage and accesses to secondary storage are comparatively expensive (or time c
onsuming).
Height of B-Trees
For n greater than or equal to one, the height of an n-key b-tree T of height h with a
minimum degree t greater than or equal to 2,
For a proof of the above inequality, refer to Cormen, Leiserson, and Rivest pages 383-384.
The worst case height is O(log n). Since the "branchiness" of a b-tree can be large c
ompared to many other balanced tree structures, the base of the logarithm tends to b
e large; therefore, the number of nodes visited during a search tends to be smaller t
han required by other tree structures. Although this does not affect the asymptotic w
orst case height, b-trees tend to have smaller heights than other trees with the same
asymptotic height.
Operations on B-Trees
The algorithms for the search, create, and insert operations are shown below. Note
that these algorithms are single pass; in other words, they do not traverse back up th
e tree. Since b-trees strive to minimize disk accesses and the nodes are usually store
d on disk, this single-pass approach will reduce the number of node visits and thus t
he number of disk accesses. Simpler double-pass approaches that move back up the
tree to fix violations are possible.
Since all nodes are assumed to be stored in secondary storage (disk) rather than pri
mary storage (memory), all references to a given node be be preceeded by a read op
eration denoted by Disk-Read. Similarly, once a node is modified and it is no longe
r needed, it must be written out to secondary storage with a write operation denoted
by Disk-Write. The algorithms below assume that all nodes referenced in parameter
s have already had a corresponding Disk-Read operation. New nodes are created an
d assigned storage with the Allocate-Node call. The implementation details of the D
isk-Read, Disk-Write, and Allocate-Node functions are operating system and imple
mentation dependent.
B-Tree-Search(x, k)
i <- 1
while i <= n[x] and k > keyi[x]
do i <- i + 1
if i <= n[x] and k = keyi[x]
then return (x, i)
if leaf[x]
then return NIL
else Disk-Read(ci[x])
return B-Tree-Search(ci[x], k)
B-Tree-Create(T)
x <- Allocate-Node()
leaf[x] <- TRUE
n[x] <- 0
Disk-Write(x)
root[T] <- x
The B-Tree-Create operation creates an empty b-tree by allocating a new root node
that has no keys and is a leaf node. Only the root node is permitted to have these pr
operties; all other nodes must meet the criteria outlined previously. The B-Tree-Cre
ate operation runs in time O(1).
B-Tree-Split-Child(x, i, y)
z <- Allocate-Node()
leaf[z] <- leaf[y]
n[z] <- t - 1
for j <- 1 to t - 1
do keyj[z] <- keyj+t[y]
if not leaf[y]
then for j <- 1 to t
do cj[z] <- cj+t[y]
n[y] <- t - 1
for j <- n[x] + 1 downto i + 1
do cj+1[x] <- cj[x]
ci+1 <- z
for j <- n[x] downto i
do keyj+1[x] <- keyj[x]
keyi[x] <- keyt[y]
n[x] <- n[x] + 1
Disk-Write(y)
Disk-Write(z)
Disk-Write(x)
If is node becomes "too full," it is necessary to perform a split operation. The split o
peration moves the median key of node x into its parent y where x is the ith child of y
. A new node, z, is allocated, and all keys in x right of the median key are moved t
o z. The keys left of the median key remain in the original node x. The new node, z,
becomes the child immediately to the right of the median key that was moved to the
parent y, and the original node, x, becomes the child immediately to the left of the m
edian key that was moved into the parent y.
The split operation transforms a full node with 2t - 1 keys into two nodes with t - 1 k
eys each. Note that one key is moved into the parent node. The B-Tree-Split-Child a
lgorithm will run in time O(t) where t is constant.
B-Tree-Insert(T, k)
r <- root[T]
if n[r] = 2t - 1
then s <- Allocate-Node()
root[T] <- s
leaf[s] <- FALSE
n[s] <- 0
c1 <- r
B-Tree-Split-Child(s, 1, r)
B-Tree-Insert-Nonfull(s, k)
else B-Tree-Insert-Nonfull(r, k)
B-Tree-Insert-Nonfull(x, k)
i <- n[x]
if leaf[x]
then while i >= 1 and k < keyi[x]
do keyi+1[x] <- keyi[x]
i <- i - 1
keyi+1[x] <- k
B-Tree-Delete
Deletion of a key from a b-tree is possible; however, special care must be taken to ensure that
the properties of a b-tree are maintained. Several cases must be considered. If the deletion red
uces the number of keys in a node below the minimum degree of the tree, this violation must b
e corrected by combining several nodes and possibly reducing the height of the tree. If the key
has children, the children must be rearranged.
Examples
Sample B-Tree
Applications
Databases
t is not uncommon for a database to contain millions of records requiring many gig
abytes of storage. For examples, TELSTRA, an Australian telecommunications co
mpany, maintains a customer billing database with 51 billion rows (yes, billion) an
d 4.2 terabytes of data. In order for a database to be useful and usable, it must supp
ort the desired operations, such as retrieval and storage, quickly. Because databases
cannot typically be maintained entirely in memory, b-trees are often used to index t
he data and to provide fast access. For example, searching an unindexed and unsort
ed database containing n key values will have a worst case running time of O(n); if
the same data is indexed with a b-tree, the same search operation will run in O(log n
). To perform a search for a single key on a set of one million keys (1,000,000), a li
near search will require at most 1,000,000 comparisons. If the same data is indexed
with a b-tree of minimum degree 10, 114 comparisons will be required in the worst
case. Clearly, indexing large amounts of data can significantly improve search perf
ormance. Although other balanced tree structures can be used, a b-tree also optimiz
es costly disk accesses that are of concern when dealing with large data sets.
Concurrent Access to B-Trees
Databases typically run in multiuser environments where many users can concurrently perfor
m operations on the database. Unfortunately, this common scenario introduces complications.
For example, imagine a database storing bank account balances. Now assume that someone at
tempts to withdraw $40 from an account containing $60. First, the current balance is checked
to ensure sufficent funds. After funds are disbursed, the balance of the account is reduced. Thi
s approach works flawlessly until concurrent transactions are considered. Suppose that anothe
r person simultaneously attempts to withdraw $30 from the same account. At the same time th
e account balance is checked by the first person, the account balance is also retrieved for the s
econd person. Since neither person is requesting more funds than are currently available, both
requests are satisfied for a total of $70. After the first person's transaction, $20 should remain
($60 - $40), so the new balance is recorded as $20. Next, the account balance after the second
person's transaction, $30 ($60 - $30), is recorded overwriting the $20 balance. Unfortunately,
$70 have been disbursed, but the account balance has only been decreased by $30. Clearly, th
is behavior is undesirable, and special precautions must be taken.