0% found this document useful (0 votes)
8 views

Module 2 Trees

Uploaded by

Muin Sayyad
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Module 2 Trees

Uploaded by

Muin Sayyad
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 167

Advanced Data Structures

Module 2
Trees
Computer Science and Engineering
IIIT Nagpur

26/11/2024 1
• A data structure is a technique of storing and organizing the data in such a way that the data can be utilized in
an efficient manner.

• A data structure is classified into two categories:


1. Linear data structure
2. Non-linear data structure

26/11/2024 2
Linear Data structure Non-Linear Data structure

Basic The elements are arranged sequentially The elements are arranged hierarchically or
or linearly and attached to one another. non-linear manner.

Types Arrays, linked list, stack, queue are the Trees and graphs are the types of a non-
types of a linear data structure. linear data structure.

implementation Due to the linear organization, they are Due to the non-linear organization, they are
easy to implement. difficult to implement.

Traversal As linear data structure is a single level, so The data items in a non-linear data structure
it requires a single run to traverse each cannot be accessed in a single run. It requires
data item. multiple runs to be traversed.

Arrangement Each data item is attached to the previous Each item is attached to many other items.
and next items.

Levels This data structure does not contain any In this, the data elements are arranged in
hierarchy, and all the data elements are multiple levels.
organized in a single level.

Memory In this, the memory utilization is not In this, memory is utilized in a very efficient
utilization efficient. manner.
Time complexity The time complexity of linear data structure The time complexity of non-linear data
increases with the increase in the input structure often remains same with the increase
size. in the input size.

Applications Linear data structures are mainly used for Non-linear data structures are used in image
developing the software. processing and Artificial Intelligence.
26/11/2024 3
Trees

26/11/2024 4
Trees

• A tree data structure is a hierarchical structure that is used to represent and


organize data in a way that is easy to navigate and search.
• It is a collection of nodes that are connected by edges and has a hierarchical
relationship between the nodes.
• The topmost node of the tree is called the root, and the nodes below it are called
the child nodes.
• Each node can have multiple child nodes, and these child nodes can also have their
own child nodes, forming a recursive structure.

26/11/2024 5
Basic Terminologies In Tree Data Structure:
1. Parent Node: The node which is a predecessor of a
node is called the parent node of that node. {B} is the
parent node of {D, E}.

2. Child Node: The node which is the immediate


successor of a node is called the child node of that
node. Examples: {D, E} are the child nodes of {B}.

3. Root Node: The topmost node of a tree or the node


which does not have any parent node is called the root
node. {A} is the root node of the tree. A non-empty
tree must contain exactly one root node

4. Leaf Node or External Node: The nodes which do not


have any child nodes are called leaf nodes. {K, L, M, N,
O, P} are the leaf nodes of the tree.

5. Ancestor of a Node: Any predecessor nodes on the


path of the root to that node are called Ancestors of
that node. {A,B} are the ancestor nodes of the
node 26/11/2024
{E} 6
Basic Terminologies In Tree Data Structure:
6. Descendant: Any successor node on the path from
the leaf node to that node. {E,I} are the descendants
of the node {B}.

7. Sibling: Children of the same parent node are called


siblings. {D,E} are called siblings.

8. Level of a node: Distance of any node from the root is


called as level. The root node has level 0.

9. Neighbour of a Node: Parent or child nodes of that


node are called neighbors of that node.

10. Subtree: Any node of the tree along with its


descendant.

11. Height: The height of the binary tree is the longest


path from root node to any leaf node in the tree.
26/11/2024 7
Binary Tree
• The Binary tree means that the node can have maximum two children.
• Here, binary name itself suggests that 'two'; therefore, each node can have either 0, 1
or 2 children.

• Node 1 contains two pointers, i.e., left and a right pointer pointing to the left and
right node respectively.
• The node 2 contains both the nodes (left and right node); therefore, it has two
pointers (left and right). The nodes 3, 5 and 6 are the leaf nodes, so all these nodes
26/11/2024 contain NULL pointer on both left and right parts. 8
Binary Tree

26/11/2024 9
Properties of Binary Tree
1. At each level of i, the maximum number of nodes is 2i.
2. The height of the tree is defined as the longest path from the root node to the leaf
node. The tree has a height equal to 3. Therefore, the maximum number of nodes
at height 3 is equal to (1+2+4+8) = 15 (h=3(Max L+1)). In general, the maximum
number of nodes possible at height h is (20 + 21 + 22+….2h) = 2h+1 -1.

3. The minimum number of nodes possible at height h is equal to h+1.

26/11/2024 10
Properties of Binary Tree

4. Total Number of leaf nodes in a Binary Tree = Total Number of nodes with 2
children + 1

5. If the number of nodes is minimum, then the height of the tree would be
maximum. Conversely, if the number of nodes is maximum, then the height of the
tree would be minimum.

6. In a non empty binary tree, if n is the total number of nodes and e is total number
of edges then e=n-1;

26/11/2024 11
Tree Traversal-
• Tree Traversal refers to the process of visiting each node in a tree data
structure exactly once.

26/11/2024 12
1. Preorder Traversal-

1. Preorder Traversal-

Algorithm-

1.Visit the root


2.Traverse the left sub tree i.e. call Preorder (left sub tree)
3.Traverse the right sub tree i.e. call Preorder (right sub tree)

Root → Left → Right


1. Preorder Traversal-

N :A Preorder Traversal Shortcut


L: Go Left
N :B R: Go Right N:C
L: Go Left L: Go Left
R: Go Right R: Go Right

N: E N: G
N:D N:F
L:NIL L: NIL
L: NIL L:NIL
R:NIL R: NIL
R :NIL R:NIL
Traverse the entire tree
starting from the root
node keeping yourself to
the left.

26/11/2024 14
2. Inorder Traversal-

Algorithm-

• Traverse the left sub tree i.e. call Inorder (left sub tree)
• Visit the root
• Traverse the right sub tree i.e. call Inorder (right sub tree)

Left → Root → Right


2. InorderTraversal-
Inorder Traversal Shortcut
L: Go Left
N :A
L: Go Left R: Go Right
N :B
R: Go Right L: Go Left
N:C
R: Go Right

L: NIL L:NIL
N:D L:NIL L: NIL
N: E N:F N: G
R :NIL R:NIL R:NIL R: NIL Keep a plane mirror horizontally
at the bottom of the tree and
take the projection of all the
nodes.

26/11/2024 16
3. Postorder Traversal-

1. Traverse the left sub tree i.e. call Postorder (left sub tree)
2. Traverse the right sub tree i.e. call Postorder (right sub tree)
3. Visit the root

Left → Right → Root


3. Postorder Traversal-
Inorder Traversal Shortcut
L: Go Left
R: Go Right
L: Go Left N :A
R: Go Right
N :B L: Go Left
R: Go Right
N:C

L: NIL
L:NIL L:NIL L: NIL
R :NIL
R:NIL R:NIL R: NIL
N:D
N: E N:F N: G Pluck all the leftmost leaf nodes
one by one

26/11/2024 18
In-order, pre-order, and post order Traversal Traversal
Without Recursion

26/11/2024 19
Binary Search Tree
Binary Search Tree is a node-based binary tree data structure which has the following
properties:

1. The left subtree of a node contains only nodes with keys lesser than the node’s key.
2. The right subtree of a node contains only nodes with keys greater than the node’s key.
3. The left and right subtree each must also be a binary search tree.

26/11/2024 20
Why do we need a Binary Search Tree?

Due to parent-child relations, the algorithm knows in which location of the tree the elements
need to be searched. This decreases the number of key-value comparisons the program has
to make to locate the desired element.

The algorithm efficiently supports operations like search, insert, and delete.

BST is commonly utilized to implement complex searches, robust game logics, auto-complete
activities, and graphics.

BST primarily offers the following three types of operations for your usage:
1. Search: searches the element from the binary search tree
2. Insert: adds an element to the binary search tree
3. Delete: delete the element from a binary search tree

26/11/2024 21
Search Operation

Always initiate analyzing tree at the root node and then move further to either the right or
left subtree of the root node depending upon the element to be located is either less or
greater than the root.

The element to be searched is 10


1. Compare the element with the root node 12, 10 <
12, hence you move to the left subtree. No need to
analyze the right-subtree.
2. Now compare 10 with node 7, 10 > 7, so move to the
right-subtree
3. Then compare 10 with the next node, which is 9, 10
> 9, look in the right subtree child
4. 10 matches with the value in the node, 10 = 10,
return the value to the user.

26/11/2024 22
Search Operation

26/11/2024 23
Insert Operation

This is a very straight forward operation. First, the root node is inserted, then the next value
is compared with the root node. If the value is greater than root, it is added to the right
subtree, and if it is lesser than the root, it is added to the left subtree.

1. Insert 12 as the root node and compare


next values 7 and 9 for inserting
accordingly into the right and left subtree.

2. Compare the remaining values 19, 5, and


10 with the root node 12 and place them
accordingly. 19 > 12 place it as the right
child of 12, 5 < 12 & 5 < 7, hence place it
as left child of 7. Now compare 10, 10 is <
12 & 10 is > 7 & 10 is > 9, place 10 as right
subtree of 9.
26/11/2024 24
Insert Operations

26/11/2024 25
Delete Operation

1. Case 1- Node with zero children: this is the easiest situation, you just need to delete the
node which has no further children on the right or left.
2. Case 2 – Node with one child: once you delete the node, simply connect its child node
with the parent node of the deleted value.
3. Case 3 Node with two children: this is the most difficult situation, and it works on the
following two rules
1. 3a – In Order Predecessor (left subtree rightmost node): you need to delete the
node with two children and replace it with the largest value on the left-subtree of
the deleted node
2. 3b – In Order Successor (right subtree leftmost node): you need to delete the node
with two children and replace it with the largest value on the right-subtree of the
deleted node

26/11/2024 26
Delete Operation

1. The node to be deleted is a leaf node


• Delete the value 19 and remove the link from the node.
• View the new structure of the BST without 19

26/11/2024 27
Delete Operation

Case 2 – Node with one child: once you delete the node, simply connect its child node with
the parent node of the deleted value.

• Delete the node 9 and replace it with its


child 10 and add a link from 7 to 10
• View the new structure of the BST without
9

26/11/2024 28
Delete Operation

3a – In Order Predecessor: You need to delete the node with two children and replace it
with the largest value on the left-subtree of the deleted node

• The deletion of the node will occur


based upon the in order predecessor
rule, which means that the largest
element on the left subtree of 12 will
replace it.
• Delete the node 12 and replace it with
10 as it is the largest value on the left
subtree
• View the new structure of the BST after
deleting 12

26/11/2024 29
Delete Operation

3b – In Order Successor: you need to delete the node with two children and replace it with
the largest value on the right-subtree of the deleted node

1. Delete a node 12 that has two


children
2. The deletion of the node will occur
based upon the In Order Successor
rule, which means that the largest
element on the right subtree of 12
will replace it
3. Delete the node 12 and replace it
with 19 as it is the largest value on the
right subtree
4. View the new structure of the BST
after deleting 12
26/11/2024 30
Complexities of Binary Tree

Space Complexity:
• The space complexity of a binary tree is O(n), where n is the
number of nodes in the tree.
• This is because each node requires memory to store its data
and references to its left and right children.

Time Complexity

• Searching: For searching element 2, we have to traverse all


elements (assuming we do breadth first traversal). Therefore,
searching in binary tree has worst case complexity of O(N).

• insertion in binary tree has worst case complexity of O(N).

• deletion of element 2, we have to traverse all elements to


find 2 (assuming we do breadth first traversal). Therefore,
26/11/2024
deletion in binary tree has worst case complexity of O(N). 31
Complexities of Binary Search Tree
Space Complexity:
• The space complexity of a binary tree is O(n), where n is the number of nodes in
the tree.
• This is because each node requires memory to store its data and references to
its left and right children.

Time Complexity

• Searching: For searching element 1, we have to traverse all elements (in order
3, 2, 1). Therefore, searching in binary search tree has worst case complexity of
O(n). In general, the time complexity is O(logn).
• insertion For inserting element 0, it must be inserted as the left child of 1.
Therefore, we need to traverse all elements (in order 3, 2, 1) to insert 0 which
has the worst-case complexity of O(n). In general, the time complexity is
O(logn).
• deletion of element 1, we have to traverse all elements to find 1 (in order 3, 2,
1). Therefore, deletion in binary tree has worst case complexity of O(n). In
26/11/2024 general, the time complexity is O(h). 32
Height Balanced Trees (AVL)

• AVL Tree, named after its inventors Adelson-Velsky and Landis who published it in their
1962 paper “An algorithm for the organization of information”.

• It is a special variation of Binary Search Tree which exhibits self-balancing property i.e.,
AVL Trees automatically attain the minimal possible height of the tree after the execution
of any operation.

• AVL Trees were developed to achieve logarithmic time complexity in BSTs irrespective of
the order (Ascending or Descending) in which the elements were inserted.

• For all nodes, the subtrees’ height difference should be at most 1 i.e. The value of
balance factor should always be -1, 0 or +1.

balance_factor = (Height of Left sub-tree) - (Height of right sub-tree)

26/11/2024 33
Tree Rotations:

• It is the process of changing the structure of the tree by moving smaller subtrees down and larger subtrees
up, without interfering with the order of the elements.

• If the balance factor of any node doesn't follow the AVL Balancing criterion, the AVL Trees make use of 4
different types of Tree rotations to re-balance themselves.

26/11/2024 34
1. LL Rotation

A B
L
LL Rotation
B BL
AR
L A

BL BR
26/11/2024
BR 35 AR
Examples

26/11/2024 36
2. RR Rotation

A
B
R
RR Rotation
AL
B A
R BR

BL BR AL BL
26/11/2024 37
Examples

26/11/2024 38
3. LR Rotation (RR+LL)

26/11/2024 39
3. LR Rotation
-2 A
L
A
L
RR Rotation C
-1 AR
B
L
AR
R
B CR

BL C
R
BL CL
C

CL CR
B A LL Rotation

BL CL CR AR
26/11/2024 40
Examples

26/11/2024 41
3. RL Rotation

26/11/2024 42
3. RL Rotation(LL + RR)
2 A
A
R LL Rotation R

AL -1 B
1 AL
C
L R

C CL
BR B
L
C
CL CR CR BR

A B
RR Rotation

AL CL CR AR
26/11/2024 43
Deletion in AVL Trees

Step 1: Find the element in the tree.

Step 2: Delete the node, as per the BST Deletion.

Step 3: Two cases are possible:-

Case 1: Deleting from the right subtree.

1A. If BF(node) = +2 and BF(node -> left-child) = +1, perform LL rotation.


1B. If BF(node) = +2 and BF(node -> left-child) = -1, perform LR rotation.
1C. If BF(node) = +2 and BF(node -> left-child) = 0, perform LL rotation.

26/11/2024 44
Examples

-1

26/11/2024 45
Deletion in AVL Trees

Step 1: Find the element in the tree.

Step 2: Delete the node, as per the BST Deletion.

Step 3: Two cases are possible:-

Case 2: Deleting from left subtree.


2A. If BF(node) = -2 and BF(node -> right-child) = -1, perform RR rotation.
2B. If BF(node) = -2 and BF(node -> right-child) = +1, perform RL rotation.
2C. If BF(node) = -2 and BF(node -> right-child) = 0, perform RR rotation.

26/11/2024 46
Heap Data structure

Step 1: Find the element in the tree.

Step 2: Delete the node, as per the BST Deletion.

Step 3: Two cases are possible:-

Case 2: Deleting from left subtree.


2A. If BF(node) = -2 and BF(node -> right-child) = -1, perform RR rotation.
2B. If BF(node) = -2 and BF(node -> right-child) = +1, perform RL rotation.
2C. If BF(node) = -2 and BF(node -> right-child) = 0, perform RR rotation.

26/11/2024 48
Heap

• A binary heap is a Binary Tree with the following two properties-

• Elements in the heap tree are arranged • Binary heap is an almost complete binary
in specific order. tree.
• This gives rise to two types of heaps- • It has all its levels completely filled except
min heap and max heap. possibly the last level.
• The last level is strictly filled from left to
right.
Nearly complete binary tree

• Every level except bottom is complete.


• On the bottom, nodes are placed as left as possible.

1
1
6
6
2 3
2 3
5 3
5 3
4 5 6
4 5 6
2 4 1
2 1 4

No! Yes!
Heap

• The binary heap data structures is an array that can be viewed as a complete binary
tree. Each node of the binary tree corresponds to an element of the array. The array
is completely filled on all levels except possibly lowest.

19

12 16

1 4 7

1 12 16 1 4 7
9
Array A
Heap

• The root of the tree A[1] and given index i of a node, the indices of its parent,
left child and right child can be computed

PARENT (i)
return floor(i/2) or floor(i-1/2)

LEFT (i)
return 2i or (2i+1)

RIGHT (i)
return 2i + 1 or (2i+2)
Types Heap
Types Heap

• Max Heap 19
• Has property of
A[Parent(i)] ≥ A[i]
12 16

1 4 7

1 12 16 1 4 7
9
Array A
Types Heap

• Min Heap 1
• Has property of
A[Parent(i)] ≤ A[i]
4 16

7 12 19

1 4 16 7 12 19

Array A
Insertion

Add the new element to the next available position at the lowest level

1.Restore the max-heap property if violated


 General strategy is percolate up (or bubble up): if the parent of the element is
smaller than the element, then interchange the parent and child.

OR

Restore the min-heap property if violated


 General strategy is percolate up (or bubble up): if the parent of the element is
larger than the element, then interchange the parent and child.
19 19

12 16 12 16

4 7 1 4 7 1
1
7
Insert 17
19

12 17
swap

1 4 7 1
6
Percolate up to maintain the
heap property
Deletion

• Delete max
• Copy the last number to the root ( overwrite the maximum element stored there ).
• Restore the max heap property by percolate down.

• Delete min
• Copy the last number to the root ( overwrite the minimum element stored there ).
• Restore the min heap property by percolate down.
Heap

Heap Sort: A sorting algorithm that works by first organizing the data to be sorted into a
special type of binary tree called a heap

Procedures on Heap

1. Heapify
2. Build Heap
3. Heap Sort
Heapify

• Heapify picks the largest child key and compare it to the parent key. If parent key is
larger than heapify quits, otherwise it swaps the parent key with the largest child key.
So that the parent is now becomes larger than its children.
Heapify(A, i)
{
largest=i
l  left(i) (2i)
r  right(i) (2i+1)
if l <= heapsize[A] and A[l] > A[i]
then largest l
if r <= heapsize[A] and A[r] > A[largest]
then largest  r
if largest != i
then swap A[i]  A[largest]
Heapify(A, largest)
}
Build Heap

• We can use the procedure 'Heapify' in a bottom-up fashion to convert an array A[1 . . n]
into a heap.
• Since the elements in the subarray A[n/2 +1 . . n] are all leaves, the procedure
BUILD_HEAP goes through the remaining nodes of the tree and runs 'Heapify' on each
one.
• The bottom-up order of processing node guarantees that the subtree rooted at
children are heap before 'Heapify' is run at their parent.

Buildheap(A)
{
heapsize[A] length[A]
for i  length[A]/2 //down to 1
do Heapify(A, i)
}
Build Heap

• The heap sort algorithm starts by using procedure BUILD-HEAP to build a heap on the
input array A[1 . . n].
• Since the maximum element of the array stored at the root A[1], it can be put into its
correct final position by exchanging it with A[n] (the last element in A).
• If we now discard node n from the heap than the remaining elements can be made into
heap.
• Note that the new element at the root may violate the heap property. All that is
needed to restore the heap property.
Heapsort(A)
{
Buildheap(A)
for i  length[A] //down to 2
do swap A[1]  A[i]
heapsize[A]  heapsize[A] - 1
Heapify(A, 1)
}
4

1 3

2 16 9 10

14 8 7
4

1 3

2 16 9 10

14 8 7
4

1 3

16 9 10
14

2 8 7
4

1 3

14 16 9 10

2 8 7
4

1 10

14 16 9 3

2 8 7
4

1 10

14 16 9 3

2 8 7
4

16 10

14 1 9 3

2 8 7
4

16 10

14 7 9 3

2 8 1
16

4 10

14 7 9 3

2 8 1
16

14 10

4 7 9 3

2 8 1
16

14 10

7 9 3
8

2 4 1
16 16

4 7 4 19
swa
p

12 19 1 12 7
1

16 19
swa
p
12 19 12 16
swa
p

4 7 1 4 7
1
Heap Sort

• The heapsort algorithm consists of two phases:


- build a heap from an arbitrary array
- use the heap to sort the data

• To sort the elements in the decreasing order, use a min heap


• To sort the elements in the increasing order, use a max heap

19

12 16

1 4 7
Example of Heap Sort
Take out biggest
19

12 16
Move the last element
to the root

1 4 7

Sorted:
Array A

12 1 1 4 7 19
6
7
swap
HEAPIFY()
12 16

1 4

Sorted:
Array A

7 12 1 1 4 19
6
16

12 7

1 4

Sorted:
Array A

1 12 7 1 4 19
6
Take out biggest
16
Move the last element
to the root
12 7

1 4

Sorted:
Array A

12 7 1 4 1 19
6
4

12 7

Sorted:
Array A

4 12 7 1 1 19
6
swap 4

HEAPIFY()
12 7

Sorted:
Array A

4 12 7 1 1 19
6
12

4 7

Sorted:
Array A

12 4 7 1 1 19
6
Take out biggest
12
Move the last
element to the
root 4 7

Sorted:
Array A

4 7 1 12 1 19
6
1
swap

4 7

Sorted:
Array A

1 4 7 12 1 19
6
7

4 1

Sorted:
Array A

7 4 1 12 1 19
6
Take out biggest
7
Move the last
element to the
4 1 root

Sorted:
Array A

1 4 7 12 1 19
6
swap 1

HEAPIFY()
4

Sorted:
Array A

4 1 7 12 1 19
6
Take out biggest
Move the last 4
element to the
root
1

Sorted:
Array A

1 4 7 12 1 19
6
Take out biggest
1

Sorted:
Array A

1 4 7 12 1 19
6
Sorted:

1 4 7 12 16 19
Time Analysis Heap Sort

• Build Heap Algorithm will run in O(n) time


• There are n-1 calls to Heapify each call requires O(log n) time
• Heap sort program combine Build Heap program and Heapify, therefore it has the
running time of O(n log n) time
• Total time complexity: O(n log n)
Comparison with Quick Sort and Merge Sort

• Quick sort is typically somewhat faster, due to better cache behavior and other
factors, but the worst-case running time for quick sort is O (n2), which is
unacceptable for large data sets and can be deliberately triggered given enough
knowledge of the implementation, creating a security risk.

• The quick sort algorithm also requires Ω (log n) extra storage space, making it
not a strictly in-place algorithm. This typically does not pose a problem except
on the smallest embedded systems, or on systems where memory allocation is
highly restricted. Constant space (in-place) variants of quick sort are possible to
construct, but are rarely used in practice due to their extra complexity.
Comparison with Quick Sort and Merge Sort

• Thus, because of the O(n log n) upper bound on heap sort’s running time and
constant upper bound on its auxiliary storage, embedded systems with real-time
constraints or systems concerned with security often use heap sort.

• Heap sort also competes with merge sort, which has the same time bounds, but
requires Ω(n) auxiliary space, whereas heap sort requires only a constant
amount. Heap sort also typically runs more quickly in practice. However, merge
sort is simpler to understand than heap sort, is a stable sort, parallelizes better,
and can be easily adapted to operate on linked lists and very large lists stored on
slow-to-access media such as disk storage or network attached storage. Heap
sort shares none of these benefits; in particular, it relies strongly on random
access.
Possible Application

• When we want to know the task that carry the highest priority
given a large number of things to do

• Interval scheduling, when we have a lists of certain task with start


and finish times and we want to do as many tasks as possible

• Sorting a list of elements that needs and efficient sorting


algorithm
Conclusion

• The primary advantage of the heap sort is its efficiency. The execution time efficiency of
the heap sort is O(n log n). The memory efficiency of the heap sort, unlike the other n log
n sorts, is constant, O(1), because the heap sort algorithm is not recursive.
• The heap sort algorithm has two major steps. The first major step involves transforming
the complete tree into a heap. The second major step is to perform the actual sort by
extracting the largest element from the root and transforming the remaining tree into a
heap.
Red Black Tree

• the performance of a binary search tree is highly dependent on its shape,


and in the worst case, it can degenerate into a linear structure with a
time complexity of O(n).
• “Balanced” binary search trees guarantee an O(lgn) running time.

• Red Black Trees are self-balancing, meaning that the tree adjusts itself
automatically after each insertion or deletion operation.

• It is a Binary search tree with an additional attribute for its nodes: color which can
be red or black
Properties of Red Black Tree:

1. Every node is either red or black

2. The root is black

3. Every leaf which is (NIL) is black

4. For each node, all paths from that node to descendant NIL leaves
contain the same number of black nodes

5. If a node is red, then both its children are black (RR )


• No two consecutive red nodes on a simple path from the root to a leaf
Properties of Red Black Tree:

• the root node is BLACK,


• there are no two adjacent RED nodes and the same number of BLACK nodes are present from
any node to its descendant NULL node.
Properties of Red Black Tree:
Red Black Tree

• Search: This operation involves traversing the tree from the root to the leaf nodes,
following the correct path based on whether the key is less than or greater than the
node's key. Since Red-Black trees are balanced, the time complexity of search operation is
O(log n).

• Insertion: When a new node is inserted, the tree might become unbalanced, violating the
Red-Black tree properties. Therefore, the tree is restructured and recolored via rotation
operations to restore balance. Despite these additional steps, the time complexity of
insertion remains O(log n).

• Deletion: Similar to insertion, deleting a node might disrupt the tree balance. After the
node is removed, rotations and recoloring are performed to maintain the Red-Black tree
properties. The time complexity of deletion is also O(log n).
Multiway Trees and external sorting

searching 25
Insertion Red Black Tree

• In the Red-Black tree, we use two tools to do the balancing.


1. Recoloring
2. Rotation
RBT
1. Perform standard BST insertion and make the color of newly inserted nodes as RED.
2. If x is the root, change the color of x as BLACK
3. If the parent is black then no need of change
4. Do the following if the color of x’s parent is RED and x is not the root.
a) If x’s uncle is RED (Grandparent must have been black from property 4)
(i) Change the colour of parent and uncle as BLACK.
(ii) Colour of a grandparent as RED.
(iii) Change x = x’s grandparent, repeat steps 2 and 3 for new x.
b) If x’s uncle is BLACK or NULL, then there can be four configurations for x, x’s parent (p)
and x’s grandparent (g) (This is similar to AVL Tree)
(i) Left Left Case (p is left child of g and x is left child of p)
(ii) Left Right Case (p is left child of g and x is the right child of p)
(iii) Right Right Case (Mirror of case i)
(iv) Right Left Case (Mirror of case ii)
Red Black Tree
Creating a red-black tree with elements 3, 21, 32 and 15 in an empty tree.
When the first element is inserted it is inserted as a root node and as root node has black colour so it
acquires the colour black.
Red Black Tree
Creating a red-black tree with elements 3, 21, 32 and 15 in an empty tree.
When the first element is inserted it is inserted as a root node and as root node has black colour so it
acquires the color black.
Red Black Tree
Creating a red-black tree with elements 3, 21, 32 and 15 in an empty tree.
When the first element is inserted it is inserted as a root node and as root node has black colour so it
acquires the color black.
Red Black Tree

• The m-way search trees are multi-way trees which are generalized versions of binary trees
where each node contains multiple elements.

• Each node can point to at most m subtree.

• In an m-Way tree of order m, each node contains a maximum of m – 1 elements and m


children.
Problem
• Insert the following numbers (one by one) into a Red Black Tree. Draw the tree after each
insertion. Place an ‘R’ or circle red nodes.

• 1,2,3,4,10,14,7,6,12
Answer

• Here is the final tree.


4
/ \
2R 10R
/ \ / \
1 3 7 14
/ /
6R 12R
Recall the rules for BST deletion

1. If vertex to be deleted is a leaf, just delete it.


2. If vertex to be deleted has just one child, replace it with that child
3. If vertex to be deleted has two children, replace the value of by it’s
in-order predecessor’s value then delete the in-order predecessor
(a recursive step)
What can go wrong?

1. If the delete node is red?

Not a problem – no RB properties violated

2. If the deleted node is black?

If the node is not the root, deleting it will change the black-height
along some path
Deletion Steps

Let v be the node to be deleted and u be the child that replaces v.


Simple Case: If either u or v is red,
Deletion Steps

3) If Both u and v are Black.


3.1) Color u as double black. Now our task reduces to convert this double black to
single black. Note that If v is leaf, then u is NULL and color of NULL is considered black.
So the deletion of a black leaf also causes a double black.
Deletion Steps

3.2) Do following while the current node u is double black, and it is not the root. Let sibling
of node be s.
Deletion Steps

3.2) Do following while the current node u is double black, and it is not the root. Let sibling
of node be s.
(a) If sibling s is black and at least one of sibling’s children is red, perform rotation(s).
Let the red child of s be r. This case can be divided in four subcases depending upon
positions of s and r.
(i) Left Left Case (s is left child of its parent and r is left child of s or both children of s are red). This is
mirror of right right case shown in below diagram.
(ii) Left Right Case (s is left child of its parent and r is right child). This is mirror of right left case shown in
below diagram.
(iii) Right Right Case (s is right child of its parent and r is right child of s or both children of s are red)
(iv) Right Left Case (s is right child of its parent and r is left child of s)
Deletion Steps
Deletion Steps

(iv) Right Left Case (s is right child of its parent and r is left child of s)
Deletion Steps

(iv) Right Left Case (s is right child of its parent and r is left child of s)
Deletion Steps

(b): If sibling is black and its both children are black, perform recoloring, and recur for the parent if parent is
black
Deletion Steps

(b): If sibling is black and its both children are black, perform recoloring, and recur for the parent if parent is
black
In this case, if parent was red, then we didn’t need to recur for parent, we can simply make it black (red +
double black = single black)

(c): If sibling is red, perform a rotation to move old sibling up, recolor the old sibling and parent. The new
sibling is always black (See the below diagram). This mainly converts the tree to black sibling case (by rotation)
and leads to case (a) or (b). This case can be divided in two subcases.
…………..(i) Left Case (s is left child of its parent). This is mirror of right right case shown in below diagram. We
right rotate the parent p.
…………..(ii) Right Case (s is right child of its parent). We left rotate the parent p.

If u is root, make it single black and return (Black height of complete tree reduces by 1).
Deletion Steps
Splay Tree

• Splay tree is a self-adjusting binary search tree data structure, which means that the tree
structure is adjusted dynamically based on the accessed or inserted elements.

• The splay tree was first introduced by Daniel Dominic Sleator and Robert Endre Tarjan in
1985.

• The basic idea behind splay trees is to bring the most recently accessed or inserted
element to the root of the tree by performing a sequence of tree rotations, called
splaying.

• Splaying is a process of restructuring the tree by making the most recently accessed or
inserted element the new root and gradually moving the remaining nodes closer to the
root.
Splay Tree

• A splay tree is a self-balancing binary search tree, designed for efficient access to data
elements based on their key values.

• The splay tree was first introduced by Daniel Dominic Sleator and Robert Endre Tarjan in
1985.

• The basic idea behind splay trees is to bring the most recently accessed or inserted
element to the root of the tree by performing a sequence of tree rotations, called
splaying.

• Splaying is a process of restructuring the tree by making the most recently accessed or
inserted element the new root and gradually moving the remaining nodes closer to the
root.
Splay Tree

• Splay trees are highly efficient in practice due to their self-adjusting nature, which
reduces the overall access time for frequently accessed elements.

• This makes them a good choice for applications that require fast and dynamic data
structures, such as caching systems, data compression, and network routing algorithms.
Splay Tree

• Operations in a splay tree:

• Insertion: To insert a new element into the tree, start by performing a regular binary search
tree insertion. Then, apply rotations to bring the newly inserted element to the root of the
tree.
• Deletion: To delete an element from the tree, first locate it using a binary search tree
search. Then, if the element has no children, simply remove it. If it has one child, promote
that child to its position in the tree. If it has two children, find the successor of the element
(the smallest element in its right subtree), swap its key with the element to be deleted, and
delete the successor instead.
• Search: To search for an element in the tree, start by performing a binary search tree
search. If the element is found, apply rotations to bring it to the root of the tree. If it is not
found, apply rotations to the last node visited in the search, which becomes the new root.
• Rotation: The rotations used in a splay tree are either a Zig or a Zig-Zig rotation. A Zig
rotation is used to bring a node to the root, while a Zig-Zig rotation is used to balance the
tree after multiple accesses to elements in the same subtree.
Splay Tree

• Operations in a splay tree:

• Insertion: To insert a new element into the tree, start by performing a regular binary
search tree insertion. Then, apply rotations to bring the newly inserted element to the
root of the tree.
• Deletion: To delete an element from the tree, first locate it using a binary search tree
search. Then, if the element has no children, simply remove it. If it has one child,
promote that child to its position in the tree. If it has two children, find the successor of
the element (the smallest element in its right subtree), swap its key with the element to
be deleted, and delete the successor instead.
• Search: To search for an element in the tree, start by performing a binary search tree
search. If the element is found, apply rotations to bring it to the root of the tree. If it is
not found, apply rotations to the last node visited in the search, which becomes the new
root.
Splay Tree

• Rotations:

1. Zig rotation [Right Rotation]


2. Zig zig [Two Right Rotations]
3. Zag rotation [Left Rotation]
4. Zag zag [Two Left Rotations]
5. Zig zag [Zig followed by Zag]
6. Zag zig [Zag followed by Zig]
1. Zig rotation [Right Rotation] 2. Zig zig [Two Right Rotations]
Splay Tree

Search for 9
Search for 3
3. Zag rotation [Left Rotation] 4. Zig zig [Two Right Rotations]
Splay Tree

Search for 7
Search for 13
5. Zig zag [Zig followed by Zag] 6. Zag Zig Rotation
Splay Tree

Search for 7

Search for 5
Insertion in a Splay Tree
• We normally insert a node in a splay tree and splay it to the root.
Insertion in a Splay Tree
• We normally insert a node in a splay tree and splay it to the root.
Deletion
Deletion in a Splay Tree
1 3
Splay the maximum node (node having the maximum
value) of Tree1.

2 Split the tree into two trees Tree1 = root’s left subtree 4 After the Splay procedure, make Root2 as the right child
and Tree2 = root’s right subtree and delete the root node. of Root1 and return Root1.
B-Tree
• Binary trees can have a maximum of 2 children while a Multiway Search Tree, commonly
known as a M-way tree can yield a maximum of m children where m>2.

• Due to their structure, M-way trees are mainly used in external searching, i.e. in
situations where data is to be retrieved from secondary storage like disk files. Following
is how an m-way tree looks like.

• Each node can contain a maximum of m-1 keys and a minimum of ⌈m/2⌉ - 1 keys.
A very small B tree

Bottom nodes are leaf nodes: all their pointers are NULL
In reality

In Key In Key In Key In Key In


tree tree tree tree tree
ptr Data ptr ptr Data ptr ptr Data ptr ptr Data ptr ptr

7 16 -- --
To To To
Null Null
Leaf leaf Leaf
Null Null
Why use B-Tree
• Here, are reasons of using B-Tree
• Reduces the number of reads made on the disk
• B Trees can be easily optimized to adjust its size (that is the number of child nodes)
according to the disk size
• It is a specially designed technique for handling a bulky amount of data.
• It is a useful algorithm for databases and file systems.
• A good choice to opt when it comes to reading and writing large blocks of data
B-Tree
• All leaves will be created at the same level.

• B-Tree is determined by a number of degree, which is also called “order” (specified by


an external actor, like a programmer), referred to as m

• The left subtree of the node will have lesser values than the right side of the subtree.
This means that the nodes are also sorted in ascending order from left to right.

• The maximum number of child nodes, a root node as well as its child nodes can contain
are calculated by this formula: m-1
Search
Search 120 in the given B-Tree.
Search
Search 120 in the given B-Tree.
B-Tree Example with m = 5

12

2 3 8 13 27

• The root has been 2 and m children and contains a minimum of 1


key
• Each non-root internal node has between m/2 and m children.
And m/2 -1 keys to m-1
• All external nodes are at the same level. (External nodes are actually 161
represented by null pointers in implementations.) and m/2 -1
Insert 10

12

2 3 8 10 13 27

• We find the location for 10 by following a path from the root using the
stored key values to guide the search.
• The search falls out the tree at the 4th child of the 1st child of the root.
• The 1st child of the root has room for the new element, so we store it
there. 162
Insert 11

12

2 3 8 10 11 13 27

• We fall out of the tree at the child to the right of key 10.
• But there is no more room in the left child of the root to hold 11.
• Therefore, we must split this node...

163
InsertContd..
11 (Continued)

8 12

2 3 10 11 13 27

• The m + 1 children are divided evenly between the old and new
nodes.
• The parent gets one new child. (If the parent become overfull, then it,
too, will have to be split).
CSE 373 -- AU 2004 -- B-Trees 164
Remove 8

8 12

2 3 10 11 13 27

• Removing 8 might force us to move another key up from one of


the children. It could either be the 3 from the 1st child or the 10
from the second child.
• However, neither child has more than the minimum number of
children (3), so the two nodes will have to be merged. Nothing
moves up. 165
Remove 8 (Continued)

12

2 3 10 11 13 27

The root contains one fewer key, and has one fewer
child.

166
Remove 13

12

2 3 10 11 13 27

• Removing 13 would cause the node containing it to become underfull.


• To fix this, we try to reassign one key from a sibling that has spares.

167
Remove 13 (Cont)

11

2 3 10 12 27

• The 13 is replaced by the parent’s key 12.


• The parent’s key 12 is replaced by the spare key 11 from the left sibling.
• The sibling has one fewer element.

CSE 373 -- AU 2004 -- B-Trees 168


Remove 11

11

2 3 10 12 27

• 11 is in a non-leaf, so replace it by the value immediately preceding: 10.


• 10 is at leaf, and this node has spares, so just delete it there.

169
Remove 11 (Cont)

10

2 3 12 27

170
Remove 2

10

2 3 12 27

• Although 2 is at leaf level, removing it leads to an underfull node.


• The node has no left sibling. It does have a right sibling, but that node is
at its minimum occupancy already.
• Therefore, the node must be merged with its right sibling.
171
Remove 2 (Cont)

3 10 12 27

• The result is illegal, because the root does not have at least 2 children.
• Therefore, we must remove the root, making its child the new root.

CSE 373 -- AU 2004 -- B-Trees 172


Remove 2 (Cont)

3 10 12 27

The new B-tree has only one node, the root.

CSE 373 -- AU 2004 -- B-Trees 173


Insert
Remove 49
2 (Cont)

3 10 12 27

Let’s put an element into this B-tree.

CSE 373 -- AU 2004 -- B-Trees 174


Insert 49 (Cont)

3 10 12 27 49

Adding this key make the node overfull, so it must be split into two.
But this node was the root.
So we must construct a new root, and make these its children.

CSE 373 -- AU 2004 -- B-Trees 175


Insert 49 (Cont)

12

3 10 27 49

The middle key (12) is moved up into the root.


The result is a B-tree with one more level.

CSE 373 -- AU 2004 -- B-Trees 176


B-Tree performance

Let h = height of the B-tree.


get(k): at most h disk accesses. O(h)
put(k): at most 3h + 1 disk accesses. O(h)
remove(k): at most 3h disk accesses. O(h)

h < log d (n + 1)/2 + 1 where d = m/2 (Sahni, p.641).


An important point is that the constant factors are relatively low.
m should be chosen so as to match the maximum node size to
the block size on the disk.
Example: m = 128, d = 64, n  643 = 262144 , h = 4.

182
B+Tree
• Variant of B trees
• Two types of nodes
• Internal nodes have no data pointers
• Leaf nodes have no in-tree pointers
• Were all null!
B+Tree
• If m is the order of the tree
• Every internal node has at most m children.
• Every internal node (except root) has at least ⌈m ⁄ 2⌉ children.
• The root has at least two children if it is not a leaf node.
• Every leaf has at most m − 1 keys
• An internal node with k children has k − 1 keys.
• All leaves appear in the same level
B+Tree
• We need to use the following data to create the B+ Tree: 1, 4, 7, 10, 17, 21, 31
• We suppose the order(m) of the tree to be 4. The following facts can be deduced
from this:
• Max Children = 4 Min Children: m/2 = 2 Max Keys: m - 1 = 3 Min Keys: ⌈m/2⌉ - 1
=1
• https://www.scaler.com/topics/data-structures/b-plus-trees/
Parameters B+ Tree B Tree

Separate leaf nodes for data storage and internal nodes


Structure for indexing Nodes store both keys and data values

Leaf Nodes Leaf nodes form a linked list for efficient range-based Leaf nodes do not form a linked list
queries

Order Higher order (more keys) Lower order (fewer keys)

Key Duplication Typically allows key duplication in leaf nodes Usually does not allow key duplication

Disk Access Better disk access due to sequential reads in linked list More disk I/O due to non-sequential reads in
structure internal nodes

Database systems, file systems, where range queries are In-memory data structures, databases, general-
Applications common purpose use

Better performance for range queries and bulk data Balanced performance for search, insert, and
Performance retrieval delete operations

Memory Usage Requires more memory for internal nodes Requires less memory as keys and values are
stored in the same node
structure of the internal nodes of a B+ tree

In In In In In In
tree Key tree Key tree Key tree Key tree Key tree
ptr ptr ptr ptr ptr ptr

Key Key Key Key Key Key

Data ptr Data ptr Data ptr Data ptr Data ptr Data ptr
B+Tree
• Each internal node is of the form: <P1, K1, P2, K2, ….., Pc-1, Kc-1, Pc> where c <= a and
each Pi is a tree pointer (i.e points to another node of the tree) and, each Ki is a key-
value (see diagram-I for reference).

• Every internal node has : K1 < K2 < …. < Kc-1

• For each search field values ‘X’ in the sub-tree pointed at by Pi, the following condition
holds : Ki-1 < X <= Ki, for 1 < i < c and, Ki-1 < X, for i = c (See diagram I for reference)
structure of the leaf nodes of a B+ tree
• The structure of the leaf nodes of a B+ tree of order ‘b’ is as follows:
B+Tree
• Each leaf node is of the form: <<K1, D1>, <K2, D2>, ….., <Kc-1, Dc-1>, Pnext> where c
<= b and each Di is a data pointer (i.e points to actual record in the disk whose
key value is Ki or to a disk file block containing that record) and, each Ki is a key
value and, Pnext points to next leaf node in the B+ tree (see diagram II for
reference).
• Every leaf node has : K1 < K2 < …. < Kc-1, c <= m
• Each leaf node has at least \ceil(m/2) values.
• All leaf nodes are at the same level.
B+Tree

• Using the Pnext pointer it is viable to traverse all the leaf nodes, just like a linked list,
thereby achieving ordered access to the records stored in the disk.

You might also like