Module 2 Trees
Module 2 Trees
Module 2
Trees
Computer Science and Engineering
IIIT Nagpur
26/11/2024 1
• A data structure is a technique of storing and organizing the data in such a way that the data can be utilized in
an efficient manner.
26/11/2024 2
Linear Data structure Non-Linear Data structure
Basic The elements are arranged sequentially The elements are arranged hierarchically or
or linearly and attached to one another. non-linear manner.
Types Arrays, linked list, stack, queue are the Trees and graphs are the types of a non-
types of a linear data structure. linear data structure.
implementation Due to the linear organization, they are Due to the non-linear organization, they are
easy to implement. difficult to implement.
Traversal As linear data structure is a single level, so The data items in a non-linear data structure
it requires a single run to traverse each cannot be accessed in a single run. It requires
data item. multiple runs to be traversed.
Arrangement Each data item is attached to the previous Each item is attached to many other items.
and next items.
Levels This data structure does not contain any In this, the data elements are arranged in
hierarchy, and all the data elements are multiple levels.
organized in a single level.
Memory In this, the memory utilization is not In this, memory is utilized in a very efficient
utilization efficient. manner.
Time complexity The time complexity of linear data structure The time complexity of non-linear data
increases with the increase in the input structure often remains same with the increase
size. in the input size.
Applications Linear data structures are mainly used for Non-linear data structures are used in image
developing the software. processing and Artificial Intelligence.
26/11/2024 3
Trees
26/11/2024 4
Trees
26/11/2024 5
Basic Terminologies In Tree Data Structure:
1. Parent Node: The node which is a predecessor of a
node is called the parent node of that node. {B} is the
parent node of {D, E}.
• Node 1 contains two pointers, i.e., left and a right pointer pointing to the left and
right node respectively.
• The node 2 contains both the nodes (left and right node); therefore, it has two
pointers (left and right). The nodes 3, 5 and 6 are the leaf nodes, so all these nodes
26/11/2024 contain NULL pointer on both left and right parts. 8
Binary Tree
26/11/2024 9
Properties of Binary Tree
1. At each level of i, the maximum number of nodes is 2i.
2. The height of the tree is defined as the longest path from the root node to the leaf
node. The tree has a height equal to 3. Therefore, the maximum number of nodes
at height 3 is equal to (1+2+4+8) = 15 (h=3(Max L+1)). In general, the maximum
number of nodes possible at height h is (20 + 21 + 22+….2h) = 2h+1 -1.
26/11/2024 10
Properties of Binary Tree
4. Total Number of leaf nodes in a Binary Tree = Total Number of nodes with 2
children + 1
5. If the number of nodes is minimum, then the height of the tree would be
maximum. Conversely, if the number of nodes is maximum, then the height of the
tree would be minimum.
6. In a non empty binary tree, if n is the total number of nodes and e is total number
of edges then e=n-1;
26/11/2024 11
Tree Traversal-
• Tree Traversal refers to the process of visiting each node in a tree data
structure exactly once.
26/11/2024 12
1. Preorder Traversal-
1. Preorder Traversal-
Algorithm-
N: E N: G
N:D N:F
L:NIL L: NIL
L: NIL L:NIL
R:NIL R: NIL
R :NIL R:NIL
Traverse the entire tree
starting from the root
node keeping yourself to
the left.
26/11/2024 14
2. Inorder Traversal-
Algorithm-
• Traverse the left sub tree i.e. call Inorder (left sub tree)
• Visit the root
• Traverse the right sub tree i.e. call Inorder (right sub tree)
L: NIL L:NIL
N:D L:NIL L: NIL
N: E N:F N: G
R :NIL R:NIL R:NIL R: NIL Keep a plane mirror horizontally
at the bottom of the tree and
take the projection of all the
nodes.
26/11/2024 16
3. Postorder Traversal-
1. Traverse the left sub tree i.e. call Postorder (left sub tree)
2. Traverse the right sub tree i.e. call Postorder (right sub tree)
3. Visit the root
L: NIL
L:NIL L:NIL L: NIL
R :NIL
R:NIL R:NIL R: NIL
N:D
N: E N:F N: G Pluck all the leftmost leaf nodes
one by one
26/11/2024 18
In-order, pre-order, and post order Traversal Traversal
Without Recursion
26/11/2024 19
Binary Search Tree
Binary Search Tree is a node-based binary tree data structure which has the following
properties:
1. The left subtree of a node contains only nodes with keys lesser than the node’s key.
2. The right subtree of a node contains only nodes with keys greater than the node’s key.
3. The left and right subtree each must also be a binary search tree.
26/11/2024 20
Why do we need a Binary Search Tree?
Due to parent-child relations, the algorithm knows in which location of the tree the elements
need to be searched. This decreases the number of key-value comparisons the program has
to make to locate the desired element.
The algorithm efficiently supports operations like search, insert, and delete.
BST is commonly utilized to implement complex searches, robust game logics, auto-complete
activities, and graphics.
BST primarily offers the following three types of operations for your usage:
1. Search: searches the element from the binary search tree
2. Insert: adds an element to the binary search tree
3. Delete: delete the element from a binary search tree
26/11/2024 21
Search Operation
Always initiate analyzing tree at the root node and then move further to either the right or
left subtree of the root node depending upon the element to be located is either less or
greater than the root.
26/11/2024 22
Search Operation
26/11/2024 23
Insert Operation
This is a very straight forward operation. First, the root node is inserted, then the next value
is compared with the root node. If the value is greater than root, it is added to the right
subtree, and if it is lesser than the root, it is added to the left subtree.
26/11/2024 25
Delete Operation
1. Case 1- Node with zero children: this is the easiest situation, you just need to delete the
node which has no further children on the right or left.
2. Case 2 – Node with one child: once you delete the node, simply connect its child node
with the parent node of the deleted value.
3. Case 3 Node with two children: this is the most difficult situation, and it works on the
following two rules
1. 3a – In Order Predecessor (left subtree rightmost node): you need to delete the
node with two children and replace it with the largest value on the left-subtree of
the deleted node
2. 3b – In Order Successor (right subtree leftmost node): you need to delete the node
with two children and replace it with the largest value on the right-subtree of the
deleted node
26/11/2024 26
Delete Operation
26/11/2024 27
Delete Operation
Case 2 – Node with one child: once you delete the node, simply connect its child node with
the parent node of the deleted value.
26/11/2024 28
Delete Operation
3a – In Order Predecessor: You need to delete the node with two children and replace it
with the largest value on the left-subtree of the deleted node
26/11/2024 29
Delete Operation
3b – In Order Successor: you need to delete the node with two children and replace it with
the largest value on the right-subtree of the deleted node
Space Complexity:
• The space complexity of a binary tree is O(n), where n is the
number of nodes in the tree.
• This is because each node requires memory to store its data
and references to its left and right children.
Time Complexity
Time Complexity
• Searching: For searching element 1, we have to traverse all elements (in order
3, 2, 1). Therefore, searching in binary search tree has worst case complexity of
O(n). In general, the time complexity is O(logn).
• insertion For inserting element 0, it must be inserted as the left child of 1.
Therefore, we need to traverse all elements (in order 3, 2, 1) to insert 0 which
has the worst-case complexity of O(n). In general, the time complexity is
O(logn).
• deletion of element 1, we have to traverse all elements to find 1 (in order 3, 2,
1). Therefore, deletion in binary tree has worst case complexity of O(n). In
26/11/2024 general, the time complexity is O(h). 32
Height Balanced Trees (AVL)
• AVL Tree, named after its inventors Adelson-Velsky and Landis who published it in their
1962 paper “An algorithm for the organization of information”.
• It is a special variation of Binary Search Tree which exhibits self-balancing property i.e.,
AVL Trees automatically attain the minimal possible height of the tree after the execution
of any operation.
• AVL Trees were developed to achieve logarithmic time complexity in BSTs irrespective of
the order (Ascending or Descending) in which the elements were inserted.
• For all nodes, the subtrees’ height difference should be at most 1 i.e. The value of
balance factor should always be -1, 0 or +1.
26/11/2024 33
Tree Rotations:
• It is the process of changing the structure of the tree by moving smaller subtrees down and larger subtrees
up, without interfering with the order of the elements.
• If the balance factor of any node doesn't follow the AVL Balancing criterion, the AVL Trees make use of 4
different types of Tree rotations to re-balance themselves.
26/11/2024 34
1. LL Rotation
A B
L
LL Rotation
B BL
AR
L A
BL BR
26/11/2024
BR 35 AR
Examples
26/11/2024 36
2. RR Rotation
A
B
R
RR Rotation
AL
B A
R BR
BL BR AL BL
26/11/2024 37
Examples
26/11/2024 38
3. LR Rotation (RR+LL)
26/11/2024 39
3. LR Rotation
-2 A
L
A
L
RR Rotation C
-1 AR
B
L
AR
R
B CR
BL C
R
BL CL
C
CL CR
B A LL Rotation
BL CL CR AR
26/11/2024 40
Examples
26/11/2024 41
3. RL Rotation
26/11/2024 42
3. RL Rotation(LL + RR)
2 A
A
R LL Rotation R
AL -1 B
1 AL
C
L R
C CL
BR B
L
C
CL CR CR BR
A B
RR Rotation
AL CL CR AR
26/11/2024 43
Deletion in AVL Trees
26/11/2024 44
Examples
-1
26/11/2024 45
Deletion in AVL Trees
26/11/2024 46
Heap Data structure
26/11/2024 48
Heap
• Elements in the heap tree are arranged • Binary heap is an almost complete binary
in specific order. tree.
• This gives rise to two types of heaps- • It has all its levels completely filled except
min heap and max heap. possibly the last level.
• The last level is strictly filled from left to
right.
Nearly complete binary tree
1
1
6
6
2 3
2 3
5 3
5 3
4 5 6
4 5 6
2 4 1
2 1 4
No! Yes!
Heap
• The binary heap data structures is an array that can be viewed as a complete binary
tree. Each node of the binary tree corresponds to an element of the array. The array
is completely filled on all levels except possibly lowest.
19
12 16
1 4 7
1 12 16 1 4 7
9
Array A
Heap
• The root of the tree A[1] and given index i of a node, the indices of its parent,
left child and right child can be computed
PARENT (i)
return floor(i/2) or floor(i-1/2)
LEFT (i)
return 2i or (2i+1)
RIGHT (i)
return 2i + 1 or (2i+2)
Types Heap
Types Heap
• Max Heap 19
• Has property of
A[Parent(i)] ≥ A[i]
12 16
1 4 7
1 12 16 1 4 7
9
Array A
Types Heap
• Min Heap 1
• Has property of
A[Parent(i)] ≤ A[i]
4 16
7 12 19
1 4 16 7 12 19
Array A
Insertion
Add the new element to the next available position at the lowest level
OR
12 16 12 16
4 7 1 4 7 1
1
7
Insert 17
19
12 17
swap
1 4 7 1
6
Percolate up to maintain the
heap property
Deletion
• Delete max
• Copy the last number to the root ( overwrite the maximum element stored there ).
• Restore the max heap property by percolate down.
• Delete min
• Copy the last number to the root ( overwrite the minimum element stored there ).
• Restore the min heap property by percolate down.
Heap
Heap Sort: A sorting algorithm that works by first organizing the data to be sorted into a
special type of binary tree called a heap
Procedures on Heap
1. Heapify
2. Build Heap
3. Heap Sort
Heapify
• Heapify picks the largest child key and compare it to the parent key. If parent key is
larger than heapify quits, otherwise it swaps the parent key with the largest child key.
So that the parent is now becomes larger than its children.
Heapify(A, i)
{
largest=i
l left(i) (2i)
r right(i) (2i+1)
if l <= heapsize[A] and A[l] > A[i]
then largest l
if r <= heapsize[A] and A[r] > A[largest]
then largest r
if largest != i
then swap A[i] A[largest]
Heapify(A, largest)
}
Build Heap
• We can use the procedure 'Heapify' in a bottom-up fashion to convert an array A[1 . . n]
into a heap.
• Since the elements in the subarray A[n/2 +1 . . n] are all leaves, the procedure
BUILD_HEAP goes through the remaining nodes of the tree and runs 'Heapify' on each
one.
• The bottom-up order of processing node guarantees that the subtree rooted at
children are heap before 'Heapify' is run at their parent.
Buildheap(A)
{
heapsize[A] length[A]
for i length[A]/2 //down to 1
do Heapify(A, i)
}
Build Heap
• The heap sort algorithm starts by using procedure BUILD-HEAP to build a heap on the
input array A[1 . . n].
• Since the maximum element of the array stored at the root A[1], it can be put into its
correct final position by exchanging it with A[n] (the last element in A).
• If we now discard node n from the heap than the remaining elements can be made into
heap.
• Note that the new element at the root may violate the heap property. All that is
needed to restore the heap property.
Heapsort(A)
{
Buildheap(A)
for i length[A] //down to 2
do swap A[1] A[i]
heapsize[A] heapsize[A] - 1
Heapify(A, 1)
}
4
1 3
2 16 9 10
14 8 7
4
1 3
2 16 9 10
14 8 7
4
1 3
16 9 10
14
2 8 7
4
1 3
14 16 9 10
2 8 7
4
1 10
14 16 9 3
2 8 7
4
1 10
14 16 9 3
2 8 7
4
16 10
14 1 9 3
2 8 7
4
16 10
14 7 9 3
2 8 1
16
4 10
14 7 9 3
2 8 1
16
14 10
4 7 9 3
2 8 1
16
14 10
7 9 3
8
2 4 1
16 16
4 7 4 19
swa
p
12 19 1 12 7
1
16 19
swa
p
12 19 12 16
swa
p
4 7 1 4 7
1
Heap Sort
19
12 16
1 4 7
Example of Heap Sort
Take out biggest
19
12 16
Move the last element
to the root
1 4 7
Sorted:
Array A
12 1 1 4 7 19
6
7
swap
HEAPIFY()
12 16
1 4
Sorted:
Array A
7 12 1 1 4 19
6
16
12 7
1 4
Sorted:
Array A
1 12 7 1 4 19
6
Take out biggest
16
Move the last element
to the root
12 7
1 4
Sorted:
Array A
12 7 1 4 1 19
6
4
12 7
Sorted:
Array A
4 12 7 1 1 19
6
swap 4
HEAPIFY()
12 7
Sorted:
Array A
4 12 7 1 1 19
6
12
4 7
Sorted:
Array A
12 4 7 1 1 19
6
Take out biggest
12
Move the last
element to the
root 4 7
Sorted:
Array A
4 7 1 12 1 19
6
1
swap
4 7
Sorted:
Array A
1 4 7 12 1 19
6
7
4 1
Sorted:
Array A
7 4 1 12 1 19
6
Take out biggest
7
Move the last
element to the
4 1 root
Sorted:
Array A
1 4 7 12 1 19
6
swap 1
HEAPIFY()
4
Sorted:
Array A
4 1 7 12 1 19
6
Take out biggest
Move the last 4
element to the
root
1
Sorted:
Array A
1 4 7 12 1 19
6
Take out biggest
1
Sorted:
Array A
1 4 7 12 1 19
6
Sorted:
1 4 7 12 16 19
Time Analysis Heap Sort
• Quick sort is typically somewhat faster, due to better cache behavior and other
factors, but the worst-case running time for quick sort is O (n2), which is
unacceptable for large data sets and can be deliberately triggered given enough
knowledge of the implementation, creating a security risk.
• The quick sort algorithm also requires Ω (log n) extra storage space, making it
not a strictly in-place algorithm. This typically does not pose a problem except
on the smallest embedded systems, or on systems where memory allocation is
highly restricted. Constant space (in-place) variants of quick sort are possible to
construct, but are rarely used in practice due to their extra complexity.
Comparison with Quick Sort and Merge Sort
• Thus, because of the O(n log n) upper bound on heap sort’s running time and
constant upper bound on its auxiliary storage, embedded systems with real-time
constraints or systems concerned with security often use heap sort.
• Heap sort also competes with merge sort, which has the same time bounds, but
requires Ω(n) auxiliary space, whereas heap sort requires only a constant
amount. Heap sort also typically runs more quickly in practice. However, merge
sort is simpler to understand than heap sort, is a stable sort, parallelizes better,
and can be easily adapted to operate on linked lists and very large lists stored on
slow-to-access media such as disk storage or network attached storage. Heap
sort shares none of these benefits; in particular, it relies strongly on random
access.
Possible Application
• When we want to know the task that carry the highest priority
given a large number of things to do
• The primary advantage of the heap sort is its efficiency. The execution time efficiency of
the heap sort is O(n log n). The memory efficiency of the heap sort, unlike the other n log
n sorts, is constant, O(1), because the heap sort algorithm is not recursive.
• The heap sort algorithm has two major steps. The first major step involves transforming
the complete tree into a heap. The second major step is to perform the actual sort by
extracting the largest element from the root and transforming the remaining tree into a
heap.
Red Black Tree
• Red Black Trees are self-balancing, meaning that the tree adjusts itself
automatically after each insertion or deletion operation.
• It is a Binary search tree with an additional attribute for its nodes: color which can
be red or black
Properties of Red Black Tree:
4. For each node, all paths from that node to descendant NIL leaves
contain the same number of black nodes
• Search: This operation involves traversing the tree from the root to the leaf nodes,
following the correct path based on whether the key is less than or greater than the
node's key. Since Red-Black trees are balanced, the time complexity of search operation is
O(log n).
• Insertion: When a new node is inserted, the tree might become unbalanced, violating the
Red-Black tree properties. Therefore, the tree is restructured and recolored via rotation
operations to restore balance. Despite these additional steps, the time complexity of
insertion remains O(log n).
• Deletion: Similar to insertion, deleting a node might disrupt the tree balance. After the
node is removed, rotations and recoloring are performed to maintain the Red-Black tree
properties. The time complexity of deletion is also O(log n).
Multiway Trees and external sorting
searching 25
Insertion Red Black Tree
• The m-way search trees are multi-way trees which are generalized versions of binary trees
where each node contains multiple elements.
• 1,2,3,4,10,14,7,6,12
Answer
If the node is not the root, deleting it will change the black-height
along some path
Deletion Steps
3.2) Do following while the current node u is double black, and it is not the root. Let sibling
of node be s.
Deletion Steps
3.2) Do following while the current node u is double black, and it is not the root. Let sibling
of node be s.
(a) If sibling s is black and at least one of sibling’s children is red, perform rotation(s).
Let the red child of s be r. This case can be divided in four subcases depending upon
positions of s and r.
(i) Left Left Case (s is left child of its parent and r is left child of s or both children of s are red). This is
mirror of right right case shown in below diagram.
(ii) Left Right Case (s is left child of its parent and r is right child). This is mirror of right left case shown in
below diagram.
(iii) Right Right Case (s is right child of its parent and r is right child of s or both children of s are red)
(iv) Right Left Case (s is right child of its parent and r is left child of s)
Deletion Steps
Deletion Steps
(iv) Right Left Case (s is right child of its parent and r is left child of s)
Deletion Steps
(iv) Right Left Case (s is right child of its parent and r is left child of s)
Deletion Steps
(b): If sibling is black and its both children are black, perform recoloring, and recur for the parent if parent is
black
Deletion Steps
(b): If sibling is black and its both children are black, perform recoloring, and recur for the parent if parent is
black
In this case, if parent was red, then we didn’t need to recur for parent, we can simply make it black (red +
double black = single black)
(c): If sibling is red, perform a rotation to move old sibling up, recolor the old sibling and parent. The new
sibling is always black (See the below diagram). This mainly converts the tree to black sibling case (by rotation)
and leads to case (a) or (b). This case can be divided in two subcases.
…………..(i) Left Case (s is left child of its parent). This is mirror of right right case shown in below diagram. We
right rotate the parent p.
…………..(ii) Right Case (s is right child of its parent). We left rotate the parent p.
If u is root, make it single black and return (Black height of complete tree reduces by 1).
Deletion Steps
Splay Tree
• Splay tree is a self-adjusting binary search tree data structure, which means that the tree
structure is adjusted dynamically based on the accessed or inserted elements.
• The splay tree was first introduced by Daniel Dominic Sleator and Robert Endre Tarjan in
1985.
• The basic idea behind splay trees is to bring the most recently accessed or inserted
element to the root of the tree by performing a sequence of tree rotations, called
splaying.
• Splaying is a process of restructuring the tree by making the most recently accessed or
inserted element the new root and gradually moving the remaining nodes closer to the
root.
Splay Tree
• A splay tree is a self-balancing binary search tree, designed for efficient access to data
elements based on their key values.
• The splay tree was first introduced by Daniel Dominic Sleator and Robert Endre Tarjan in
1985.
• The basic idea behind splay trees is to bring the most recently accessed or inserted
element to the root of the tree by performing a sequence of tree rotations, called
splaying.
• Splaying is a process of restructuring the tree by making the most recently accessed or
inserted element the new root and gradually moving the remaining nodes closer to the
root.
Splay Tree
• Splay trees are highly efficient in practice due to their self-adjusting nature, which
reduces the overall access time for frequently accessed elements.
• This makes them a good choice for applications that require fast and dynamic data
structures, such as caching systems, data compression, and network routing algorithms.
Splay Tree
• Insertion: To insert a new element into the tree, start by performing a regular binary search
tree insertion. Then, apply rotations to bring the newly inserted element to the root of the
tree.
• Deletion: To delete an element from the tree, first locate it using a binary search tree
search. Then, if the element has no children, simply remove it. If it has one child, promote
that child to its position in the tree. If it has two children, find the successor of the element
(the smallest element in its right subtree), swap its key with the element to be deleted, and
delete the successor instead.
• Search: To search for an element in the tree, start by performing a binary search tree
search. If the element is found, apply rotations to bring it to the root of the tree. If it is not
found, apply rotations to the last node visited in the search, which becomes the new root.
• Rotation: The rotations used in a splay tree are either a Zig or a Zig-Zig rotation. A Zig
rotation is used to bring a node to the root, while a Zig-Zig rotation is used to balance the
tree after multiple accesses to elements in the same subtree.
Splay Tree
• Insertion: To insert a new element into the tree, start by performing a regular binary
search tree insertion. Then, apply rotations to bring the newly inserted element to the
root of the tree.
• Deletion: To delete an element from the tree, first locate it using a binary search tree
search. Then, if the element has no children, simply remove it. If it has one child,
promote that child to its position in the tree. If it has two children, find the successor of
the element (the smallest element in its right subtree), swap its key with the element to
be deleted, and delete the successor instead.
• Search: To search for an element in the tree, start by performing a binary search tree
search. If the element is found, apply rotations to bring it to the root of the tree. If it is
not found, apply rotations to the last node visited in the search, which becomes the new
root.
Splay Tree
• Rotations:
Search for 9
Search for 3
3. Zag rotation [Left Rotation] 4. Zig zig [Two Right Rotations]
Splay Tree
Search for 7
Search for 13
5. Zig zag [Zig followed by Zag] 6. Zag Zig Rotation
Splay Tree
Search for 7
Search for 5
Insertion in a Splay Tree
• We normally insert a node in a splay tree and splay it to the root.
Insertion in a Splay Tree
• We normally insert a node in a splay tree and splay it to the root.
Deletion
Deletion in a Splay Tree
1 3
Splay the maximum node (node having the maximum
value) of Tree1.
2 Split the tree into two trees Tree1 = root’s left subtree 4 After the Splay procedure, make Root2 as the right child
and Tree2 = root’s right subtree and delete the root node. of Root1 and return Root1.
B-Tree
• Binary trees can have a maximum of 2 children while a Multiway Search Tree, commonly
known as a M-way tree can yield a maximum of m children where m>2.
• Due to their structure, M-way trees are mainly used in external searching, i.e. in
situations where data is to be retrieved from secondary storage like disk files. Following
is how an m-way tree looks like.
• Each node can contain a maximum of m-1 keys and a minimum of ⌈m/2⌉ - 1 keys.
A very small B tree
Bottom nodes are leaf nodes: all their pointers are NULL
In reality
7 16 -- --
To To To
Null Null
Leaf leaf Leaf
Null Null
Why use B-Tree
• Here, are reasons of using B-Tree
• Reduces the number of reads made on the disk
• B Trees can be easily optimized to adjust its size (that is the number of child nodes)
according to the disk size
• It is a specially designed technique for handling a bulky amount of data.
• It is a useful algorithm for databases and file systems.
• A good choice to opt when it comes to reading and writing large blocks of data
B-Tree
• All leaves will be created at the same level.
• The left subtree of the node will have lesser values than the right side of the subtree.
This means that the nodes are also sorted in ascending order from left to right.
• The maximum number of child nodes, a root node as well as its child nodes can contain
are calculated by this formula: m-1
Search
Search 120 in the given B-Tree.
Search
Search 120 in the given B-Tree.
B-Tree Example with m = 5
12
2 3 8 13 27
12
2 3 8 10 13 27
• We find the location for 10 by following a path from the root using the
stored key values to guide the search.
• The search falls out the tree at the 4th child of the 1st child of the root.
• The 1st child of the root has room for the new element, so we store it
there. 162
Insert 11
12
2 3 8 10 11 13 27
• We fall out of the tree at the child to the right of key 10.
• But there is no more room in the left child of the root to hold 11.
• Therefore, we must split this node...
163
InsertContd..
11 (Continued)
8 12
2 3 10 11 13 27
• The m + 1 children are divided evenly between the old and new
nodes.
• The parent gets one new child. (If the parent become overfull, then it,
too, will have to be split).
CSE 373 -- AU 2004 -- B-Trees 164
Remove 8
8 12
2 3 10 11 13 27
12
2 3 10 11 13 27
The root contains one fewer key, and has one fewer
child.
166
Remove 13
12
2 3 10 11 13 27
167
Remove 13 (Cont)
11
2 3 10 12 27
11
2 3 10 12 27
169
Remove 11 (Cont)
10
2 3 12 27
170
Remove 2
10
2 3 12 27
3 10 12 27
• The result is illegal, because the root does not have at least 2 children.
• Therefore, we must remove the root, making its child the new root.
3 10 12 27
3 10 12 27
3 10 12 27 49
Adding this key make the node overfull, so it must be split into two.
But this node was the root.
So we must construct a new root, and make these its children.
12
3 10 27 49
182
B+Tree
• Variant of B trees
• Two types of nodes
• Internal nodes have no data pointers
• Leaf nodes have no in-tree pointers
• Were all null!
B+Tree
• If m is the order of the tree
• Every internal node has at most m children.
• Every internal node (except root) has at least ⌈m ⁄ 2⌉ children.
• The root has at least two children if it is not a leaf node.
• Every leaf has at most m − 1 keys
• An internal node with k children has k − 1 keys.
• All leaves appear in the same level
B+Tree
• We need to use the following data to create the B+ Tree: 1, 4, 7, 10, 17, 21, 31
• We suppose the order(m) of the tree to be 4. The following facts can be deduced
from this:
• Max Children = 4 Min Children: m/2 = 2 Max Keys: m - 1 = 3 Min Keys: ⌈m/2⌉ - 1
=1
• https://www.scaler.com/topics/data-structures/b-plus-trees/
Parameters B+ Tree B Tree
Leaf Nodes Leaf nodes form a linked list for efficient range-based Leaf nodes do not form a linked list
queries
Key Duplication Typically allows key duplication in leaf nodes Usually does not allow key duplication
Disk Access Better disk access due to sequential reads in linked list More disk I/O due to non-sequential reads in
structure internal nodes
Database systems, file systems, where range queries are In-memory data structures, databases, general-
Applications common purpose use
Better performance for range queries and bulk data Balanced performance for search, insert, and
Performance retrieval delete operations
Memory Usage Requires more memory for internal nodes Requires less memory as keys and values are
stored in the same node
structure of the internal nodes of a B+ tree
In In In In In In
tree Key tree Key tree Key tree Key tree Key tree
ptr ptr ptr ptr ptr ptr
Data ptr Data ptr Data ptr Data ptr Data ptr Data ptr
B+Tree
• Each internal node is of the form: <P1, K1, P2, K2, ….., Pc-1, Kc-1, Pc> where c <= a and
each Pi is a tree pointer (i.e points to another node of the tree) and, each Ki is a key-
value (see diagram-I for reference).
• For each search field values ‘X’ in the sub-tree pointed at by Pi, the following condition
holds : Ki-1 < X <= Ki, for 1 < i < c and, Ki-1 < X, for i = c (See diagram I for reference)
structure of the leaf nodes of a B+ tree
• The structure of the leaf nodes of a B+ tree of order ‘b’ is as follows:
B+Tree
• Each leaf node is of the form: <<K1, D1>, <K2, D2>, ….., <Kc-1, Dc-1>, Pnext> where c
<= b and each Di is a data pointer (i.e points to actual record in the disk whose
key value is Ki or to a disk file block containing that record) and, each Ki is a key
value and, Pnext points to next leaf node in the B+ tree (see diagram II for
reference).
• Every leaf node has : K1 < K2 < …. < Kc-1, c <= m
• Each leaf node has at least \ceil(m/2) values.
• All leaf nodes are at the same level.
B+Tree
• Using the Pnext pointer it is viable to traverse all the leaf nodes, just like a linked list,
thereby achieving ordered access to the records stored in the disk.