Now that you've discovered the tree in Java, it is time to dive into one of the most used trees: the Java binary tree, aka binary search tree.
What is the Binary Search Tree
A binary search tree (BST) is a data structure that stores nodes in a structured and sorted manner. Binary search trees are nice as they are quite logical and relatively easy to wrap your head around. They work on the simple premise of comparison. Like most other abstract data structures, you typically use a node class to hold keys and values.
The image below is a simple representation of a BST, where the numbers represent keys that are associated with corresponding values.
Rules of Java Binary Trees
A BST enforces the following rules:
- The left subtree of any given node (root or otherwise) solely contains nodes with keys that are lesser than the given node’s key.
- The right subtree of any given node (root or otherwise) solely contains nodes with keys that are greater than the given node’s key.
Therefore:
- If you are searching for, inserting, updating, or deleting a node in a tree, you can use a simple comparison on each node to determine whether you have found the appropriate node/location or whether you need to traverse right or left to continue searching.
- If the key that I'm searching for (or inserting/updating/deleting) is smaller than the key of the current node, then you traverse to the left child.
- Inversely, if the key that I'm searching for (or inserting/updating/deleting) is greater than the key of the current node, then you traverse to the right child.
- Then, you simply repeat the process. (Compare the current node, return, or traverse right/left.) We do this until you either find the location of the node you're searching for, updating or deleting, or until you find the empty leaf where you can insert the node at the appropriate location.
Imagine you need to insert the number (or key of) 7 in the image above. Following the rules, you will arrive at the following steps:
- Compare 7 against 10
- Since 7 is less than 10, go to left-child (5)
- Since 7 is greater than 5, go to right-child (8)
- Since 7 is less than 8, go to left-child (6)
- Since 7 is greater than 6, and 6 has no child nodes, attach 7 as the right-child of 6
Basically, you continue to compare the key of each Node as you traverse against the key of the Node you're looking for.
Tree Terminology Review
- Root is the root node of the tree
- Edge is a link between two nodes
- Child is a node that has a parent node
- Parent is a node that has a child node
- Leaf is a node that does not have a child node in the tree
- Height is the length of the longest path between a node and a leaf
- Depth is the length of the path from a node to its root
Binary Search Tree Big O Complexity
Binary search trees are known for their excellent complexity. Take a look at the table below:
| Operation | Average | Worst case | |
|---|---|---|---|
| Search | $`O(log \, n)`$ | $`O(n)`$ | |
| Insert | $`O(log \, n)`$ | $`O(n)`$ | |
| Delete | $`O(log \, n)`$ | $`O(n)`$ |
$`O(log n)`$ is the best complexity you've seen yet! It basically means that with each "hop" or comparison, you cut the complexity of the algorithm in half.
Real World Explanation
For instance, if you're looking for someone with the last name "Peters" in the phone book, you would open the phone book to the middle and compare whether "P" is greater than or less than the letter you're seeing on the page. If it's greater than that, you can "throw out" the "less than" half of the phone book. You can repeat this process 3 or 4 times to get very close to the "P" and again to the "Peters". This is called a binary search.
It is MUCH faster than starting at the beginning of the phone book and comparing each name till you find it. That would be horribly inefficient (with a Big O of $`O(n)`$).
Worst Case Scenario
You'll notice in the Big O chart above that the worst case for a BST can also be $`O(n)`$. This happens when the tree is unbalanced. For instance, if you insert ordered, increasing values to the tree, i.e., "1", then "2", then "3", then "4", then "5" and so on. You'll effectively just create a linked list that looks like the image below:
This is because each value will be added to the last right child. This means you have lost the $`O(log n)`$ benefits of the tree data structure and will now simply have to iterate through each element to find the element you're looking for.
Rebalancing Trees
In order to avoid the situation outlined above (a tree basically becoming a linked list), you do what is called rebalancing. When rebalancing a tree, you take into consideration the width, height, and max depth of the tree, and you rotate the tree.
You'll dig into the details on how to do this when you take up an example with code, but here's a picture to give an idea of how it works.
Summary: What is a Java Binary Tree
- Binary search trees (BST) store nodes in a structured and ordered manner through comparison
- The left subtree contains nodes with keys that are lesser than the given node’s key
- The right subtree contains nodes with keys that are greater than the given node’s key
- Using the lesser and greater key values, a BST can be efficiently traversed to find the desired element
- BSTs may have to be rebalanced to avoid the worst-case complexity: linear growth
Big O Complexity
| Operation | Average | Worst case | |
|---|---|---|---|
| Search | $`O(log \, n)`$ | $`O(n)`$ | |
| Insert | $`O(log \, n)`$ | $`O(n)`$ | |
| Delete | $`O(log \, n)`$ | $`O(n)`$ |