Deleting items from a Binary Search Tree

Special : Web Developer's Collection CD-ROM
50,000+ Web Templates, Logos, Cliparts, Scripts. Order Now!

Add to Favorites Make Home Page

767 Online

Language Categories

Source Codes Home
Project Ideas New!
Interview Questions
FAQs Home
ASP Home: ASP Source Codes; ASP Script Directory New!; ASP .Net Script Directory New!; ASP Interview Questions; ASP FAQs; ASP How Tos
C Home: C Source Codes; C Script Directory New!; C Interview Questions; C FAQs; C How Tos
C++ Home: C++ Source Codes; C++ Script Directory New!; C++ Interview Questions; C++ FAQs; C++ How Tos
Java Home: Java Source Codes; Java Directory New!; Java Interview Questions; Java FAQs; Java How Tos
JavaScript Home: JavaScript Directory New!; JavaScript Source Codes; JavaScript FAQs; JavaScript How Tos
COBOL Home: COBOL Source Codes; COBOL FAQs; COBOL How Tos
Pascal Home: Pascal Source Codes; Pascal FAQs; Pascal How Tos
PHP Script Directory New!
Python Script Directory New!
Perl & CGI Script Directory New!
Flash Script Directory New!
CFML Script Directory New!
Remotely Hosted Scripts New!
Tools & Utilities Directory New!
XML Script Directory New!
Best Programmers: Amit Mathur; Vishal Bhardwaj; Deepesh Jain
Vyom NetWork

Our Services

Get 9,000 Interview Questions & Answers in an eBook.

9500+ Pages
9000 Question & Answers
All Tech. Categories

14 MB Content

Get it now !!

Java Data Structures - Contents

Deleting items from a Binary Search Tree...

A D V E R T I S E M E N T

Every database system must provide for deletion of items. A Binary Search Tree is a very good option for implementing a database (fast searches, sorts, inserts, etc.). One problem that stands in the way though is that it's not very straight forward to delete an item from a database represented by a binary tree. More precisely, the problem is in maintaining the tree structure while the deletion process.

Deleting items from a Binary Search Tree can be rather tricky. On one hand, you'd like to remove the object from the tree, on the other, you don't want the process to destroy the tree structure. There are many algorithms for this, and they all vary in degree of simplicity and efficiency.

The simplest one (which we will not cover here), is to simply have a boolean variable inside a node. That boolean variable tells us if that node is valid or not. Whenever we want to delete that node, we simply set that valid variable to false; making all the traversal functions simply skip that node. The actual removal can take place when the tree is rebuilt. This may seem like a waste of space, but most of the time, it's not (in today's world, several extra bytes inside the tree structure don't make much of a difference).

The above can actually be used in conjunction with a more complicated approach. For example, lets take a regular company database. Throughout the day, there are thousands of transactions, deletions, insertions, etc., all this is handled by the algorithm described above (a boolean valid variable). At the end of the day (night), when the system becomes free, the program does a routine clean-up of the tree from these "deleted" items. Since setting or unsetting a boolean variable can be fast, the database is really fast during the day, i.e.: when it matters.

Cleaning up a tree from these types of "deleted" items can be accomplished in many different ways. If there are too many of them, then it might be worthwhile to simply rebuild the tree from scratch (making sure it doesn't loose it's advantageous structure, i.e.: doesn't become a linked list). The program could also fire up a more complicated approach to delete each node individually. (While at it, the system could also be optimizing the tree structure ;-)

Now, what is that "more complicated" approach that I keep talking of? It is the approach of switching the node being deleted with the one currently in the tree, only at lower level. Deleting a node that has no children is pretty simple, we just remove that node (and set the parent's pointer to it to null). Deleting a node that has one child is also easy. We delete the node, and make it's parent point to that only child of the deleted node.

The problem comes when you try to delete a node which has two valid children. Which one do you pick to be it's successor (take it's parent's place)? Actually, in most cases, neither!

You have to realize that we're dealing with a binary search tree. Search trees have very specific properties. For example, if we need to remove a node, we look for nearest right child that doesn't have a left son (or nearest left child, that doesn't have a right son). We then replace that newly found node with the one we are trying to delete (and making sure that all the links go where they should). The process of deleting that right child with no left son actually involves a simple removal described above: i.e.: the node is simply replaced by it's only child (in this case the right child; since there is no left). Confusing? Lets jump into the code to clear it up...

import java.lang.*;
import java.io.*;

public class pBSTRemoveNode{

    public pBSTRemoveNode left,right;
    public Comparable data;

    public static pBSTRemoveNode tree_AddNumber(
        pBSTRemoveNode r,Comparable n){
        if(r == null){
            r = new pBSTRemoveNode();
            r.left = r.right = null;
            r.data = n;
        }else if(r.data.compareTo(n) < 0)
            r.right = tree_AddNumber(r.right,n);
        else if(r.data.compareTo(n) > 0)
            r.left = tree_AddNumber(r.left,n);
        return r;
    }

    public static pBSTRemoveNode tree_removeNumber(
        pBSTRemoveNode r,Comparable n){
        if(r != null){
            if(r.data.compareTo(n) < 0){
                r.right = tree_removeNumber(r.right,n);
            }else if(r.data.compareTo(n) > 0){
                r.left = tree_removeNumber(r.left,n);
            }else{
                if(r.left == null && r.right == null){
                    r = null;
                }else if(r.left != null && r.right == null){
                    r = r.left;
                }else if(r.right != null && r.left == null){
                    r = r.right;
                }else{
                    if(r.right.left == null){
                        r.right.left = r.left;
                        r = r.right;
                    }else{
                        pBSTRemoveNode q,p = r.right;
                        while(p.left.left != null)
                            p = p.left;
                        q = p.left;
                        p.left = q.right;
                        q.left = r.left;
                        q.right = r.right;
                        r = q;
                    }
                }
            }
        }
        return r;
    }

    public static void tree_InOrderPrint(
        pBSTRemoveNode r){
        if(r != null){
            tree_InOrderPrint(r.left);
            System.out.print(" "+r.data);
            tree_InOrderPrint(r.right);
        }
    }

    public static void main(String[] args){
        pBSTRemoveNode tree = null;
        int[] numbers = {56,86,71,97,82,99,65,36,16,10,28,52,46};
        System.out.print("inserting: ");
        for(int i = 0;i<numbers.length;i++){
            Integer n = new Integer(numbers[i]);
            System.out.print(" "+n);
            tree = tree_AddNumber(tree,n);
        }
        System.out.print("\ntree: ");
        tree_InOrderPrint(tree);
        for(int j = 0;j < numbers.length;j++){
            Integer n = new Integer(numbers[j]);
            System.out.print("\nremove: "+n+" tree: ");
            tree = tree_removeNumber(tree,n);
            tree_InOrderPrint(tree);
        }
        System.out.println("\ndone ;-)");
    }
}

If you look through the above code, you'll quickly realize that the most relevant function (to this section), is the tree_removeNumber(). It accepts a reference to the root of the tree, and a number to remove. Actually, the way it's implemented, it doesn't necessarily has to be a number. It could just as well be a java.lang.String object; anything that's implementing a Comparable interface will work. The title of the function is a bit misleading; telling you that you can only work with numbers.

The main() is rather straight forward; it first adds numbers to the tree, and then removes them, displaying what it does at each step. The tree_AddNumber() function will not be explained here (since it's already been explained somewhere within this document).

The tree_removeNumber() method is where most of the interesting action takes place. We first check to see if the root of the tree is not null, since if it is, we have nothing to remove, and we simply return. The next two if() and else if() statements do what is know as binary search. If the item that we're looking for is greater than the value of the current node, we recursively search the right child of the current node. If it's less than the value of the current node, then we recursively search the left child of the current node.

The remainder of the function is kind of a big else statement. Once we're there, it means we have found the item we were looking for. We start by first checking for simple cases, where the node we're removing has no children, or has only one child. If it has two valid children, we fall into another else statement.

Once we know it is not one of the simple cases, we have to do a bit of thinking. First, we check a bit easier case, where the right child of the node doesn't have a left child. If that's the case, we need go no further, we simply replace the removing node with it's right child, and make sure the links are not lost. If the right child has a left child, we have to loop to find the closest right child with no left child, and that's what that while() loop is doing. The moment we find that node we would like to put in place of the one being deleted, we remove the found node. This removal is rather simple, since the node only has a right child. We then simply replace the removed node with the new one, making sure we don't loose any links, and we are done.

Inside main(), I have picked numbers to work with, and to construct the tree (sample data). The numbers generate a pretty good binary tree. The removal starts with the root node, so, the example does quite extensive testing in all the cases described above. The output from the above program follows:

inserting:  56 86 71 97 82 99 65 36 16 10 28 52 46
tree:  10 16 28 36 46 52 56 65 71 82 86 97 99
remove: 56 tree:  10 16 28 36 46 52 65 71 82 86 97 99
remove: 86 tree:  10 16 28 36 46 52 65 71 82 97 99
remove: 71 tree:  10 16 28 36 46 52 65 82 97 99
remove: 97 tree:  10 16 28 36 46 52 65 82 99
remove: 82 tree:  10 16 28 36 46 52 65 99
remove: 99 tree:  10 16 28 36 46 52 65
remove: 65 tree:  10 16 28 36 46 52
remove: 36 tree:  10 16 28 46 52
remove: 16 tree:  10 28 46 52
remove: 10 tree:  28 46 52
remove: 28 tree:  46 52
remove: 52 tree:  46
remove: 46 tree:
done ;-)

Trace the output if you like, it's quite interesting. This type of removing actually improves the tree structure, it makes it shorter (decreases tree's depth), thus, making searches faster. One thing that I'd like to mention before we go on, is that this example is for JDK 1.2 or above. It will not compile (nor run) on anything less due to the fact that it uses the java.util.Comparable interface, which is not supported in earlier versions of the JDK. Anyway, I guess that's it for this topic.

Back to Table of Contents

A D V E R T I S E M E N T

Google Groups Subscribe to SourceCodesWorld - Techies Talk

Email:

Free eBook - Interview Questions: Get over 1,000 Interview Questions in an eBook for free when you join JobsAssist. Just click on the button below to join JobsAssist and you will immediately receive the Free eBook with thousands of Interview Questions in an ebook when you join.

Free Tutorials - HTML, Java, SQL, PHP, ASP, Perl, SQL Tutorials
Discuss Programming Problems and Freshers Jobs related issues here.
Discuss Final Year Project Ideas here.