Sorting Using Quicksort - Java Data Structures

Special : Web Developer's Collection CD-ROM
50,000+ Web Templates, Logos, Cliparts, Scripts. Order Now!

Add to Favorites Make Home Page

1162 Online

Language Categories

Source Codes Home
Project Ideas New!
Interview Questions
FAQs Home
ASP Home: ASP Source Codes; ASP Script Directory New!; ASP .Net Script Directory New!; ASP Interview Questions; ASP FAQs; ASP How Tos
C Home: C Source Codes; C Script Directory New!; C Interview Questions; C FAQs; C How Tos
C++ Home: C++ Source Codes; C++ Script Directory New!; C++ Interview Questions; C++ FAQs; C++ How Tos
Java Home: Java Source Codes; Java Directory New!; Java Interview Questions; Java FAQs; Java How Tos
JavaScript Home: JavaScript Directory New!; JavaScript Source Codes; JavaScript FAQs; JavaScript How Tos
COBOL Home: COBOL Source Codes; COBOL FAQs; COBOL How Tos
Pascal Home: Pascal Source Codes; Pascal FAQs; Pascal How Tos
PHP Script Directory New!
Python Script Directory New!
Perl & CGI Script Directory New!
Flash Script Directory New!
CFML Script Directory New!
Remotely Hosted Scripts New!
Tools & Utilities Directory New!
XML Script Directory New!
Best Programmers: Amit Mathur; Vishal Bhardwaj; Deepesh Jain
Vyom NetWork

Our Services

Get 9,000 Interview Questions & Answers in an eBook.

9500+ Pages
9000 Question & Answers
All Tech. Categories

14 MB Content

Get it now !!

Java Data Structures - Contents

Sorting using Quicksort...

A D V E R T I S E M E N T

Sorting using Quicksort is something that I wasn't planning to cover in this document due to the fact that the algorithm solely depends on the data it receives. If the data has certain properties, quicksort is one of the fastest, if not, quicksort can be excruciatingly slow. By now, you probably already figured out that bubble sort (covered earlier) is pretty slow; put it another way: forget about using bubble sort for anything important.

Bubble sort tends to be O(n^2); in other words, pretty slow. Now, quicksort can perform quite fast, on average about O(n log n), but it's worst case, is a humiliating O(n^2). For quicksort, the worst case is usually when the data is already sorted. There are many algorithms to make this "less likely to occur," but it does happen. (the O notation is paraphrased "on the order of")

If you need a no-worst-case fast sorting algorithm, then by all means, use heap sort (covered earlier). In my opinion, heap sort is more elegant than any other sort <it is in place O(n log n) sort>. However, on average, it does tend to perform twice as slow as quicksort.

UPDATE: I've recently attended a conference at the [insert fancy name here] Science Society, and heard a nice talk by a professor from Princeton. The talk was totally dedicated to Quicksort. Apparently, if you modify the logic a bit, you can make Quicksort optimal! This is a very grand discovery! (unfortunately, I didn't write down the name of anybody; not even the place!) The idea (among other things) involves moving duplicate elements to one end of the list, and at the end of the run, move all of them to the middle, then sort the left and right sides of the list. Since now Quicksort performs well with lots of duplicate values (really well actually ;-), you can create a radix-like sort that uses embedded Quicksort to quickly (very quickly) sort things. There's also been a Dr.Dobbs article (written by that same professor ;-) on this same idea, and I'm sorry for not having any more references than I should. I still think this is something that deserved to be mentioned.

We will start out by writing a general (simplest) implementation of quicksort. After we thoroughly understand the algorithm we will re-write it implementing all kinds of tricks to make it faster.

Quicksort is naturally recursive. We partition the array into two sub-arrays, and then re-start the algorithm on each of these sub-arrays. The partition procedure involves choosing some object (usually, already in the array); If some other object is greater than the chosen object, it is added to one of the sub-arrays, if it's less than the chosen object, it's added to another sub-array. Thus, the entire array is partitioned into two sub-arrays, with one sub-array having everything that's larger than the chosen object, and the other sub-array having everything that's smaller.

The algorithm is fairly simple, and it's not hard to implement. The tough part is making it fast. The implementation that follows is recursive and is not optimized, which makes this function inherently slow (most sorting functions try to avoid recursion and try to be as fast as possible). If you need raw speed, you should consider writing a native version of this algorithm.

public class pSimpleQuicksort{

    public static void qsort(Comparable[] c,int start,int end){
        if(end <= start) return;
        Comparable comp = c[start];
        int i = start,j = end + 1;
        for(;;){
            do i++; while(i<end && c[i].compareTo(comp)<0);
            do j--; while(j>start && c[j].compareTo(comp)>0);
            if(j <= i)   break;
            Comparable tmp = c[i];
            c[i] = c[j];
            c[j] = tmp;
        }
        c[start] = c[j];
        c[j] = comp;
        qsort(c,start,j-1);
        qsort(c,j+1,end);
    }

    public static void qsort(Comparable[] c){
        qsort(c,0,c.length-1);
    }

    public static void main(String[] args){
        int i;
        Integer[] arr = new Integer[20];
        System.out.println("inserting: ");
        for(i=0;i<arr.length;i++){
            arr[i] = new Integer((int)(Math.random()*99));
            System.out.print(arr[i]+" ");
        }
        pSimpleQuicksort.qsort(arr);
        System.out.println("\nsorted: ");
        for(i=0;i<arr.length;i++)
            System.out.print(arr[i]+" ");
        System.out.println("\nDone ;-)");
    }
}

As you can see, the qsort() functions itself is fairly short. One of them is just a wrapper for the other one. The idea is that qsort() receives the array, and then the start, and end pointer between which you want everything sorted. So, the starting call starts at array position zero, and ends at the last valid position in the array. Some sample output follows:

inserting:
58 52 82 27 23 67 37 63 68 18 95 41 87 6 53 85 65 30 10 3
sorted:
3 6 10 18 23 27 30 37 41 52 53 58 63 65 67 68 82 85 87 95
Done ;-)

The sorts starts out by first checking to see if the end pointer is less then or equal to the start pointer. It if is less, then there is nothing to sort, and we return. If they are equal, we have only one element to sort, and an element by itself is always already sorted, so we return.

If we did not return, we make a pick. In our example, we simply choose the first element in the array to use as our partition element (some call it pivot element). To some people, this is the most critical point of the sort, since depending on what element you choose, you'll get either good performance, or bad. A lot of the times, instead of choosing the first, people choose the last, or take a median of the first, last and middle elements. Some even have a random number generator to pick a random element. However, all these techniques are useless against random data, in which case, all those tricky approaches can actually prove to worsen results. Then again, most of the times, they do work quite nicely... (it really depends on the type of data you are working with)

After we have picked the element, we setup i and j, and fall into the a for loop. Inside that loop, we have two inner loops. Inside the first inner loop, we scan the array looking for the first element which is larger than our picked element, moving from left to right of the array. Inside the second inner loop, we scan to find an element smaller than the picked, moving from right to left in the array. Once we've found them (fallen out of both loops), we check to see if the pointers haven't crossed (since if they did, we are done). We then swap the elements, and continue on to the next iteration of the loop.

You should notice that in all these loops, we are doing one simple thing. We are making sure that only one element gets to it's correct position. We deal with one element at a time. After we find that correct position of our chosen element, all we need to do is sort the elements on it's right, and left sides, and we're done. You should also notice that the algorithm above is lacking in optimization. The inner loops, where most of the time takes place are not as optimized as they should be. However, for the time being, try to understand the algorithm; and we will get back to optimizing a bit later.

When we fall out of the outer loop, we put our chosen element into it's correct position, calculate the upper and lower bounds of the left and right arrays (from that chosen elements), and recursively call the method on these new computed arrays.

That's basically it for the general algorithm. In the next section, you'll be amazed at how much faster we can make this work. Basically, in the next section, we will implement a sort function to use everywhere (it will be faster than most other approaches).

Back to Table of Contents

A D V E R T I S E M E N T

Google Groups Subscribe to SourceCodesWorld - Techies Talk

Email:

Free eBook - Interview Questions: Get over 1,000 Interview Questions in an eBook for free when you join JobsAssist. Just click on the button below to join JobsAssist and you will immediately receive the Free eBook with thousands of Interview Questions in an ebook when you join.

Free Tutorials - HTML, Java, SQL, PHP, ASP, Perl, SQL Tutorials
Discuss Programming Problems and Freshers Jobs related issues here.
Discuss Final Year Project Ideas here.