Thursday, 20 August 2015

Fundamental Similarities and Differences of HashSet and TreeSet in Java

Details:


The TreeSet and HashSet in Java are both part of the Collections framework. They are both  advance concepts. On this document we would be discussing their similarities and differences in terms of usage and functionality. Both of these classes implements the same java.util.Set interface. The Set interface makes sure that no duplicate elements are allowed thus this behavior is also true for both the TreeSet and HashSet class. Both HashSet and TreeSet are used for to store unique elements, but HashSet doesn't care about any order and TreeSet keeps thing in order. Ordering or sorting on TreeSet can be customized by using Comparator interface, by default TreeSet uses elements natural order for sorting, which is defined by compareTo() method of java.lang.Comparable interface. What is difference between HashSet and TreeSet is is also one the frequently asked Java interview question, So you should know about similarities and difference between them. It also helps you to understand when to use both TreeSet and HashSet and what are the scenario when we should use what type of set. Let's go through the similarities and difference between HashSet and TreeSet in Java.

If you are only need basic container and would not require the advance features of HashSet and TreeSet, I would suggest stick to Array, HashMap and List.

TreeSet and HashSet Similarities

Below are some important similarities between the two implementation of java.util.Set interface from Collection framework which is the TreeSet and HashSet.

Set Interface

Both HashSet and TreeSet implements Set interface.


The order the elements are inserted

TreeSet And HashSet don't care on the order of insertion of elements because anyway both of these classes are not maintaining the order of elements. Not unlike if you are using Lists or simple Array where it follows the index of elements. Imagine that these two classes are one big bucket. Just put elements on it regardless of location and you can get anything inside the big container.


Duplicate value

Both class doesn't allow duplicate elements. The use of add() method when adding elements will return false if the element is already existing. You can use this flag to evaluate if the insertion is success or not.

Synchronization

TreeMap and HashSet is not synchronized.

Iterator

Iterator of both TreeSet and HashSet are fail fast. They will throw ConcurrentModificationException if  the Set is modified once the iteration begins.

Cloneable and Serializable

Both classes implements Cloneable and Serializable interface, which means you can serialize a TreeSet and HashSet and save it into disk or send over network to other JVM.

TreeSet and HashSet Differences


Below are the differences of HashSet and TreeSet. This tutorial will not be complete if we will not differentiate the two.

Internal Structure

HashSet internally uses HashMap to store its element, while TreeSet uses TreeMap to store its element internally. See this article to learn more about how HashSet works in Java.

Ordering

HashSet is unordered which means that the position of elements is random. It will not follow the FIFO rule. Imagine that a HashSet is a big bucket where you store information you get the elements one by one probably using an iterator and what comes first in the iteration will be random.

But in case of TreeSet, the order of elements are defined by supplied Comparator, and if you don't give any Comparator for TreeSet objects  then it will use natural ordering of elements for example if the elements are integer then it will get sorted in numerical order meanwhile if the type is String then it will get sorted alphabetically.

Null Value

HashSet allow maximum  of one null element, which means you can store only one null value inside HashSet. But TreeSet won't allow any null object and you will get NullPointerException if you try to store null values in TreeSet. Why? because TreeSet sorts element as soon as you add it into TreeSet, if the object is null then calling compareTo() on that will throw NullPointerException.

Comparison method

HashSet uses equal() and hashcode() methods to compare the elements, while TreeSet we can implements compareTo() method of Comparator interface so we have compare() and compareTo() method ,TreeSet  does not use equal() and  hashcode() method.

Speed and Performance

HashSet is faster than TreeSet for all general purpose .The add, remove, and contains methods has constant time complexity O(1)  for HashSet, which means HashSet offers constant time cost for adding, removing and checking if an object exists in Set.
TreeSet performance is less as compared to HashSet as TreeSet has to sort the element after each insertion and removal Operation. TreeSet offers log(n) time cost for dd, remove, and contains  operations.

Functionality

TreeSet offers sorting which is not provided by HashSet. HashSet also has less methods as compared to TreeSet. TreeSet is rich in functionality as compare to HashSet. Functions like pollFirst(), pollLast(), first(), last(), ceiling(), lower() etc makes TreeSet rich in functionality wise.