Java 6 Collections: Why need HashSet


Share/Save/Bookmark

If the ordering of the elements is not important, then the Hash set allows you to find the element much faster. But anyway, if you access these elements in the iterative manner, you may get random ordering each time.

The well known data structure for finding objects quickly is the hash table. Java HashSet and HashTable are based on the hash table data structure. Specially, if the universe U is larger, it is impractical to storing objects in the Table T which is size of U. Instead Key can represent the actual object because K is relatively smaller compared to associated object. The storage requirement is Θ(K) and to search a element in the hash table required only Ο(1) time.

In the direct addressing such as ArrayList, an element is stored is referenced in the k slot of the table, instead hashing, this element is stored in slot h(k) for the hash function h. In other words, k3 an k10 can be in the same slot. If two keys are fallen to a same slot which is called hash collision. The collision should be minimize is the rule of thumb. There are number of effective techniques to create hash function. Anyway, hash function should be deterministic. In Java, hash collision is resolved with chaining that is the elements who has the same hash value has been put in linked list which is referenced by the slot.

package com.ojitha.set.item;

/**
* Book represent the real world book.
*
* @author Ojitha
*
*/
public class Book {

// Name of the book
private String name;

// Year of the published
private int year;

/**
 * Books can be constructed providing name and the published year.
 *
 * @param name
 *            Name of the book.
 * @param year
 *            Published year.
 */
public Book(String name, int year) {
 this.name = name;
 this.year = year;
}

/**
 * Override this method to display book information.
 * @return Book name and the published year.
 */
public String toString() {
 return "Book: " + name + " published on: " + year;
}

/**
 * The equals() method has been overridden to equate the books on the
 * name. Anyway year of the published is not important.
 */
public boolean equals(Object other) {

 if (other instanceof Book
   && this.name.toLowerCase().equals(
     ((Book) other).name.toLowerCase())) {
  return true;
 }
 return false;
}

@Override
public int hashCode() {
 return this.year;
}

/**
 * Get the year of the book published.
 * @return Book published year.
 */
public int getyear(){
 return this.year;
}

}

As shown in the following sample books.txt file, the book "A changed Man" is duplicated, but they were published on two different years. Although book names are equal, because published years are different, these books are stored in different buckets.  As a conclusion, to identify a object, equals() should be “true” as well as hashCode() should be return the same value.

The Emperor's New Mind, 1989
The Six Value Medals, 1988
The World is Flat, 2005
A changed Man, 2004
A changed Man, 2005

In the above code, there are two important methods which have been overridden. Those are equals() and hashCode() methods. The first method should be overridden to find the particular book. As mentioned earlier, hashCode() method return the bucket where the book belongs to. 

package com.ojitha.set;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Arrays;
import java.util.Comparator;
import java.util.HashSet;
import java.util.Set;

import com.ojitha.set.item.Book;

/**
* This program demonstrate the use of HashSet.
* @author Ojitha
*
*/
public class HashSetExample {

/**
 * @param args
 * @throws IOException
 */
public static void main(String[] args) throws IOException {

 //create set of books
 Set books = new HashSet();

 //The file books.txt available at the class path
 File file = new File("books.txt");

 //Create filereader to read the file
 FileReader fReader = new FileReader(file);

 //use more easy buffered reader to read the file line by line
 BufferedReader reader = new BufferedReader(fReader);

 //populate books and store in the hash set
 String line=null;
 while ((line=reader.readLine())!=null){
  String[] data =line.split(",");
  books.add(new Book(data[0], Integer.parseInt(data[1].trim())));
 }

 //Iterate over the available books
 for (Book book : books) {
  System.out.println(book);
 }

 //Search the book
 Book bookToFind = new Book("A CHANGED MAN",2004);

 if (books.contains(bookToFind)){
  System.out.println("Book is available");
 }

 //Remove the book also very inexpensive
 if (books.remove(bookToFind)){
  System.out.println("Book has been removed.");
 }

 //Iterate over the sorted available books
 Book[] bookArray = new Book[books.size()];
 books.toArray(bookArray);

 //create the anonymous class to sort on descending order of the
 //book published year. You will get latest book first.
 Arrays.sort(bookArray, new Comparator(){

  @Override
  public int compare(Book b1, Book b2) {
   return b2.getyear()-b1.getyear();
  }
 
 });

 //Display the sorted book list, first is the latest.
 for (Book book : bookArray) {
  System.out.println(book);
 }
}

}

In the above sample client, we’ve created Book instance which is for “A changed Man” published on 2004, in other words, this object should be available in the bucket 2004.

To create ordered collection from the Set, that should be transferred to either Array or List. In the above code, converted array has been sorted on descending order passing anonymous class which is extended from the Comparator, to the Array.sort() method.

The output should be similar to the following.

Book: A changed Man published on: 2004
Book: The Emperor's New Mind published on: 1989
Book: A changed Man published on: 2005
Book: The World is Flat published on: 2005
Book: The Six Value Medals published on: 1988
Book is available
Book has been removed.
Book: A changed Man published on: 2005
Book: The World is Flat published on: 2005
Book: The Emperor's New Mind published on: 1989
Book: The Six Value Medals published on: 1988

Comments

Popular posts from this blog

How To: GitHub projects in Spring Tool Suite

Spring 3 Part 7: Spring with Databases

Parse the namespace based XML using Python