The Java Collections Framework
1. What is a collection ?
An object that brings multiple elements together in a group. These elements can be manipulated through the methods of the collection object.
2. What is a collections framework ?
• A collections framework is a unified architecture for representing and manipulating collections.
• All collections frameworks consist of interfaces, implementations and algorithms.
o Interfaces allow collections to be manipulated by information hiding.
o Implementations are concrete classes that implement interfaces
o Algorithms are methods that perform useful functions, like sort, shuffle, copy, or search.
3. Internal Iterators versus External Iterators Internal Iterators
• With an internal iterator, a user (program) of a collection iterates over its element directly through a public method of the collection class.
• Example:
public class Sequence<E>
{ private E[] data;
private int dataSize;
private int nextIndex; //index of the next element to return public Sequence()
{ data = (E[]) new Object[100];}
public void add(E item) { data[dataSize] = item;
dataSize++;
}
public void start() { nextIndex = 0; } public boolean hasNext()
{ return nextIndex < dataSize; } public E next()
{ return data[nextIndex++]; }
public static void main (String [] args)
{ Sequence<String> list = new Sequence<String>();
list.add("A");
list.add("B");
list.add("C");
list.start();
while (list.hasNext())
System.out.println(list.next());
list.start();
while (list.hasNext())
System.out.println(list.next());
} }
• Limitation
Multiple traversals of the collection elements must occur sequentially – stepping through all the elements for the first time, then restart the iterator, and stepping through all the elements for the next time, and so on.
External Iterators
• An external iterator is an object separate from the collection object.
• Example
interface Iterator<E>
{ boolean hasNext();
E next();
void remove(); // optional;
}
public class Sequence<E>
{ private E[] data;
private int dataSize;
public Sequence()
{ data = (E[]) new Object[100]; }
public void add(E item) { data[dataSize] = item;
dataSize++;
}
public Iterator<E> iterator() { return new Helper(); }
private class Helper implements Iterator<E>
{
private int nextIndex;
public boolean hasNext()
{ return nextIndex < dataSize; }
public E next()
{ return data[nextIndex++]; }
public void remove() { // optional }
}
public static void main (String [] args)
{ Sequence<String> list = new Sequence<String>();
list.add("A");
list.add("B");
list.add("C");
Iterator<String> iter1 = list.iterator();
while (iter1.hasNext())
{ System.out.print(iter1.next() + ":");
Iterator<String> iter2 = list.iterator();
while (iter2.hasNext())
System.out.print(iter2.next() + " ");
System.out.println();
} } }
4. Collection and Iterator Interface
In Java collections framework, core interfaces form a hierarchy as shown below. We will learn Java collections framework focusing on the use of Collection and Iterator interfaces.
Collection<E>
Set<E> List<E>
SortedSet<E>
Iterator<E>
ListIterator<E>
There are several methods defined in the Collection<E> interface. Among them, iterator() is one of the fundamental method to understand.
Iterator<E> iterator();
The iterator method returns an object of a class that implements Iterator<E> interface. You can simply call the object as iterator. The Iterator interface defines the following three methods:
E next()returns the next element in the iteration
boolean hasNext() returns true if the iteration has more elements.
void remove()removes from the underlying collection the last element returned by the iterator.
You can traverse elements of a collection through an iterator of the collection. In the following loop, we process all items in LinkedList<Integer> aList through an Iterator.
// Obtain an Iterator to the list aList
LinkedList<Integer> aList = new LinkedList<Integer>();
Iterator<Integer> iter = aList.iterator();
while (iter.hasNext())
int value = iter.next(); //auto-unboxing // Do something with value.
An Iterator does not point to a particular object at any given time. Rather you should think of an Iterator as pointing between objects within a collection. (We will discuss later how to implement iterators.)
after next call returned
element Iterator
current position (before next call)
The hasNext method tells you if the next method will succeed. If hasNext returns false, a call to next will cause the NoSuchElementException to be thrown. If hasNext returns true, you can safely call next. With a next call, the iterator jumps over the next element, and returns a reference to the object that it just passed.
Note that the Iterator.remove removes the last element returned by next(). Therefore, the remove operation must pair up with a preceding next().
public class Test
{ public static void main (String [] args)
{ LinkedList<Integer> list= new LinkedList<Integer>();
list.add(1);
list.add(2);
list.add(3);
list.add(4);
list.add(5);
System.out.println(list);
removeDivisibleBy(list, 2);
System.out.println(list);
}
/** Remove items divisible by a given value.
pre: LinkedList aList contains Integer objects.
post: Element divisible by div have been removed.
*/
public static void removeDivisibleBy (LinkedList<Integer> aList, int div) { Iterator <Integer> iter = aList.iterator();
while (iter.hasNext())
{ int nextInt = iter.next(); // auto unboxing if (nextInt % div == 0) iter.remove();
}
}// removeDivisibleBy }// Test
If you try to remove if it wasn’t preceded by next(), IllegalStateException is thrown.
Iterator<Integer> iter = aList.iterator();
iter.next();
iter.remove();
iter.remove(); // throws an exception
5. The Collection Hierarchy
According to J2SE 1.5 API, Collection<E> interface defines a large number of methods.
boolean add(E o)
boolean addAll(Collection<? extends E> c) void clear()
boolean contains(Object o)
boolean containsAll(Collection<?> c) boolean equals(Object o)
int hashCode() boolean isEmpty() Iterator<E> iterator() boolean remove(Object o)
boolean removeAll(Collection<?> c) boolean retainAll(Collection<?> c) int size()
Object[] toArray()
<T> T[] toArryay(T[] a)
According to the definition of interface, a class that implements Collection interface must complete the implementation of all these methods. Otherwise, the class should be defined as abstract. It will be bother for a class to supply so many routine methods. Thus, Java developers think up a better way to avoid this unnecessary work - abstract classes that supply complete implementation of some routine methods of an interface. Here is an example:
There is an abstract class called AbstractCollection that implements Collection interface. This class leaves the fundamental methods such as add and iterator abstract, but completes some of the routine methods. Therefore, when a class extends AbstractCollection class, it doesn’t have to write the already completed methods by AbstractCollection class (if desired, the class can override the inherited methods.)
public abstract class AbstractCollection implements Collection { . . .
public abstract Iterator iterator();
public abstract int size();
public boolean addAll(Collection from) { Iterator iter = iterator();
boolean modified = false;
while (iter.hasNext()) if (add(iter.next())) modified = true;
return modified;
} . . . }
Suppose a class, MyCollection, extends AbstractCollection as shown below. In order to be a concrete class, it must complete all inherited (such as size or iterator) and its own abstract methods. However, it does not have to write the already completed inherited methods (such as addAll).
Collection
AbstractCollection
MyCollection
6. Concrete Collection
A concrete collection is a completed collection class, and is also called a data structure. You need to understand a data structure from both the user’s viewpoint and the developer’s viewpoint. A user of a data structure creates an instance of a collection class and invoke the public methods of that class to manipulate the object. A developer considers specific private fields and the implementation of public methods of the class.
There are two frequently used concrete collections in Java collections framework: ArrayList and LinkedList. As a user, you can create an object of ArrayList or LinkedList and manipulate data
in the collection object through its public methods. These concrete collections are already implemented by Java developers and included in JDK. You need to learn how to use existing data structures as well as how to implement them from scratch. We start with user’s viewpoint and then turn to the developer’s viewpoint.
Object
Abstract Collection
AbstractList
Collection
List
AbstractSequentialList
LinkedList ArrayList
7. java.util.LinkedList and ListIterator
1. Introduction
Arrays and ArrayLists have major disadvantage in removing (or inserting) an element from the middle of array. Removing (or inserting) degrades performance since all the array elements beyond the removed one should be shifted toward to the beginning (or end) of the array. The is due to the fact that array elements are stored at consecutive memory location. LinkedLists solve this problem. a LinkedList consists of nodes that contains data and link(s). A link contains a reference to the adjacent node in the linked list. The nodes do not have to be stored in a consecutive memory because links connect them to be a list. Removing/inserting requires changing the links around the element to be removed/inserted. Shifting elements is not needed by nature. If a node has one link only that refers to the next node, the linked list is a single linkedlist.
It provides one directional traverse towards the end of the linked list. The node of a double linkedlist contains two links, one for the reference to the previous node and the other for the reference to the next node. Thus, you can traverse a double linkedlist in either way. A concrete class java.util.LinkedList is an impelemntation of a double linkedlist.
Before removing
LinkedList node node node
After removing
2. java.util.LinkedList and listIterator
java.util.LinkedList implements List interface which extends Collection interface. Thus, unlike general collections, a linkedlist should be considered as an ordered collection in the sense that the position of the elements matters. The following example shows how to create an instance of a LinkedList and manipulate its elements using public methods and iterators. The add method of LinkedList appends a new element to the end of list. (Note that an iterator has also add method which inserts an element to the position pointed by the iterator)
LinkedList<String> myList = new LinkedList<String>();
myList.add("Apple");
myList.add("Carrot");
myList.add("Milk");
Iterator<String> iter = myList.iterator();
iter.next();
iter.next();
iter.remove(); // remove the last element returned next(), Carrot myList.add("Juice");
System.out.println(myList); // now the list contains, [Apple, Milk, Juice]
head
prev
next data
prev
next data data
prev
next tail
LinkedList
tail
node
prev
next data
node node
prev
next data prev
next data head
java.util.LinkedList also supplies another type of iterator called ListIterator that allows you to traverse the list in either direction. The listIterator method of the LinkedList class returns an listIterator, an object that implements the ListIterator interface.
public interface ListIterator<E> extends Iterator<E>
{ boolean hasNext();
E next();
boolean hasPrevious();
E previous();
int nextIndex();
int previousIndex();
void remove();
void set(E obj);
void add(E obj);
}
In a list of n elements, there are n+1 positions indexed from 0 to n. See the picture below. You can consider there are n + 1 valid cursor positions of listIterator. An listiterator that was just returned from the listIterator method points to the position 0. Using previous() or next(), you can move the listIterator in either direction.
^ ^ ^ ^ ^ Element 0 Element 1 Element 2 Element n -1
0 1 2 n -1 n
…
3. ListIterator Operations
(1) ListIterator.add inserts new element to where the listIterator points to. More precisely, the element is inserted in front of listIterator. If the list was empty, the new element becomes the first and only element of the list. A subsequent call to next will not be affected (the next element will be the same after adding a new element), and a subsequent call to previous will return the new element.
LinkedList<String> myList = new LinkedList<String>();
LinkedListIterator<String> iter = myList.listIterator();
iter.add(“A”);
iter.add(“B”);
^ ^
A A B
After the above code, a subsequent next() is illegal. With a subsequent previous(), listIterator jumps over B and return it. If you add a new element C after a previous(), the list will contain A, C, B and listIterator lies between C and B.
(2)ListIterator.set replaces the last element returned by next() or previous() with a new element. Repeated calls to set() is possible, but it will set the same element again and again. This
call can be made only if neither ListIterator.remove nor ListIterator.add have been called after the last next() or previous().
ListIterator<String> iter = list.listIterator();
<String> oldValue = iter.next(); //returns first element.
iter.set(“X”); // sets first element to a newValue “X”
iter.set(“Y”);
(3) ListIterator.remove removes from the list the element returned by the last next() or previous(). This call can only be made once per call to next or previous. It can be made only if ListIterator.add has not been called after the last next() or previous().
Here is the summary of the rules to call remove() or set().
Rule #1: remove() and set() of listIterator are not defined in terms of the iterator position;
they are operating on the last element returned by next() or previous().
Rule #2: add() and remove() change the object that lies either at left side or right side of the listiterator. You cannot call remove() or set() right after these method without a preceding next() or previous(
Preceding method call Current method call: remove() or set() add illegal
remove illegal
Let’s apply these rules to the following examples. When you evaluate the validity of a statement, consider preceding valid statements only. ^ indicates where the cursor (listIterator) lies.
Example
LinkedList<String> myList = new LinkedList<String>();
ListIterator<String> iter = myList.listIterator();
Line 1: iter.add(“A”); legal [A ^]
Line 2: iter.add(“B”); legal [A B ^]
Line 3: iter.add(“C”); legal [A B C ^]
Line 4: iter.remove(); illegal Line 5: iter.set(“X”); illegal Line 6: iter.next(); illegal
Line 7: iter.previous(); legal [A B ^ C]
Line 8: iter.set(“Y”); legal [A B ^ Y]
Line 9: iter.set(“Z”); legal [A B ^ Z]
Line 10: iter.remove(); legal [A B ^]
Line 11: iter.remove(); illegal
Line 12: iter.previous(); legal [A ^ B]
Line 13: iter.remove(); legal [A ^]
Line 14: iter.remove(); illegal
Line number Comments
1 add() does not need a preceding next() or previous()
2 & 3 You don’t have to check the validity of the current call to add().
4 Rule #1
5 Rule #1
6 There is no next element in this list
8 Rule #1
9 Rule #2.
10 Rule #2
11 Rule #2
12 & 13 Rule #1
14 Rule #2
4. More on java.util.LinkedList and ListIterator Multiple iterators
If an iterator traverses a collection while another iterator is modifying it, confusing situation can occur. Thus, you can attach as many iterators to a container as you like, provided that all of them are only reader. You can attach a single iterator that can both read and write.
LinkedList<String> list = . . .
LinkedIterator<String> iter1 = list.listIterator();
LinkedIterator<String> iter2 = list.listIterator();
iter1.next();
iter1.remove();
iter2.next(); /* throws ConcurrentModificationException */
Random access
Linked lists do not support fast random access. The get method lets you access a particular element like this: Object obj = list.get(n). However, what it actually does is skipping past the first n-1 elements to get the nth element.
for (int i = 0; i < list.size(); i++) do something with list.get(i);
The above code is not a smart one. Each loop actually performs a sequential access from the beginning of a list to get the ith element.
toString() Method
The AbstractCollection overrides the toString method in a way that it generates a long string in a format [A, B, C]. Therefore, System.out.println(a) prints all elements in the linked list a in the format [A,B,C].
Comparison of Iterator and ListIterator
• ListIterator is a subinterface of Iterator; classes that implement ListIterator provide all the capabilities of both.
• Iterator interface requires fewer methods and can be used to iterate over more general data structures but only in one direction
• Iterator is required by the Collection interface, whereas the ListIterator is required only by the List interface
Conversion between a ListIterator and an Index
• ListIterator has the methods nextIndex and previousIndex, which return the index values associated with the items that would be returned by a call to the next or previous methods
• The LinkedList class has the method listIterator(int index) which will create a new listIterator at the position index. (However, the method creates a listIterator that starts at the beginning and moves it forward to find the position index. It is an O(n) operation except for a special case where index is equal to size().
The Enhanced for Statement
LinkedList<String> myList = … count = 0;
for (String nextStr: myList)
{ if (target.equals(nextStr)) count++; }
The above enhanced for creates an Iterator object and implicitly calls its hasNext and next methods. Other Iterator methods, such as remove, are not available.
LinkedList<Integer> aList = … sum = 0;
for (int nextInt : aList) sum += nextInt;
Each Integer object is auto-unboxed, and its int value is stored in nextInt.
The Iterable Interface
• This interface requires only that a class that implements it provide an iterator method
• The Collection interface extends the Iterable interface, so all classes that implement the List interface (a subinterface of Collection) must provide an iterator method
public interface Iterable<E>
{ Iterator <E> iterator(); } Arrays vs. LinkedList
Array/ArrayList
• Advantage: O(1) takes to access an element (data[i] or get(i))
• Disadvantage: O(n) worst case time to add or remove due to extra shift operations.
Linked List
• Advantage: O(1) takes to add/remove an element using an iterator
• Disadvantage: O(n) takes to access a particular element. Therefore, traversing a
linkedlist using an index would be an O(n2) operator because we need to repeat the walk each time the index changes. The Iterator provides a general way to traverse a list so that traversing a linked list using an iterator is an O(n).