Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 14 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
14
Dung lượng
64,41 KB
Nội dung
Chapter 8: Sets Overview The Set interface is the first type of Collection we'll discuss. This interface represents a group of elements without duplicates. There is nothing in the interface definition that forces the group to not have duplicates; it is the actual implementations that enforce this part of the definition. The elements of the group have no perceived order, though specific implementations of the interface may place an order on those items. Figure 8−1 shows the class hierarchy diagram for the Set interface, along with the base abstract implementation, the concrete implementation classes, and other interfaces implemented by these classes. Figure 8−1: The Set class hierarchy. Set Basics In Java, a set represents a collection of unique elements. Not only must these elements be unique, but while they are in the set, each element must not be modified. While there is no programmatic restriction preventing you from modifying elements in a set, if an element were to change, it could become forever lost in the set. While not immediately obvious in Table 8−1, the Set interface definition is the same as the Collection interface definition shown in Chapter 7. Table 8−1: Summary of Interface Set VARIABLE/METHOD NAME VERSION DESCRIPTION add() 1.2 Adds an element to the set. addAll() 1.2 Adds a collection of elements to the set. clear() 1.2 Clears all elements from the set. contains() 1.2 Checks if the set contains an element containsAll() 1.2 Checks if the set contains a collection of elements. equals() 1.2 Checks for equality with another object. 93 hashCode() 1.2 Returns a computed hash code for the set. isEmpty() 1.2 Checks if the set is empty. iterator() 1.2 Returns an object from the set that allows all of the set's elements to be visited. remove() 1.2 Clears a specific element from the set. removeAll() 1.2 Clears a collection of elements from the set. retainAll() 1.2 Removes all elements from the set not in another collection. size() 1.2 Returns the number of elements in the set. toArray() 1.2 Returns the elements of the set as an array. The Collection Framework provides two concrete set implementations: HashSet and TreeSet. The HashSet represents a set backed by a hash table providing constant lookup−time access to unordered elements. On the other hand, the TreeSet maintains its elements in an ordered fashion within a balanced tree. Since the Set interface is identical to the Collection interface, we'll immediately jump into the concrete implementations. HashSet Class The Collection Framework introduces the HashSet collection. This implementation is backed by a hash table (HashMap, actually) for storing unique elements. The backing hash table ensures that duplicate elements are avoided as each element is stored and retrieved through its hash code, providing constant retrieval time. Note For more on hash codes, see Chapter 5. The HashMap is the framework's replacement to the historical Hashtable class. It, too, provides storage for key−value pairs. HashSet just stores a dummy 'present' object as the value for every key. Most of the HashSet functionality is provided through the AbstractCollection and AbstractSet superclasses, which HashSet shares with TreeSet. As Table 8−2 shows, the HashSet class only needs to provide constructors and customize eight of its methods. Table 8−2: Summary of the HashSet Class VARIABLE/METHOD NAME VERSION DESCRIPTION HashSet() 1.2 Constructs a hash set. add() 1.2 Adds an element to the set. clear() 1.2 Removes all elements from the set. clone() 1.2 Creates a clone of the set. contains () 1.2 Checks if an object is in the set. isEmpty() 1.2 Checks if the set has any elements. iterator() 1.2 Returns an object from the set that allows all of the set's elements to be visited. remove() 1.2 Removes an element from the set. size() 1.2 Returns the number of elements in the set. HashSet Class 94 While only eight methods are shown here, we should look at all those defined within the Set interface. These methods can, again, like Collection, be broken down into eight groups with HashSet adding a ninth for its constructors. The methods of the eight groups allow you to add and remove elements as well as perform operations on the set as a whole. Creating a HashSet The HashSet class provides four constructors broken into two sets. The first three constructors create empty sets of varying sizes: public HashSet() public HashSet(int initialCapacity) public HashSet(int initialCapacity, int loadFactor) If unspecified, the initial set size for storing elements will be the default size of a HashMap, which happens to be eleven or 101, depending upon what Java version you are using. When the set capacity reaches full and a new element is added, the internal structure will double in size before adding the new element (and copying in the old elements). If you don't like the 75%−full aspect, you can provide the constructor with a custom load factor. Tip When creating any collection, it is always best to have the local variable be of the interface type, as in Set set = new HashSet(). That way, if you later decide to change the set to a TreeSet or some other set implementation, you won't have to change any code as you'll only be using the methods of the Set interface. The fourth constructor acts as a copy constructor, copying the elements from one set into the newly created set: public HashSet(Collection col) You cannot provide a custom initial capacity or load factor. Instead, the internal map will be sized at twice the size of the collection, or eleven if the collection is small (five or less elements), keeping the default load factor of 75%. Note If the original collection had duplicates, only one of the duplicates will be in the final created set. An easy way to initialize a set without manually adding each element is to create an array of the elements, create a List from that array with Arrays.asList(), then call this constructor with the list as the collection: String elements[] = {"Irish Setter", "Poodle", "English Setter", "Gordon Setter", "Pug"}; Set set = new HashSet(Arrays.asList(elements)); Adding Elements When you need to add elements to a set, you can either add a single element or a group of elements. Adding Single Elements To add a single element, call the add() method: public boolean add(Object element) Creating a HashSet 95 The add() method takes a single argument of the element to add. If the element is not in the set, it is added and true is returned. If the element happens to be in the set already, because element.equals(oldElement) returns true (for some element in the set), then the new element replaces the old element in the collection and false is returned. If the old element has no other references, it becomes eligible for garbage collection. See Figure 8−2 to help you visualize this replacement scenario. Figure 8−2: Adding a contained element to a set. If you are working with a set that was made to be read−only, or if adding elements is not supported, then an UnsupportedOperationException will be thrown. Tip If you need to modify an element in a set, you should remove it, modify it, and then re−add it. If you don't, you can consider the object lost as there is no way of finding the object without manually digging through all the objects in the set. This is true when the change affects the results of hashCode(). If the change doesn't affect the method results, the change can be made. However, you should then question why the hashCode() results didn't change. Adding Another Collection You can add a group of elements from another collection to the set with the addAll() method: public boolean addAll(Collection c) Each element in the collection passed in will be added to the current set via the equivalent of calling the add() method on each element. If the underlying set changes, true is returned. If no elements are added, false is returned. As with add(), if equal elements are in both sets, true is returned with the new element replacing the old element in the set. An UnsupportedOperationException will be thrown when working with a set that doesn't support adding elements. Removing Elements You can remove elements from a set in four different ways. Removing All Elements The simplest removal method, clear(), clears all of the elements from the set: public void clear() Adding Elements 96 While there is no return value, you can still get an UnsupportedOperationException thrown when working with a read−only set. Removing Single Elements Use the remove() method if you wish to remove a single element at a time: public boolean remove(Object element) Determining whether the element is in the set is done via the equals() method of the element. If the element is found, the element is removed from the set and true is returned. If not found, false is returned. If removal is not supported, you'll get an UnsupportedOperationException thrown whether the element is present or not. Removing Another Collection The third way to remove elements is with removeAll(): public boolean removeAll(Collection c) The removeAll() method takes a Collection as an argument and removes from the set all instances of each element in the Collection passed in. The Collection passed in can be a Set or some other Collection implementation. For instance, if the original set consisted of the following elements: {"Irish Setter", "Poodle", "English Setter", "Gordon Setter", "Pug"} and the collection passed in was {"Poodle", "Pug", "Poodle", "Pug", "Poodle", "Pug", "Pug", "Pug"} the resulting set would have every instance of "Poodle" and "Pug" removed. Since a set cannot have duplicates, it would remove one for each: {"Irish Setter", "English Setter", "Gordon Setter"} As with most of the previously shown set methods, removeAll() returns true if the underlying set changed, and false or an UnsupportedOperationException, otherwise. And again, the equals() method is used to check for element equality. Retaining Another Collection The retainAll() method works like removeAll(), but in the opposite direction: public boolean retainAll(Collection c) In other words, only those elements within the collection argument are kept in the original set. Everything else is removed, instead. Figure 8−3 should help you visualize the difference between removeAll() and retainAll(). The contents of the starting set are the five dogs listed previously (Irish Setter, Poodle, English Setter, Gordon Setter, Pug). The acting collection consists of the elements Pug, Poodle, and Samoyed. The Samoyed element is shown to demonstrate that in neither case will this be added to the original collection. Remember that sets are Removing Elements 97 unordered, so the resultant set may not keep the elements in the original order. Figure 8−3: The removeAll() method versus the retainAll() method. Set Operations Besides methods for filling a set, there are several other operations you can perform, all of which deal with the elements. You'll find support for fetching, finding, and copying elements, among some other secondary tasks. Fetching Elements To work with all of the elements of the set, you can call the iterator() method to get an Iterator: public Iterator iterator() Since the elements of a hash set are unordered, the order of the elements returned has nothing to do with the order in which they were added or any kind of sort order. And as the capacity of the hash set changes, the elements may be reordered. In other words, don't rely on the element order returned to be consistent between calls. Working with an Iterator returned from a Set is no different than every other Iterator: String elements[] = {"Irish Setter", "Poodle", "English Setter", "Gordon Setter", "Pug"}; Set set = new HashSet(Arrays.asList(elements)); Iterator iter = set.iterator(); while (iter.hasNext()) { System.out.println(iter.next()); } Running this will result in output similar to the following; remember that order doesn't matter as long as each element in the set is displayed: Gordon Setter Poodle English Setter Pug Irish Setter Set Operations 98 Finding Elements If, instead of fetching elements in the set, you desire to know only whether a specific element or set of elements is in the set, this next set of methods provides discovery capabilities about the set. Checking for Existence The contains() method reports if a specific element is within the set: public boolean contains(Object element) If found, contains() returns true; if not, false. As with remove(), equality checking is done through the element's equals() method. Checking for Set Containment Besides checking to see if a set contains a single element, the containsAll() method allows you to check if a set contains another whole collection: public boolean containsAll(Collection c) This method takes a Collection as its argument and reports if the elements of the passed−in collection are a subset of the current set. In other words, is each element of the collection also an element of the current collection? The current set can contain other elements but the passed−in collection cannot or containsAll() will return false. If an element is in the passed−in collection multiple times, it only needs to be in the source set once to be successful. Checking Size To find out how many elements are in a set, use the size() method: public int size() To combine the process of getting the size and checking for no elements in the set, use the isEmpty() method instead: public boolean isEmpty() You can think of the isEmpty() method as returning the following value, however, the use of isEmpty() is faster: return (size() == 0); Copying and Cloning Sets There are many ways to duplicate a HashSet. You can clone it, serialize it, copy it, or simply call the previously shown copy constructor. Hash sets are Cloneable and have a public clone() method: Finding Elements 99 public Object clone() Calling the clone() method of a HashSet creates a shallow copy of that HashSet. In other words, the elements of the set aren't duplicated. Initially, both sets will refer to the same elements. However, adding or removing elements in one has no effect on the other. Changes to the attributes of a common element will be reflected in both sets. And while changes to a single element will be reflected in both collections, changing elements while they are in the set should be avoided. If you do, and an individual element's hash code is changed, the set no longer knows of the existence of the element in the set except through its iterator. Calling clone() can be a little ugly if you only have a reference to the interface. You must call the clone() method of a concrete class, not an interface. Calling it on an interface is effectively like calling it on Object and the clone() method of Object is protected. Thus, you must cast both the method call and the return value of the method call like so: Set set = . Set set2 = ((Set)((HashSet)set).clone()); Besides implementing Cloneable, HashSet implements the empty Serializable interface. If, and only if, all of the elements of a HashSet are Serializable can you save the HashSet to an ObjectOutputStream and later read it in to an ObjectInputStream. The following demonstrates this: FileOutputStream fos = new FileOutputStream("set.ser"); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(set); oos.close(); FileInputStream fis = new FileInputStream("set.ser"); ObjectInputStream ois = new ObjectInputStream(fis); Set anotherSet = (Set)ois.readObject(); ois.close(); System.out.println(set3); This is also helpful for passing sets across an HttpServletResponse through a servlet, across a URL connection, or some other socket connection. Another manner of copying elements out of a set is through the toArray() methods: public Object[] toArray() public Object[] toArray(Object[] a) The first toArray() method will return an Object array containing all the elements in the collection. The position of an element in the array is not meant to imply any position for that element inside the set. It just so happens that elements in an array need an index. Because this method returns an Object [ ], every time you need to get an element out of the array, you need to cast it to the appropriate type. When all elements of a collection are of the same type, it is easier for you to work with an array of that specific type. As with the generic Collection interface definition, this is where the second version of toArray() comes in handy: Object[] toArray(Object[] a). With this version, the toArray() method consults with the passed−in array to determine the return type (and size). If the passed−in array is large enough to contain all the elements in the collection [set.size() <= a.length], the elements are placed in the array and returned. If the array is too small, a new array of the same type will be created, sized to the current number of elements in the set as Finding Elements 100 reported by size(), and used to store all the elements. It is this new array that is returned in the second case, not the original. If the array passed in is too large, the element located at the index in the array one position after the last item from the set will be set to null (a[collection.size()] = null). This may be useful if you know there are no null elements in the set and don't want to ask the set its size. Warning If the elements of the collection are not assignment−compatible with the array type, an ArrayStoreException will be thrown. Checking for Equality The HashSet class defines equality through its equals() method: public boolean equals(Object o) A HashSet is equal to another object if the other object implements the Set interface, has the same size(), and contains all the same elements. Calling the equals() method on a HashSet effectively calls the equals() method on each element within the set—or at least until one reports that there is no equivalent element in the passed−in set. Hashing Collections The HashSet class overrides the hashCode() method to define an appropriate hash code for the set: public int hashCode() The hashCode() method works such that no matter what internal order the hash set elements are in, the same hashCode() must be returned. In other words, it sums up the hash codes of all the elements. PERFORMANCE NOTE Starting with Java 1.3, hash codes for Strings are now cached. Previously, they were recomputed each time they were needed. TreeSet Class The other concrete Set implementation is the TreeSet. The TreeSet class works exactly the same as the HashSet class with one notable exception: instead of keeping its elements unordered, a TreeSet keeps its elements ordered internally. Not only are the elements ordered, but the tree is balanced. More specifically, it's a red−black tree. Having a balanced tree guarantees a quick o(log n) search time at the cost of a more time−intensive insertion (and deletion). Of course, elements added to the tree must be orderable. Note Red−black tree rules refresher: Every node in the tree is either black or red.1. The root is always black.2. If a node is red, its children must be black.3. Every path from the root to a leaf (or null child) must contain the same number of black nodes. 4. Because TreeSet implements the SortedSet interface as well as the Set interface, understanding TreeSet is a little more involved than HashSet. Table 8−3 lists the methods you'll need to know to use TreeSet. Checking for Equality 101 Table 8−3: Summary of the TreeSet Class VARIABLE/METHOD NAME VERSION DESCRIPTION TreeSet() 1.2 Constructs a tree set. add() 1.2 Adds an element to the set. addAll() 1.2 Adds a collection of elements to the set. clear() 1.2 Removes all elements from the set. clone() 1.2 Creates a clone of the set. comparator() 1.2 Retrieves a comparator for the set. contains () 1.2 Checks to see if an object is in the set. first() 1.2 Retrieves the first element of the set. headSet() 1.2 Retrieves a subset at the beginning of the entire set. isEmpty() 1.2 Checks if the set has any elements. iterator() 1.2 Returns an object from the set that allows all of the set's elements to be visited. last() 1.2 Retrieves the last element of the set. remove() 1.2 Removes an element from the set. size() 1.2 Returns the number of elements in the set. subSet() 1.2 Retrieves a subset of the entire set. tailSet() 1.2 Retrieves a subset at the end of the entire set. As the behavior of most of the TreeSet methods duplicates that of HashSet, we'll only look at those methods that are new or specialized to TreeSet. These new methods happen to be those implemented for the SortedSet interface. Note In addition to their coverage here, the SortedSet interface and sorting support in general will be examined in more depth in Chapter 11. Creating a TreeSet The TreeSet class provides four constructors broken into two sets. The first two constructors create empty tree sets: public TreeSet() public TreeSet(Comparator comp) In order to maintain an ordering, elements added to a tree set must provide some way for the tree to order them. If the elements implement the Comparable interface, the first constructor is sufficient. If, however, the objects aren't comparable or you don't like the default ordering provided, you can pass along a custom Comparator to the constructor that will be used to keep elements ordered. Once the TreeSet is created, you cannot change the comparator. Note Similar to the HashSet relying on a HashMap for the internal storage, the TreeSet relies on a TreeMap internally. The second two constructors are copy constructors, copying all elements from one collection into another: Creating a TreeSet 102 [...]... alphabetical, which is the natural ordering of strings English Setter Gordon Setter Irish Setter Poodle Pug Working with Subsets Since a TreeSet is ordered, a subset of the tree set is also ordered As such, the TreeSet class provides several methods for working with these subsets The two that are simplest to explain are headset() and tailSet(): public SortedSet headSet(Object toElement) public SortedSet... easy For the headSet(), the first element will always be the first() element Similarly, for the tailSet(), the last element will be the last() element These are always included in their respective subsets As far as the other element specifying the range goes, the fromElement will be in the subset while the toElement will not: fromElement . HashSet class provides four constructors broken into two sets. The first three constructors create empty sets of varying sizes: public HashSet() public HashSet(int. Cloning Sets There are many ways to duplicate a HashSet. You can clone it, serialize it, copy it, or simply call the previously shown copy constructor. Hash sets