Java Utilities

Raj Shekhar

You will find several standard utility interfaces and classes in the java.util package.  The java.util classes covered in this article are

  • Formatter A class for producing formatted text.
  • BitSet A dynamically sized bit vector.
  • Observer/Observable An interface/class pair that enables an object to be Observable by having one or more Observer objects that are notified when something interesting happens in the Observable object.
  • Random A class to generate sequences of pseudorandom numbers.
  • Scanner A class for scanning text and parsing it into values of primitive types or strings, based on regular expression patterns.
  • StringTokenizer A class that splits a string into tokens based on delimiters (by default, whitespace).
  • Timer/TimerTask A way to schedule tasks to be run in the future.
  • UUID A class that represents a universally unique identifier (UUID).

Formatter

The Formatter class allows you to control the way in which primitive values and objects are represented as text. The common way to represent objects or values as text is to convert the object or value to a string, using either the toString method of the object or the toString method of the appropriate wrapper class. This is often done implicitly, either through use of string concatenation or by invoking a particular overload of the PrintStream or PrintWriterprint methods. This is easy and convenient, but it doesn't give you any control over the exact form of the string. For example, if you invoke

System.out.println("The value of Math.PI is " + Math.PI);

the output is

The value of Math.PI is 3.141592653589793

This is correct and informative, but it might provide more detail than the reader really needs to see, or more than you have room to show. Using a formatter you can control the format of the converted text, such as restricting the value to only three decimal places, or padding the converted string with spaces so that it occupies a minimum width regardless of the actual value being converted. All of these operations are possible starting from the result of toString, but it is much easier to have a formatter do it for you.

The primary method of a Formatter object is the format method. In its simplest form it takes a format string followed by a sequence of arguments, which are the objects and values you want to format. For convenience the PrintStream and PrintWriter classes provide a printf method (for "print formatted") that takes the same arguments as format and passes them through to an associated Formatter instance. We use the System.out.printf method to demonstrate how format strings are used.

The format string contains normal text together with format specifiers that tell the formatter how you want the following values to be formatted. For example, you can print the value of Math.PI to three decimal places using

System.out.printf("The value of Math.PI is %.3f %n", Math.PI);

which prints
The value of Math.PI is 3.142

A format specifier starts with a % character and ends with a character that indicates the type of conversion to perform. In the example above, the f conversion means that the argument is expected to be a floating-point value and that it should be formatted in the normal decimal format. In contrast, an e conversion is a floating-point conversion that produces output in scientific notation (such as 3.142e+00). Other conversions include 'd for integers in decimal format, x for integers in hexadecimal, and s for strings or general object conversions. Conversion indicators that can produce non-digit text are defined in both a lowercase and uppercase form, such as e and E, where the uppercase form indicates that all text in the output will be converted to uppercase (as if the resulting String had toUpperCase invoked on it).

In addition to the conversion indicator, a format specifier can contain other values that control the layout of the converted value. In the example above the ".3" is a precision indicator, which for a floating-point decimal conversion indicates how many decimal places should appear in the result, with the value rounded up or down as appropriate. You can also control the width of the output text, ensuring that different formatted elements line up correctly (such as for printing data in a tabular form) regardless of the actual value being formatted. There are also flags that can be specified to control things like the justification (left or right) and the padding character to use to maintain the minimum width. The exact meaning of precision, width, and flags depends on the conversion being applied.

There are two special conversions that we quickly mention. The first is the % conversion used to output the % character. Because the % character marks the start of a format specifier, you need a way to include a % character that should actually appear in the output. The format specifier %% will do just that (just as the escape character \\ is used to produce the single character \). You can specify a width with this conversion to pad the output with spaces: on the left if no flags are given, and on the right if the flag is given to request left-justification.

The second special conversion is the line separator conversion n. A format specifier of %n (as used in the example) outputs the platform specific line separator. The line separator string is defined by the system property line.separator and is not necessarily a single newline character (\n). Unlike println that outputs the line separator for you, printf and format require that you remember to do this yourself. No width, precision, or flag values can be used with the line separator conversion.

The Formatter class also supports formatting for instances of the java.math.BigInteger and java.math.BigDecimal classes, but those are not discussed hereconsult the Formatter class documentation for information concerning those classes.

BitSet

The BitSet class provides a way to create a bit vector that grows dynamically. In effect, a bit set is a vector of TRue or false bits indexed from 0 to Integer.MAX_VALUE, all of them initially false. These bits can be individually set, cleared, or retrieved. BitSet uses only sufficient storage to hold the highest index bit that has been setany bits beyond that are deemed to be false.

Methods that take indices into the set throw IndexOutOfBoundsException if a supplied index is negative or, where relevant, if the from index is greater than the to index.

There are two constructors for BitSet:

public BitSet(int size)

Creates a new bit set with enough initial storage to explicitly represent bits indexed from 0 to size-1. All bits are initially false.

Other methods modify the current bit set by applying bitwise logical operations using the bits from another bit set:

public void and(BitSet other)

Logically ANDs this bit set with other and changes the value of this set to the result. The resulting value of a bit in this bit set is TRue only if it was originally true and the corresponding bit in other is also true.

public void andNot(BitSet other)

Clears all bits in this bit set that are set in other. The resulting value of a bit in this bit set is true only if it was originally true and the corresponding bit in other is false.

public void or(BitSet other)

Logically ORs this bit set with other and changes the value of this set to the result. The resulting value of a bit in this bit set is TRue only if it was originally true or the corresponding bit in other is TRue.

public void xor(BitSet other)

Logically XORs this bit set with other and changes the value of this set to the result. The resulting value of a bit in this bit set is true only if it has a value different from the corresponding bit in other.

You can also ask whether the current bit set has any true bits in common with a second bit set by using the intersects method.

The remaining methods are

public int cardinality()

Returns the number of bits in this bit set that are true.

public int size()

Returns the number of bits actually stored in this BitSet. Setting a bit index greater than or equal to this value may increase the storage used by the set.

public int length()

Returns the index of the highest set bit in this BitSet plus one.

public boolean isEmpty()

Returns true if this bit set has no TRue bits.

public int hashCode()

Returns a hash code for this set based on the values of its bits. Do not change values of bits while a BitSet is in a hash map, or the set will be misplaced.

public boolean equals(Object other)

Returns true if all the bits in other are the same as those in this set.

Here is a class that uses a BitSet to mark which characters occur in a string. Each position in the bit set represents the numerical value of a character: The 0th position represents the null character (\u0000), the 97th bit represents the character a, and so on. The bit set can be printed to show the characters that it found:

public class WhichChars {
private BitSet used = new BitSet();

    public WhichChars(String str) {
for (int i = 0; i < str.length(); i++)
used.set(str.charAt(i));    // set bit for char
}

    public String toString() {
String desc = "[";
for (int i = used.nextSetBit(0);
i >= 0;
i = used.nextSetBit(i+1) ) {
desc += (char) i;
}
return desc + "]";
}
}

If we pass WhichChars the string "Testing123" we get back

[ 123Teginst]

which shows each of the characters (including the spaces) that were used in the input string, and which, incidentally, have now been sorted into numerical order. Notice how easy it is to iterate through all the set bits in a bit set.

public BitSet()

Creates a new bit set with a default amount of initial storage. All bits are initially false.

There are four methods for dealing with individual bits.

public void set(int index)

Sets the bit specified by index to TRue.

public void clear(int index)

Sets the bit specified by index to false.

public void flip(int index)

Sets the bit specified by index to the complement of its current valuetrue becomes false, and false becomes true.

public boolean get(int index)

Returns the value of the bit specified by index.

A second overloaded form of each of the above methods works on a range of bits. These overloads take a from index and a to index and either sets, clears, flips, or returns all bits in the range, starting with from and up to but not including to. For get, the values are returned as a new BitSet. A third overload of clear takes no arguments and clears the entire set to false. A second variant of the set method takes both the index (or range) and a boolean value to apply to the bits. This makes it easier for you to change bits arbitrarily without having to work out whether you need to invoke set or clear.

You can find the index of the next clear or set bit, that is at or after a given index, using the nextClearBit and nextSetBit methods. If there is no next set bit from that index then 1 is returned.[1]

Observer/Observable

The Observer/Observable types provide a protocol for an arbitrary number of Observer objects to watch for changes and events in any number of Observable objects. An Observable object subclasses the Observable class, which provides methods to maintain a list of Observer objects that want to know about changes in the Observable object. All objects in the "interested" list must implement the Observer interface. When an Observable object experiences a noteworthy change or an event that Observer objects may care about, the Observable object invokes its notifyObservers method, which invokes each Observer object's update method.

The Observer interface consists of a single method:

public void update(Observable obj, Object arg)

This method is invoked when the Observable object obj has a change or an event to report. The arg parameter is a way to pass an arbitrary object to describe the change or event to the observer.

The Observer/Observable mechanism is designed to be general. Each Observable class is left to define the circumstances under which an Observer object's update method will be invoked. The Observable object maintains a "changed" flag which subclass methods use to indicate when something of interest has occurred.

protected void setChanged()

Marks this object as having been changedhasChanged will now return truebut does not notify observers.

protected void clearChanged()

Indicates that this object is no longer changed or has notified all observers of the last changehasChanged will now return false.

public boolean hasChanged()

Returns the current value of the "changed" flag.

When a change occurs, the Observable object should invoke its setChanged method and then notify its observers with one of the following methods:

public void notifyObservers(Object arg)

Notifies all Observer objects in the list that something has happened, and then clears the "changed" flag. For each observer in the list, its update method is invoked with this Observable object as the first argument and arg as the second.

public void notifyObservers()

Equivalent to notifyObservers(null).

The following Observable methods maintain the list of Observer objects:

public void addObserver(Observer o)

Adds the observer o to the observer list if it's not already there.

public void deleteObserver(Observer o)

Deletes the observer o from the observer list.

public void deleteObservers()

Deletes all Observer objects from the observer list.

public int countObservers()

Returns the number of observers in the observer list.

The methods of Observable use synchronization to ensure consistency when concurrent access to the object occurs. For example, one thread can be trying to add an observer while another is trying to remove one and a third is effecting a change on the Observable object. While synchronization is necessary for maintaining the observer list and making changes to the "changed" flag, no synchronization lock should be held when the update method of the observers is invoked. Otherwise it would be very easy to create deadlocks. The default implementation of notifyObservers takes a synchronized snapshot of the current observer list before invoking update. This means that an observer that is removed while notifyObservers is still in progress will still be notified of the last change. Conversely, an observer that is added while notifyObservers is still in progress will not be notified of the current change. If the Observable object allows concurrent invocations of methods that generate notifications, it is possible for update to be called concurrently on each Observer object. Consequently, Observer objects must use appropriate synchronization within update to ensure proper operation.

The default implementation of notifyObservers uses the invoking thread to invoke update on each observer. The order in which observers are notified is not specified. A subclass could specialize notifyObservers to use a different threading model, and/or to provide ordering guarantees.

The following example illustrates how Observer/Observable might be used to monitor users of a system. First, we define a Users class that is Observable:

import java.util.*;

public class Users extends Observable {
private Map<String, UserState> loggedIn =
new HashMap<String, UserState>();

    public void login(String name, String password)
throws BadUserException
{
if (!passwordValid(name, password))
throw new BadUserException(name);

        UserState state = new UserState(name);
loggedIn.put(name, state);
setChanged();
notifyObservers(state);
}

    public void logout(UserState state) {
loggedIn.remove(state.name());
setChanged();
notifyObservers(state);
}

// ...

}



A Users object stores a map of users who are logged in and maintains UserState objects for each login. When someone logs in or out, all Observer objects will be passed that user's UserState object. The notifyObservers method sends messages only if the state changes, so you must invoke setChanged on Users; otherwise, notifyObservers would do nothing.

Here is how an Observer that maintains a constant display of logged-in users might implement update to watch a Users object:


import java.util.*;

public class Eye implements Observer {
Users watching;

    public Eye(Users users) {
watching = users;
watching.addObserver(this);
}
public void update(Observable users, Object whichState)
{
if (users != watching)
throw new IllegalArgumentException();

        UserState state = (UserState) whichState;
if (watching.loggedIn(state))   // user logged in
addUser(state);             // add to my list
else
removeUser(state);          // remove from list
}
// ...
}

Each Eye object watches a particular Users object. When a user logs in or out, Eye is notified because it invoked the Users object's addObserver method with itself as the interested object. When update is invoked, it checks the correctness of its parameters and then modifies its display depending on whether the user in question has logged in or out.

The check for what happened with the UserState object is simple here. You could avoid it by passing an object describing what happened and to whom instead of passing the UserState object itself. Such a design makes it easier to add new actions without breaking existing code.

Random

The Random class creates objects that manage independent sequences of pseudorandom numbers. If you don't care what the sequence is and want it as a sequence of double values, use the method java.lang.Math.random, which creates a single Random object the first time it is invoked and returns pseudorandom numbers from that objectsee Section. You can gain more control over the sequence (for example, the ability to set the seed) by creating a Random object and getting values from it.

public Random()

Creates a new random number generator. Its seed will be initialized to a value based on the current time.

public Random(long seed)

Creates a new random number generator using the specified seed. Two Random objects created with the same initial seed will return the same sequence of pseudorandom numbers.

public void setSeed(long seed)

Sets the seed of the random number generator to seed. This method can be invoked at any time and resets the sequence to start with the given seed.

public boolean nextBoolean()

Returns a pseudorandom uniformly distributed boolean value.

public int nextInt()

Returns a pseudorandom uniformly distributed int value between the two values Integer.MIN_VALUE and Integer.MAX_VALUE, inclusive.

public int nextInt(int ceiling)

Like nextInt(), but returns a value that is at least zero and is less than the value ceiling. Use this instead of using nextInt() and % to get a range. If ceiling is negative, an IllegalArgumentException is thrown.

public long nextLong()

Returns a pseudorandom uniformly distributed long value between Long.MIN_VALUE and Long.MAX_VALUE, inclusive.

public void nextBytes(byte[] buf)

Fills the array buf with random bytes.

public float nextFloat()

Returns a pseudorandom uniformly distributed float value between 0.0f (inclusive) and 1.0f (exclusive).

public double nextdouble()

Returns a pseudorandom uniformly distributed double value between 0.0 (inclusive) and 1.0 (exclusive).

public double nextGaussian()

Returns a pseudorandom Gaussian-distributed double value with mean of 0.0 and standard deviation of 1.0.

All the nextType methods use the protected method next. The next method takes an int that represents the number of random bits to produce (between 1 and 32) and returns an int with that many bits. These random bits are then converted to the requested type. For example, nextInt simply returns next(32), while nextBoolean returns TRue if next(1) is not zero, else it returns false.

You can safely use Random from multiple threads.

The Random class specifies the algorithms to be used to generate the pseudo-random numbers but permits different algorithms to be used provided the general contract of each method is adhered to. The basic algorithm (a linear congruential generator) is defined in the next method and is used for all other methods except nextGaussian. You can create your own random number generator by overriding the next method to provide a different generating algorithm.

Scanner

The Scanner class will help you read files of formatted data, such as those you might generate from a method that used printf. It uses regular expressions to locate the desired data, and parsers to convert them into known types. For example, it knows how to use localized number parsing to read in values that humans might type (such as "20,352").

Many of the methods of Scanner can take a pattern to indicate how the scanner should match input. These methods all have two overloaded forms: One takes the pattern as a String and the other takes it as a java.util.regex.Pattern.When we say that a method takes a pattern, you should infer from this that there are two variants of the method as just described. Supplying a pattern as a string may require that it be compiled each time (using Pattern.compile), so if a pattern is to be reused it may be more efficient to use a Pattern object directly.

There are two primary approaches to using Scanner, although they can be reasonably intermixed as desired, with care. The first is to use it to read a stream of values. The second is line-oriented.

StringTokenizer

A StringTokenizer breaks a string into parts, using delimiter characters. A sequence of tokens broken out of a string is, in effect, an ordered enumeration of those tokens, so StringTokenizer implements the Enumeration interface. StringTokenizer provides methods that are more specifically typed than Enumeration, which you can use if you know you are working on a StringTokenizer object. The StringTokenizer enumeration is effectively a snapshot because String objects are read-only. For example, the following loop breaks a string into tokens separated by spaces or commas:

 

For historical reasons it implements Enumeration<Object>, not Enumeration<String>.

String str = "Gone, and forgotten";
StringTokenizer tokens = new StringTokenizer(str, " ,");
while (tokens.hasMoreTokens())
System.out.println(tokens.nextToken());

By including the comma in the list of delimiter characters in the StringTokenizer constructor, the tokenizer consumes commas along with spaces, leaving only the words of the string to be returned one at a time. The output of this example is

Gone
and
forgotten

The StringTokenizer class has several methods to control what is considered a word, whether it should understand numbers or strings specially, and so on:

public StringTokenizer(String str, String delim,boolean returnTokens)

Constructs a StringTokenizer on the string str, using the characters in delim as the delimiter set. The returnTokens boolean determines whether delimiters are returned as tokens or skipped. If they are returned as tokens, each delimiter character is returned separately.

public StringTokenizer(String str, String delim)

Equivalent to StringTokenizer(str, delim, false), meaning that delimiters are skipped, not returned.

public StringTokenizer(String str)

Equivalent to StringTokenizer(str," \t\n\r\f"), meaning that the delimiters are the whitespace characters and are skipped.

public boolean hasMoreTokens()

Returns true if more tokens exist.

public String nextToken()

Returns the next token of the string. If there are no more tokens, a NoSuchElementException is thrown.

public String nextToken(String delim)

Switches the delimiter set to the characters in delim and returns the next token. There is no way to set a new delimiter set without getting the next token. If there are no more tokens, a NoSuchElementException is thrown.

public int countTokens()

Returns the number of tokens remaining in the string using the current delimiter set. This is the number of times nextToken can return before it will generate an exception. When you need the number of tokens, this method is faster than repeatedly invoking nextToken because the token strings are merely counted, not constructed and returned.

The methods StringTokenizer implements for the Enumeration interface (hasMoreElements and nextElement) are equivalent to hasMoreTokens and nextToken, respectively.

The delimiter characters are processed individually when StringTokenizer looks for the next token, so it cannot identify tokens separated by a multicharacter delimiter. If you need a multicharacter delimiter, use either the String class's split method or a Scanner.

Timer and TimerTask

The Timer class helps you set up tasks that will happen at some future point, including repeating events. Each Timer object has an associated thread that wakes up when one of its TimerTask objects is destined to run. For example, the following code will set up a task that prints the virtual machine's memory usage approximately once a second:

Timer timer = new Timer(true);
timer.scheduleAtFixedRate(new MemoryWatchTask(), 0, 1000);

This code creates a new Timer object that will be responsible for scheduling and executing a MemoryWatchTask. The TRue passed to the Timer constructor tells Timer to use a daemon thread so that the memory tracing activity will not keep the virtual machine alive when other threads are complete.

The scheduleAtFixedRate invocation shown tells timer to schedule the task starting with no delay (the 0 that is the second argument) and repeat it every thousand milliseconds (the 1000 that is the third argument). So starting immediately, timer will invoke the run method of a MemoryWatchTask:

import java.util.TimerTask;
import java.util.Date;

public class MemoryWatchTask extends TimerTask {
public void run() {
System.out.print(new Date() + ": " );
Runtime rt = Runtime.getRuntime();
System.out.print(rt.freeMemory() + " free, ");
System.out.print(rt.totalMemory() + " total");
System.out.println();
}
}

MemoryWatchTask extends the abstract TimerTask to define a task that prints the current free and total memory, prefixed by the current time. TimerTask implements the Runnable interface, and its run method is what is invoked by a Timer object when a task is to be run. Because the setup code told timer to execute once a second, the thread used by timer will wait one second between task executions.

TimerTask has three methods:

public abstract void run()

Defines the action to be performed by this TimerTask.

public boolean cancel()

Cancels this TimerTask so that it will never run again (or at all if it hasn't run yet). Returns true if the task was scheduled for repeated execution or was a once-only task that had not yet been run. Returns false if the task was a once-only task that has already run, the task was never scheduled, or the task has previously been cancelled. Essentially, this method returns true if it prevented the task from having a scheduled execution.

public long scheduledExecutionTime()

Returns the scheduled execution time of the most recent actual execution (possibly the in-progress execution) of this TimerTask. The returned value represents the time in milliseconds. This method is most often used inside run to see if the current execution of the task occurred soon enough to warrant execution; if the task was delayed too long run may decide not to do anything.

You can cancel either a single task or an entire timer by invoking the cancel method of TimerTask or Timer. Cancelling a task means that it will not be scheduled in the future. Cancelling a Timer object prevents any future execution of any of its tasks. If you purge a timer then all references to cancelled tasks are removed from its internal queue of tasks to schedule.

Each Timer object uses a single thread to schedule its task executions. You can control whether this thread is a daemon thread by the boolean you specify to the Timer constructor.

public Timer(boolean isDaemon)

Creates a new Timer whose underlying thread has its daemon state set according to isDaemon.

public Timer()

Equivalent to Timer(false).

public Timer(String name)

Equivalent to Timer(false) but gives the associated thread the given name. This is useful for debugging and monitoring purposes.

public Timer(String name, boolean isdaemon)

Equivalent to Timer(isDaemon) but gives the associated thread the given name. This is useful for debugging and monitoring purposes.

If the thread is not specified to be a daemon, it will be a user thread. When the timer is garbage collectedwhich can only happen when all references to it are dropped and no tasks remain to be executedthe user thread will terminate. You should not rely on this behavior since it depends on the garbage collector discovering the unreachability of the timer object, which can happen anytime or never.

UUID

The UUID class provides immutable objects that represent universally unique identifiers. These are 128-bit values that are generated in such a way that they are guaranteed to be unique. There are four different versions of UUID values, generally known as types 1, 2, 3, and 4. You can ask a UUID object for its version by using the version method.

The usual way to create a type 4 (random) UUID is to use the static factory method randomUUID. A type 3 (name-based) UUID can be obtained from the static factory method nameUUIDFromBytes, which takes an array of bytes representing the name.

You can directly construct an arbitrary form UUID by supplying two long values that represent the upper and lower halves. Of course, there are no guarantees about uniqueness for such a constructed UUIDit is up to you to supply appropriate values. The two halves of a UUID object can be retrieved as long values from the getMostSignificantBits and getLeastSignificantBits methods.

The layout of a UUID is determined by which variant is used to implement it. There are four variants: 0, 2, 6, and 7. The UUID class always generates variant 2 UUID valuesknown as the Leach-Salz variant. The details of these variants are not important for this article. You can ask a UUID object for its variant by using the variant method. Two UUID objects are equal only if they have the same variant and 128-bit value.

The toString method returns a string representation of a UUID that can be passed into the static fromString factory method to get a UUID object.

The remaining methods are applicable only to type 1 (time-based) UUID values, and throw UnsupportedOperationException if invoked on other types:

public long timestamp()

Returns the 60-bit timestamp associated with this UUID.

public int clockSequence()

Returns the 14-bit clock sequence value associated with this UUID.

public long node()

Returns the 48-bit node value associated with this UUID. This value relates to the host address of the machine on which this UUID was generated.








}