Home : Course Map : Chapter 10 : Java :
String Tools
JavaTech
Course Map
Chapter 10

Introduction
Vector/Enumeration
Hashtable,HashMap
   Properties
Collections

Iterator/ArrayList
Generics
Preferences API
  Demo 1
Concurrency Utils
Enumerated Type
Arrays Class
String Tools
  String
  StringBuffer
  StringBuilder
  StringTokenizer
  String.split()

Calendar,Date,Time
  Demo 2
  Demo 3

Other Utilities
Exercises

    Supplements
Performance
Benchmarks
     About JavaTech
     Codes List
     Exercises
     Feedback
     References
     Resources
     Tips
     Topic Index
     Course Guide
     What's New

In Chapter 2: Java we briefly discussed the String class. Here we look again at the String class and some of its methods. We also introduce the StringBuffer , StringBuilder, and StringTokenizer classes that provide additional string handling and processing tools.

String Methods

In Chapter 3 we discussed the valueOf() methods in the String class that convert primitive type values to strings. The String class contains a large number of other useful methods. Here we briefly examine a sample of these methods.

int length ()
This method returns the number of characters in the string as in

String str = "A string";
int x = str.length ();

This results in variable x holding the value 8.

String trim ()
Removes whitespace from the leading and trailing edges of the string.

String string = "  14 units  ";
String str = string.trim ();

This results in the variable str referencing the string "14 units".

int indexOf (int ch)
int lastIndexOf (int ch)
This method returns the index, starting from 0, for the location of the given character in the string. (The char value will be widened to int.) For example,

  String string = "One fine day";
  int x = string.indexOf (‘f’);

This results in a value of 4 in the variable x. If the string holds no such character, the method returns -1.  To continue searching for more instances of the character, you can use the method

  indexOf(int ch, int fromIndex)

This will  start the search at the fromIndex location in the string and search to the end of the string. The methods

indexOf (String str)
indexOf (String str, int fromIndex)

provide similar functions but search for a sub-string  rather than just for a single character.  Similarly, the methods

lastIndexOf (int ch)
lastIndexOf (int ch, int fromIndex)

lastIndexOf (String str)
lastIndexOf (String str, int fromIndex)

search backwards for characters and strings starting from the right side and moing from right to left. . (The fromIndex second parameter still counts from the left, with the search continuing from that index position toward the beginning of the string.)

boolean startsWith (String prefix)
boolean endsWith (String str)

These two methods test indicate whether a string begins or ends with a particular substring. For example:

String [] str = {"Abe", "Arthur", "Bob"};
for (int i=0; i < str.length (); i++) {
    if (str1.startsWith ("Ar")) doSomething ();
}

String toLowerCase ()
String toUpperCase ()

The first method return a new string with all the characters set to lower case  while the second returns the characters set to upper case.

String [] str = {"Abe", "Arthur", "Bob"};
for (int i=0; i < str.length(); i++){
    if (str1.toLowerCase ().startsWith ("ar")) doSomething ();
}

Below we discuss the split() method in the String class after discussing the StringTokenizer class. See the String entry in the Java 2 Platform API Specifications to examine other methods in the class.

java.lang.StringBuffer

String objects are immutable, meaning that once created they cannot be altered. Concatenating two strings does not modify either string but instead creates a new string object:

  String str = "This is ";
  str = str + " a new string object";

Here str variable now references a completely new object that holds the "This is a new string object" string.

This is not very efficient if you are doing extensive string manipulation with lots of new strings created through this sort of append operations. The String class maintains a pool of strings in memory. String literals are saved there and new strings are added as they are created. Extensive string manipulation with lots of new strings created with the String append operations can therefore result in lots of memory taken up by unneeded strings. Note however, that if two string literals are the same, the second string reference will point to the string already in the pool rather than create a duplicate.

The class java.lang.StringBuffer offers more efficient string creation. For example:.

  StringBuffer strb = new StringBuffer ("This is ");
  strb.append (" a new string object");
  System.out.println (strb.toString());

The StringBuffer uses an internal char array for the intermediate steps so that new strings objects are not created. If it becomes full, the array is copied into a new larger array with the additional space available for more append operations.

java.lang.StringBuilder

J2SE5.0 added the StringBuilder class, which is a drop-in replacement for StringBuffer in cases where thread safety is not an issue. Because StringBuilder is not synchronized, it offers faster performance than StringBuffer.

In general, you should use StringBuilder in preference over StringBuffer. In fact, the J2SE 5.0 javac compiler normally uses StringBuilder instead of StringBuffer whenever you perform string concatenation as in

System.out.println ("The result is " + result);

All the methods available on StringBuffer are also available on StringBuilder, so it really is a drop-in replacement.

java.util.StringTokenizer

The java.util.StringTokenizer allows you to break a string into substrings, or tokens, that are separated by delimiters. The delimiters are whitespace (spaces, carriage returns, etc.) by default but you can defined others.

A StringTokenizer provides an Enumeration object that steps through the tokens:

  String str = "This is a string object";
  StringTokenizer st = new StringTokenizer (str);
  while (st.hasMoreTokens ()) {
    System.out.println (st.nextToken ());
    ...
  }

On the console this shows:

This
is
a
string
object

An overloaded constructor allows you to specify the delimiters. For example,

String str = "A*bunch*of*stars";
StringTokenizer st = new StringTokenizer (str,"*");

This breaks the string into the tokens separated by the "*" character.

String.split ()

J2SE 1.4 added the split() method to the String class to simplify the task of breaking a string into substrings, or tokens. This method uses the concept of a regular expression to specify the delimiters. A regular expression is a remnant from the Unix grep tool ("grep" meaning "general regular expression parser").

A full discussion of regular expressions is beyond the scope of this book. See most any introductory Unix text or the Java API documentation for the java.util.regex.Pattern class.

In its simplest form, searching for a regular expression consisting of a single character finds a match of that character. For example, the character 'x' is a match for the regular expression "x".

The split() method takes a parameter giving the regular expression to use as a delimiter and returns a String array containing the tokens so delimited. Using split(), the first example above becomes

String str = "This is a string object";
String[] words = str.split (" ");
for (int i=0; i < words.length; i++)
  System.out.println (words[i]);

To use "*" as a delimiter, specify "\\*", where the backslashes indicate that the * is a character to look for rather than the wildcard in a regular expression:

String str2 = "A*bunch*of*stars";
String[] starwords = str2.split ("\\*");
for (int i=0; i < starwords.length; i++)
  System.out.println (starwords[i]);

For most string splitting tasks, the String.split() method is much easier and more natural to use than the StringTokenizer class. However, StringTokenizer is still useful for some tasks. For example, an overloaded StringTokenizer constructor allows you to specify that the tokens to be returned include the delimiter characters themselves.

References & Web Resources

Latest update: July 28, 2006

              Tech
ArbitaryPrecision
   BigInteger
  
BigDecimal
Bit Handling
Exercises

           Physics
Data Gen&Analysis

  Demo 1
  Demo 2
Exercises

  Part I Part II Part III
Java Core 1  2  3  4  5  6  7  8  9  10  11  12 13 14 15 16 17
18 19 20
21
22 23 24
Supplements

1  2  3  4  5  6  7  8  9  10  11  12

Tech 1  2  3  4  5  6  7  8  9  10  11  12
Physics 1  2  3  4  5  6  7  8  9  10  11  12

Java is a trademark of Sun Microsystems, Inc.