Home : Course Map : Chapter 12 : Java :
Timing & Performance
JavaTech
Course Map
Chapter 12

Printing
  Demo 1
Cursor Icons
  Demo 2
MouseButtons
  Demo 3
PopupMenu
  Demo 4
Keystrokes
  Demo 5
Audio
  Demo 6
Timing & Speed
  Demo 7
Toolkit
  Demo 8
AppletContext
  Demo 9
Exercises

    Supplements
Java Beans
More APIs
Java & Browsers
  Demo 1
     About JavaTech
     Codes List
     Exercises
     Feedback
     References
     Resources
     Tips
     Topic Index
     Course Guide
     What's New

Improving execution times in a Java program can be very important part of making it a usable tool. There are various profiler tools available commercially that give detailed information on the time taken by various parts of a program. The Sun J2SE java program, in fact, includes the options -Xprof and -Xrunhprof to produce time profiles.

There are some optimization steps that the programmer can make to improve the performance. For example, if you are sure that you will not be overriding a method then you should you use the modifier final. The compiler can then inline the method since the interpreter will not have to search for possible overriding methods.

Inline puts the program code directly in the execution flow and therefore avoids jumping to and back from another section of memory.

For AWT components we saw that double buffering and other techniques (see Chapter 11: Supplements: AWT Flicker ) reduce the number of calls to the host windowing system and this speeds execution considerably. (Swing components already employ double bufferiing.)

Avoiding multiple drawing calls reduces drawing times. We mentioned in a Chapter 6: Java : Drawing Polygons, a considerable speedup can be obtained by using a single drawPolyline() call instead of multiple drawLine()calls when drawing a curve.

Timing with System.currentTimeMillis()

When you are unsure about the relative speed of different coding techniques, you can test them with the static method System.currentTimeMillis(). This method returns a long value equal to the number of milliseconds since a standard date in 1970. That number in itself is of little value, but the difference between two millisecond time readings can be meaningful. Thus, you can bracket a code section of interest with this method and find the difference in milliseconds to get an idea of how long that section of code takes to run.

On today's high performance machines, you often need to loop over an operation a large number of times to produce a time in the millisecond range. Such a timing measurement has come to be known as a microbenchmark and is fraught with difficulties for a variety of reasons including just-in-time and dynamic compilation techniques, compilation warm-up periods, the use of separate compile threads, background compilation, compiler optimizations, dead code removal, and others.

Nevertheless, many people still rely on System.currentTimeMillis() to measure execution time differences. We explain here how that is normally done and attempt to ameliorate many of the concerns that make microbenchmarks untrustworthy.

With multiprocessing happening in the operating system and multithreading in the JVM, the measured times will vary from run to run, so a statistical average is more meaningful than a single run. It is wise to unload any unnecessary programs that might steal time slices away from the program you are testing.

Demo Timing Application Program

As an example, consider the java.util.Vector class we discussed in Chapter 10 : Java. It is often very useful compared to a regular fixed-length array because of Vector's ability to remove and add elements. However, in situations where high performance is required, you may find that an array is preferable because Vector offers noticeably slower performance than an array.

One reason that Vector is slow is because it is synchronized for thread safety. An alternative that can be used in most situations is the ArrayList class which does not have the synchronization overhead. In the test code below we compare timings of object arrays, Vector, and ArrayList objects. For consistency, we populate each with Integer objects.

In the main() method shown below in the class TimeTest, we read in some command line parameters and then call doTests() several times. Inside doTests() we call four methods that test fetching Integer objects from four different container types - Vector, ArrayList, regular object arrays, and an ArrayList collection object that takes advantage of the new generics feature in Java 5.0 (see Chapter 10). If you want to run this code on a pre-5.0 Java system, you'll need to comment out the generics version.

The program uses the following methods to check the four container types:

testv() - vector
testal() - ArrayList
testali() - ArrayList<Integer>
testa() - array

In each case, the fetches from the containers are in a loop in order to make the total time large enough to measure. We have to do something with the Integer objects fetched from the containers or else the compiler will notice that nothing is changing inside the loop and remove the loop altogether. So for our timing loop we first cast the Object type retrieved from the container to an Integer and then we accumulate a total sum of all the values after converting from Integer type to int type using the intValue() method on Integer.

TimeTest.java
import java.util.*;

/**
  * Run timing tests on four types of code:
  *  1 - Vector element access
  *  2 - ArrayList element access
  *  3 - ArrayList element access.
  *  4 - Object array access (using Integer objects
**/
public class TimeTest
{
  private int fWarmup, fNum, fOuter, fSleep;

  /** Fill timing parameters. **/
  public TimeTest (int warmup, int sleep, int num, int outer) {
    fWarmup = warmup;
    fSleep = sleep;
    fNum = num;
    fOuter = outer;
  } // ctor

  /** Run tests. **/
  public static void main (String[] args) {
    int warmup=0, sleep=0, num=0, outer=0, total=0;
    if (args.length == 5) {
       warmup = Integer.parseInt (args[0]);
       sleep  = Integer.parseInt (args[1]);
       num    = Integer.parseInt (args[2]);
       outer  = Integer.parseInt (args[3]);
       total  = Integer.parseInt (args[4]);
       System.out.println ("Testing with warmup = " + warmup +
         ", sleep = " + sleep + ", num = " + num +
         ", outer = " + outer + ", total = " + total);
    }
    else {
        System.err.println ("Usage: java TimeTest warmup " +
        "sleep loop outer total");
        System.exit (1);
    }

    System.out.println ("V\tAL\tALI\tarray");
    TimeTest tt = new TimeTest (warmup, sleep, num, outer);
    for (int i = 0; i < total; i++) {
      tt.doTests ();
    }
  } // main

  /** Carry out the different set of timing tests.**/
  public void doTests () {
    long vectime    = testv ();
    long arlisttime = testal ();
    long alitime    = testali ();
    long arraytime  = testa ();

    System.out.println (vectime + "\t" + arlisttime +
      "\t" + alitime + "\t" + arraytime);
  } // doTests

  /**
    * Create a Vector and an array whose elements reference
    * Integer objects. A time comparison is made between accessing
    * the Integer objects with the Vector and the array.
   **/
  public long testv () {
    // Create Vector and fill elements with Integer objects
    Vector vec = new Vector (fNum);
    for (int i=0; i < fNum; i++) {
        vec.add (new Integer (i));
    }

    // Now test access times by looping through the Vector
    // to access each Integer element.

    // First warmup the hotspot compiler.
    long sum = 0;
    for (int i = 0; i < fWarmup; i++) {
        sum += ((Integer)(vec.elementAt (i))).intValue ();
    }
    // And give it time to finish the JIT compile
    // in the background.
    if (fSleep > 0) {
        try {
          Thread.sleep (fSleep);
        }
        catch (InterruptedException e)
        {}
    }
    // Then do the loop for real and time it.
    long t1 = System.currentTimeMillis ();
    for (int j = 0; j < fOuter; j++) {
        for (int i = 0; i < fNum; i++) {
            sum += ((Integer)(vec.elementAt (i))).intValue ();
        }
    }
    long t2 = System.currentTimeMillis ();
    long elapsed = t2 - t1;
    vec.clear ();
    vec = null;
    System.gc ();
    return elapsed;
  } // testv

  /** Create Object array and fill elements with Integer objects.**/
  public long testa () {
    Object[] array = new Object[fNum];
    for (int i=0; i < fNum; i++){
        array[i] = new Integer(i);
    }

    // Now test access times by looping through the array
    // to access each Integer element.

    // First warmup the hotspot compiler.
    int sum = 0;
    for (int i=0; i < fWarmup; i++) {
        sum += ((Integer) (array[i])).intValue();
    }
    // And give it time to finish the JIT compile inthe background.
    try {
      Thread.sleep (fSleep);
    }
    catch (InterruptedException e) {}

    // Then do the loop for real and time it.
    long t1 = System.currentTimeMillis();
    for (int j=0; j < fOuter; j++) {
        sum = 0;
        for (int i=0; i < fNum; i++) {
            sum += ((Integer) (array[i])).intValue();
        }
    }
    long t2 = System.currentTimeMillis();
    long elapsed = t2 - t1;
    array = null;
    System.gc ();
    return elapsed;
  } // testa

  /** Create ArrayList and fill elements with Integer objects.**/
  public long testal () {
    ArrayList al = new ArrayList (fNum);
    for (int i=0; i < fNum; i++){
        al.add (new Integer(i));
    }

    // Now test access times by looping through the array list
    // to access each Integer element.

    // First warmup the hotspot compiler.
    int sum = 0;
    for (int i=0; i < fWarmup; i++) {
        sum += ((Integer)(al.get (i))).intValue();
    }
    // And give it time to finish the JIT compile inthe background.
    try {
      Thread.sleep (fSleep);
    }
    catch (InterruptedException e) {}

    // Then do the loop for real and time it.
    long t1 = System.currentTimeMillis();
    for (int j=0; j < fOuter; j++) {
        sum = 0;
        for (int i=0; i < fNum; i++) {
            sum += ((Integer)(al.get (i))).intValue();
        }
    }
    long t2=System.currentTimeMillis();
    long elapsed = t2 - t1;
    al.clear ();
    al = null;
    System.gc ();
    return elapsed;
  } // testal

  /** Create ArrayList and fill elements with Integer objects. **/
  long testali () {
    ArrayList ali = new ArrayList (fNum);
    for (int i=0; i < fNum; i++){
        ali.add (i);
    }

    // Now test access times by looping through the array list
    // to access each Integer element.

    // First warmup the hotspot compiler.
    int sum = 0;
    for (int i=0; i < fWarmup; i++) {
        sum += ali.get (i);
    }
    // And give it time to finish the JIT compile in the background.
    try {
      Thread.sleep (fSleep);
    }
    catch (InterruptedException e) {}

    // Then do the loop for real and time it.
    long t1 = System.currentTimeMillis ();
    for (int j=0; j < fOuter; j++) {
        sum = 0;
        for (int i=0; i < fNum; i++) {
            sum += ali.get (i);
        }
    }
    long t2 = System.currentTimeMillis();
    long elapsed = t2 - t1;
    ali.clear ();
    ali = null;
    System.gc ();
    return elapsed;
  } // testali

} // class TimeTest

In the testv() method, for example, we first populate a Vector with Integer objects. Then we begin to fetch from the Vector and accumulate the sum. We are using Sun's Hotspot(tm) compiler, which uses adaptive optimization, so we want to give it time to warm up, we first perform a short summation loop without timing. The number of warm up loops is specified by one of the command line parameters. After warming up we sleep a short while.

The purpose of the sleep is to relinquish the processor so that the Hotspot compiler can finish a background compile of the loop if needed. Finally we begin the timing loop by calling System.currentTimeMillis(), followed by the loop itself (actually an inner and outer loop) and then another call to System.currentTimeMillis() so we can calculate the elapsed time. We then clear the Vector, set the reference to null, and request garbage collection (System.gc ()). The elapsed time is returned to doTests() where it is printed to the console.

The command line parameters adjust the number of warm up loops, the sleep time, the number of inner and outer loops to be timed, and the total number of times to run the tests. If you download this code you can experiment with the various parameters to see what effect they have on the timing results. The results can vary widely based on the platform in use.

In a run of the program, via the command line we set the five parameters as follows:

num warmup iterations = 3000
sleep = 300 msecs
num inner loop interations = 3000
num outer loop iterations = 3000

num test runs = 4

The following four runs show typical output variations:

C:\Java\...>java TimeTest 3000 3000 3000 3000 4
Testing with warmup = 3000, sleep = 10, num = 3000, outer = 3000, total = 4
V       AL      AL<I>   array
1032    203     187     47
1031    219     187     47
1062    218     203     110
1031    203     187     63

C:\Java\...>java TimeTest 3000 10 3000 3000 4
Testing with warmup = 3000, sleep = 10, num = 3000, outer = 3000, total = 4
V       AL      AL<I>   array
1062    219     188     46
1406    328     343     109
1391    328     359     110
1407    328     360     109

C:\Java\...>java TimeTest 3000 10 3000 3000 4
Testing with warmup = 3000, sleep = 10, num = 3000, outer = 3000, total = 4
V       AL      AL<I>   array
1047    204     360     110
1406    219     359     47
1406    344     343     109
1406    344     360     109
 

 

For this case we see that the Vector is slowest, the ArrayList is up to four to five times faster than the Vector, the ArrayList<Integer> is varies from faster to slower than the non-generic version, and the simple object array is fastest of all; up to 14 times faster than the Vector and 2-4 times faster than the ArrayList.

While the exact times will vary considerably from platform to platform, these general relavtive timings will hold. So from these tests you would clearly not want to use a Vector in a time critical code segment and use a plain array where time is paramount. So cleary we see that such timing tests can be valuable even if the absolute numbers are not especially meaningful.

Other performance tips

Here are some tips on performance coding. See the reference section below for links to a number of sites with extensive lists of performance coding recommendations.

  • Local variables run faster than instance variables.

  • The long and double type variables typically require extra time to access and modify since they contain additional bytes compared to int and float. However, significantly slower performance is not true of all JVMs on all platforms since some take advantage of particular processor capabilities.

  • " The JVM bytecode is weighted towards int operations so use int type except where you specifically need to use one of the other types. Math operations with the shorter integer types are widened in the JVM to int. For example, a sum of two byte values results in an int value (that may be cast back to a byte if the result is to be stored into a byte variable).

  • Use x += A; assignment commands rather than x = x + A; The first requires 1 instruction and the latter 4.

  • If you are concatenating lots of strings, use StringBuffer and its append() method instead of the String "+" operator. If a Java 5.0 JVM is available, use StringBuilder instead.

  • As demonstrated in the TimeTest example above, the original Java utility classes like Vector, Enumeration, and Hashtable are slow because they use synchronization for thread safety. Although synchronization overhead has been significantly reduced in modern versions of Java, if you do not require synchronization, then the newer utilities from the Java Collections Framework are normally superior. The new classes ArrayList and HashMap generally replace Vector and Hashtable, respectively, and Iterator is preferred over Enumeration.

  • Avoid creating lots of objects if possible, especially in loop in which speed is important. Creating an object takes time and uses up memory resources.

  • Avoid method calls in a loop as much as possible. That is, do your loops inside of methods rather than calling methods inside of loops.

  • Performance can vary significantly among different JVMs for the same platform. Do timing measurements for all the JVMs that would potentially be used with your program

References & Web Resources

Latest update: Oct. 3, 2005

              Tech
Tech APIs
Exercises

           Physics
Math/Science APIs
JAIDA Program
  Demo 1
Demo 1 Discussion
Exercises

  Part I Part II Part III
Java Core 1  2  3  4  5  6  7  8  9  10  11  12 13 14 15 16 17
18 19 20
21
22 23 24
Supplements

1  2  3  4  5  6  7  8  9  10  11  12

Tech 1  2  3  4  5  6  7  8  9  10  11  12
Physics 1  2  3  4  5  6  7  8  9  10  11  12

Java is a trademark of Sun Microsystems, Inc.