The Java Virtual Machine (JVM) is a program that takes
a Java bytecode file (the ".class"
file) as input and interprets and executes the instructions in that
file.
We will not go deeply into the workings of the JVM
in this course but we will give an overview and then come back to
it several times in later chapters. We hope to provide some insight
into the JVM so as to help understand various aspects of the Java
language design and performance.
For detailed descriptions of the JVM specifications,
see the references for a
list of books that detail the JVM design. Note that as long as a
JVM implementation meets the specifications, the JVM developers
have wide latitude in how their JVM actually carries out the job.
Bytecode
To illustrate what the bytecode looks like we take
the following simple program Test.java:
public
class Test
{
public static void main(String args[])
{
int i;
i = 2;
i = i + 7;
}
}
|
and compile it with
>
javac Test.java
This produces the file Test.class,
which is a binary file that's not readable by most humans. We can
convert the file to a readable form with the javap
tool as shown here:
C:\ >
javap -c Test
Compiled from Test.java
public class Test extends java.lang.Object {
public Test(); //
a default constructor created
public static void main(java.lang.String[]);
}
Method
Test()
0 aload_0
1 invokespecial #3
4 return
Method
void main(java.lang.String[])
0 iconst_2 // Put integer
2 on stack
1 istore_1 // Store
the top stack value at location 1
2 iload_1 //
Put the value at location 1 on stack
3 bipush 7 // Put
the value 7 on the stack
5 iadd //
Add two top stack values together
6 istore_1 //
The sum, on top of stack, stored at location 1
7 return //
Finished processing
|
The output of javap
shows the class file in an assembler style format. Assembler
code lies very close to the machine level but it gives text names
to the opcodes and organizes the code so that people can read
it (at least with some practice).
Note: Sun
does not provide an assembler tool to take handwritten assembler
code and turn it into a class file. However, there are some independent
assemblers available.
The JVM uses an 8-bit instruction set, allowing
up to 256 possible instructions. Most of the instructions are
quite basic such as the iconst_2
instruction above which loads the value 2 onto the operand stack
(see below). A few of the instructions are quite elaborate (compared
to instructions in most hardware processors) such as one that
creates an array and also initializes the values of all the elements
to zero.
We'll discuss more about the instruction
sets and other aspects of the bytecode in later chapters.
JVM Design
Unlike many hardware processors, the JVM does not
allow access to registers that hold program counters, operands,
etc. Instead it uses operand stacks and local variables.
Every time a method is called, or invoked,
a new stack (Last-In-First-Out memory) is created to hold
operand values for instructions and to receive results from an instruction
operation. Method argument values are passed via the stack and the
method return value is passed via the stack. The stack values are
32-bit. The iconst_2
instruction in the above program, puts the integer value 2 on top
of the stack.
Note: This is an example
of where knowing something about the JVM helps explain an important
aspect of the Java language. Note that double
and long
values, which are 64 bits, require two of the 32 bit wide slots
on the stack.
This requires the JVM to carry out two stack operations to place
or remove such values on the stack. This can cause problems if
a process (that is, a thread) is stopped in between these
two operations. The data will be left in an indeterminate state.
In fact, the stop()
and suspend()
and resume()
methods in the original Thread
class of version 1.1 were deprecated
just to avoid this kind of problem.
Similarly, memory is allocated for local variables
in each method invocation and each variable given a number. In the
above example, the variable "i"
becomes variable 1. The instruction istore_1
puts the current value at the top of the stack into the local variable
1.
There are a number of other features used in the JVM
such as a Constants Pool that holds symbolic data for a class,
but we will discuss these later.
JVM Implementation
Although the bytecode cannot access registers or directly
reference memory locations and must obey various other restrictions,
the actual JVM program can use internally whatever techniques are
convenient to use for a particular platform. As long as the Java
bytecode sees only a JVM
specification compliant system, the JVM programmer has broad
discretion for its implementation.
As mentioned in the history
section, Java was always intended for a wide array of platforms,
including very simple embedded processors that might provide few
or no registers. So the stack approach was taken to allow for Java
to run on such basic hardware. Of course, the JVM program itself
will run as normal on a processor with a register architecture.
Typically the JVM is written in C (since virtually
every platform has a C compiler). The simplest interpreter style
approach would involve just a big switch statement, e.g.
in pseudocode:
switch
( instruction )
{
case inconst_2:
...
case istore_1:
...
...
}
in which each instruction would jump to the code in
the appropriate case section.
However, as we will discuss in the next
section, most JVMs employ far more sophisticated approaches
so as to optimise the performance of the bytecode and achieve C
like performance speeds.
References & Web Resources
Latest update: Mar.14.2004
|