The Java Virtual Machine

Operations on the JVM Operand Stack

Here we review the basic operations on a stack, viewed from both the ADT (Abstract Data
Type) approach and also from the modified approach taken by the JVM. As an ADT, a
stack is a LIFO data structure with only a few basic operations. The two basic operations
we shall consider are Push and Pop. These are described first using the IA–32 assembler syntax.

PUSH X ; Push the value at location X onto the stack.

POP Y ; Pop the last value pushed onto the stack
; and store it into the location Y.

The description of the stack as a LIFO data structure illustrates its basic operation. Whatever
is last put on the stack is the first removed. We shall adopt the standard terminology that any
item is added to the top of the stack and removed from that location. While any correct stack
implementation must be based on a SP (Stack Pointer), we shall not use that detail here. In the
IA–32 architecture, the 32–bit stack pointer is called ESP.

Here is a short sequence of stack operations, just to serve as a review for the ADT.
We begin by assuming the following associations of values with locations.

Location	Value
X1	1
X2	2
Y	0

In this sequence, we show only that part of the stack that is directly relevant.

PUSH X1 ; Push the value at location X1 onto the stack.

PUSH X2 ; Push the value at location X2 onto the stack.

POP Y ; Pop the value from the stack and place into Y.
; The value 2 is no longer a part of the stack.

PUSH X1 ; Push the value at location X1 onto the stack.

The JVM operand stack is handled fairly much as the abstract data type.
Here are the key differences.
      1.      The push operation is replaced by a variety of load instructions.
      2.      The pop operation is augmented by a variety of store operations.
      3.      There are a number of operations that pop two operands at a time.
      4.      Some of the above operations pop two operands and then push one result.

Java Primitive Data Types

The primitive data types for the JVM are those for the Java language. Here is a table.

Data Type	Bytes	Format	Range
char	2	Unicode character	\u0000 to \uFFFF *
byte	1	Signed Integer	–128 to + 127
short	2	Signed Integer	– 32768 to + 32767
int	4	Signed Integer	– 2147483648 to + 2147483647
long	8	Signed Integer	– (2⁶³) to 2⁶³ – 1
float	4	IEEE Single Precision Float	Magnitude: 1.40·10^–45 to 3.40·10³⁸, and zero
double	8	IEEE Double Precision Float	Magnitude: 4.94·10^–3224 to 1.798·10³⁰⁸, and zero

* The character with 16–bit code 0x0000 through that with 16–bit code 0xFFFF.

While the primitive data types are as expected, the JVM storage is based on 32–bit (4 byte)
entities. Specifically, the operand stack is organized into 32–bit slots and the local variable
storage is considered as an array of 32–bit words of no particular data type.

The correspondence between the Java language primitive data types and the storage allocations
of the JVM is simple: the two 64–bit data types (long and double) are each stored in two
32–bit entries in the local variable array and stored as two entries on the operand stack. All
other data types are assigned one 32–bit storage location.

Put another way, from the JVM storage view, each data type is either a single word or a
double word. The JVM instructions interpret these as appropriate. Consider the following
two Java byte code instructions

dadd pop two double–precision floating point values from the stack,
add them, and push the result back onto the stack

ladd pop two long (64 bit) signed integer values from the stack,
add them, and push the result back onto the stack.

In either case, we have the following stack conditions.

Before	After
Value1–Word1	Result–Word1
Value1–Word2	Result–Word2
Value2–Word1
Value2–Word1
Something else

It is interesting to note that Java stores all multi–byte values in big–endian order, as opposed
to the Intel architectures, which use little–endian order. Big–endian order is also called
“network order”, in that it is used for values transmitted over the global Internet. As the
JVM provides no instructions for direct access of the memory of the base hardware (Intel,
Motorola, Sparc, etc.), this hardware independence causes no problem.

A Sample of Some Instructions

At this time, it might be convenient to show a sample of Java source code and discuss its
representation as instructions at the bytecode level. To avoid confusion, this author finds
it expedient to discuss the bytecode instructions before using them.

In Java bytecodes, each instruction contains a one–byte opcode followed by zero or more
operands. The name “bytecode” comes from the size of the opcode. It appears that the JVM
can accommodate at most 2⁸ = 256 different opcodes; though there is a way around this limit.

The following example shows two ways to load integers; that is, push the values onto the
operand stack. The iload instructions load from the indexed variable storage, and an
interesting variety of iconst instructions load constant values. We first examine the
iconst instructions, noting that each has a separate opcode.

Opcode           Instruction           Comment
0x02                iconst_m1        Push the integer constant –1 onto the stack.
0x03                iconst_0          Push the integer constant 0 onto the stack
0x04                iconst_1          Push the integer constant 1 onto the stack
0x05                iconst_2          Push the integer constant 2 onto the stack
0x06                iconst_3          Push the integer constant 3 onto the stack
0x07                iconst_4          Push the integer constant 4 onto the stack
0x08                iconst_5          Push the integer constant 5 onto the stack

There are two variants of the iload instruction, one for general indices into the local
variable array and one dedicated to the first four elements of this array. Here we focus on
the latter instructions, each with its one byte opcode. None of these has an argument.

Opcode           Instruction           Comment
0x1A               iload_0             Copy the integer in location 0 and push onto the stack.
0x1B               iload_1             Copy the integer in location 1 and push onto the stack.
0x1C               iload_2             Copy the integer in location 2 and push onto the stack.
0x1D               iload_3             Copy the integer in location 3 and push onto the stack.

The other variant of the iload instruction has opcode 0x15. Its appearance in byte code
is as the 2 byte entry 0x15 NN, where NN is the hexadecimal value of the variable index. Thus
the code 0x15 0A would be read as “push the integer in element 10 of the variable array
onto the stack”. Remember that 0x0A is the hexadecimal representation of decimal 10.

There are two variants of the istore instruction, one for general indices into the local
variable array and one dedicated to the first four elements of this array. Here we focus on
the latter instructions, each with its one byte opcode. None of these has an argument.

Opcode           Instruction           Comment
0x3B               istore_0          Pop the integer and store into location 0.
0x3C               istore_1          Pop the integer and store into location 1.
0x3D               istore_2          Pop the integer and store into location 2.
0x3E                istore_3          Pop the integer and store into location 3.

The other variant of the istore instruction has opcode 0x36. Its appearance in byte code
is as the 2 byte entry 0x36 NN, where NN is the hexadecimal value of the variable index.

With all of that said, here is the Java source code example.

int a = 3 ;

int b = 2 ;

int sum = 0 ;

sum = a + b ;

Here is the local variable table for this example. Note that we make the reasonable assumption
that table space is reserved for the variables in the order in which they were declared.

Location Index	Variable stored
0	a
1	b
2	sum

Here is the Java bytecode for the above simple source language sequence. In this, we add a
number of comments, each in the Java style, to help the reader make the connection.

// int a = 3 ;

iconst_3 // Push the constant value 3 onto the stack

istore_0 // Pop the value and store in location 0.

// int b = 2 ;

iconst_2 // Push the constant value 2 onto the stack

istore_1 // Pop the value and store in location 1

// int sum = 0 ;

iconst_0 // Push the constant value 0 onto the stack

istore_2 // Pop the value and store in location 2

// sum = a + b ;

iload_0 // Push value from location 0 onto stack

iload_1 // Push value from location 1 onto stack

iadd // Pop the top two values, add them,
// and push the value onto the stack.

istore_2 // Pop the value and store in location 2.

Just to be very explicit about this, we shall show the state of the operand stack and
variable array as the bytecode above is executed.

In the diagrams below, the gray color in a box indicates that the entry is not valid logically
within the current context. For example, following the execution of the iconst_3 instruction,
the stack has only one valid entry, which contains the value 3. Any entry on the stack that is
“below” the current entry might have meaning within another context, but not in this one.

Similarly, there are values in the variable array, but the values have no meaning in this context.

// int a = 3 ;

iconst_3 // Push the constant value 3 onto the stack

istore_0 // Pop the value and store in location 0.

// int b = 2 ;

iconst_2 // Push the constant value 2 onto the stack

istore_1 // Pop the value and store in location 1

// int sum = 0 ;

iconst_0 // Push the constant value 0 onto the stack

istore_2 // Pop the value and store in location 2

To facilitate reading the example, we repeat the state of the JVM just before the addition.

// sum = a + b ;

iload_0 // Push value from location 0 onto stack

iload_1 // Push value from location 1 onto stack

iadd // Pop the top two values, add them,
// and push the value onto the stack.

istore_2 // Pop the value and store in location 2.

To complete the discussion of the example, we need to list the integer (32 bit two’s complement) arithmetic operations.
Here is a table of the standard operations.

Opcode       Instruction    Comment
0x60            iadd               Pop the top two values, add them, and push the result onto the stack.
0x64            isub                Pop the two values, subtract the top one from the second one (next to
                                          top on the stack before subtraction), and push the result onto stack.

0x68 imul Pop the two values, multiply them, and push the result onto the stack.

0x6C           idiv                Pops the two values, divide the value that was second from top by the
                                          value that was at the top of the stack, truncates the result to the nearest
                                          integer, and pushes the result onto the stack.

Here is the Java bytecode for the short sequence used in the example above. It has ten bytes.

06 3B 05 3C 03 3D 1A 1B 60 3D

The Conditional Instructions

Conditional instructions are one of the class of instructions that separate a stored program
computer from a simple calculator. It must be possible to compare two values of the same
type and make branching decisions based on the result. The facility to compare two values
of different types (integer and float, float and double) will probably be provided by the
compiler of a higher level language, and involve type conversions.

The JVM divides these operations into two classes: conditional branch instructions and
comparison instructions. The division between the two classes is based on operand type,
with the operators called “comparison instructions” used to compare long integers (64 bit),
single precision floating point values, and double precision floating point values.

In the JVM, the conditional each branch instructions pops one value from the stack, examines it,
and branches accordingly. There are fourteen instructions in this class. Each of these fourteen
is a three-byte instruction, with the following form.

Byte    Type                                          Range                      Comment
1          unsigned 1-byte integer field     0 to 255                    8-bit opcode
2, 3      signed 2-byte integer field         -32768 to 32767       branch offset

For each of these instructions, the branch target address is computed as (pc + branch_offset),
where pc (program counter) references the address of the opcode of the branch instruction,
and branch_offset is a 16-bit signed integer indicating the offset of the branch target.

The first set of instructions to discuss cover the Java null object.

Opcode Instruction   Comment
0xC6             ifnull             This pops the top item from the operand stack. If it is an object
                                            reference to the special object null, the branch is taken.

0xC7 ifnonnull The branch is taken if the item does not reference null.

The following twelve instructions treat each numeric value as a 32-bit signed integer. While
the Java language supports signed integers with length of 8 bits and 16 bits, it appears that
each of these types is extended to 32 bits when pushed onto the stack.

Each of the instructions in the next set pops a single item off the stack, examines it as a 32-bit
integer value, and branches accordingly.

Opcode Instruction Comment

0x99                ifeq                  jump if the value is zero
0x9E                ifle                   jump if the value is zero or less than zero (not positive)
0x9B               iflt                   jump if the value is less than zero (negative)
0x9C               ifge                  jump if the value is zero or greater than zero (not negative)
0x9D               ifgt                  jump if the value is greater than zero (positive)
0x9A               ifne                  jump if the value is not zero

Each of the next instructions pops two items off the stack, compares them as 32-bit signed
integers, and branches accordingly. For each of these instructions, we assume that the stack
status before the execution of the branch instruction is as follows.

V1 the 32-bit signed integer at the top of the stack
V2 the 32-bit signed integer next to the top of the stack.

Opcode Instruction    Comment
0x9F      if_icmpeq      Branch if V1 == V2
0xA2      if_icmpge      Branch if V2 ≥ V1
0xA3      if_icmpgt      Branch if V2 > V1
0xA4      if_icmple       Branch if V2 ≤ V1
0xA1      if_icmplt       Branch if V2 < V1
0xA0      if_icmple       Branch if V1 ≠ V2

In the JVM, the comparison operators pop two values from the stack, compare them, and
then push a single integer onto the top of the stack to indicate the result. Assume that,
prior to the execution of the comparison operation; the state of the stack is as follows.

V1_Word 1
V1_Word 2
V2_Word 1
V2_Word 2

Here is a description of the effect of this class of instructions.

Result of comparison	Value Pushed onto Stack
V1 > V2	– 1
V1 = V2	0
V1 < V2	1

Here is a list of the comparison instructions.

Opcode     Instruction     Comment
0x94          lcmp                Takes long integers from the stack (each being two words, four
                                          stack entries are popped), compares them, and pushes the result.

0x97          fcmpl               Each takes single–precision floats from the stack (two stack
0x98          fcmpg              entries in total), compares them and pushes the result. The values
                                          +0.0 and –0.0 are treated as being equal.

0x95          dcmpl              Each takes double–precision floats from the stack (four stack
0x96          dcmpg             entries in total), compares them and pushes the result. The values
                                          +0.0 and –0.0 are treated as being equal.

Note that there are two comparison operations for each floating point type. This is based on
how one wants to compare the IEEE standard value NaN (Not a Number, use for results of
arithmetic operations such as 0/0). The fcmpl and dcmpl instructions return –1 if either
value is NaN. The fcmpg and dcmpg instructions return 1 if either value is NaN.

The JVM includes a number of other useful bytecode instructions, including the following.

1. Instructions to push other integer constant values onto the stack.

2. Instructions to handle other stack operations, such as discarding the top elements,
and swapping the top element with the one just below it.

3. A complete suite of logical operations.

4. A complete suite of type conversion operations.

5. Instructions for unconditional branches, subroutine calls, and subroutine returns.

The purpose of this sub–chapter has been to use the JVM (Java Virtual Machine) to illustrate
a commonly used stack–oriented architecture. A full discussion of the JVM must include
very much more.