Why Assembler Language
Does Not Use Variables

Edward L. Bosworth, Ph.D.

Associate Professor

TSYS Department of Computer Science

Columbus State University

Columbus, GA 31907 – 5645

Sample of Assembler Language Code

Consider the assignment statement Z = X + Y.

We are using IBM® 370 Series Assembler Language as an example.
Here is a possible translation of the high–level language above.

STD 0,Z     STORE RESULT INTO ADDRESS Z

The first question in examining this text is to determine what it does.
I have already told you much of what it does, but let’s start at the beginning.

PLEASE NOTE:      This lecture will use much that has yet to be discussed.
Please focus on the “big picture”.  The details, such as
how to write the code, will be discussed in due time.

A Two – Pass Assembler

Here, we shall focus only on the first pass of either a compiler or assembler.
The goal is to read and interpret the symbols found in the text of the code.

Here is what a two–pass assembler would first do with this text.

STD 0,Z     STORE RESULT INTO ADDRESS Z

The symbols LD, AD, and STD would be identified as assembler language
operations, and the symbol
0 would be identified as a reference to register 0.

The symbols X, Y, and Z would be identified properly only if those symbols
had been properly identified.  Here is the way to do it for this code.

X        DC  D‘3.0’  DOUBLE-PRECISION FLOAT

Y        DC  D‘4.0’

Z        DC  D‘0.0’

Here is a somewhat literal translation of the text.  Note that the
spacing has been altered so that I could get more on a line.

STD 0,Z     Store result into 8 bytes at address Z

X  DC  D‘3.0’  Set aside eight bytes (64 bits)
at an address to be associated with
the label X, initialize it to 3.0

Y  DC  D‘4.0’  Set aside eight bytes (64 bits)
at an address to be associated with
the label Y, initialize it to 4.0

Z  DC  D‘0.0’  Set aside eight bytes (64 bits)
at an address to be associated with
the label Z, initialize it to 0.0

Cautions on the Assembler Process

In the above fragments, we see two independent processes at work.

1)    Use of data declarations to reserve space in memory to

2)    Use of assembly code to perform operations on these data.

Note that these are inherently independent.

It is the responsibility of the coder to apply the operations to
the correct data types.

Occasionally, it is proper to apply a different (and apparently
inconsistent) operation to a data type.  Consider the following.

XX      DS  D    Double-precision floating point

All that really says is “Set aside an eight–byte memory area, and
associate it with the symbol
XX.”

Any eight–byte data item could be placed here,
even a 15–digit packed decimal format.  (This is commonly done)

To show what could happen, and commonly does in student programs,
lets rewrite the above fragment.

STD 0,Z     STORE RESULT INTO ADDRESS Z

X        DC  E‘3.0’  SINGLE-PRECISION FLOAT, 4 BYTES

Y        DC  E‘4.0’  ANOTHER SINGLE-PRECISION

Z        DC  D‘0.0’  A DOUBLE PRECISION

The first instruction “LD  0,X” will go to address X and extract the next
eight bytes.  This will be four bytes for 3.0 and four bytes for 4.0.

The value retrieved will be 0x4130 0000 4140 0000, which represents
a double–precision number with value slightly larger than 3.0.

Had X and Y been properly declared, the value retrieved would have been
0x4130 0000 0000 0000.

What A Modern Compiler Does

Consider the following fragments of Java code.

double x = 3.0;  // 64 bits or eight bytes

double y = 4.0;  // 64 bits or eight bytes

double z = 0.0;  // 64 bits or eight bytes

// More declarations and code here.

z = x + y;       // Do the addition that is
// proper for this data type.

// Here, it is double-precision

Note that the compiler will interpret the source–language statement
z = x + y” according to the data types of the operands.

Here is an informal translation of the Java code.  We assume that the Java
compiler is multi–pass, and that it has a standard first pass.

The standard first pass will identify all of the symbols, allocate storage for
each, and assign a data type for each.

The compiler, as a system program with input and output, maintains a number
of internal tables to assist in generating the appropriate code.

Here is our reading.  We assume a table used to describe the variables.
For each variable, this table stores:      the character representation,
the storage location allocated, and
the data type for the variable.

When an operation on a variable is called for, it is the description stored
in this compiler table that is used to select the proper low–level code.

Rereading the Java Text (page 2)

Let’s begin by rereading and interpreting the code we have seen.

double x = 3.0;
// Allocate 8 bytes of storage to be associated with
// the symbol “x”.  Initialize the value to the
// double–precision floating–point value 3.0.
// Place entries in the compiler tables allocating
// eight bytes for storage and indicating that this
// variable is double–precision floating point.

double y = 4.0;  // Same for this symbol,
// but set the value to 4.0.

double z = 0.0;  // Same for this symbol,
// but set the value to 4.0.

z = x + y;       // Do the addition that is
// proper for this data type.

// The data declarations determine the operation.

Another Java Code Fragment

Here is more code, similar to the first fragment.

float  a = 3.0;  // 32 bits or four bytes

float  b = 4.0;  // 32 bits or four bytes

float  c = 0.0;  // 32 bits or four bytes

double x = 3.0;  // 64 bits or eight bytes

double y = 4.0;  // 64 bits or eight bytes

double z = 0.0;  // 64 bits or eight bytes

// More declarations and code here.

c = a + b;       // Single-precision floating-point
z = x + y;       // Double-precision floating-point

The operations “c = a + b” and “z = x + y” have no meaning, apart
from the data types recorded by the compiler.

More on This Code: IBM® 370 Assembler Equivalents

Here is the sort of thing that might happen.  Assume the data declarations
given above, and repeated here is somewhat altered fashion.

float  a = 3.0, b = 4.0, c = 0.0 ;

double x = 3.0, y = 4.0, z = 0.0 ;

c = a + b ;     // LE  0,A  Instructions appropriate
// AE  0,B  for single-precision
// STE 0,C  floating point data.

z = x + y ;     // LD  2,X  Instructions appropriate
// STD 2,Z  floating point data.

IMPORTANT POINT
The assembler code emitted is entirely dependent on the data types for
the operands, as declared earlier in the program.

Another Note:     When possible, the compiler will avoid immediate reuse
of registers, in an attempt to keep as much data in local
registers for later use.  The code is more efficient.

Side Note: How to Write a Better Compiler

Consider the following Java fragment, focusing on how it might be
compiled and how it should be compiled.

double v = 0.0, w = 0.0, x = 3.0, y = 4.0, z = 5.0 ;

w = x + y ;

v = x + y + z ;

The example below shows inefficient code of the type actually emitted by an
early 1970’s era compiler.  The modern compiler keeps the sum
x + y in
register 0 and reuses it as a partial sum in the next result
x + y + z.

Older Compiler              Modern Compiler

LD  0, X          LD  0, X
STD 0, W          STD 0, W

LD  0, X
STD 0, V          STD 0, V

A Final Comment on Assembler Addresses

Here is some code that will work, though it is strange.

STD 0,Z     STORE RESULT INTO ADDRESS Z

LD  2,Z     RESULT NOW IN REGISTER 2

X        DC  D‘3.0’  DOUBLE-PRECISION FLOAT: 8 bytes

Y        DC  D‘4.0’  DOUBLE-PRECISION FLOAT: 8 bytes

Z        DC  F       32-BIT INTEGER: 4 BYTES

Z1       DC  H       16-BIT INTEGER: 2 BYTES

Z2       DC  H       16-BIT INTEGER: 2 BYTES

The code above will work.  At the end, register 2 will have the correct result.

What about the strange declarations for Z, Z1, and Z2?  All that matters is
that the
STD 0,Z instruction has eight contiguous bytes into which to place
its result so that the
LD 2,Z instruction can retrieve it.  This is done.

Summary

The most obvious conclusion is that it is not appropriate to discuss
assembler language code in terms of variables.

The name “variable” should be reserved for higher–level compiled
languages in which a data type is attached to each data symbol.

Here is a brief comparison.

Language                               Assembler                 Compiled

Data type determined by         Operation                  Data
Declaration