Chapter 14: Data Comparison and Branching

The purpose of this chapter is to bring together a number of topics discussed in earlier chapters, so that we may emphasize a few of the more important points.  As the chapter emphasizes comparison and branching, we begin with a theoretical discussion of sorting.

All comparison is based on the idea of a sort order.  There are four possible operators that can be applied to give a sort order; these are usually denoted as “<”, “>”, “£”, and “³”.  The two operators “=” and “¹” are interesting, but will not do for sorting.

As we can choose any of the four operators to support a sort, your author arbitrarily chooses
the “less than” operator and, by implication, its opposite “greater than or equal”.

A < B        A is less than B.  A precedes B in sort order.
A ³ B        A is not less than B.  A does not precede B in sort order.

In order to support a sort order, a data type must have the following two properties.

1)   Every pair of elements in the data type can be explicitly compared.  Let A and B
belong to the data type.  Then, it is the case that either A < B or A ³ B.

2)   The transitivity property holds.  Let A, B, and C belong to the data type.
If A < B and B < C, it necessarily holds that A < C.

As an aside, it is easy to discover data types that do not suggest an explicit comparison.  The easiest example is that of points in a plane, which are specified by two coordinates.  Data types that always support explicit comparison but do not support the transitivity property can be devised, but they are more artificial.

The natural tendency would be to define two basic sort orders: numeric and alphabetic.  While this works for data handled manually by humans, it is not really appropriate for data stored in computers.  All data stored in computers, being binary, are ultimately subject to a numeric comparison.  This statement holds for all primitive data types.

The numeric data types (halfword integer, fullword integer, packed decimal, and all floating point types) all support the expected sort order.  Another way to put this is, that subject to precision and range limitations, the standard rules of algebra apply.  The sort order for character data follows the EBCDIC format, which is ultimately an 8–bit integer format.

The reader should note that the EBCDIC sort order is equivalent to alphabetical order only when comparing characters all in upper case or all in lower case.  There are a few surprises resulting from the decision to design EBCDIC for easy translation from punch card codes.

The following is a true sort sequence in EBCDIC: “a” < ‘h” < “z” < “A” < “H” < “Z”.  The reason for this is seen in the following table of codes.

 Character “A” “H” “Z” “a” “h” “z” EBCDIC X‘C1’ X‘C8’ X‘E9’ X‘81’ X‘81’ X‘A9’

The following is also a true sort sequence in EBCDIC: “I” < “!” < “J”, as the hexadecimal codes are  X‘C8’, X‘D0’, and X‘D1’.   In standard alphabetical order, there is no letter that stands between “I” and “J”.

One may imagine a sort order on Boolean data, but it does not help very much.  Boolean variables can have only two values 0 (False) or 1 (True).  What we can clearly say is that
0 ¹ 1, or False¹  True.  It is also commonly claimed that False < True, but this seem to your author to have little practical significance.  In any case, we cannot have transitivity with only two values to consider.  The reader should arrive at an independent opinion on this issue.

Branching Strategy

The essence of conditional branching is the two–step process of data–type specific comparison followed by a branch based on the results.  There are a number of design strategies that can be used to design the appropriate instruction set.  IBM seems to have selected the idea of a standard set of branch conditions (with synonyms) and designing each comparison instruction to set one of the branch conditions.

The reader should remember that the following instructions also set the branch conditions.

Binary arithmetic:             A, S, AH, SH, AR, SR

Boolean            N, NC, NI, NR,
O, OC, OI, OR,
X, XC, XI, XR

Packed Format:                 AP, SP, ZAP

The conditional branch strategy operates based on the condition codes, which are derived from bits in the PSW (Program Status Word).  For the System/370 these are bits 18 and 19 of the 64 bit PSW.  Remember that bits are numbered left to right; the leftmost bit is bit 0.

The two bits stored in the PSW give rise to four standard condition codes.  These are:

 Condition Code Interpretation Binary Decimal 0 0 0 Result is zero or comparison is equal. For Boolean operators, the result has no bits with value 1. 0 1 1 Result is negative or the first argument sorts lower than the second. For Boolean operators, the result has at least one 1 bit. 1 0 2 Result is positive or the first argument sorts higher than the second. 1 1 3 Arithmetic overflow has occurred.

As noted previously, it seems a bit odd that the Boolean operator would set the negative bit when the result contains a 1 bit.  Consider the following two binary values, each having 16 bits and possible to interpret as a halfword.

0011 1100 1101 1110   As a 16–bit integer, this is positive.

1111 1100 1101 1110   As a 16–bit integer, this is negative.

One might expect an integer interpretation of the results of the bitwise Boolean operations as implemented on the System/370.  IBM must have had a good reason to implement the instructions this way; but all we say is “that is the way it was done”.

Branch Instructions

One of the encodings used to minimize the instruction size is to use the idea of a condition mask to extend two basic branch instructions into fourteen equivalent branch instructions.  This device is often called “syntactic sugar” or extended mnemonics.  There are two basic branch instructions in the IBM instruction set.

BC   MASK,TARGET     A TYPE RX INSTRUCTON

BCR  MASK,REGISTER   A TYPE RR INSTRUCTION

In the Type RX instruction, the target address is computed using the base register and displacement method, with an optional index register: D2(X2,B2).  In the Type RR instruction, the target address is found as the contents of the register.

Each of these instruction formats uses a four–bit mask, with bit numbers based on the 2–bit value of the condition code in the PSW, to determine the conditions under which the branch will be taken.  The mask should be considered as having bits numbered left to right as 0 – 3.

Bit 0 is the equal/zero bit.                   Bit 2 is the high/plus bit.

Bit 1 is the low/minus bit.                   Bit 3 is the overflow bit.

The Standard Combinations

The following table shows the standard conditional branch instructions and their translation to the BC (Branch on Condition).  The same table applies to BCR (Branch on Condition, Register), so that there is another complete set of mnemonics for that set.

 Bit Mask Flags Condition Extended instructions 0 1 2 3 Sort Arithmetic 0 0 0 0 No branch BC 0,XX NOP 0 0 0 1 Bit 3: Overflow BC 1,XX BO XX 0 0 1 0 Bit 2: High/Plus BC 2,XX BH XX BP 0 1 0 0 Bit 1: Low/Minus BC 4,XX BL XX BM 0 1 1 1 1, 2, 3: Not Equal BC 7,XX BNE XX BNZ 1 0 0 0 Bit 0: Equal/Zero BC 8,XX BE XX BZ 1 0 1 1 0, 2, 3: Not Low BC 11.XX BNL XX BNM 1 1 0 1 0, 1, 3: Not high BC 13,XX BNH XX BNP 1 1 1 1 0, 1, 2, 3: Any BC 15,XX B XX

Note the two sets of extended mnemonics: one for comparisons and an equivalent
set for the results of arithmetic operations.

These equivalent sets are provided to allow the assembler code to read more naturally.

Here is the complete listing of S/370 assembler mnemonics, showing what is called either “extended mnemonics” or “syntactic sugar” for each of the BC and BCR instructions.

 Branch on Condition (Register) Branch on Condition Condition Basic Extended Instruction Basic Extended Instruction Sort Arithmetic Sort Arithmetic No branch BCR 0,R NOPR BC 0,X NOP Bit 3: Overflow BCR 1,R BOR R BC 1,X BO X Bit 2: High/Plus BCR 2,R BHR R BPR R BC 2,X BH X BP X Bit 1: Low/Minus BCR 4,R BLR R BMR R BC 4,X BL X BM X 1, 2, 3: Not Equal BCR 7,R BNER R BNZR R BC 7,X BNE X BNZ X Bit 0: Equal/Zero BCR 8,R BER R BZR R BC 8,X BE X BZ X 0, 2, 3: Not Low BCR 11.R BNLR R BNMR R BC 11.X BNL X BNM X 0, 1, 3: Not high BCR 13,R BNHR R BNPR R BC 13,X BNH X BNP X 0,1, 2 No overflow BCR 14,R BNOR R BC 14,X BNO X 0, 1, 2, 3: Any BCR 15,R BR R BC 15,X B X

The BC Instruction

The BC instruction is a type RX instruction with opcode X‘47’.

The source code format for the instruction is BC M,Address or BC M,D2(X2,B2).

The object code format for this instruction is as follows.

 Type Bytes Operands 1 2 3 4 RX 4 R1,D2(X2,B2) 47 M1 X2 B2 D2 D2D2

The first byte contains the 8–bit instruction code, which is X‘47’.

The second byte contains two 4–bit fields.

The first four bits contain the mask for the branch condition codes

The second four bits contain the number of the index register used in computing
the address of the jump target.  If X2 = 0, indexed addressing is not used.

The next two bytes contain the 4–bit number of the base register and the 12–bit displacement used to form the basic address of the target.  This address may be indexed.

Suppose that address TARGET is formed by offset X‘666’ using base register 8.
No index is used and the instruction is BNE TARGET, equivalent to BC 7,TARGET,
as the condition mask for “Not Equal” is the 4–bit number 0111, or decimal 7.

The object code for this is 47 70 86 66.

The BCR Instruction

The BCR instruction is a two–byte instruction of the form OP M1,R2.

 Type Bytes Operands RR 2 M1,R2 07 M1 R2

The first byte contains the 8–bit instruction code, which is X‘07’.

The second byte contains two 4–bit fields.

The first 4–bit field encodes a branch condition

The second 4–bit field encodes the number of the register containing
the target address for the branch instruction.

For example, the instruction BR R8 is the same as BCR 15,R8.  Again, source code written in this form assumes that the symbol R8 has been defined with an EQU directive.

The object code is 07 F8.   The effect of this instruction is to branch unconditionally to the address in register 8.  An instruction such as this is used to implement a subroutine return.

The two code fragments below may be considered to be equivalent.  The first fragment is:
BR R8           Branch to the address in R8.

The second fragment avoids the register as follows:
B  TARGET       Branch to the target address.

Comparing Binary Data: C, CH, and CR

There are three instructions for binary comparison with the value in a register.

Mnemonic         Description                                Type            Format

C                 Compare full–word                    RX               C R1,D2(X2,B2)

CH               Compare half–word                   RX               CH R1,D2(X2,B2)

CR               Compare register to register       RR                CR R1,R2

Each comparison sets the expected condition code.

Condition                      Condition Code            Branch Taken

Operand 1 = Operand 2             0 (Equal/Zero)                BE, BZ

Operand 1 < Operand 2             1 (Low/Minus)               BL, BM

Operand 1 > Operand 2             2 (High/Plus)                  BH, BP

Each of the C and CH instructions is a type RX instruction, with source code shown above.  The object code is of the form that is standard for a type RX instruction.

 Type Bytes Operands 1 2 3 4 RX 4 R1,D2(X2,B2) OP R1 X2 B2 D2 D2D2

The first byte contains the 8–bit instruction code.  The opcode for C is X‘59’ and that for CH is X‘49’.  The second byte contains two 4–bit fields, each of which encodes a register number.  The first hexadecimal digit specifies the register containing either the fullword or the halfword that will be compared.  The second hexadecimal digit specifies the optional register that may be used if indexed addressing is used.  If this digit is 0, then the addressing is simple base/displacement with out any addressing.

The next two bytes contain the 4–bit number of the base register and the 12–bit displacement used to form the basic address of the target.  This address may be indexed.

For the CH (Compare Halfword) instruction, the address computed will reference a halfword.  The value in this halfword will be sign extended to a 32–bit fullword before the comparison.

The CR instruction is a type RR instruction, with opcode X‘19’.

This is a two–byte instruction of the form OP R1,R2.

 Type Bytes Operands RR 2 R1,R2 OP R1 R2

The first byte contains the 8–bit instruction code.

The second byte contains two 4–bit fields, each of which encodes a register number.

Don’t forget that literal arguments can be used with either C or CH, as in this example.

C  R9,=F‘0’   COMPARE THE REGISTER TO ZERO

BH ISPOS      IT IS POSITIVE

BL ISNEG      NO, IT IS NEGATIVE.

If this line is reached, R9 contains the value 0.
More code here

B  DOMORE

ISPOS    Code for R9 > 0 is placed here

B  DOMORE

ISNEG    Code for R9 < 0 is placed here

DOMORE   Common code is placed here.

CP: Compare Packed Decimal Fields

The CP instruction is a type SS instruction with opcode X‘F9’.  The source code is of the form OP D1(L1,B1),D2(L2,B2), which provide a 4–bit number representing the length for each of the two operands.  The object code format is as follows.

 Type Bytes Operands 1 2 3 4 5 6 SS(2) 6 D1(L1,B1),D2(L2,B2) X‘F9’ L1 L2 B1 D1 D1D1 B2 D2 D2D2

The first byte contains the operation code, which is X‘FA’.

The second byte contains a two values, each a 4–bit binary number (one hex digit).

L1        A value that is one less than the length of the first operand.

L2        A value that is one less than the length of the second operand.

Bytes 3 and 4 specify the address of the first operand, using the standard base register and displacement format.  Bytes 5 and 6 specify the address of the second operand, using the
standard base register and displacement format.

Some of the rules for the CP instruction are as follows.

1.   The maximum field length for either operand is 16 bytes; thus 31 digits.

2.   Both operands must contain valid packed data.

3.   If the fields are not the same length, CP extends the shorter field (in the CPU,
but not the copy in memory) by padding it with the proper number of leftmost
zeroes before comparison.

4.   The CP instruction follows the rules of standard algebra.  Any positive number
will sort greater than any negative number, except that +0 = –0.

5.   The implicit number of decimal places in each operand must be the same, or
the comparison results will not reflect the true meaning of the data.

Example

Consider the assembly language statement below, which compares AMOUNT to TOTAL.

CP TOTAL,AMOUNT

Assume:    1.      TOTAL is 4 bytes long, so it can hold at most 7 digits.

2.      AMOUNT is 3 bytes long, so it can hold at most 5 digits.

3.      The label TOTAL is at an address specified by a displacement
of X‘50A’ from the value in register R3, used as a base register.

4.      The label AMOUNT is at an address specified by a displacement
of X‘52C’ from the value in register R3, used as a base register.

The object code looks like this:           F9 32 35 0A 35 2C

The Disassembly of the Above Example

Consider F9 32 35 0A 35 2C.  The operation code X‘F9’ is that for the
Compare Packed (CP) instruction, which is a type SS(2).  The above format applies.

The field 32 is of the form L1 L2.

The first value is X‘3’, or 3 decimal.  The first operand is 4 bytes long.

The second value is X‘2’, or 2 decimal.  The second operand is 3 bytes long.

The two–byte field 35 0A indicates that register 3 is used as the base register
for the first operand, which is at displacement X‘50A’.

The two–byte field 35 2C indicates that register 3 is used as the base register
for the second operand, which is at displacement X‘52C’.

It is quite common for both operands to use the same base register.

Character Comparison: CLC

The CLC (Compare Logical Character) instruction is one of the two used to compare
character fields, one byte at a time, left to right.  The instruction is type SS with opcode X‘D5’.  The source code is of the form CLC D1(L,B1),D2(B2), which provide a length for only operand 1.  The length is specified as an 8–bit byte.  The object code format is:

 Type Bytes Operands 1 2 3 4 5 6 SS(1) 6 D1(L,B1),D2(B2) X‘D5’ L B1 D1 D1D1 B2 D2 D2D2

The first byte contains the operation code, which is X‘D5’ for CLC.

The second byte contains a value storing one less than the length of the first operand,
which is the destination for any move.  Bytes 3 and 4 specify the address of the first operand, using the standard base register and displacement format.  Bytes 5 and 6 specify the address of the second operand, using the standard base register and displacement format.

Comparison is based on the binary contents (EBCDIC code) contents of the bytes.
The sort order is from X’00’ through X’FF’.

The instruction may be written as       CLC Operand1,Operand2

An example of the instruction is         CLC NAME1,NAME2

This instruction sets the condition code that is used by the conditional branch instructions.  The condition code is set as follows:

If Operand1 is equal Operand2                      Condition Code = 0

If Operand1 is lower than Operand2              Condition Code = 1

If Operand1 is higher than Operand2             Condition Code = 2

The operation moves, byte by byte, from left to right and terminates as soon as an unequal comparison is found or one of the operands runs out.

Example of Character Instructions

Consider the example assembly language statement, which compares the string of characters at label CONAME with those at the location associated with the label TITLE.

CLC TITLE,CONAME

Suppose that:     1.      There are fourteen bytes associated with TITLE, say that it was
declared as TITLE DS CL14.  Decimal 14 is hexadecimal E.

2.      The label TITLE is referenced by displacement X‘40A’
from the value stored in register R3, used as a base register.

3.      The label CONAME is referenced by displacement X‘42C’
from the value stored in register R3, used as a base register.

Given that the operation code for CLC is X‘D5’, the instruction assembles as

D5 0D 34 0A 34 2C   Length is 14 or X‘0E’; L – 1 is X‘0D’

Final Comments on the Comparison Operators

This chapter has covered a variety of typical comparison operators.
C, CH, and CR for integer comparison,
CP for packed decimal comparison, and
CLC for character comparison.

At this point, it would be helpful to restate the warning that it is the instruction that sets the data type for the section of memory that is being referenced.  If the data are of the wrong type for the instruction, strange results may be produced.

For example, suppose that the labels DATA1 and DATA2 have been defined somehow and that they have been initialized with some content.

The instruction C R6,DATA1 will compare the contents of register R6 with the contents of
the four bytes beginning at address DATA1, assuming that those four bytes represent a
32–bit integer in two’s–complement form.

The instruction CH R6,DATA1 will compare the contents of register R6 with the sign extended contents of the two bytes beginning at address DATA1, assuming that those two bytes represent a 16–bit integer in two’s–complement form.

The instruction CP DATA1,DATA2 will compare the contents of the two locations, assuming that each location contains data in valid packed decimal format.

The instruction CLC DATA1,DATA2 will compare the contents of the two locations, assuming that each location contains EBCDIC character data.

For data that are properly initialized (this is a bit of a trick), it might be possible that each of the four instructions will execute to completion within the same program.  All four might give results, but only one will give a correct result.