Instruction Set Design Issues - Guide to RISC Processors

There are several design issues that inﬂuence the instruction set of a processor. We have already discussed one issue, the number of addresses used in an instruction. In this section, we present some other design issues.

Operand Types

Processor instructions typically support only the basic data types. These include characters, integers, and ﬂoating-point numbers. Because most memories are byte addressable, representing characters does not require special treatment. In a byte-addressable memory, the smallest memory unit we can address, and therefore access, is one byte. We can, however, use multiple bytes to represent larger operands. Processors provide instructions to load various operand sizes. Often, the same instruction is used to load operands of different sizes. For example, the IA-32 instruction

mov AL,address ; Loads an 8-bit value

loads the AL register with an 8-bit value from memory ataddress. The same instruction can also be used to load 16- and 32-bit values as shown in the following two instructions.

mov AX,address ; Loads a 16-bit value

mov EAX,address ; Loads a 32-bit value

In these instructions, the size of the operand is indirectly given by the size of the register used. The AL, AX, and EAX are 8-, 16-, and 32-bit registers, respectively. In those instructions that do not use a register, we can use size speciﬁers. This type of speciﬁcation is typical for the CISC processors.

RISC processors specify the operand size in their load and store operations. Note that only the load and store instructions move data between memory and registers. All other

instructions operate on registerwide data. Below we give some examples of the MIPS load instructions:

lb Rdest,address ; Loads a byte

lh Rdest,address ; Loads a halfword (16 bits)

lw Rdest,address ; Loads a word (32 bits)

ld Rdest,address ; Loads a doubleword (64 bits)

The last instruction is available only on 64-bit processors. In general, when the size of the data moved is smaller than the destination register, it is sign-extended to the size of Rdest. There are separate instructions to handle unsigned values. For unsigned numbers, we uselbuandlhuinstead oflbandlh, respectively.

Similar instructions are available for store operations. In store operations, the size is reduced to ﬁt the target memory size. For example, storing a byte from a 32-bit register causes only the lower byte to be stored at the speciﬁed address. SPARC also uses a similar set of instructions.

So far we have seen operations on operands located either in registers or in memory. In most instructions, we can also use constants. These constants are called immediate values because the constants are encoded as part of the instruction. In RISC processors, instructions excluding the load and store use registers only; any nonregister value is treated as a constant. In most assembly languages, a special notation is used to indicate registers. For example, in MIPS assembly language, the instruction

add $t0,$t0,−32 ;$t0=$t0−32

subtracts 32 from the$t0register and places the result back in the$t0register. Notice the special notation to represent registers. But there is no special notation for constants. Some assemblers, however, use the “#” sign to indicate a constant.

Addressing Modes

Addressing mode refers to how the operands are speciﬁed. As we have seen in the last section, operands can be in one of three places: in a register, in memory, or part of the instruction as a constant. Specifying a constant as an operand is called the immediate addressing mode. Similarly, specifying an operand that is in a register is called theregister addressing mode. All processors support these two addressing modes.

The difference between the RISC and CISC processors is in how they specify the operands in memory. CISC designs support a large variety of memory addressing modes. RISC designs, on the other hand, support just one or two addressing modes in their load and store instructions. Most RISC architectures support the following two memory addressing modes.

• The address of the memory operand is computed by adding the contents of a register and a constant. If this constant is zero, the contents of the register are treated as the operand address. In this mode, the memory address is computed as

34 Guide to RISC Processors

Address = contents of a register + constant.

• The address of the memory operand is computed by adding the contents of two registers. If one of the register contents is zero, this addressing mode becomes the same as the one above with zero constant. In this mode, the memory address is computed as

Address = contents of register 1 + contents of register 2.

Among the RISC processors we discuss, ARM and Itanium provide slightly different addressing modes. The Itanium uses the computed address to update the register. For example, in the ﬁrst addressing mode, the register is loaded with the value obtained by adding the constant to the contents of the register.

The IA-32 provides a variety of addressing modes. The main motivation for this is the desire to support high-level language data structures. For example, one of its addressing modes can be used to access elements of a two-dimensional array.

Instruction Types

Instruction sets provide different types of instructions. We describe some of these instruction types here.

Data Movement Instructions All instruction sets support data movement instructions. The type of instructions supported depends on the architecture. We can divide these instructions into two groups: instructions that facilitate movement of data between memory and registers and between registers. Some instruction sets have special data movement instructions. For example, the IA-32 has special instructions such aspushandpopto move data to and from the stack.

In RISC processors, data movement between memory and registers is restricted to load and store instructions. Some RISC processors do not provide any explicit instructions to move data between registers. This data transfer is accomplished indirectly. For example, we can use theaddinstruction

add Rdest,Rsrc,0 ; Rdest = Rsrc + 0

to copy contents ofRsrctoRdest. The IA-32 provides an explicitmovinstruction to copy data. The instruction

mov dest,src

copies the contents ofsrctodest. Thesrcanddestcan be either registers or memory. In addition,srccan be a constant. The only restriction is that bothsrcanddest cannot be located in memory. Thus, we can use the movinstruction to transfer data between registers as well as between memory and registers.

Arithmetic and Logical Instructions Arithmetic instructions support ﬂoating-point as well as integer operations. Most processors provide instructions to perform the four basic arithmetic operations: addition, subtraction, multiplication, and division. Because the 2’s complement number system is used, addition and subtraction operations do not need separate instructions for unsigned and signed integers. However, the other two arithmetic operations need separate instructions for signed and unsigned numbers.

Some processors do not provide division instructions, whereas others support only partially. What do we mean by partially? Remember that the division operation produces two outputs: a quotient and a remainder. We say that the division operation is fully supported if the division instruction produces both results. For example, the IA-32 and MIPS provide full division support. On the other hand, SPARC and PowerPC provide only the quotient.

Logical instructions provide the basic bitwise logical operations. Processors typically provide logicalandand oroperations. Other logical operations including thenotand xoroperations are also supported by most processors.

Most of these instructions set the condition code bits, either by default or when ex- plicitly instructed. In the IA-32 architecture, the condition code bits are set by default. In other processors, two versions of arithmetic and logical instructions are provided. For example, in SPARC,ADDdoes not update the condition codes, whereas theADDccin- struction updates the condition codes.

Instruction Formats

Processors use two types of basic instruction format: fixed-length or variable-length instructions. In the fixed-length encoding, all (or most) instructions use the same size instructions. In the latter encoding, the length of the instructions varies quite a bit. Typically, RISC processors use fixed-length instructions and the CISC designs use variable-length instructions.

All 32-bit RISC architectures discussed in this book use instructions that are 32 bits wide. Some examples are the SPARC, MIPS, ARM, and PowerPC. The Intel Itanium, which is a 64-bit processor, uses ﬁxed-length, 41 bit wide instructions. We discuss instruction encoding schemes of these processors in Part II of the book.

The size of the instruction depends on the number of addresses and whether these addresses identify registers or memory locations. Figure 2.1 shows how the size of the instruction varies with the number of addresses when all operands are located in registers. This format assumes that eight bits are reserved for the operation code (opcode). Thus we can have 256 different instructions. Each operand address is ﬁve bits long, which means we can have 32 registers. This is the case in architectures like the MIPS. The Itanium, for example, uses seven bits as it has 128 registers.

As you can see from this ﬁgure, using fewer addresses reduces the length of the instruction. The size of the instruction also depends on whether the operands are in memory or in registers. As mentioned before, RISC designs keep their operands in registers. In

36 Guide to RISC Processors

72 bits Opcode destination address 8 bits 32 bits

source address 32 bits 18 bits Opcode Rdest

8 bits 5 bits 5 bits Rsrc Register format

Memory format

Figure 2.9Instruction size depends on whether the operands are in registers or memory.

CISC architectures, operands can be in memory. If we use 32-bit memory addresses for each of the two addresses, we would need 72 bits for each instruction (see Figure 2.9) whereas the register-based instruction requires only 18 bits. For this and other efﬁciency reasons, the IA-32 does not permit both addresses to be memory addresses. It allows at most one address to be a memory address.

The instruction size in IA-32 varies from one byte to several bytes. Part of the reason for using variable length instructions is that CISC tends to provide complex addressing modes. For example, in the IA-32 architecture, if we use register-based operands, we need just 3 bits to identify a register. On the other hand, if we use a memory-based operand, we need up to 32 bits. In addition, if we use an immediate operand, we need an additional 32 bits to encode this value into the instruction. Thus, an instruction that uses a memory address and an immediate operand needs 8 bytes just for these two components. You can realize from this description that providing ﬂexibility in specifying an operand leads to dramatic variations in instruction sizes.

The opcode is typically partitioned into two fields: one identifies the major operation type, and the other defines the exact operation within that group. For example, the major operation could be a branch operation, and the exact operation could be “branch on equal.” These points become clearer as we describe the instruction formats of various processors in later chapters.

Summary

When designing a processor, several design choices will have to be made. These choices are dictated by the available technology as well as the requirements of the target user group. Processor designers will have to make compromises in order to come up with the best design. This chapter looked at some of the important design issues involved in such an endeavor.

Here we looked at how the processor design at the ISA level gets affected by various design choices. We stated that the number of addresses in an instruction is one of

the choices that can have an impact on the instruction set design. It is possible to have instruction sets with zero, one, two, or three addresses; however, most recent processors use the three-address format. The IA-32, on the other hand, uses the two-address format. The addressing mode is another characteristic that affects the instruction set. RISC designs tend to use the load/store architecture and use simple addressing modes. Often, they support just one or two addressing modes. In contrast, CISC architectures provide a wide variety of addressing modes.

Both of these choices—the number of addresses and the complexity of addressing modes—affect the instruction format. RISC architectures use ﬁxed-length instructions and support simple addressing modes. In contrast, CISC designs use variable-length instructions to accommodate various complex addressing modes.

3

In document Guide to RISC Processors (Page 43-49)