IA-64 Application Programming Model
4.1.2 Register Stack Instructions
The alloc instruction is used to change the size of the current register stack frame. An alloc
instruction must be the first instruction in an instruction group otherwise the results are undefined. An alloc instruction affects the register stack frame seen by all instructions in an instruction group, including the alloc itself. An alloc cannot be predicated. An alloc does not affect the values or NaT bits of the allocated registers. When a register stack frame is expanded, newly allocated registers may have their NaT bit set.
In addition, there are three instructions which provide explicit control over the state of the register stack. These instructions are used in thread and context switching which necessitate a
corresponding switch of the backing store for the register stack. See Chapter 6 of Volume 2 for details on explicit management of the RSE.
The flushrs instruction is used to force all previous stack frames out to backing store memory. It stalls instruction execution until all active frames in the physical register stack up to, but not including the current frame are spilled to the backing store by the RSE. A flushrs instruction must be the first instruction in an instruction group; otherwise, the results are undefined. A
flushrs cannot be predicated.
The cover instruction creates a new frame of zero size (sof = sol = 0). The new frame is created above (not overlapping) the present frame. Both the local and output areas of the previous stack frame are automatically saved. A cover instruction must be the last instruction in an instruction group otherwise an Illegal Operation fault is taken. A cover cannot be predicated.
Figure 4-1. Register Stack Behavior on Procedure Call and Return
000721 Caller’s Frame (procA) Local A
32 Output A 46 52 sofa=21 sola=14 call
Callee’s Frame (procB)
After call Output B1
32 38
sofb1=7
alloc
Callee’s Frame (procB)
After alloc Local B
32 Output B2 48 50 sofb2=19 solb2=16 return
Caller’s Frame (procA)
After return Local A
32 Output A 46 52 14 sol 21 sof x sol x sof 0 7 14 21 16 19 14 21 14 21 14 21 CFM PFM Frame Markers Stacked GRs Instruction Execution
The loadrs instruction ensures that the specified portion of the register stack is present in the physical registers. It stalls instruction execution until the number of bytes specified in the loadrs field of the RSC application register have been filled from the backing store by the RSE (starting from the current BSP). By specifying a zero value for RSC.loadrs, loadrs can be used to indicate that all stacked registers outside the current frame must be loaded from the backing store before being used. In addition, stacked registers outside the current frame (that have not been spilled by the RSE) will not be stored to the backing store. A loadrs instruction must be the first instruction in an instruction group otherwise the results are undefined. A loadrs cannot be predicated.
Table 4-1 lists the architectural visible state relating to the register stack. Table 4-2 summarizes the register stack management instructions. Call- and return-type branches, which affect the stack, are described in “Branch Instructions” on page 4-26.
4.2
Integer Computation Instructions
The integer execution units provide a set of arithmetic, logical, shift and bit-field-manipulation instructions. Additionally, they provide a set of instructions to accelerate operations on 32-bit data and pointers.
Arithmetic, logical and 32-bit acceleration instructions can be executed on both I- and M-units.
4.2.1
Arithmetic Instructions
Addition and subtraction (add, sub) are supported with regular two input forms and special three input forms. The three input addition form adds one to the sum of two input registers. The three input subtraction form subtracts one from the difference of two input registers. The three input forms share the same mnemonics as the two input forms and are specified by appending a “1” as a third source operand.
Table 4-1. Architectural Visible State Related to the Register Stack
Register Description
AR[PFS].pfm Previous Frame Marker field
AR[RSC] Register Stack Configuration application register AR[BSP] Backing store pointer application register
AR[BSPSTORE] Backing store pointer application register for memory stores AR[RNAT] RSE NaT collection application register
Table 4-2. Register Stack Management Instructions
Mnemonic Operation
alloc Allocate register stack frame
flushrs Flush register stack to backing store loadrs Load register stack from backing store
Immediate forms of addition and subtraction use a register and a 15-bit immediate. The immediate form is obtained simply by specifying an immediate rather than a register as the first operand. Also, addition can be performed between a register and a 22-bit immediate; however, the source register must be GR 0, 1, 2 or 3.
A shift left and add instruction (shladd) shifts one register operand to the left by 1 to 4 bits and adds the result to a second register operand. Table 4-3 summarizes the integer arithmetic instructions.
Note that an integer multiply instruction is defined which uses the floating-point registers. See
“Integer Multiply and Add Instructions” on page 5-17 for details. Integer divide is performed in software similarly to floating-point divide.
4.2.2
Logical Instructions
Instructions to perform logical AND (and), OR (or), and exclusive OR (xor) between two registers or between a register and an immediate are defined. The andcm instruction performs a logical AND of a register or an immediate with the complement of another register. Table 4-4
summarizes the integer logical instructions.
4.2.3
32-bit Addresses and Integers
Support for IA-64 32-bit addresses is provided in the form of add instructions that perform region bit copying. This supports the virtual address translation model (see “32-bit Virtual Addressing” on page 4-22 of Volume 2 for details). The add 32-bit pointer instruction (addp) adds two registers or a register and an immediate, zeroes the most significant 32-bits of the result, and copies bits 31:30 of the second source to bits 62:61 of the result. The shladdp instruction operates similarly but shifts the first source to the left by 1 to 4 bits before performing the add, and is provided only in the two-register form.
Table 4-3. Integer Arithmetic Instructions
Mnemonic Operation
add Addition
add ...,1 Three input addition
sub Subtraction
sub ...,1 Three input subtraction
shladd Shift left and add
Table 4-4. Integer Logical Instructions
Mnemonic Operation
and Logical and
or Logical or
andcm Logical and complement
In addition, support for 32-bit integers is provided through 32-bit compare instructions and instructions to perform sign and zero extension. Compare instructions are described in “Compare Instructions and Predication” on page 4-7. The sign and zero extend (sxt, zxt) instructions take an 8-bit, 16-bit, or 32-bit value in a register, and produce a properly extended 64-bit result. Table 4-5
summarizes 32-bit pointer and 32-bit integer instructions.
4.2.4
Bit Field and Shift Instructions
Four classes of instructions are defined for shifting and operating on bit fields within a general register: variable shifts, fixed shift-and-mask instructions, a 128-bit-input funnel shift, and special compare operations to test an individual bit within a general register. The compare instructions for testing a single bit (tbit), or for testing the NaT bit (tnat) are described in “Compare Instructions and Predication” on page 4-7.
The variable shift instructions shift the contents of a general register by an amount specified by another general register. The shift right signed (shr) and shift right unsigned (shr.u) instructions shift the contents of a register to the right with the vacated bit positions filled with the sign bit or zeroes respectively. The shift left (shl) instruction shifts the contents of a register to the left. The fixed shift-and-mask instructions (extr, dep) are generalized forms of fixed shifts. The extract instruction (extr) copies an arbitrary bit field from a general register to the least-significant bits of the target register. The remaining bits of the target are written with either the sign of the bit field (extr) or with zero (extr.u). The length and starting position of the field are specified by two immediates. This is essentially a shift-right-and-mask operation. A simple right shift by a fixed amount can be specified by using shr with an immediate value for the shift amount. This is just an assembly pseudo-op for an extract instruction where the field to be extracted extends all the way to the left-most register bit.
The deposit instruction (dep) takes a field from either the least-significant bits of a general register, or from an immediate value of all zeroes or all ones, places it at an arbitrary position, and fills the result to the left and right of the field with either bits from a second general register (dep) or with zeroes (dep.z). The length and starting position of the field are specified by two immediates. This is essentially a shift-left-mask-merge operation. A simple left shift by a fixed amount can be specified by using shl with an immediate value for the shift amount. This is just an assembly pseudo-op for dep.z where the deposited field extends all the way to the left-most register bit. The shift right pair (shrp) instruction performs a 128-bit-input funnel shift. It extracts an arbitrary 64-bit field from a 128-bit field formed by concatenating two source general registers. The starting position is specified by an immediate. This can be used to accelerate the adjustment of unaligned data. A bit rotate operation can be performed by using shrp and specifying the same register for both operands. Table 4-6 summarizes the bit field and shift instructions.
Table 4-5. 32-bit Pointer and 32-bit Integer Instructions
Mnemonic Operation
addp 32-bit pointer addition
shladdp Shift left and add 32-bit pointer
sxt Sign extend