The term Instruction Set Architecture includes various constraints and information about the processor. The compiler and the assembler must generate machine code that is compatible with the ISA in order for the program to execute as intended. ISA is an umbrella term and usually includes information about the following:
• Instruction format • Register set • Memory model
• Addressing modes
ISAs can be classified in a number of ways. For example, Hennessy and Patterson [8] con- sider characteristics such as the type of internal storage of the processor (e.g. registers or stack). There are two classes of register architectures which are register-memory and load-store. Hennessy and Patterson also describe a third class which keeps all operands in memory; this is called a memory-memory architecture. An example of a register-memory architecture is the 8086 CISC and an example of the load-store architecture is the ARM RISC. An example of a memory-memory architecture is Tiny, Audsley and Ward [11].
Another common way of classifying the ISA is the level of abstraction of the instructions. Table 5.2 illustrates the common types of ISAs organised as an abstraction hierarchy.
ISA Example
Application Specific
High Level Language Specific Tiny
Language Specific Virtual Machine JBC, Forth, p-code
CPU Architecture 8086, ARM
Table 5.2: ISAs
At the highest level of abstraction the ISA supports specific applications or algorithms. The benefit of this ISA is that the instruction stream can be very compact for a given application. The processor is not burdened with the overhead of interpreting general purpose instructions which implement application or algorithmic behaviour. However, a severe disadvantage is that the processor is not general enough for widespread use. A common compromise is to allow a general purpose processor to be augmented with abstract, application specific instructions. This type of processor is called an Application Specific Instruction Processor (ASIP). The new instructions can be implemented by way of FPGA or microcode. An example is Xtensa [55]
The next type of ISA is High Level Language Specific. This type of ISA means that the processor can interpret a direct representation of the source language. In other words, there is a one-to-one mapping between the source language constructs and the processor instructions. This means it is not necessary to interpret low-level or fine-grained instructions which implement high- level programming abstractions and notations. If an instruction has a closer semantic meaning with the source program, then fewer instructions will be required to represent it. It is also likely to shorten the instruction stream for a given program which should reduce the burden on the memory system when executing a program. However, the disadvantage is that the processor is coupled to a single source language. This is likely to limit its commercial adoption and applicability.
It is even possible for the processor to interpret the source language directly rather than an equivalent representation of it. Machines that provide direct support for high-level languages in this way were proposed in the 1960s and are by no means novel. Chu and Cannot [4] illustrate a taxonomy of High Level Language Systems for directly executing high-level languages. This is summarised in Table 5.3 and will be briefly discussed.
High Level Language
System Type Subtype Description
Interactive Compilation
systems 1(a)
Editing, compiling, executing the entire source code
1(b) Editing, syntax checking, compiling, executing the entire source code
1(c) Editing, syntax checking each line, compiling and executing the entire source code
Interactive Interpretation
Systems 2(a)
Editing, syntax checking and interpreting the entire source code
2(b) Editing, syntax checking and interpreting each line of source code
Interactive direct execution
Systems 3
Editing, syntax checking and executing each symbol of source code
Table 5.3: Types of High Level Language Systems, taken from [4]
Type 1(a) systems are similar to traditional compiler tool chains and processors. For example, a set of C files is compiled and linked to produce a single binary which is executed on the processor. A type 1(b) system would differ in that the source code could be syntax checked during development, then only fully compiled when an executable is required. This may appear an outdated approach to development, but a modern equivalent would be the use of syntax checking in an Integrated Development Environment (IDE)) such as eclipse. A Type 1(c) system would interactively syntax check each line (e.g. as with BASH shell), however, once programming is complete, the program is then compiled and executed on the processor. Type 2 systems are both unconventional and uncommon. In Type 2(a) the program is created and syntax checked as normal. However, the source text is then directly interpreted by the processor. Type 2(b) differs in that the processor allows the program source to be inputted interactively line by line. An analogy of this would be a BASH shell implemented in hardware. Type 3 systems process symbols. For example, these can be strings of reverse polish notation or Forth dictionaries [19].
It can be argued that Type 1(b) systems are outdated due to the availability of processor time during the development life cycle. Type 1(c) systems are not of interest, since these imply an interactive shell development environment. Type 2, whilst interesting, have had little or no
adoption in the commercial world. Type 3 systems may be of interest since their use may yield some benefits for general processor design.
Continuing on from High Level Language Specific ISAs, the next level of ISA abstraction is the Language Specific Virtual Machine ISA. This type of ISA provides an abstraction layer for a virtual machine for a specific language. The virtual machine is usually software based and interprets bytecode. Bytecode is higher level than a CPU-based ISA (discussed below), but lower level than a language ISA. It may interpret or compile the bytecode in a manner of ways discussed by Smith and Nair [32]. The motivation for this additional layer in the system stack is portability. Examples of a bytecode ISAs include Forth [19], p-code [40], and Java bytecode [24]. Ironically, there have been many attempts at implementing parts of a virtual machine in hardware in order to improve performance. The bytecode-based ISAs were developed to support the porting onto CPU- based ISAs (discussed below). This is why these ISAs resemble either a stack- or register-based design. This type of ISA is a bottom-up design where instructions map onto the requirements of an abstract processor model intended for implementation in software.
The most common type of ISA is the CPU Architecture ISA. These are general purpose ISAs. The processor interprets a number of instructions that implement the source level constructs. In other words, there is a one-to-many mapping between the source language constructs and the processor instructions. This has the advantage that the processor is applicable to a wide range of source languages making it more commercially viable. Example ISAs include RISC and CISC. The design philosophy of these ISAs has remained largely unchanged for the past 40 years or so.
It appears that ISA design has been either too general (e.g. RISC- and CISC-based ISAs), resulting in a many-to-one mapping between processor instructions and programming constructs, or too specific (e.g. ASIPs), which may limit the reusability of a particular instruction. A possi- ble compromise is to support a set of abstract instructions representing fundamental imperative language constructs. In turn this would provide support for the C programming language and the many languages based on it and influenced by it. The ISA could be supported by an SDLP as suggested by Audsley and Ward [11]. This would be an SDLP supporting fundamental imperative programming constructs; in other words, an SDLP for imperative languages.
The reduction in the semantic gap between the source language and the ISA for such a processor may yield a number of benefits whilst not limiting its support for most practical languages. The possible benefits may include:
• Reduce the dependency on the memory system. If the instruction stream is more compact, the processor will require fewer instruction memory interactions for a given program.
• Reduce the frequency of external memory accesses. If there is less dependency on the memory system, there should be fewer accesses to external memory.
• Reduce power consumption. Since according to Verma et al., the memory system consumes the most power [3], reducing the dependency should reduce overall power consumption.