• No results found

1. Define Pipelining.

Starting the execution of next instruction before the current instruction execution is finished with the available hardware resources is called pipelining. This is achieved by splitting the execution of each instruction for more than one stage and allocating appropriate hardware for each stage. To improve the utilization of hardware resources, and also the processor throughput, pipelining organization is implemented.

2. What are the sequences of steps in pipelining? A typical pipelining sequence may be as follows:

• Fetch – fetch instruction from memory.

• Decode – generating control signals for that instruction. • Register – accessing any operands from register.

• ALU – combine the operands to produce results or memory address. • Memory – access memory for a data operand.

• Result – write the result back to the register bank.

3. What are the hazards occur in pipelining?

Read-after-write pipeline hazard – occurs when an instruction waits for an operand which is the result of the previous instruction.

Branching hazards – since branch instructions modify the flow of program, it flush and refill the pipeline.

4. List the features of RISC architecture.

• Fixed 32-bit instruction size with predefined formats. • Load - store architecture.

• Large register bank of 32–bit registers.

5. List the features of RISC processor. • Hardwired decode logic.

• Pipelined execution. • Single cycle execution.

6. Mention the advantages and drawbacks of RISC. Advantages:

• Smaller die size.

• Shorter development time. • Higher performance.

• Higher clock rate with single cycle execution. Drawbacks:

• RISCs generally have poor code density. • RISCs don’t execute x86 code.

7. How higher clock rates are achieved in RISC? Higher clock rates is achieved by,

• Single cycle execution. • High memory access rate.

8. What are the factors to be considered for low power circuit design? • Minimize the power supply voltage, Vdd.

• Minimize the circuit activity, A. • Minimize the number of gates. • Minimize the clock frequency.

9. Mention the features of RISC which are used and rejected in ARM processors.

Features used: 1. Load store architecture

2. Fixed-length 32-bit instructions 3. 3-address instruction formats. Features rejected: 1. Register windows

2. Delayed branches

3. Single cycle execution of all instructions.

10. Explain the ARM CPSR format.

CPSR–Current Program Status Register is used to store the status bits. 31 28 27 8 7 6 5 4 0

N Z C V unused I F T mode

• Mode [4:0] (lower 5 bits) – represents the processor operating mode. • T – 5th bit: represents currently ARM or Thumb instruction is executed. • I, F – 6th, 7th bits: interrupt flag and fast interrupt flag.

• N, Z, C, V – Negative, Zero, Carry, overflow flags.

11. How data items are arranged in memory system?

Memory may be viewed as a linear array of bytes numbered from zero up to 2^32-l. Data items may be 8-bit bytes, 16-bit half-words or 32-bit words. A word-sized data item must occupy a group of four byte locations starting at a byte address which is a multiple of four. Half-words occupy two byte locations starting at an even byte address.

12. Define Load-store architecture.

This means that the instruction set will only process (add, subtract, and so on) values which are in registers (or specified directly within the instruction itself), and will always place the results of such processing into a register. The only operations which apply to memory state are ones which copy memory values into registers (Load instructions) or copy register values into memory (store instructions).

13. List the types of ARM instructions.

All ARM instructions fall into one of the following three categories: 1. Data processing instructions.

2. Data transfer instructions 3. Control flow instructions

14. Define supervisor mode.

The ARM processor supports a protected supervisor mode. The protection mechanism ensures that user code cannot gain supervisor privileges without appropriate checks being carried out to ensure that the code is not attempting illegal operations. These functions generally include any accesses to hardware peripheral registers, and to widely used operations such as character input and output.

15. List the features of ARM instruction set.

The most notable features of the ARM instruction set are: • The load-store architecture;

• 3-address data processing

• Conditional execution of every instruction; • load and store multiple register instructions;

• Single instruction that executes in a single clock cycle;

• Open instruction set extension through the coprocessor instruction set

• Highly dense 16-bit compressed representation of the instruction set in the Thumb architecture.

16. How I/O systems are handled in ARM?

The ARM handles I/O (input/output) peripherals (such as disk controllers, network interfaces, and so on) as memory-mapped devices with interrupt support.

The internal registers in these devices appear as addressable locations within the ARM's memory map and may be read and written using the same (load- store) instructions as any other memory locations.

17. Mention the development tools available for ARM. • ARM C compiler.

• ARM assembler. • The linker.

• ARM symbolic debugger. • ARMulator.

.

18. Define ARMulator. Give its various levels of accuracy.

The ARMulator (ARM emulator) is a suite of programs that models the behavior of various ARM processor cores in software on a host system. It can operate at various levels of accuracy:

Instruction-accurate modeling gives the exact behavior of the system state without regard to the precise timing characteristics of the processor.

Cycle-accurate modeling gives the exact behavior of the processor on a cycle by- cycle basis, allowing the exact number of clock cycles that a program requires to be established.

Timing-accurate modeling presents signals at the correct time within a cycle, allowing logic delays to be accounted for.

19. Give the steps in exception handling.

The current state is saved by copying the PC into rl4_exc and the CPSR into SPSR_exc (where exc stands for the exception type). The processor operating mode is changed to the appropriate exception mode. The PC is forced to a value between 0016 and 1C16, the particular value depending on the type of exception.

20. Define Jump-start tools.

The Jumpstart tools from VLSI Technology, Inc., include the same basic set of development tools but present a full X-windows interface on a suitable workstation rather than the command-line interface of the standard ARM toolkit. There are many other suppliers of tools that support ARM development.

21. What are the principal components in 3-stage pipelining? The principal components in 3-stage pipelining are:

• The register bank • The barrel shifter • The ALU

• The address register and incrementer • The data registers

22. What are the factors considered to view breaks in ARM pipeline. The simplest way to view breaks in the ARM pipeline is to observe that: • All instructions occupy the data-path for one or more adjacent cycles.

• For each cycle that an instruction occupies the data-path, it occupies the decode logic in the immediately preceding cycle.

• During the first data-path cycle each instruction issues a fetch for the next instruction but one.

• Branch instructions flush and refill the instruction pipeline.

23. How performance of a processor can be increased? There are only two ways to increase performance:

• Increase the clock rate, fclk. - This requires the logic in each pipeline stage to be simplified and, therefore, the number of pipeline stages to be increased.

• Reduce the average number of clock cycles per instruction, CPI .- This requires either that instructions which occupy more than one pipeline slot in a 3-stage pipeline ARM are re-implemented to occupy fewer slots, or that pipeline stalls caused by dependencies between instructions are reduced, or a combination of both.

24. What is meant by Von-Neumann bottleneck?

• Any stored-program computer with a single instruction and data memory will have its performance limited by the available memory bandwidth. • The fundamental problem with reducing the CPI relative to a 3-stage core

is related to the von Neumann bottleneck.

• To get a significantly better CPI the memory system must deliver more than one value in each clock cycle.

25. What is Data forwarding?

The only way to resolve data dependencies without stalling the pipeline is to introduce forwarding paths.

Forwarding paths allow results to be passed between stages as soon as they are available, and the 5-stage ARM pipeline requires each of the three source operands to be forwarded from any of three intermediate result registers.

26. How clocking scheme is implemented in ARM?

The design is based around 2-phase non-overlapping clocks, which are generated internally from a single input clock signal. This scheme allows the use of level-sensitive transparent latches. Data movement is controlled by passing the data alternately through latches which are open during phase 1 and latches which are open during phase 2. The non-overlapping property of the phase 1 and phase 2 clocks ensures that there are no race conditions in the circuit.

27. Mention the delays to be considered for minimum data path cycle time. The minimum data-path cycle time depends on:

• The register read time; • The shifter delay; • The ALU delay;

• The register write set-up time;

• The phase 2 to phase 1 non-overlap time.

28. Explain the types of multiplier used in ARM processors. Two styles of multiplier have been used:

• Older ARM cores include low-cost multiplication hardware that supports only the 32-bit result multiply and multiply-accumulate instructions.

• Recent ARM cores have high-performance multiplication hardware and support the 64-bit result multiply and multiply-accumulate instructions.

29. Mention the structural components of ARM control logic.

The control logic on the simpler ARM cores has three structural components: 1. An instruction decoder PLA (programmable logic array).

2. Distributed secondary control - to select other instruction bits and/or processor state information to control the datapath.

3. Decentralized control units for specific instructions that take a variable number of cycles to complete (load and store multiple, multiply and coprocessor operations).

30. Define the mechanisms used to implement processor core.

There are two principal mechanisms used to implement an ARM processor core (or any other core, for than matter) on a particular process:

• A hard macrocell is delivered as physical layout ready to be incorporated into the final design;

• A soft macrocell is delivered as a synthesizable design expressed in a hardware description language such as VHDL.

31. List the features of co-processor architecture. The coprocessor’s most important features are: • Support for up to 16 logical coprocessors.

• Each coprocessor can have up to 16 private registers of any reasonable size; they are not limited to 32 bits.

• Coprocessors use load-store architecture, with instructions to perform internal operations on registers, instructions to load and save registers from and to memory, and instructions to move data to or from an ARM register.

32. What is a co-processor?

• The ARM architecture supports a general mechanism for extending the instruction set through the addition of coprocessors. The most common use of a coprocessor is the system coprocessor used to control on-chip functions such as the cache and memory management unit on the ARM720.

• A floating-point ARM coprocessor has also been developed, and application-specific coprocessors are a possibility.

• Coprocessor data operations are completely internal to the coprocessor and cause a state change in the coprocessor registers.

• ARM coprocessors have their own private register sets and their state is controlled by instructions that mirror the instructions that control ARM registers.

33. Define the handshaking signals used for coprocessor interfacing. The handshake uses three signals:

1. cpi (from ARM to all coprocessors).

This signal, which stands for 'Coprocessor Instruction', indicates that the ARM has identified a coprocessor instruction and wishes to execute it.

2. cpa (from the coprocessors to ARM).

This is the 'Coprocessor Absent' signal which tells the ARM that there is no coprocessor present that is able to execute the current instruction.

3. cpb (from the coprocessors to ARM).

This is the 'Coprocessor Busy' signal which tells the ARM that the coprocessor cannot begin executing the instruction yet.

34. What are the signals used in ARM bus transactions? The memory bus interface signals include the following:

• A 32-bit address bus, A [31:0], which gives the byte address of the data to be accessed.

• A 32-bit bidirectional data bus, D [31:0], along which the data is transferred.

• Signals that specify whether the memory is needed (mreq) and whether the address is sequential (seq); these are issued in the previous cycle so that the memory control logic can prepare appropriately.

• Signals that specify the direction (r/w) and size (b/w on earlier processors; mas[1:0] on later processors) of the transfer.

35. List the functions of control logic in memory interfacing. The control logic performs the following functions:

• It decides when to activate the RAM and when to activate the ROM. • It controls the byte write enables during a write operation.

• It ensures that the data is ready before the processor continues.

36. Define the buses specified in AMBA.

Three buses are denned within the AMBA specification:

• The Advanced High-performance Bus (AHB) is used to connect high- performance system modules. It supports burst mode data transfers and split transactions, and all timing is reference to a single clock edge.

• The Advanced System Bus (ASB) is used to connect high-performance system modules. It supports burst mode data transfers.

• The Advanced Peripheral Bus (APB) offers a simpler interface for low- performance peripherals.

37. Define AMBA.

• ARM processor cores have bus interfaces that are optimized for high- speed cache interfacing and required to allow the ARM to communicate with other on-chip macrocells.

• ARM Limited specified the Advanced Microcontroller Bus Architecture, AMBA, to standardize the on-chip connection of different macrocells.

38. What are the signals used by bus master?

The bus master which holds the grant; then proceeds with the bus transaction using the following signals:

• Bus transaction, BTRAN [1:0], indicates whether the next bus cycle will be address-only, sequential or non-sequential.

• The address bus, BA[31:O] • Bus transfer direction, BWRITE.

• Bus protection signals, BPROT[1:0], which indicate instruction or data fetches and supervisor or user access.

• The transfer size, BSIZE[1:0], specifies a byte, half-word or word transfer.

• Bus lock, BLOK, allows a master to retain the bus to complete an atomic read-modify-write transaction.

• The data bus, BD[31:0], used to transmit write data and to receive read data.

39. What are the signals used by bus slave unit?

A slave unit may process the requested transaction immediately, accepting write data or issuing read data on BD [31:0], or signals one of the following responses:

• Bus wait, BWAIT, allows a slave module to insert wait states when it cannot complete the transaction in the current cycle.

• Bus last, BLAST, allows a slave to terminate a sequential burst to force the bus master to issue a new bus transaction request to continue. • Bus error, BERROR, indicates a transaction that cannot be completed.

If the master is a processor it should abort the transfer.

40. Why bus reset signal is required?

• It takes some time for a clock oscillator to stabilize after power-up, so there may be no reliable clock available to sequence all the modules into a known state.

• In any case, if two or more modules power-up trying to drive bus lines in opposite directions, the output drive clashes may cause power supply crow-bar problems which may prevent the chip from powering up properly at all.

• Correct ASB power-up is ensured by imposing an asynchronous reset mode that forces all drivers off the bus independently of the clock.

41. Define APB?

• The ASB offers a relatively high-performance on-chip interconnects which suits processor, memory and peripheral macrocells with some built-in interface sophistication.

• For very simple, low-performance peripherals, the overhead of the interface is too high. The Advanced Peripheral Bus is a simple, static bus which operates as a stub on an ASB to offer a minimalist interface to very simple peripheral macrocells.

42. Why AHB is replaced for ASB in high performance systems? The following features differentiate the AHB from the ASB:

• It supports split transactions, where a slave with long response latency can free up the bus for other transfers while it prepares its data for transmission.

• It uses a single clock edge to control all of its operations, aiding synthesis and design verification (through the use of static timing analysis and similar tools).

• It uses a centrally multiplexed bus scheme rather than a bidirectional bus with tri-state drivers.

43. Mention the protocol followed to initiate a bus transfer. The ASB specifies the protocol which must be followed:

• The master, x, issues a request (AREQx) to the central arbiter.

• When the bus is available, the arbiter issues a grant (AGNTx) to master.

44. List the components of ARMulator. It is made up of four components:

• The processor core model, which can emulate any current ARM core • A memory interface which allows the characteristics of the target

memory system to be modelled.

• A coprocessor interface that supports custom coprocessor models. • An operating system interface that allows individual system calls to be

handled by the host or emulated on the ARM model.

45. Mention the test signals used by JTAG.

The interface works with five dedicated signals which must be provided on each chip that supports the test standard:

• TRST is a test reset input which initializes the test interface.

• TCK is the test clock which controls the timing of the test interface independently from any system clocks.

• TMS is the test mode select which controls the operation of the test interface state machine.

• TDI is the test data input line which supplies the data to the boundary scan or instruction registers.

• TDO is the test data output line which carries the sampled values from the boundary scan chain and propagates data to the next chip in the serial test circuit.

46. Define TAP controller.

• The operation of the test interface is controlled by the Test Access Port (TAP) controller. This is a state machine whose state transitions are controlled by TMS.

• All the states have two exits so the transitions can be controlled by one signal, TMS.

• The two main paths in the state transition diagram control the operation of a data register (DR) and the instruction register (IR).

47. What are the public instructions supported by compliant devices? The minimum set of public instructions that all compliant devices must support is:

• BYPASS: here the device connects TD Ito TDO though a single clock delay. • EXTEST: here the boundary scan register is connected between TDI and

TOO and the pin states are captured and controlled by the register. • IDCODE: here the ID register is connected between TDI and TDO. In the

Capture DR state the device ID is copied into the register which is then shifted out in the Shift DR state.

• INTEST: here the boundary scan register is connected between TDI and TDO and the core logic input and output states are captured and

controlled by the register.

48. What is an Embedded-ICE?

• The Embedded-lCE module consists of two watch-point registers and control and status registers. The watch-point registers can halt the ARM core when the address, data and control signals match the value programmed into the watch-point register.

• Since the comparison is performed under a mask, either watch-point register can be configured to operate as a breakpoint register capable of halting the processor when an instruction in either ROM or RAM is executed.

• The breakpoint and watch-point registers which are accessed as additional data registers using special JTAG instructions and a trace

Related documents