Implicit in the goal of providing a meaningful abstraction is the goals for the HWTI to provide a target for HLL to HDL translation. As discussed in section 2.5, this is not an easy task. Where as software can target an existing processor’s instruction set with a Von Neumann architecture behind it, hardware does not have any preexisting target. Because of this, there is not any pre-existing support for high-level language semantics. Pointers and a function call stack are often two capabilities left out. Support for pointers have largely been solved by translating
uses this method to support pointers. However, a hardware equivalent function call stack has not been addressed. Without a stack’s functionality, parameter passing is difficult and true recursion is impossible.
To address this problem, the HWTI creates a function call stack using its local memory. The HWTI’s function call stack works analogously to software function call stacks. There are three key differences. First, the stack and frame pointer are maintained as registers within the HWTI, pointing to its local memory instead of traditional global memory. During a call, the HWTI pushes the frame pointer and the number of passed function parameters values onto the stack. The stack and frame pointers are then appropriately incremented for the new function.
The second difference is specific to how parameters are passed and stored during the call. RISC CPU’s such as the MIPS architecture [71] and the PowerPC use a register convention that reserves general purpose registers to save parameters during a call. However, the HWTI does not maintain any general purpose registers that are shared with the user logic prohibiting it from using this option. Instead it uses a method similar to CISC architectures, like the x86 where all function parameters are passed by pushing the values onto the stack. The HWTI has a PUSH operation for this purpose. Once called, the callee function reads the the parameters by using a POP operation.
The third difference is instead of saving the contents of the program counter during a function call, as done on CPUs, the HWTI pushes the user logic’s state machine’s return state onto the stack. The user logic is required to pass the return state to the HWTI, along with the function to call, during a CALL operation. To be more specific, the user logic passes a 16-bit variable representing the return state. The user logic is responsible for mapping this variable to its return state
when control is returned to the caller function.
Function returns are implemented with the RETURN operation. Here, the stack register is set to the current frame register (minus the number of previously pushed parameters that was stored on the stack), and the frame registers is restored by popping the value from the stack. The return state and return value, limited to 32 bits, are passed back to the user logic.
The HWTI supports calling system and library functions, as well as user de- fined functions. The interface and protocol for calling any type of function is the identical for the user logic. The implementation difference is that for a system or library call, the HWTI performs the method on behalf of the user logic. For a user defined function call, the HWTI sets up the function stack for a new function, and then returns control to the user logic, specify the start state of the function. In order to give the user logic easy access to the local memory the HWTI sup- ports similar semantics to HLL variable declarations. To declare local variables, the user logic uses the DECLARE operation, with the number of words (4 bytes) in memory it wants to set aside for local variables. The HWTI reserves space on the stack by incrementing its stack pointer the specified number of words. The user logic access this memory using READ and WRITE operations in conjunction with an index number that corresponds to the declared variables. The first declared variable has index 0, the second declared variable has index 1, and so on. Since the variables are declared and granted space with the HWTI’s local memory, they each have an address in global memory. The ADDRESSOF operator works by con- verting the index number into its equivalent memory address, taking into account the HWTI’s base address and current frame pointer.
quently allows the HWTI to support recursive function calls in hardware. A hardware thread may repeatedly call the same function without incurring the costs of duplicating function logic within the FPGA fabric. The caller function’s state is saved to the HWTI’s local memory, and then restored when the callee function returns. The recursive depth of a function is only limited to the avail- ability of local memory. Two examples of recursive functions are given in the results section, they are quicksort (section 6.2.1) and factorial (section6.2.2).
To help understand how the HWTI function call stack is implemented, consider the pseudo code, and stack representation, given in Figure 4.2. This image depicts the state of the HWTI after calling the foo function.
Lastly, Table 4.2 lists the performance of operations associated with the HWTI support for function calls, variable declaration, and variable use.
Operation Clock Cycles
POP 5 PUSH 1 DECLARE 1 READ 3 WRITE 1 ADDRESSOF 1 CALL 3 RETURN 7
Table 4.2. Performance of function call stack operations