1 Introduction
2.3 Distributed Computing in Embedded Systems
2.3.3 Real-Time Distributed Computing
2.3.3.1 Real-Time Java for Embedded Network Applications
In embedded environments, the scenario with very restricted memory, CPU and battery resources is additionally complicated when real time requirements are considered. Normally, considering Java for programming such systems is not ideal but Java has a lot of features that make it very attractive to ENS designers in the mobile computing field, including:
Heterogeneity with a JVM for Platform Independence Security Features and API
Dynamic Memoiy Management with Garbage Collectors Language is simple and object-orientated
Suited to Networkability Suited to Parallel Operations
When investigating the programming language choice between C, C++, and Java, there are many benchmarks available which show comparisons of compilers, interpreters with varying results from differing benchmarks [38, 39, 40] but C and C++ will be an order magnitude faster than Java. But when working with wireless systems, the communication scheme may be the most latent segment in the whole system which could make the programming language less significant. There are still many efforts to improve Java’s deterministic behaviour including the use o f just-in-
time (JIT) compilers, the introduction of a real time specification for Java (RTSJ), and various
Chapter 2.Distributed Computing Systems
2.3.3.1.1 Just-in-Time Compilation
Just-in-time compilers convert byte-code to machine code at run time and considered advantageous in many scenarios because it becomes portable, flexible and greatly improves performance. This is called dynamic classloading leading to a constantly changing Java environment. The major two drawbacks are 1) the start-up time delay which can be significantly higher and 2) an unpredictable memory model. An embedded JIT compilation method is discussed in [41] on a Yari processor that uses 1 MB RAM but also comments on the large Java classes required which must be statically (precompiled) or dynamically loaded (runtime compilation). An even smaller virtual machine is written in C called KVM [42] and utilises only 80 kB, much like an embedded OS, but is not open-source and has limited functionality.
2.3.3.1.2 R eal Time Specification fo r Java
As Java is such a popular language to use, the Real-Time for Java Expert Group (or RTJEG) made a real-time specification for Java (RTJS) to address timing constraints in Java [43]. The main enhanced areas this specification adds includes:
1. Thread Scheduling and Dispatching 2. Memory Management
3. Synchronisation and Resource Sharing
4. Asynchronous Event Handling (similar to hardware interrupts)
5. Asynchronous Transfer of Control (time-bounding context switching for complex programs)
6. Asynchronous Real-time Thread Termination (an ‘oracle’ to manage real time threads activity).
7. Physical Memoiy Access (direct access to raw or physical memory areas)
These additions allow programmers to include real time multi-threaded operation and a new long and short term memoiy management scheme called scoped memory. There is also a handler to perform context switching between tasks to guarantee real-time slot completion.
A comparison o f ORBexpress middleware versions were tested, written as ‘C++ ORB’ and ‘Java ORB’ [44]. Timing experiments were performed where Java program speeds veiy comparable to C++. Overall, it summarised that Java optimisations are very necessaiy but even simple tests o f the each C++ and Java implementations resulted in very comparative results. An example experiment was transporting data packets o f 32 kB through sockets where the ORB overhead is 81 us in C++ and 85 us in Java.
Chapter 2.Distributed Computing Systems
The latest Java real-time specification is used in [45] for memory management and real-time applications using ‘active’ software components that hide coding complexities and allows them to be reusable blocks of code. It did however assume applications are scheduled offline and parameters and mechanisms for handling overruns and missed deadlines are ignored, part of their future work. An example of using Java in embedded systems is also proposed in [46] where Java is used to reduce hardware communications overheads on an FPGA (tested in Simics) but does not address any real-time issues.
2.3,3.1.3 Hardware Implementations o f the JV M
The most recent hardware solutions in this research use early Java revisions to implement deterministic Java execution processors. Two open source Java processor options based on VHDL IP cores are Java Optimized Processor (JOP) [47, 48] and Secure Hardware Agent Platform (SHAP) [49]. JOP, shown in Figure 2-8, was designed by M. Schoeberl in 2003 and implements a JVM in hardware with predictable execution time for embedded real-time systems. It is a RISC architecture with 4 pipeline stages which can be used to predict the number of clock cycles and execution time for Java bytecodes to be run. Results in [50] showed that JOP is the smallest hardware realization of the JVM available to date in between 1100 to 1800 LCs (configuration dependent). Implemented in an FPGA, JOP also has the highest known operating clock frequency of Java processors, operating at lOOMHz (limited only by a selection of target FPGA devices). JOP was also compared against several embedded Java systems and, as a reference, with Java on a standard PC to find that a Java processor in hardware is up to 500 times faster than an interpreting JVM in software on a standard processor for an embedded system. This system requires additional software for networking, garbage collection and scheduling.
JOP Core Bytecode Fetch Menxwy Intafece i ec ACK^®£.£ ’ ^ ec Date Method F e tc h D e c o d e e A Stack Execute m ie n ru p t Data j coftiroî D a ta ! C0 i*.t?0 ! r o m te n a c e
Chapter 2.Distributed Computing Systems
JOP research has been extended in [51] for chip-multiprocessor (CMP) use where 3 test-bench applications are tested using between 1 and 8 JOP cores. Their results show that the parallel architecture is faster for multiple threaded behaviours compared to other Java processors but disadvantages include a) the complexity in software design, b) saturation if more tasks are added, and c) loss of processor speed due to processor bus arbitration.
SHAP, shown in Figure 2-9, is another VHDL IP core for running real-time Java but with some improvements over JOP. It is still a RISC architecture with 4 pipeline stages and there is additional hardware support for a schedule, general garbage collection, real-time garbage collection and dynamic class loading instead of software. This functionality though increases the required logic cells and the implementation can be between 50 - 100% more costly compared to JOP. To date, a CMP version of SHAP has not been published. The research aims at eventually running software Agents (1 per core) but no existing Agent software or any new Agent definitions have been published.
CO g S 5 UART Graphics Unit Ethernet M AC Method Cache ALU
Stack GarbageCollector
DMA Ctrl Memory CPU Memory Manager SHAP Mbroarchitecture Data Code 3S 32 configurable] 32
Figure 2-9. SHAP Architecture
Other existing processors include various aJile processors, such as the aJ-100 [52], a one-chip microprocessor to directly execute JVM bytecodes and supports J2ME CDC/CLDC stack and the RTJS. Their next generation processor due at the end of June 2009, the aJ-102 [53], focuses directly on highly embedded and networked applications with added 10/100 Ethernet core, encryption/decryption core, and AMBA AHB interface, shown in Figure 2-10.
It is worth noting that the new aJ-102 architecture follows a similar approach to the one proposed in Chapter 4 of this thesis which was first published in March 2008 [54] and further published in [55, 56]. This architecture is subtly different to the one in this thesis as it does not have the LE0N3 processor and has an increased ROM and cache size.
Chapter 2.Distributed Computing Systems Java Processor Execution Unit JWJM.... rn ed -P oin t MAC w c s 32KB Power Management Unit 1«100 Ethernet Controller ' AHB ContrcBer Unified iSD Cactte 32 KB Encyptwn Engine J
Card Interface ■ interface
iMen^st
Controller
Figure 2-10. Block Diagram of the aJ-102 [53]
Other implementations include the Imsys IMl 101 Java microprocessor [57] and the Javalin Stamp Module [58], both aimed at developing highly networked and embedded Java systems.
The advent of highly integrated circuits and networking applications has obviously led to the development of Java based processors making application development in gaming, mobile phone and many other industries inherently simple, portable, multi-threaded and distributed. As JOP is both open source and the fastest hardware implementation of a JVM, this core could be taken forward for development.