• No results found

Research Contributions

1 INTRODUCTION

1.3 Research Contributions

The research contributions addressing the issues and challenges facing integration of non- traditional execution platforms, the master/worker paradigm, and PDES are as follows:

• Master/worker architecture for PDES. I have developed an architecture to address the challenges and issues facing implementation and execution of PDES codes across loosely coupled distributing computing infrastructures such as volunteer computing and desktop grid systems. This architecture delivers fully reproducible results in a metacomputing environment through the development of portable, scalable, load-balanced, fault-tolerant, idle-cycle capturing services and protocols. I have proposed a set of fault-tolerance protocols for the master services to provide robust execution in the presence of failures. Additionally, I have developed extensions to the master/worker framework to allow an integrated and insulated execution environment for simultaneous PDES and task parallel simulations. The mechanisms developed allow any number of PDES simulations and replications to be run concurrently with task parallel simulations.

• Analysis of Portability Approaches and Impact on Performance. I have analyzed different approaches to portability from web services to highly portable libraries, showing their strengths and weaknesses with regard to architecture

independence and performance as it applies to a master/worker PDES architecture. I have addressed scalability concerns under a master/worker PDES architecture by developing protocols to distribute the set of services under the master portion of the paradigm allowing dynamic allocation of storage resources as needed. An empirical study comparing a monolithic universally portable system to a slightly

less portable distributed architecture was performed and presented with

quantitative differences between the two approaches. I have shown significant speedup by reducing excessive artificial overhead from a widely accessible web service approach and providing simulation capability for large-scale PDES through the use of distributed master services without large sacrifices in portability.

• Performance evaluation of a master/worker PDES system. Utilizing both synthetic workloads and real world applications, I have performed various empirical studies on master/worker PDES systems. I have characterized and evaluated key PDES properties such as lookahead, granularity, and computation to communication ratio for a master/worker environment. Understanding these characteristics can better classify which PDES applications are best suited for a master/worker execution. Moreover, for master/worker systems I have developed underscore the need for a more relevant metric by comparing and contrasting the amount of processor time spent in actual PDES computation versus overhead times associated with the master/worker environment. The proposed metrics provide a breakdown of each major component, and a profile indicating what portion of the total processor time dedicated to executing simulation application code as opposed to overhead. These metrics provide more useful and relevant information than traditional speedup metrics typically found in PDES

performance studies, and are applied to different performance tests under a master/worker PDES system. The results from these empirical studies show and validate the impact of key PDES properties on overall performance. The viability

of a PDES system under a shared loosely coupled computing resource is

demonstrated. Most importantly, the performance studies show PDES codes that are the most conducive to a master/worker execution leading to a clear

classification of expected performance (e.g., excessive overhead or acceptable overhead) according to inherent model characteristics.

• Conservative execution optimizations. In order to reduce the amount of intrinsic overheads involved in a master/worker PDES computation, I have proposed several optimizations that have been applied and evaluated to the master/worker PDES architecture. First, the design of a caching mechanism for storing recent simulation states on the workers along with various eviction

policies is discussed and incorporated. Second, scheduling policies for work units are described where lookahead and other time information along with runtime statistics are exploited to better prioritize partitions of work to clients. Third, a mechanism for overlapping communication with computation is proposed to efficiently pipeline simulation state updates. Similarly, a variety of techniques for masking communication costs associated with messages is designed and

evaluated. Finally, a protocol for pro-active message updating is described. Together, I have shown that these optimizations significantly reduce the performance gap between master/worker and traditional PDES systems using synthetic workloads and a real-world application.

• Optimistic execution mechanisms. Optimism on a master/worker paradigm across volatile computing platforms presents new challenges. Due to the centralized nature of metadata, state and messages, traditional Time Warp

concepts must be adapted to fit this paradigm. I have developed new techniques to allow optimistic executions to be performed under this metacomputing paradigm. Master/worker PDES operates on the principle of leasing execution windows for workers to process, so methods must be developed on determining the proper length of these windows even under pure stochastic simulations with no lookahead defined a priori. I have proposed two new rollback mechanisms to effectively deal with window-based leases and messages delivered via proxy. Issues such as delayed rollbacks due to no peer-to-peer connectivity, unique message identification across distributed master services and causality linkages are analyzed. These protocols handle rollbacks on the master services, as well as preserving the maximum amount of work already completed. Additionally, adaptive tuning of time window lengths and adaptive state saving mechanisms are proposed and evaluated.