• No results found

Java Grande Update

N/A
N/A
Protected

Academic year: 2020

Share "Java Grande Update"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

Java Grande Update

http://www.javagrande.org

PTLIU Laboratory for Community Grids

Bryan Carpenter, Geoffrey Fox

Computer Science, Informatics, Physics Indiana University, Bloomington IN 47404

(Technology Officer, Anabas Corporation, San Jose)

http:

//grids.ucs.indiana.edu/ptliupages

(2)

Java Grande in a Nutshell

n

Concept started in December 1996 with first meeting

on Java for Science and Engineering

n

Forum established in February 1998

n

Multiple forum activities in numerics, message-passing

and parallel/distributed systems

n

Ongoing set of workshops sponsored by ACM

Bill Joy talked in 2000, Guy Steele in 2001

n

Multiple useful Web Sites and papers/presentations

n

JSR activities with probably insufficient momentum

n

No institutional contact with Sun for 3 years

n

No impressive support for Java on HPC machines with

(3)

Java Grande Concept

n

Use of

Java

for “

Performance

” and “

Usability

” in:

n

High Performance Network Computing

n

Scientific and Engineering Computation

n

(Distributed) Modeling and Simulation

n

Parallel and Distributed Computing

n

Data Intensive Computing

n

HPCC

n

Computational Grids

n

The above is classic “small” technical computing area.

There is a much larger Grande problem:

n

Communication

and

Computing Intensive Commercial

(4)

Java Grande Motivation I: Users

n

We have rather different drivers from HPCC (parallel

computing) and Enterprise Systems

n

In

Enterprise

software, we have Java as well

established but architectures new (J2EE and messages)

so new performance and scaling issues (Enterprise

systems are large as in Grids/Autonomic computing)

n

In

HPCC

we failed to produce good computing

environments in HPCC Initiative and there is a

possibly serious gap between field (use of

Fortran/C/C++) and next generation of potential

Science and Engineering Users (Java, C#, Python ….)

(5)

Java Motivation II: Language

n The Java Language has several good design features

secure, safe (wrt bugs), object-oriented, familiar (to C C++

and even Fortran programmers)

n Java has a very good set of libraries covering everything from

commerce, multimedia, images to math functions (under development at http://math.nist.gov/javanumerics)

n Java has best available electronic and paper training

resources

n Java has excellent integrated program development

environments

n Java naturally integrated with network and universal

(6)

Java Grande Forum

n Group meets (through 2002) either at annual meeting or

separately

n Forum coordinated by Fox

n Numerics Group led by Boisvert and Pozo

n The Concurrency and Applications (Benchmarks) Group led by

Caromel and Gannon

MPI subgroup led by Getov

n Annual ACM sponsored workshops were in Bay area just

before JavaOne upto 2001

In 1999 merged with ISCOPE (Object Methods in Scientific Computing

e.g. C++) but JG dominates

2002 held just before OOPSLA with 90 attendees and good quality papers

(7)

JG Workshop 2002 I

n KEYNOTE: Pratap Pattnaik, IBM, Autonomic Computing n Session II Grid and Parallel Computing

The Ninf Portal: An Automatic Generation Tool for Computing PortalsJavaSymphony: New Directives to Control and Synchronize Locality,

Parallelism, and Load Balancing for Cluster and GRID-Computing

Ibis: an Efficient Java-based Grid Programming EnvironmentEfficient, Flexible and Typed Group Communications for JavaJOPI: A Java Object-Passing Interface

n Session III Grid and Peer-to-peer Computing

Abstracting Remote Object Interaction in a Peer-2-Peer EnvironmentAdvanced Eager Scheduling for Java-Based Adaptively Parallel

Computing

A Scaleable Event Infrastructure for Peer to Peer Grids

n Session IV Java Compilation

Elimination of Java Array Bounds Checks in the Presence of IndirectionSimple and Effective Array Prefetching in Java

(8)

JG Workshop 2002 II

n Session V Object-based Computing

KEYNOTE: Alexander Stepanov, The Future of Abstraction

Generic Programming for High Performance Scientific Applications

n Session VI Object-based Computing and Applications

Higher-Order Functions and Partial Applications for a C++ Skeleton

Library

Ravenscar-Java: A High Integrity Profile for Real-Time JavaParsek: Object Oriented Particle in Cell. Implementation and

Performance Issues

inAspect - Interfacing Java and VSIPL

n Session VII Node Java I

Open Runtime Platform: Flexibility with Performance Using InterfacesAggressive Object Combining

Run-time Evaluation of Opportunities for Object Inlining in Java

n Session IV Node Java II

Jeeg: A Programming Language for Concurrent Objects SynchronizationSpecifying Java Thread Semantics Using a Uniform Memory Model

Immutability Specification and its Applications

(9)

Disappointing Comment

n I have not seen strong interest from HPCC users and HPCC

purchasers in Java

Possibly Chicken and Egg situation ..

n 2 years ago, Sun offered poor Java support on HPC

Not certain current situation

n IBM Research produced several interesting HPC compilers

supporting for example high performance arrays

These were not I think offered on IBM HPC machines

n However people voting on this are not from the Internet

generation and the “alternatives” are not good!

n However one of largest pure Java science applications is from

(10)

Types of Activity

n

Java on Node

Compilers and Language issues

n

Parallel Computing

Thread and Message-passing models

Very little academic work for any languages!

n

Distributed Computing

RMI

Jini JXTA

Grid and Web Services

n

High performance enterprise Java

(11)

Java on the Node

n

Numerics subgroup of Java Grande Forum focused on

“node issues”

Floating Point

Java Math libraries

Arrays – efficiency of >1D arrays and support of Fortran90 style array functions

Convenience and natural syntax – complex arithmetic notation and multi-type libraries

n

Very good SCIMARK node kernel benchmark at

http:

//math.nist.gov/javanumerics/

n

Broa

der range of benchmarks at

http:/

/www.epcc.ed.ac.uk/javagrande/

n

Typic

al compiler work (from IBM)

(12)

Scimark

n

http://math.nist.gov/scimark2/

n

FFT

n

SOR

n

Monte Carlo

n

Sparse Matrix Multiply

n

Dense LU

n

Available as downloadable applet

n

Today peak performance is Sun 1.4.2 VM on Pentium

(13)

Edinburgh Benchmark Set I

n

http://www.epcc.ed.ac.uk/javagrande/index_1.html

n

Sequential, multi-threaded, mpiJava, C versus Java

n

Low-Level

Arith: execution of arithmetic operationsAssign: variable assignment

Cast: casting

Create: creating objects and arraysLoop: Loop overheads

Math: execution of math library operationsMethod: method invocation

Serial: Serialisation

(14)

Edinburgh Benchmark Set II

n Medium Size

Series: Fourier coefficient analysisLUFact: LU Factorisation

SOR: Successive over-relaxationHeapSort: Integer sorting

Crypt: IDEA encryptionFFT: FFT

Sparse: Sparse Matrix multiplication

n “Real” Application

Search: Alpha-beta pruned search

Euler: Computational Fluid DynamicsMD: Molecular Dynamics simulationMC: Monte Carlo simulation

(15)

Java

(16)

Java

(17)

Numerics I

n Initially focused on “Java floating rules” that guaranteed same

(bad) result on all processors

n strictfp: This has been a part of Java for some time now. It is a

keyword that specifies that the original strict (slow) semantics for Java floating point should be followed.

The new default allows 15-bit exponents for anonymous (temporary)

variables. This tiny change allows Java implementations on the x86 family or processors to run at (nearly) full speed.

Also in default mode the specification of elementary functions is relaxed to

allow any result within one unit in the last place of the correctly rounded exact results. This allows more efficient algorithms to be used (including hardware sin/cos).

There is a separate java.lang.StrictMath library that has a specific

(18)

Some more on Numerics

n fastfp modifier: There was a JSR for this that was withdrawn.

Obvious goals include support for fused multiply-add.Mark Snir was the lead and IBM could not find a

replacement, so this is not being pursued.

At some point we'd like to resubmit. We are hoping that Joe Darcy would be the lead.

You can see info on fastfp at

http://math.nist.gov/javanumerics/reports/jgfnwg-minutes-6-00.html

n Joe Darcy: Joe has the title of Java Floating-Point Czar (it

actually says this on his business card).

Joe is working on floating-point issues within Sun and serves as our main technical contact now.

(19)

Numerics III

n

http://math.nist.gov/javanumerics/reports/jgfnwg-minutes-11-02.html November 2 2002 Update

n True multidimensional arrays indexed using specialized

notation. This is JSR83

n Operator overloading to support the easy expression of

alternate arithmetics.

n Complex numbers that are as efficient as primitive types.

n A new floating-point mode (i.e., fastfp) that admits the use of

fused multiply-add operations in Java, and possibly admits additional compiler optimizations, such as the use of

associativity.

(20)

Java and Parallelism

n

Message passing Systems

mpiJava from Community Grids Lab “oldie but goodie”Pure Java version MPJ planned but not implemented (well)

n

OpenMP in Java

JOMP from Edinburgh has its version of JavaGrande benchmarks

http://www.epcc.ed.ac.uk/computing/research_activities/JOM

P/index_1.html

n

Thread

and RMI based libraries

(21)

HPJava

n Conceived as a language for parallel programming, especially

suitable for massively parallel, distributed memory computers. n Takes various ideas (hopefully the good ones) from High

Performance Fortran ̶ distributed array model, parallel constructs.

n But in many respects HPJava is a lower level parallel

programming language than HPF (takes the best of MPI and HPF style programming models)

Explicitly SPMD, requiring parallel programmer to insert calls to collective communication libraries like MPI or Adlib

(library developed originally to support general distributed memory parallel compilers)

More or less as a by-product, HPJava also has a useful

(22)

HPspmd Model

n

HPJava was originally intended as a first demonstration of

a parallel programming model we called the

HPspmd

model

.

(Single Program Multiple Data)

Java was chosen as the base language for this demo

(instead of Fortran 90 or C++) partly because of

JavaGrande philosophy – we expected Java to be a

more productive high performance computing

environment

n

Actually it took so long to finish the HPJava preprocessor

that in the mean time Java has become comparable in

speed with those languages.

(23)

An HPF-like Program in HPJava

Procs p = new Procs2(P, P) ; // Declare 2d group of processes

on(p) { // Enclosed code executed by that group.

Range x = new ExtBlockRange(N, p.dim(0), 1, 1) ; // Distributed index ranges… in this

Range y = new ExtBlockRange(N, p.dim(1), 1, 1) ; // case extended with ghost regions.

float [[-,-]] u = new float [[x, y]] ; // A distributed array

for(int iter = 0 ; iter < NITER ; iter++) {

Adlib.writeHalo(u) ; // Communication – edge exchange

overall(i = x for 1 : N - 2) // Distributed, parallel looping construct

overall(j = y for 1 + (i` + iter) % 2 : N - 2 : 2)

u [i, j] = 0.25 * (u [i - 1, j] + u [i + 1, j] + u [i, j - 1] + u [i, j + 1]) ; }

(24)

HPJava vs HPF

n

This HPJava program looks like HPF, but the

programming model is one of multiple, interacting

processes, or threads

“Loosely synchronous”, not HPF single-threaded

semantics.

n

We invoke the communication library to update the

ghost regions in the array

explicitly

.

But because we have high-level collective libraries,

this isn’t particularly onerous.

n

Can “break out” of the collective mindset at any time,

(25)

Benefits of HPspmd Model

n Translators are much easier to implement than full parallel

compilers. No compiler magic needed and inherit immediately features of best “standard compilers”.

The current HPJava compiler is just a preprocessor converting to standard Java, using a simple translation

scheme with essentially no optimization. But performance is not embarrassing (see later).

Of course later we can do optimizations, and (hand-coding suggests) improve performance significantly.

n Good (object-oriented) framework for developing specialized

parallel libraries.

n HPspmd designed to have “ease of writing” of HPF but allow

(26)

HPJava Architecture

Full HPJava

(Group, Range, on, overall,…)

Sequential Java with Multiarrays

int[[*, *]]

Java Source-to-Source Translation

Adlib OOMPH¹ MPJ¹

mpjdev

Translator

(27)

HPJava Preprocessor Features

n Input language is a strict extension of Java 2.

n Multi-arrays are translated into conventional Java 1D arrays n Front-end implements all compile-time checks required by the

Java Language Spec (currently testing against Jacks suite).

Goal: if the preprocessor accepts the source, it never outputs a program the javac back-end will reject.

n Carefully preserves line numbering, so run-time exception

messages usually point accurately back into original HPJava source code – makes debugging HPJava “easy”.

n Version 1.0 released April 2003 at http://www.hpjava.org.

Full source of preprocessor + libraries are in public domain.Good framework for other experiments with Java language

(28)

Libraries

n

Adlib

is a comprehensive library for collective operations on

distributed arrays, implementing operations like reductions,

shifts, edge exchange for stencil updates, etc.

Invoked like MPI, but higher level.

Originally implemented to support HPF translation (shpf,

PCRC

projects).

Originally C++, now Java, implemented on

mpjdev

portability layer.

n

MPJ

is a proposed Java binding of standard MPI

n

OOMPH

is an envisaged HPJava binding of MPI-level

(29)

Low-Level Messaging for HPC

n mpiJava is our own binding of MPI for Java. Implemented as native method

wrappers for “real” MPI (MPICH, Sun HPC and IBM MPI’s)

Several other groups developed similar APIs, but mpiJava is probably most

widely used today.

Pugh claims Java’s new I/O very fast. Prototyped a nio-based MPJava, but

not actively developing.

Distinguish from “MPI on the Grid” MPICH-G2

n MPJ was put forward as a unified “standard” by a small group (including

Vladimir Getov, Tony Skjellum and Carpenter from Indiana) but activity appears to be dormant.

API is quite large and inherits some ugly features from MPI and mpiJava.

A smaller, more focused, OOMPH API might be more attractive.

(30)

Level of Interest in mpiJava?

n

Uptake of

our

mpiJava

software

over ~five

years:

n

Being used

(31)

Example HPJava Benchmarks on IBM SP3

(32)

Distributed Computing

n

Much of the work of Forum was in distributed

computing

n

It included several pure Java frameworks such as those

from European groups (Bal, Caromel, Phillipsen)

n

Forum initially focused on fast RMI and a “Java

Framework for Computing” or Java Computing

Portals

n

Most workers in this field now position research in a

Grid context

n

RMI becomes GridRPC

n

Portals become Grid Computing Environments

(33)

Background on Indiana

Community Grid

Laboratory Research

Geoffrey Fox, Director

(34)

6 Activity Areas in CG Laboratory I

n

HPJava:

Parallel Programming with Java

MPI and HPF Style Programming in Java (multi-arrays)http://www.hpjava.org

Build on this for “HPSearch” with Java bound to “Grid/Web/XPATH/Google” handles

Available Dec. 2002; mpiJava available for 3 years

n

NaradaBrokering

Publish/Subscribe Distributed

Event/Message System

http://www.naradabrokering.org

“MQSeries/JMS” P/S applied to Collaboration, Grid, P2P(JXTA)

Supports UDP, TCP/IP, Firewalls (actual transportuser call)

(35)

6 Activity Areas in CG Laboratory II

n Online Knowledge Center DoD HPCC Support Portal

http://ptlportal.communitygrids.iu.edu/portal/

Portal, Database, XML Metadata ToolsJetspeed and portlet architecture

http://www.xmlnuggets.org is “email group” interface for browsing

multiple instances of a Schema (also XML based news groups)

Schema wizard gives general user interfaces for each Schema

n

Gateway Computing Portal

DoD HPCMO, Geoscience, (Bioinformatics, particle physics) applications

Web Service based (originally CORBA)

Kerberos, SAML Security, GCE Shell – 70 functionsIntegrate data and compute Grids

(36)

6 Activity Areas in CG Laboratory II

Components of an Education Grid

n Anabas provides base JMS based collaborative e-learning

service (Fox co-founded 2 years ago)

n Collaboration as a Web Service

General XGSP specification of Collaborative session capturing

H323 SIP JXTA

Audio-Video conferencing as a web service – Admire (Beihang),

Access Grid, VOIP, Polycom, Desktop USB

Move all tools and shared applications (Word, PowerPoint)

General Scheme to make WS’s collaborative using NB

n Carousel HandHeld Collaborative Environments

iPAQ running Savaje Java OS linked to PC’s; adding

cell-phone/PDA tandems

SVG as a Web Service demonstrated

(37)

HPJava: December 2003

n

The HPJava project has been in hibernation

(maintenance only) since the release of the HPJava

Development Kit, version 1.0, April this year.

n

The release wasn’t very aggressively advertised, partly

because the project ran out of funding.

Rate of downloads similar to early days of mpiJava, which

was quite aggressively advertised.

n

Noticeable growth in downloads of mpiJava in the past

(38)

HPJava Software Status

n The 1.0 release was functionally complete and self-contained

with good documentation, and a reasonably comprehensive test suite. Reliable, we think.

Highly compliant with standard Java. Strict extension of JLS 2nd ed.

syntax. Produces standard class files. Uses standard JVMs to execute. Any standard Java code can be invoked.

n Performance suffers from a naïve translation scheme. It would

be easy to improve the node code dramatically, with standard optimizations. Anybody have compiler experience?

n HPJava language specification is quite stable and well-defined.

Probably some semantic details should be changed to make optimization easier.

(39)

Java MPI Status

n mpiJava is popular in its field, popularity apparently still

increasing.

Foundation for communication in the HPJava language, but mostly used

standalone at the moment.

n It is old, and the implementation needs a complete rewrite.

It is a relatively complex API where we tried to make every call a JNI

wrapper. Now obvious this is the wrong strategy.

n It should be rebuilt on top of a layer like mpjdev, which could be

implemented using native MPI (or other native HPC platform)

or on top of Java NIO (plus Jini?). This should make it much more maintainable.

n The API could also be improved.

(40)

Related Systems I: UPC

n UPC is a new parallel version of the C language, supported by

various groups and companies.

n Programming model is distributed shared memory: somewhere

in between HPF and OpenMP.

c.f. HPJava is somewhere in between HPF and MPI.

n Adds shared type signatures for arrays, pointers.

n Block cyclic distribution formats like HPF, but one dimensional

distribution only.

multidimensional arrays effectively flattened before mapping.HPJava, like HPF, has true multidimensional distributions.

n Primitives in language for barriers, locks.

HPJava doesn’t have these: message + library based communication.

n HPJava, UPC both SPMD. After that the approaches and

(41)

Related Systems II: Titanium

n Parallel language from Berkeley, based on Java. Titanium Syntax =

Java 1 + many new features – Java Threads

n Parallel programming model conceptually similar to UPC, but

seems more complicated.

Variables can be independently global or local, shared or nonshared (or

polyshared).

Discipline of single variables and methods, and compile time checks on

valid barrier sequence.

Sophisticated model of multiarrays, based on Domain and Point concepts.Many other language-level features…

n Not very compatible with Java: compiles to C++; doesn’t use a

References

Related documents

Density (no./ha) estimates of regal fritillary ( Speyeria idalia ) grouped by overall management from surveys during 2012 and 2014 – 2016 at the Fort Riley Military Reserve and

Hydraulfunktion can offer a small range of hydraulic gear pumps, commonly used in hydraulic power packs and other hydraulic units.... group

Tool providers understood that the drivers of different companies using their tools may vary but the general information they needed to improve their sustainable packaging design

Results: We provide evidence that exposure of monocyte-derived dendritic cells (MDDCs) to recombinant HIV-1 R5 gp120, but not to CCR5 natural ligand CCL4, influences the expression of

Vantage Safety offers repair and maintenance services for a wide variety of gas detection and monitoring devices. We operate a 10,000-square-foot repair facility in

Stora Enso Oulu Finland 2 Guide rolls TecnoClean coating 7 2012 International Paper-Kwidzyn Puola 2 Dryer cylinders TecnoClean coating 6 2012 Zao International Paper Russia 1 Guide