• No results found

Big Data, R, and HPC

N/A
N/A
Protected

Academic year: 2021

Share "Big Data, R, and HPC"

Copied!
66
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Data, R, and HPC

A Survey of Advanced Computing with R

Drew Schmidt

April 16, 2015

(2)

About Me

@wrathematics

http://librestats.com

https://github.com/wrathematics

http://wrathematics.info

(3)

Introduction

1

Introduction

XSEDE

Big Data and Bio

Why is my code slow?

2

Profiling and Benchmarking

3

A Hasty Introduction to Advanced Computing with R

4

Wrapup

(4)

Introduction XSEDE

1

Introduction

XSEDE

Big Data and Bio

Why is my code slow?

(5)

Introduction XSEDE

Extreme  Science  and  Engineering  Discovery  Environment  

Follow  on  NSF  project  to  TeraGrid  in  2012  

Centers  operate  machines,  and  XSEDE  provides  seamless  

infrastructure  for  allocaEons,  access,  and  training  

Researchers  propose  resource  use  through  XRAS  

Supports  thousands  of  scienEsts  in  fields  such  as:  

Chemistry  

BioinformaEcs  

Materials  Science  

Data  Sciences  

(6)

Introduction XSEDE

XSEDE Allocations

Want  to  use  XSEDE  resources  to  teach  a  

class?  

h3ps://portal.xsede.org/alloca;ons-­‐

overview#types-­‐educa;on  

Just  looking  to  try  out  a  larger  resource  or  a  

special  resource  your  campus  doesn’t  have?  

h3ps://portal.xsede.org/alloca;ons-­‐

overview#types-­‐startup  

 

(7)

Introduction XSEDE

XSEDE Allocations

See  a  Campus  Champion  

h.ps://www.xsede.org/current-­‐champions  

Ready  to  scale  up  your  research?  

h.ps://portal.xsede.org/alloca>ons-­‐

overview#types-­‐research  

(8)

Introduction XSEDE

More

helpful

resources

xsede.org

à

User  Services  

Resources

 

available  at  each  Service  Provider  

User  Guides  describing  memory,  number  of  CPUs,  file  systems,  

etc.  

Storage  facili?es  

So@ware  (Comprehensive  Search)  

Training:  

portal.xsede.org  

à

 Training  

Course  Calendar  

On-­‐line  training  

Cer?fica?ons  

Get  face-­‐to-­‐face  help    from  XSEDE  experts  at  your  ins?tu?on;  

contact  your  local  

Campus  Champions

.    

Extended  Collabora?ve  Support  (formerly  known  as  Advanced  

User  Support  (AUSS))      

(9)

Introduction Big Data and Bio

1

Introduction

XSEDE

Big Data and Bio

Why is my code slow?

(10)

Introduction Big Data and Bio

Big Data

Volume, Velocity, Variety

Volume — Sequencers

Velocity — sensors ???

Variety — fasta, csv, databases, images, unstructured, . . .

Complexity — complicated models

(11)

Introduction Big Data and Bio

Common Computational Problems in Bio

p

>

n

.

Many small tasks (workflow).

Parallelization often difficult.

(12)

Introduction Why is my code slow?

1

Introduction

XSEDE

Big Data and Bio

Why is my code slow?

(13)

Introduction Why is my code slow?

Why is my code slow?

Bad code.

Abstraction.

Serial.

(14)

Introduction Why is my code slow?

Bad Code

R isn’t very smart.

R is slow, but bad programmers are slower!

Bad parallel code may be slower than good serial code.

(15)

Introduction Why is my code slow?

Abstraction

Never completely free!

Could cost a microsecond (not worth worrying about!) or much,

much more . . .

But abstraction is A Good Thing

TM

.

Have to find the right balance.

(16)

Introduction Why is my code slow?

Serial

https://csgillespie.wordpress.com/2011/01/25/cpu-and-gpu-trends-over-time/

(17)

Profiling and Benchmarking

1

Introduction

2

Profiling and Benchmarking

Why Profile?

Profiling R Code

Advanced R Profiling

Benchmarking

3

A Hasty Introduction to Advanced Computing with R

4

Wrapup

(18)

Profiling and Benchmarking Why Profile?

2

Profiling and Benchmarking

Why Profile?

Profiling R Code

Advanced R Profiling

Benchmarking

(19)

Profiling and Benchmarking Why Profile?

Performance and Accuracy

Sometimes

π

= 3

.

14

is (a) infinitely faster

than the “correct” answer and (b) the

differ-ence between the “correct” and the “wrong”

answer is meaningless. . . . The thing is, some

specious value of “correctness” is often

irrel-evant because it doesn’t matter. While

per-formance almost always matters. And I

ab-solutely detest the fact that people so often

dismiss performance concerns so readily.

— Linus Torvalds, August 8, 2008

(20)

Profiling and Benchmarking Why Profile?

Why Profile?

Because performance matters.

Bad practices scale up!

Your bottlenecks may surprise you.

Because R is dumb.

R users claim to be data people. . . so act like it!

(21)

Profiling and Benchmarking Why Profile?

Compilers often correct bad behavior. . .

A Really Dumb Loop

1

int

m a i n () {

2

int

x , i ;

3

for

( i =0; i < 1 0 ; i ++)

4

x = 1;

5

r e t u r n

0;

6

}

clang -O3 -S example.c

m a i n :

. cfi_

s t a r t p r o c

# BB #0:

x o r l

% eax ,

% eax

ret

clang -S example.c

m a i n :

. cfi_

s t a r t p r o c

# BB #0:

m o v l

$0 , -4(% rsp )

m o v l

$0 ,

-12(% rsp )

. L B B 0

_

1:

c m p l

$

10 ,

-12(% rsp )

jge

. L B B 0

_4

# BB #2:

m o v l

$1 , -8(% rsp )

# BB #3:

m o v l

-12(% rsp ) , % eax

a d d l

$1 , % eax

m o v l

% eax ,

-12(% rsp )

jmp

. L B B 0

_1

. L B B 0

_

4:

m o v l

$0 , % eax

ret

(22)

Profiling and Benchmarking Why Profile?

R will not!

Dumb Loop

1

for

( i in 1: n ) {

2

tA <- t(A)

3

Y

< -

tA %* % Q

4

Q < - qr.

Q(

qr( Y ) )

5

Y

< -

A %* % Q

6

Q < - qr.

Q(

qr( Y ) )

7

}

8 9

Q

Better Loop

1

tA <- t(A)

2 3

for

( i in 1: n ) {

4

Y

< -

tA %* % Q

5

Q < - qr.

Q(

qr( Y ) )

6

Y

< -

A %* % Q

7

Q < - qr.

Q(

qr( Y ) )

8

}

9 10

Q

(23)

Profiling and Benchmarking Why Profile?

Example from a Real R Package

Exerpt from Original function

1

w h i l e

( i <= N ) {

2

for

( j in 1: i ) {

3

d.k <- as.matrix(x)[l==j,l==j]

4

...

Exerpt from Modified function

1

x.mat <- as.matrix(x)

2 3

w h i l e

( i <= N ) {

4

for

( j in 1: i ) {

5

d.k <- x.mat[l==j,l==j]

6

...

By changing just 1 line of

code, performance of the

main method improved by

over 350%

!

(24)

Profiling and Benchmarking Why Profile?

Some Thoughts

R is slow.

Bad programmers are slower.

R can’t fix bad programming.

(25)

Profiling and Benchmarking Profiling R Code

2

Profiling and Benchmarking

Why Profile?

Profiling R Code

Advanced R Profiling

Benchmarking

(26)

Profiling and Benchmarking Profiling R Code

Timings

Getting simple timings as a basic measure of performance is easy, and

valuable.

system.time()

— timing blocks of code.

Rprof()

— timing execution of R functions.

Rprofmem()

— reporting memory allocation in R .

tracemem()

— detect when a copy of an R object is created.

(27)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

system.time()

system.time()

is a basic R utility for timing expressions

1

x

< - m a t r i x

(

r n o r m

( 2 0 0 0 0*

7 5 0 ) ,

n r o w

=20000 ,

n c o l

= 7 5 0 )

2 3

s y s t e m

.

t i m e

(t

( x ) %* %

x )

4

#

u s e r

s y s t e m e l a p s e d

5

#

2 . 1 8 7

0 . 0 3 2

2 . 3 2 4

6 7

s y s t e m

.

t i m e

(

c r o s s p r o d

( x ) )

8

#

u s e r

s y s t e m e l a p s e d

9

#

1 . 0 0 9

0 . 0 0 3

1 . 0 1 9

10 11

s y s t e m

.

t i m e

(

cov( x ) )

12

#

u s e r

s y s t e m e l a p s e d

13

#

6 . 2 6 4

0 . 0 2 6

6 . 3 3 8

(28)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

system.time()

Put more complicated expressions inside of brackets:

1

x

< - m a t r i x

(

r n o r m

( 2 0 0 0 0*

7 5 0 ) ,

n r o w

=20000 ,

n c o l

= 7 5 0 )

2 3

s y s t e m

.

t i m e

({

4

y

< -

x +1

5

z

< -

y

*2

6

})

7

#

u s e r

s y s t e m e l a p s e d

8

#

0 . 0 5 7

0 . 0 3 2

0 . 0 8 9

(29)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

Rprof()

1

R p r o f ( f i l e n a m e =

" R p r o f . out "

,

a p p e n d= FALSE , i n t e r v a l =0.02 ,

2

m e m o r y

. p r o f i l i n g = FALSE ,

gc. p r o f i l i n g = FALSE ,

3

l i n e . p r o f i l i n g = FALSE , n u m f i l e s = 1 0 0 L , b u f s i z e = 1 0 0 0 0 L )

(30)

Profiling and Benchmarking Profiling R Code

(31)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

Rprof()

1

x

< - m a t r i x

(

r n o r m

( 1 0 0 0 0*

2 5 0 ) ,

n r o w

=10000 ,

n c o l

= 2 5 0 )

2 3

R p r o f ()

4

i n v i s i b l e

( p r c o m p ( x ) )

5

R p r o f ( N U L L )

6 7

s u m m a r y R p r o f ()

(32)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

Rprof()

1

$ by

. s e l f

2

s e l f .

t i m e

s e l f . pct t o t a l .

t i m e

t o t a l . pct

3

" La . svd "

0 . 6 8

6 9 . 3 9

0 . 7 2

7 3 . 4 7

4

" %

* %

"

0 . 1 2

1 2 . 2 4

0 . 1 2

1 2 . 2 4

5

" a p e r m . d e f a u l t "

0 . 0 4

4 . 0 8

0 . 0 4

4 . 0 8

6

" a r r a y "

0 . 0 4

4 . 0 8

0 . 0 4

4 . 0 8

7

" m a t r i x "

0 . 0 4

4 . 0 8

0 . 0 4

4 . 0 8

8

" s w e e p "

0 . 0 2

2 . 0 4

0 . 1 0

1 0 . 2 0

9

# ## o u t p u t t r u n c a t e d by p r e s e n t e r

10 11

$ by

. t o t a l

12

t o t a l .

t i m e

t o t a l . pct s e l f .t i m e

s e l f . pct

13

" p r c o m p "

0 . 9 8

1 0 0 . 0 0

0 . 0 0

0 . 0 0

14

" p r c o m p . d e f a u l t "

0 . 9 8

1 0 0 . 0 0

0 . 0 0

0 . 0 0

15

" svd "

0 . 7 6

7 7 . 5 5

0 . 0 0

0 . 0 0

16

" La . svd "

0 . 7 2

7 3 . 4 7

0 . 6 8

6 9 . 3 9

17

# ## o u t p u t t r u n c a t e d by p r e s e n t e r

18 19

$ s a m p l e

. i n t e r v a l

20

[1] 0 . 0 2

21 22

$

s a m p l i n g .t i m e

23

[1] 0 . 9 8

(33)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

Rprof()

1

R p r o f ( i n t e r v a l = . 9 9 )

2

i n v i s i b l e

( p r c o m p ( x ) )

3

R p r o f ( N U L L )

4

5

s u m m a r y R p r o f ()

(34)

Profiling and Benchmarking Profiling R Code

Performance Profiling Tools:

Rprof()

1

$ by

. s e l f

2

[1] s e l f .

t i m e

s e l f . pct

t o t a l .

t i m e

t o t a l . pct

3

<0 rows > (

or

0 -

l e n g t h row.

n a m e s

)

4 5

$ by

. t o t a l

6

[1] t o t a l .t i m e

t o t a l . pct

s e l f .

t i m e

s e l f . pct

7

<0 rows > (

or

0 -

l e n g t h row.

n a m e s

)

8 9

$ s a m p l e. i n t e r v a l

10

[1] 0 . 9 9

11 12

$

s a m p l i n g .t i m e

13

[1] 0

(35)

Profiling and Benchmarking Advanced R Profiling

2

Profiling and Benchmarking

Why Profile?

Profiling R Code

Advanced R Profiling

Benchmarking

(36)

Profiling and Benchmarking Advanced R Profiling

Other Profiling Tools

perf, PAPI

fpmpi, mpiP, TAU

pbdPROF

pbdPAPI

(37)

Profiling and Benchmarking Advanced R Profiling

Profiling MPI Codes with

pbdPROF

1. Rebuild p

p

p

p

p

p

b

b

b

b

b

b

d

d

d

d

d

d

R

R

R

R

R

R

packages

R CMD I N S T A L L p b d M P I_

0.2 -1. tar . gz \

- - c o n f i g u r e - a r g s = \

" - - enable - p b d P R O F "

2. Run code

m p i r u n - np 64 R s c r i p t my

_

s c r i p t . R

3. Analyze results

1

l i b r a r y

( p b d P R O F )

2

p r o f

< - r e a d

. p r o f (

" o u t p u t . m p i P "

)

3

p l o t

( prof ,

p l o t

. t y p e =

" m e s s a g e s 2 "

)

(38)

Profiling and Benchmarking Advanced R Profiling

Profiling with

pbdPAPI

Bindings for Performance Application

Programming Interface (PAPI)

Gathers detailed hardware counter data.

High and low level interfaces

Function

Description of Measurement

system.flips()

Time, floating point instructions, and Mflips

system.flops()

Time, floating point operations, and Mflops

system.cache()

Cache misses, hits, accesses, and reads

system.epc()

Events per cycle

system.idle()

Idle cycles

system.cpuormem()

CPU or RAM bound

system.utilization()

CPU utilization

(39)

Profiling and Benchmarking Benchmarking

2

Profiling and Benchmarking

Why Profile?

Profiling R Code

Advanced R Profiling

Benchmarking

(40)

Profiling and Benchmarking Benchmarking

Benchmarking

R functions are complicated!

Symbol lookup, creating the abstract syntax tree, creating promises

for arguments, argument checking, creating environments, . . .

Executing a second time can have dramatically different performance

over the first execution.

Benchmarking several methods fairly requires some care.

(41)

Profiling and Benchmarking Benchmarking

Benchmarking tools: rbenchmark

rbenchmark

is a simple package that easily benchmarks different

functions:

1

x

< - m a t r i x

(

r n o r m

( 1 0 0 0 0*

5 0 0 ) ,

n r o w

=10000 ,

n c o l

= 5 0 0 )

2 3

f

< - f u n c t i o n

( x )

t

( x ) %* %

x

4

g

< - f u n c t i o n

( x )

c r o s s p r o d

( x )

5 6

l i b r a r y

( r b e n c h m a r k )

7

b e n c h m a r k ( f ( x ) , g ( x ) , c o l u m n s =c

(

" t e s t "

,

" r e p l i c a t i o n s "

,

" e l a p s e d "

,

" r e l a t i v e "

) )

8 9

#

t e s t r e p l i c a t i o n s e l a p s e d r e l a t i v e

10

# 1 f ( x )

100

1 3 . 6 7 9

3 . 5 8 8

11

# 2 g ( x )

100

3 . 8 1 2

1 . 0 0 0

(42)

Profiling and Benchmarking Benchmarking

Benchmarking tools: microbenchmark

microbenchmark

is a separate package with a slightly different

philosophy:

1

x

< - m a t r i x

(

r n o r m

( 1 0 0 0 0*

5 0 0 ) ,

n r o w

=10000 ,

n c o l

= 5 0 0 )

2 3

f

< - f u n c t i o n

( x )

t

( x ) %* %

x

4

g

< - f u n c t i o n

( x )

c r o s s p r o d

( x )

5 6

l i b r a r y

( m i c r o b e n c h m a r k )

7

m i c r o b e n c h m a r k ( f ( x ) , g ( x ) , u n i t =

" s "

)

8 9

# U n i t : s e c o n d s

10

#

e x p r

min

lq

m e a n

m e d i a n

uq

max n e v a l

11

#

f ( x ) 0 . 1 1 4 1 8 6 1 7 0 . 1 1 6 4 7 5 1 7 0 . 1 2 2 5 8 5 5 6 0 . 1 1 7 5 4 3 0 2 0 . 1 2 0 5 8 1 4 5

0 . 1 7 2 9 2 5 0 7

100

12

#

g ( x ) 0 . 0 3 5 4 2 5 5 2 0 . 0 3 6 1 3 7 7 2 0 . 0 3 8 8 4 4 9 7 0 . 0 3 6 6 8 2 3 1 0 . 0 3 7 4 0 1 7 3

0 . 0 7 4 7 8 3 0 9

100

(43)

Profiling and Benchmarking Benchmarking

Benchmarking tools: microbenchmark

I generally prefer

rbenchmark

, but the built-in plots for

microbenchmark

are nice:

1

b e n c h

< -

m i c r o b e n c h m a r k ( f ( x ) , g ( x ) , u n i t =

" s "

)

2 3

b o x p l o t

( b e n c h )

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● f(x) g(x) 40 60 80 100 120 140 160 Expression log(time) [t]

(44)

A Hasty Introduction to Advanced Computing with R

1

Introduction

2

Profiling and Benchmarking

3

A Hasty Introduction to Advanced Computing with R

Free

Better Code

Compiled code

Parallelism

4

Wrapup

(45)

A Hasty Introduction to Advanced Computing with R

Types of Improvements

Free.

Better code.

Compiled code.

Parallelism.

(46)

A Hasty Introduction to Advanced Computing with R Free

3

A Hasty Introduction to Advanced Computing with R

Free

Better Code

Compiled code

Parallelism

(47)

A Hasty Introduction to Advanced Computing with R Free

Build R with a Better Compiler

Better compiler =

Faster R

Not entirely painless.

Can cost $$$.

R Installation and Administration

:

http://cran.r-project.org/doc/manuals/R-admin.html

(48)

A Hasty Introduction to Advanced Computing with R Free

The Bytecode Compiler

1

f

< - f u n c t i o n

( n )

for

( i in 1: n ) 2*

( 3 + 4 )

2 3 4

l i b r a r y

( c o m p i l e r )

5

f_

c o m p

< -

c m p f u n ( f )

6 7 8

l i b r a r y

( r b e n c h m a r k )

9 10

n

< -

1 0 0 0 0 0

11

b e n c h m a r k ( f ( n ) , f_

c o m p ( n ) , c o l u m n s =

c(

" t e s t "

,

" r e p l i c a t i o n s "

,

" e l a p s e d "

,

12

" r e l a t i v e "

) ,

13

o r d e r

=

" r e l a t i v e "

)

14

#

t e s t r e p l i c a t i o n s e l a p s e d r e l a t i v e

15

# 2 f_

c o m p ( n )

100

2 . 6 0 4

1 . 0 0 0

16

# 1

f ( n )

100

2 . 8 4 5

1 . 0 9 3

(49)

A Hasty Introduction to Advanced Computing with R Free

Choice of BLAS

Library

1 s e t. s e e d ( 1 2 3 4 ) 2 m<−2000 3 n<−2000 4 x<−m a t r i x( 5 r n o r m(m∗n ) , 6 m, n ) 7 8 o b j e c t . s i z e ( x ) 9 10 l i b r a r y( r b e n c h m a r k ) 11 12 b e n c h m a r k ( x%∗%x ) 13 b e n c h m a r k (s v d( x ) )

x%*%x on 2000x2000 matrix (~31 MiB) x%*%x on 4000x4000 matrix (~122 MiB)

svd(x) on 1000x1000 matrix (~8 MiB) svd(x) on 2000x2000 matrix (~31 MiB) 0 10 20 30 40 50 0 10 20 30 40 50

reference atlas openblas1 openblas2 reference atlas openblas1 openblas2

BLAS Impelentation

A

v

er

age W

all Clock Run Time (10 Runs)

Comparison of Different BLAS Implementations for Matrix−Matrix Multiplication and SVD

(50)

A Hasty Introduction to Advanced Computing with R Better Code

3

A Hasty Introduction to Advanced Computing with R

Free

Better Code

Compiled code

Parallelism

(51)

A Hasty Introduction to Advanced Computing with R Better Code

Loops, Plys, and Vectorization

Loops are slow.

apply(),

Reduce()

are just

for

loops.

Map(),

lapply(),

sapply(),

mapply()

(and most other core ones)

are

not

for

loops.

Ply functions are not vectorized

.

Vectorization is fastest, but consumes lots of memory.

(52)

A Hasty Introduction to Advanced Computing with R Compiled code

3

A Hasty Introduction to Advanced Computing with R

Free

Better Code

Compiled code

Parallelism

(53)

A Hasty Introduction to Advanced Computing with R Compiled code

Rcpp

What Rcpp

is

R interface to compiled code.

Package ecosystem (Rcpp, RcppArmadillo, RcppEigen, . . . ).

Utilities to make writing C++ more convenient for R users.

A tool which requires C++ knowledge to effectively utilize.

What Rcpp

is not

Magic.

Automatic R-to-C++ converter.

A way around having to learn C++.

As easy to use as R.

(54)

A Hasty Introduction to Advanced Computing with R Compiled code

Quickly Getting Started

1

c o d e

< -

2

# i n c l u d e < R c p p . h >

3 4

/ /

[[ R c p p :: e x p o r t ]]

5

int p l u s t w o ( int n )

6

{

7

r e t u r n n +2;

8

}

9

10 11

l i b r a r y

( R c p p )

12

s o u r c e C p p ( c o d e = c o d e )

13 14

p l u s t w o (1)

15

# [1] 3

(55)

A Hasty Introduction to Advanced Computing with R Parallelism

3

A Hasty Introduction to Advanced Computing with R

Free

Better Code

Compiled code

Parallelism

(56)

A Hasty Introduction to Advanced Computing with R Parallelism

Parallelism

Serial Programming

Parallel Programming

(57)

A Hasty Introduction to Advanced Computing with R Parallelism

Parallel Programming: In Theory

(58)

A Hasty Introduction to Advanced Computing with R Parallelism

Parallel Programming: In Practice

(59)

A Hasty Introduction to Advanced Computing with R Parallelism

Shared and Distributed Memory Machines

Shared Memory Machines

Thousands of cores

Nautilus, University of Tennessee

1024 cores 4 TB RAM

Distributed Memory Machines

Hundreds of thousands of cores

Titan, Oak Ridge National Lab

299,008 cores 584 TB RAM

(60)

A Hasty Introduction to Advanced Computing with R Parallelism

Parallel Programming Packages for R

Shared Memory

Examples:

parallel

,

snow

,

foreach

,

gputools

,

HiPLARM

Distributed

Examples:

pbdR

,

Rmpi

,

RHadoop

,

RHIPE

CRAN HPC Task View

For more examples, see:

http://cran.r-project.org/web/views/

HighPerformanceComputing.html

(61)

A Hasty Introduction to Advanced Computing with R Parallelism

Parallel Programming Packages for R

PETSc

pbdDMAT

PLASMA

Interconnection Network

PROC

+ cache + cachePROC + cachePROC + cachePROC

Mem Mem Mem Mem Distributed Memory

Memory

CORE

+ cache + cacheCORE + cacheCORE + cacheCORE

Network

Shared Memory Local Memory

GPU or MIC Co-Processor

GPU: Graphical Processing Unit MIC: Many Integrated Core Focus on who owns what data and

what communication is needed

Focus on which tasks can be parallel

Same Task on Blocks of data Sockets MPI Hadoop OpenMP Threads fork CUDA OpenCL OpenACC OpenMP OpenACC multicore (fork) snow + multicore = parallel

ScaLAPACK PBLAS BLACS MAGMA Trilinos DPLASMA CUBLAS MKL ACML LibSci .C .Call Rcpp OpenCL inline snow Rmpi pbdMPI LAPACK BLAS RHIPE pbdDMAT pbdDMAT HiPLAR HiPLARM magma

(62)

A Hasty Introduction to Advanced Computing with R Parallelism

pbdR Packages

(63)

Wrapup

1

Introduction

2

Profiling and Benchmarking

3

A Hasty Introduction to Advanced Computing with R

4

Wrapup

(64)

Wrapup

Performance-Centered Development Model

1

Just get it working.

2

Profile vigorously.

3

Weigh your options.

Improve R code? (

lapply()

, vectorization, a package, . . . )

Incorporate C/C++?

Go parallel?

Some combination of these. . .

4

Don’t forget the free stuff (BLAS, bytecode compiler, . . . ).

5

Repeat 2 — 4 until performance is acceptable.

(65)

Wrapup

Where to Learn More

The Art of R Programming

by Norm Matloff:

http://nostarch.com/artofr.htm

The R Inferno

by Patrick Burns:

http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

Using R for HPC 4 Hour Tutorial

(66)

Thanks so much for attending!

Questions?

Breakout Sessions

Big Data, R and HPC

Environmental/Habitat data

Physiological data

Population data

Vegetation surveys

References

Related documents

Indeed to get the result about the number of nodal regions of the functions in the ω -limit set we will use, in the next section, the action of the group G and the energy estimates

Here, we have used EIT to characterise the propagation pattern of the impedance response to focal ictal spike-and-wave activity, induced by the cortical stimulation model of epilepsy

If you need to process smaller numbers of rows, consider storing them in a temporary table in SQL Server or a temporary fi le and only writing them to Hadoop when the data size

Gabara, “150/30 Mb/s CMOS Non-Oversampled Clock and Data Recovery Circuits with Instantaneous Locking and Jitter Rejection,” IEEE International Solid-State Circuits

This paper employed geographic information system (GIS) to process the input data, RIDF curve to generate different design storm scenarios and PCSWMM to simulate

Population and economic growth within the Durban Metropolitan region in eastern South Africa has increased the demand for water supply. This ever-increasing demand means that

In the present investigation, the removal of Naphthol green B dye is studied using Hydrogen peroxide treated Red mud as adsorbent.. The sorption characteristics of the adsorbent

A Pilot Study of the Effects of Mindfulness-Based Stress Reduction on Post-traumatic Stress Disorder Symptoms and Brain Response to Traumatic Reminders of Combat in Operation