• No results found

Large-Scale Machine Learning with the HeAT Framework

N/A
N/A
Protected

Academic year: 2021

Share "Large-Scale Machine Learning with the HeAT Framework"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Martin Siggel, Kai Krajsek, Philipp Knechtges, Markus Götz, Claudia Comito, Björn Hagemeier

DLR-SC, High-Performance Computing

(2)

Motivation: Helmholtz Analytics Framework (HAF)

• Joint project of all 6 Helmholtz centers

to foster data analytics within Helmholtz.

• Scope: Systematic development of domain-specific data analysis techniques in a

co-design approach between domain scientists and information experts.

People who have

applications in mind.

People who provide

the tools (

us!

).

(3)

Motivation: HAF Use Cases

Aeronautics and Aerodynamics

 

Structural Biology

Research

 

with

 

Photons

Earth

 

System

 

Modelling

(4)

Motivation: HAF Use Case Methods

Clustering

k-means, mean shift clustering

Uncertainty quantification

Ensemble methods

Dimension reduction

Autoencoder, reduced order models

Feature learning

Image descriptors, autoencoder

Data assimilation

Kalman filter, 4Dvar, particle filter/smoother

Classification/Regression

Random forest, CNN, SVM

Modelling

Fiber tractography, point processes

Optimization techniques

L-BFGS, simulated annealing

Hyper-parameter optimization

Evidence framework, grid search

Interpolation

Radial basis function, Kriging

(5)

Greatest Common Denominator?

https://xkcd.com/1838/

Machine Learning

=

Lots of Data

+

Numerical Linear Algebra

Sounds like something HPC has been

doing anyway.

(6)

What is HeAT?

• HeAT

=

He

lmholtz

A

nalytics

T

oolkit

• Developed within the Helmholtz Analytics Framework Project

• The

hot

new machine learning / data analytics framework to come.

• Developed in the open:

Available on

https://github.com/helmholtz-analytics

and

https://pypi.org/project/heat

(7)
(8)

Big Data/Deep Learning Libraries

(9)

Which technology stack to use?

• We do not want to reinvent the wheel and there are already plenty of machine learning frameworks available.

• Common denominator: all frameworks provide Python as a front end language

We also choose

• Which framework to use as a basis?

(10)

Technology Benchmarks

• Testing several Machine Learning Frameworks

• by implementing

K-means

Self-Organizing Maps (SOM)

Artificial Neural Networks (ANN)

Support Vector Machines (SVM)

• Evaluation criteria:

• Feature completeness

• Ease of implementation

(11)

Evaluation

• Feature completeness:

• Ease of implementation and usage: Pytorch and MXNet

• Performance: Pytorch and MXNet

• Since we need MPI for distributed parallelism, we have chosen Pytorch

as a basis for our Machine Learning toolkit:

HeAT

Framework

GPU

MPI

AD

LA

nD Tensors

SVD

Dist.

 

tens

Pytorch

X

X

X

X

X

X

TensorFlow

X

X

X

X

X

X

MXNet

X

X

X

X

X

(12)

Scope

Design

Facilitating Use Cases of

HAF in their work

Bringing HPC and Machine

Learning / Data Analytics

closer together

k-means

SVM

NN

Ease of use

Distributed

Parallelism (MPI)

Pythonic

NumPy-like

interface

Automatic

Differentiation

Tensor Linear

Algebra

PyTorch

GPU support

(13)

Data

 

structure

ND

Tensor

Operations

Elementwise

 

operations

Slicing

Matrix

 

operations

Reduction

 

(14)

Operations

Elementwise

 

operations

Slicing

Matrix

 

operations

Reduction

 

Automatic

 

differentation

 

Runs

 

on:

or

Data

 

structure

ND

Tensor

(15)

Operations

Elementwise

 

operations

Slicing

Matrix

 

operations

Reduction

 

Automatic

 

differentation

or

MPI

Runs

 

on:

Data

 

structure

(16)

Data Distribution

Server#1

PyTorch

 

Tensor#1

Server#2

PyTorch

 

Tensor#2

Server#3

PyTorch

 

Tensor#3

HeAT

 

Tensor

Example:

Server#1

[0,

 

1]

Server#2

[2,

 

3]

Server#3

[4,

 

5]

split=1

Server#1

 

PyTorch

 

Tensor#1

Server#2

 

PyTorch

 

Tensor#2

Server#3

 

PyTorch

 

Tensor#3

HeAT

 

Tensor

split=0

(17)

What has been done so far?

• The core technology has been identified

• Implementation of a distributed parallel tensor class has

begun

• Implemented standard vector and matrix operations

• Parallel data I/O via HDF 5 and NETCDF

(18)

Example: k-means

• Core idea: k clusters

around centroids

• Minimization

arg min

Algorithm sketch

1.

Choose k centroids

2.

For each point calculate distance to centroids

3.

Assign point to

closest centroid

4.

Estimate new centroid as

mean

of points

5.

Go to 2. until

convergence

(19)

Example: k-means

Numpy

 

vs.

 

HeAT

2.

 

For

 

each

 

point

 

calculate

 

distance

 

to

 

centroids

 

(20)
(21)
(22)

Transparent development process

Github for code review,

issue tracking,

sprint planning

Travis for continuous integration

Mattermost for discussions

(23)

• We said that HPC people and codes are trained in Numerical Linear Algebra ….

• ….. but this mostly holds only for sparse matrices!

• In, e.g. SVMs, one has to solve a linear system with the matrix

• The never become zero,

but

can be arbitrarily close to zero.

• Very unfortunate to have many almost zeros, but still

complexity for matrix vector multiplication.

• Could one not partially approximate the matrix with low-rank

matrices?

What’s next?

Figure taken from Steffen Börm‘s lecture notes

„Numerical Methods for Non-Local Operators“

Hierarchical Matrices

(24)

Thanks for listening.

Questions?

References

Related documents

To cite this article: Eugenia Correa & Alicia Girón (2013) Credit and Capital Formation: Lessons of Mexican Migrant Entrepreneurs in the U.S... Eugenia Correa is a professor

Therefore, this research is aimed at predicting propagation path loss model for four GSM network service providers within Dutse- town in some selected areas.. Review on

A motor center has a pair of positive and negative reinforcement inputs from an associative memory. What the associative memory represents is a set of positive

195 MISCELLANEOUS PHOTOGRAPHS & EPHEMERA: a large box, to include cdv in album, a bundle of postcards, various snapshot albums, cabinet cards and misc.. other photographic

However, enhancing engineering capabilities and technical excellence, although readily recognized as key motivations, is not often seen in the list of motivations for the

three bands of original image; (c) image corrupted with blue scratch of width w=15 and height h=70; (b) horizontal projection of the intensity curves of the three bands of

As it is proven to be difficult to obtain consistent and efficient estimates of the parameters (regression effects and variance of the random effects) of such

This study utilised data from two di ff erent sources: (i) the public sector held o ffi cial measures of area deprivation based on the 2011 census data de fi ned for Lower-layer