• No results found

Data Analytics with HPC and DevOps

N/A
N/A
Protected

Academic year: 2019

Share "Data Analytics with HPC and DevOps"

Copied!
87
0
0

Loading.... (view fulltext now)

Full text

(1)

Data Analytics with HPC and DevOps

PPAM 2015, 11th International Conference On Parallel Processing And Applied Mathematics Krakow, Poland, September 6-9, 2015

1 Geoffrey Fox, Judy Qiu, Gregor von Laszewski, Saliya Ekanayake,

Bingjing Zhang, Hyungro Lee, Fugang Wang, Abdul-Wahid Badi Sept 8 2015

[email protected]

http://www.infomall.org, http://spidal.org/ http://hpc-abds.org/kaleidoscope/

Department of Intelligent Systems Engineering

(2)

2

ISE Structure

The focus is on engineering of systems of small scale, often mobile devices that draw upon modern information technology techniques including intelligent systems, big data and user

interface design. The

foundation of these devices include sensor and detector technologies, signal processing, and information and control theory.

End to end Engineering

New faculty/Students Fall 2016 IU Bloomington is the only university among AAU’s 62 member

(3)

Abstract

• There is a huge amount of big data software that we want to

use and integrate with HPC systems

• Use Java and Python but face same challenges as large scale

simulations to get good performance

• We propose adoption of DevOps motivated scripts to support

hosting of applications on the many different infrastructures like

OpenStack, Docker, OpenNebula, Commercial clouds and HPC

supercomputers.

• Virtual Clusters can be used in clouds and Supercomputers and

seem a useful concept on which base approach

• Can also be thought of more generally as software defined

distributed systems

(4)

Big Data Software

(5)

Data Platforms

(6)

6

(7)

Functionality of 21 HPC-ABDS Layers

1) Message Protocols:

2) Distributed Coordination: 3) Security & Privacy:

4) Monitoring:

5) IaaS Management from HPC to hypervisors: 6) DevOps:

7) Interoperability: 8) File systems:

9) Cluster Resource Management: 10) Data Transport:

11) A) File management B) NoSQL

C) SQL

12) In-memory databases&caches / Object-relational mapping / Extraction Tools 13) Inter process communication Collectives, point-to-point, publish-subscribe, MPI: 14) A) Basic Programming model and runtime, SPMD, MapReduce:

B) Streaming:

15) A) High level Programming: B) Frameworks

16) Application and Analytics: 17) Workflow-Orchestration:

7

Here are 21 functionalities. (including 11, 14, 15 subparts)

4 Cross cutting at top

(8)
(9)

Java Grande

Revisited on 3 data analytics codes

Clustering

Multidimensional Scaling

Latent Dirichlet Allocation

all sophisticated algorithms

(10)

DA-MDS Scaling MPI + Habanero Java (22-88 nodes)

• TxP is # Threads x # MPI Processes on each Node

• As number of nodes increases, using threads not MPI becomes better • DA-MDS is “best general purpose” dimension reduction algorithm

• Juliet is a 96 24-core node Haswell + 32 36-core Haswell Infiniband Cluster • Use JNI +OpenMPI gives similar MPI performance for Java and C

10

All MPI on Node

(11)

DA-MDS Scaling MPI + Habanero Java (1 node)

• TxP is # Threads x # MPI Processes on each Node • On one node MPI better than threads

• DA-MDS is “best known” dimension reduction algorithm

• Juliet is a 96 24-core node Haswell + 32 36-core Haswell Infiniband Cluster • Use JNI +OpenMPI usually gives similar MPI performance for Java and C

11

24 way parallel Efficiency

(12)

12

(13)

Sometimes Java Allgather MPI performs poorly

13

TxPxN where T=1 is threads per node and P is MPI processes per node and N is number of nodes

Tempest is old Intel Cluster

Bind processes to 1 or multiple cores

(14)

Compared to C Allgather MPI performing

consistently

14

(15)

No classic nearest neighbor communication

All MPI collectives

15

All MPI on Node

(16)

No classic nearest neighbor communication

All MPI collectives (allgather/scatter)

16

All MPI on Node

(17)

No classic nearest neighbor communication

All MPI collectives (allgather/scatter)

17

All MPI on Node

(18)

DA-PWC Clustering on old Infiniband

cluster (FutureGrid India)

• Results averaged over TxP choices with full 8 way parallelism per node • Dominated by broadcast implemented as pipeline

(19)

Parallel LDA Latent

Dirichlet Allocation

• Java code running under Harp – Hadoop plus HPC plugin

• Corpus: 3,775,554 Wikipedia

documents, Vocabulary: 1 million words; Topics: 10k topics;

• BR II is Big Red II supercomputer with Cray Gemini interconnect • Juliet is Haswell Cluster with Intel

(switch) and Mellanox (node) infiniband

– Will get 128 node Juliet results

19

(20)

Parallel Sparse LDA

• Original LDA (orange) compared to LDA exploiting sparseness (blue) • Note data analytics making full use

of Infiniband (i.e. limited by communication!)

• Java code running under Harp – Hadoop plus HPC plugin

• Corpus: 3,775,554 Wikipedia

documents, Vocabulary: 1 million words; Topics: 10k topics;

• BR II is Big Red II supercomputer with Cray Gemini interconnect • Juliet is Haswell Cluster with Intel

(switch) and Mellanox (node) infiniband

20

(21)

Classification of Big Data Applications

(22)

Breadth of Big Data Problems

• Analysis of 51 Big Data use cases and current benchmark sets

led to 50 features (facets) that described important features

– Generalize Berkeley Dwarves to Big Data

• Online survey

http://hpc-abds.org/kaleidoscope/survey

for next

set of use cases

• Catalog 6 different architectures

• Note streaming data very important (80% use cases) as are

Map-Collective (50%) and Pleasingly Parallel (50%)

• Identify “complete set” of benchmarks

• Submitted to ISO Big Data standards process

(23)

51 Detailed Use Cases:

Contributed July-September 2013

Covers goals, data features such as 3 V’s, software, hardware

• http://bigdatawg.nist.gov/usecases.php

• https://bigdatacoursespring2014.appspot.com/course (Section 5)

Government Operation(4): National Archives and Records Administration, Census Bureau • Commercial(8): Finance in Cloud, Cloud Backup, Mendeley (Citations), Netflix, Web Search,

Digital Materials, Cargo shipping (as in UPS)

Defense(3): Sensors, Image surveillance, Situation Assessment

Healthcare and Life Sciences(10): Medical records, Graph and Probabilistic analysis, Pathology, Bioimaging, Genomics, Epidemiology, People Activity models, Biodiversity

Deep Learning and Social Media(6): Driving Car, Geolocate images/cameras, Twitter, Crowd Sourcing, Network Science, NIST benchmark datasets

The Ecosystem for Research(4): Metadata, Collaboration, Language Translation, Light source experiments

Astronomy and Physics(5): Sky Surveys including comparison to simulation, Large Hadron Collider at CERN, Belle Accelerator II in Japan

Earth, Environmental and Polar Science(10): Radar Scattering in Atmosphere, Earthquake, Ocean, Earth Observation, Ice sheet Radar scattering, Earth radar mapping, Climate simulation datasets, Atmospheric turbulence identification, Subsurface Biogeochemistry (microbes to

watersheds), AmeriFlux and FLUXNET gas sensors • Energy(1): Smart grid

23

26 Features for each use case Biased to science

(24)

Problem Architecture View Pleasingly Parallel Classic MapReduce Map-Collective Map Point-to-Point Shared Memory

Single Program Multiple Data Bulk Synchronous Parallel Fusion

Dataflow Agents Workflow

Geospatial Information System HPC Simulations

Internet of Things Metadata/Provenance

Shared / Dedicated / Transient / Permanent Archived/Batched/Streaming

HDFS/Lustre/GPFS Files/Objects

Enterprise Data Model SQL/NoSQL/NewSQL Pe rform anc eMe tri cs Fl ops pe rB yt e; Me m ory I/O Exe cut ion Envi ronm ent ;C ore libra rie s Vol um e Ve loc ity Va rie ty Ve ra

city Comm

uni cati on St ruc ture Da ta Abst ra ction Me tri c= M /Non-Me tri c= N O N 2 = NN / O(N) = N Re gul ar = R /Irre gul ar = I Dyna m ic = D /St atic = S Vi sua liza tion Gra ph Al gori thm s Line ar Al ge bra Ke rne ls Al ignm ent St re am ing Opt im iza tion Me thodol ogy Le arni ng Cla ssi fic ation Se arc h /Que ry /Inde x Ba se St atist ics Gl oba lAna lyt ics Loc al Ana lyt ics Mi cro-be nc hm arks Re com m enda tions

Data Source and Style View

Execution View

Processing View 2

3 4 6 7 8 9 10 11 12 10 9 8 7 6 5 4 3 2 1

1 2 3 4 5 6 7 8 9 10 12 14 9 8 7 5 4 3 2 1

14 13 12 11 10 6

13

Map Streaming 5

4 Ogre Views and

50 Facets Itera

(25)

6 Forms of

MapReduce

cover “all”

circumstances

Also an interesting software

(architecture) discussion

(26)

Benchmarks/Mini-apps spanning Facets

Look at NSF SPIDAL Project, NIST 51 use cases, Baru-Rabl review • Catalog facets of benchmarks and choose entries to cover “all facets” • Micro Benchmarks: SPEC, EnhancedDFSIO (HDFS), Terasort,

Wordcount, Grep, MPI, Basic Pub-Sub ….

SQL and NoSQL Data systems, Search, Recommenders: TPC (-C to x–HS for Hadoop), BigBench, Yahoo Cloud Serving, Berkeley Big Data, HiBench, BigDataBench, Cloudsuite, Linkbench

– includes MapReduce cases Search, Bayes, Random Forests, Collaborative Filtering

Spatial Query: select from image or earth data • Alignment: Biology as in BLAST

Streaming: Online classifiers, Cluster tweets, Robotics, Industrial Internet of Things, Astronomy; BGBenchmark.

Pleasingly parallel (Local Analytics): as in initial steps of LHC, Pathology, Bioimaging (differ in type of data analysis)

Global Analytics: Outlier, Clustering, LDA, SVM, Deep Learning, MDS, PageRank, Levenberg-Marquardt, Graph 500 entries

Workflow and Composite (analytics on xSQL) linking above

(27)

SDDSaaS

Software Defined Distributed Systems

as a Service

and Virtual Clusters

(28)

Supporting Evolving High Functionality ABDS

• Many software packages in HPC-ABDS. • Many possible infrastructures

• Would like to support and compare easily many software systems on different infrastructures

• Would like to reduce system admin costs

– e.g. OpenStack very expensive to deploy properly • Need to use Python and Java

– All we teach our students

– Dominant (together with R) in data science

• Formally characterize Big Data Ogres – extension of Berkeley dwarves – and benchmarks

• Should support convergence of HPC and Big Data

– Compare Spark, Hadoop, Giraph, Reef, Flink, Hama, MPI ….

• Use Automation (DevOps) but tools here are changing at least as fast as operational software

(29)

SDDSaaS Stack for HPC-ABDS

SaaS

PaaS

IaaS

NaaS

BMaaS

Orchestration

Mahout, MLlib, R

Hadoop, Giraph, Storm

Docker, OpenStack, Bare metal

OpenFlow

Just examples from 350 components

Cobbler

Abstract Interfaces removes tool dependency

IPython, Pegasus, Kepler, FlumeJava, Tez, Cascading

HPC-ABDS at 4 levels

(30)

30

Visualization Libraries

Mindmap of core

Benchmarks

(31)

Software for a Big Data Initiative

Functionality of ABDS and Performance of HPC

Workflow: Apache Crunch, Python or Kepler

Data Analytics: Mahout, R, ImageJ, Scalapack • High level Programming: Hive, Pig

Programming model:

Batch Parallel: Hadoop, Spark, Giraph, Harp, MPI; – Streaming: Storm, Kafka or RabbitMQ

In-memory: Memcached

Data Management: Hbase, MongoDB, MySQL • Distributed Coordination: Zookeeper

Cluster Management: Yarn, Mesos, Slurm

File Systems: HDFS, Object store (Swift),Lustre

DevOps: Cloudmesh, Chef, Ansible, Docker, Cobbler

IaaS: Amazon, Azure, OpenStack, Docker, SR-IOV • Monitoring: Inca, Ganglia, Nagios

31

(32)

Automation or

“Software Defined Distributed Systems”

• This means we specify Software (Application, Platform) in configuration file and/or scripts

• Specify Hardware Infrastructure in a similar way – Could be very specific or just ask for N nodes – Could be dynamic as in elastic clouds

– Could be distributed

• Specify Operating Environment (Linux HPC, OpenStack, Docker) • Virtual Cluster is Hardware + Operating environment

Grid is perhaps a distributed SDDS but only ask tools to deliver “possible grids” where specification consistent with actual hardware and administrative rules

– Allowing O/S level reprovisioning makes it easier than yesterday’s grids • Have tools that realize the deployment of application

– This capability is a subset of “system management” and includes DevOps • Have a set of needed functionalities and a set of tools from various commuinies

(33)

“Communities” partially satisfying SDDS

management requirements

IaaS: OpenStack

DevOps Tools: Docker and tools (Swarm, Kubernetes, Centurion, Shutit), Chef, Ansible, Cobbler, OpenStack Ironic, Heat, Sahara; AWS OpsWorks, • DevOps Standards: OpenTOSCA; Winery

Monitoring: Hashicorp Consul, (Ganglia, Nagios)

Cluster Control: Rocks, Marathon/Mesos, Docker Shipyard/citadel, CoreOS Fleet

Orchestration/Workflow Standards: BPEL

Orchestration Tools: Pegasus, Kepler, Crunch, Docker Compose, Spotify Helios

Data Integration and Management: Jitterbit, Talend

Platform As A Service: Heroku, Jelastic, Stackato, AWS Elastic Beanstalk, Dokku, dotCloud, OpenShift (Origin)

(34)

Functionalities needed in SDDS

Management/Configuration Systems

• Planning job -- identifying nodes/cores to use • Preparing image

• Booting machines

• Deploying images on cores

• Supporting parallel and distributed deployment

• Execution including Scheduling inside and across nodes • Monitoring

• Data Management

• Replication/failover/Elasticity/Bursting/Shifting • Orchestration/Workflow

• Discovery • Security

• Language to express systems of computers and software • Available Ontologies

• Available Scripts (thousands?)

(35)

Virtual Cluster Overview

(36)

Virtual Cluster

Definition:

A set of (virtual) resources that constitute a cluster

over which the user has full control. This includes virtual

compute, network and storage resources.

Variations:

Bare metal cluster:

A set of bare metal resources that can

be used to build a cluster

Virtual Platform Cluster:

In addition to a virtual cluster with

network, compute and disk resources a platform is deployed

over them to provide the platform to the user

(37)

Virtual Cluster Examples

• Early examples:

– FutureGrid bare metal provisioned compute resources

• Platform Examples:

– Hadoop virtual cluster (OpenStack Sahara)

– Slurm virtual cluster

– HPC-ABDS (e.g. Machine Learning) virtual cluster

• Future examples:

– SDSC Comet virtual cluster; NSF resource that will

offer virtual clusters based on KVM+Rocks+SR-IOV in

next 6 months

(38)

Virtual Cluster Ecosystem

(39)

Virtual Cluster Components

Access Interfaces

– Provides easy access by users and programmers: GUI, command line/shell, REST, API

Integration Services

– Provides integration of new services, platforms and complex workflows while exposing them in simple fashion to the user

Access and Cluster Services

– Simple access services allowing easy use by the users (monitoring, security, scheduling, management)

Platforms

– Integration of useful platforms as defined by user interest • Resources

– Virtual clusters need to be instantiated on real resources

• Includes: IaaS, Baremetal, Containers

• Concrete resources: SDSC Comet, FutureSystems, …

(40)

Virtual Platform Workflow

Successful

Reservation PlatformSelect

Prepare Deployment

Framework Workflow

(Devops)

Deploy

Return Successful

Platform Deployment

Execute

Experiment reservationTerminate

(41)

Virtual Platform Workflow

Successful

Reservation PlatformSelect

Prepare Deployment

Framework Workflow

(Devops)

Deploy

Return Successful

Platform Deployment

Execute

Experiment reservationTerminate

(42)

Tools To Create Virtual Clusters

(43)

Phases needed for Virtual Cluster Management

Baremetal

– Manage bare metal servers • Provisioning

– Provision an image on bare metal • Software

– Package management, software installation • Configuration

– Configure packages and software • State

– Report on the state of the install and services • Service Orchestration

– Coordinate multiple services • Application Workflow

– Coordinate the execution of an application including state and application experiment management

(44)

From Bare metal Provisioning



to Application Workflow

Baremetal Provisioning Software Configuration State Service

Orchestration ApplicationWorkflow

Nova Ironic

MaaS

Chef, Puppet, ansible, salt, … Juju

Packages

OS config OS state

Heat

Pegasus SLURM

Kepler

TripleO : deploys OpenStack

disk-mage-bulder

(45)

Some Comparison of DevOps Tools

Score Framework Open

Stack Language Effort Highlighted features

+++ Ansible x python low Low entry barrier, push model, agentless via ssh, deployment, configuration, orchestration, can deploy onto windows but does not run on windows.

+ Chef x Ruby High Cookbooks, Client server based, roles

++ Puppet x Puppet DSL

/ Ruby medium Declarative language, client-server based,

(---) Crowbar x Ruby Cent OS only, bare metal, focus on openstack, moved from Dell to SUSE

+++ Cobbler Python Medium - high Networked installations of clusters, provisioning, DNS, DHCP, package updates, power management, orchestration

+++ Docker Go very low Low entry barrier, Container management, Dockerfile

(--) Juju x Go low Manages services and applications

++ xcat Perl medium Diskless clusters, manage servers, setup of HPC stack, cloning of images

+++ Heat x Python medium Templates, relationship between resources, focuses on infrastructure

+ TripleO x Python high OpenStack focused, Install, upgrade OpenStack using OpenStack functionality

(+++) Foreman x Ruby,

puppet low REST, very nice documentation of REST apis

Puppet

Razor Ruby,puppet Inventory, dynamic image selection, policy based provisioning

+++ Salt x Python low Salt Cloud, dynamic bus for orchestration, remote execution and configuration management, faster than ansible via zeroMQ, ansible is in some aspects easier to use

(46)

PaaS as seen by Developers

Platform Languages Application staging Highlighted features Focus

Heroku Ruby, PHP, Node.js, Python, Java, Go, Closure, Scala

Source code

syncronization via git, addons

build, deliver, monitor and scale apps, data services, marketplace

Application development

Jelastic Java, PHP, Python,

Node.js, Ruby and .NET Source codesyncrhronization: git,

svn, bitbucket

PaaS and container based IaaS, Heterogeneous cloud support, plugin support for IDEs and builders such as maven, ant

Web server and

database development. Small number of

available stacks AWS Elastic

Beanstalk

Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker

Selection from

Webpage/REST API, CLI

deploying and scaling web

applications Apache, Nginx,Passenger, and IIS and

self developed services Dokku See heroku Source code

synchronisation via git Mini Heroku powered bydocker, docker Your own single-hostlocal Heroku, dotCloud Java, Node.js PHP,

Python, Ruby, (Go) Sold by Docker. Smallnumber of examples managed service forweb developers

Redhat Openshift

Via git automates the provisioning, management and scaling of applications

Aplication hosting in public cloud

Pivotal Cloud Foundry

Java, Node.js ,Ruby,

PHP, Python, Go Command line Integrates multipleclouds, develop and

manage applications Cloudify Java, Python, REST Command line, GUI,

REST open source TOSCA-basedcloud orchestration softwareplatform, can be installed

locally

open source, TOSCA, integrates with many cloud platforms Google App

Engine

Python, Java, PHP, Go Many useful services from

OAUTH to MapReduce run applications onGoogle’s infrastructure

(47)

Virtual Clusters Powered By

Containers

(48)

Hypervisor Models

Hardware

OS

Hypervisor

OS OS

Hardware

Hypervisor

OS OS

KVM, FreeBSD bhyve make appear as type 1

but run on OS

Type 1 - Baremetal

Type 2 – on OS

Oracle VM Server for SPARC and x86, Citrix XenServer, VMware ESX/ESXi ,

Microsoft Hyper-V 2008/2012. VMware Player, VirtualBoxVMware Workstation,

(49)

Hypervisor vs. Container

Hypervisor (Type 2)

Container

Server

Operating System

Container Engine

Libraries

App App App

Libraries

App App App

Server

Operating System

Hypervisor

Guest OS

Libraries

App App App

Guest OS

Libraries

App Aop App

(50)

Docker Components

• Docker Machine

– creates Docker hosts on computer, cloud providers, data center. Automatically creates hosts, installs Docker, configures docker client to talk to them. A

“machine” is the combination of a Docker host and a

configured client

• Docker Client

– Builds, pulls, and runs docker images and containers while using the docker daemon and the registry

• Docker Engine

– Builds and runs Containers with downloaded Images

• Docker Registry

– Provides storage and

distribution of Docker images e.g. Ubuntu, CentOS, nginx

• Docker Compose

– Connects multi containers

• Docker Swarm

– Manages containers on multiple hosts

• Docker Hub

– Public image repository

• Kitematic (GUI)

– OSX

(51)

CoreOS Components

• preconfigured OS with

popular tools for running

container

• Operating System

– CoreOS

• Container runtime

– rkt

• Discovery

– etcd

• Distributed System

– fleet

• Network

– flannel

Source: https://coreos.com/

(52)

Provisioning Comparison

• Containers do not

carry the OS

– Provisioning is much

faster

– Security may be an

issue

• By employing server server isolation we do not have this issue • Security will improve,

may come at cost of runtime

(53)

Cluster Scaling from Single entity -> Multiple -> Distributed

Service/

Tool Highlightedfeatures Provisioning SingleEntity MultipleEntities DistributedEntities DistributedScheduling

Rocks HPC, Hadoop

Docker Tools Uses Docker Dockermachine Dockerclient Docker-engine Dockerswarm

Helios Uses Docker

Centurion Uses Docker

Docker

Shipyard Composability

CoreOS Fleet CoreOS

Mesos Master-worker

Myriad

Mesos

framework for scaling YARN

YARN Next gen mapreduce scheduler Global, perapplication

Kubernetes

Manage cluster as single

container system

Borg Multiple cluster

Marathon Manage cluster

Omega Scheduler only

(54)

Container-powered Virtual Clusters (Tools)

Application Containers

– Docker, Rocket (rkt), runc (previously lmctfy) by OCI* (Open Container Initiative), Kurma, Jetpack

• Operating Systems and support for Containers – CoreOS, docker-machine

– requirements: kernel support (3.10+) and Application Containers installed i.e. Docker • Cluster management for containers

– Mesos, Kubernetes, Docker Swarm, Marathon/Mesos, Centurion, OpenShift, Shipyard, OpenShift Geard, Smartstack, Joyent sdf-docker, OpenStack Magnum

– requirements: job execution & scheduling, resource allocation, service discovery • Containers on IaaS

– Amazon ECS (EC2 Container Service supporting Docker), Joyent Triton • Service Discovery

– Zookeeper, etcd, consul, Doozer

– coordination service - registering master/workers • PaaS

– AWS OpsWorks, Elastic Beanstalk, Flynn, EngineYard, Heroku, Jelastic, Stackato (moved to HP Cloud from ActiveState)

Configuration Management

– Ansible, Chef, Puppet, Salt

(55)

Container-based Virtual Cluster Ecosystem

(56)

56

(57)

57

https://www.mindmeister.com/389671722/o pen-container-ecosystem-formerly-docker-ecosystem

(58)

58

https://www.mindmeiste r.com/389671722/open- container-ecosystem-

formerly-docker-ecosystem

(59)

59

https://www.mindmeister.com/389671

722/open-container-ecosystem-formerly-docker-ecosystem

(60)

60

https://www.mindmeister.com /389671722/open-container- ecosystem-formerly-docker-ecosystem

(61)

61

Open Container

Mindmap 5

https://www.mindmeister.c

om/389671722/open- container-ecosystem-

(62)

Roles in Container-powered VC

Application Containers

– Provides isolation of the host environment for the running container process by cgroups and namespaces from linux kernel, No hypervisor.

– Tools: Docker, Rocket (rkt), runc (lmctfy) – Use Case: Starting/running container images • Operating Systems

– Supports latest kernel version with minimal package extensions. Docker has small footprint and can be deployed easily on OS

Cluster Management

– Running container-based applications on complex cluster architectures needs additional supports for cluster management. Service Discovery, security, job execution and resource allocation are discussed.

Containers on IaaS

– During the transition between process and OS virtualization, running container tools on current IaaS platforms provides practices of running applications without

dependencies. • Service Discovery

– Each container application communicates with others to exchange data, configuration information, etc via overlay network. In a cluster environment, access information and membership information need to be gathered and managed in a quorum. Zookeeper, etcd, consul, and doozer offer cluster coordination services.

Configuration Management (CM)

– Uses configuration scripts (i.e. recipes, playbooks) to maintain and construct VC software packages on target machines

– Tools: Ansible, Chef, Puppet, Salt

– Use case: Installing/configuring Mesos on cluster nodes

(63)

Issues for Container-powered Virtual Clusters

Latest OS required (e.g. CentOS 7 and RHEL 7) – Docker runs on Linux Kernel 3.10+

– CentOS 6 and RHEL 6 are still most popular which have old kernels i.e. 2.6+ • Linux is only supported

– Currently no containers for Windows or other OS

– Microsoft Windows Server and Hyper-V Containers will be available

• Preview of Windows Server Containers

Container Image

– Size Issue: Delay on large images in starting apps (> 1GB), Increased network traffic while downloading and registering, Scientific application containers are typically big

– Purging old images: completed container images stay in host machines which consumes

storage. Unused images need to be cleaned up. One example is Docker Garbage Collection by Spotify

Security

– Insecure Image: image checksum flaws, docker 1.8 supports Docker Content Trust – Vulnerable hosts can affect data and applications on containers

Lack of available Application Images

– Still lots of application images need to be created, especially for scientific applications • File Systems

– Lots of Confusing issues (Lustre v. HDFS etc.) but not special to Containers

(64)

Cluster Management with Containers

Name Job Management (execution/ scheduling) Image management (planning /preparing/ booting) Configuration Service (Discovery, membership, quorum) Replication/Fai lover (Elasticity/Burs ting/Shifting)

Lang. What can be specified (extent of ontologies)

Apache Mesos Chronos,

two-level scheduling SchedulerDriver with ContainerInfo

Zookeeper Zookeeper, Slave Recovery

C++ Hadoop, spark, Kafka, elastic search

Google

Kubernetes kube-scheduler: Scheduling units (pods) with location affinity

Image Policy

with Docker SkyDNS Controllermanager Go Web applications

Apache Myriad (Mesos + YARN) Marathon, Chronos, Myriad Executor SchedulerDriv er with ContainerInfo, dcos

Zookeeper Zookeeper Java Spark, Cassandra, Storm, Hive, Pig, Impala, Drill [8]

Apache YARN Resource

Manager DockerContainer Executor

YARN Service Registry, Zookeeper

Resource

Manager Java Hadoop, MapReduce, Tez,Impala, Broadly used

Docker Swarm Swarm Filters

and Strategies Node agent Etcd, consul,zookeeper Etcd, consul,zookeeper Go Dokku, Docker Compose,Krane, Jenkins

Engine Yard

Deis (PaaS) Fleet Docker etcd etcd Python,Go Applications from HerokuBuildpacks, Dockerfiles, Docker Images

(65)

Cluster Management with Containers

Legend Description

Job Execution and

Scheduling Managing workload with priority, availability and dependency Image management

(planning/preparing/booting)

Downloading, powering up, and allocating images (VM image and container image) for applications

Configuration Service

(discovery, membership, quorum)

Storing cluster membership with key-value stores. service discovery, leader election

Replication/Failover

(elasticity/bursting/shifting) High Availability (HA), recovery against service failures Lang. development programming language

What can be specified

(extent of ontologies) Supported applications available in scripts

(66)

Container powered Virtual Clusters for

Big Data Analytics

• Better for reproducible research

• Issues on dealing with large amounts of data compatible with important for Data Software Stacks.

– Requires external file system integration such as HDFS, NFS for Big Data Software Stacks, or OpenStack Manila [10] (formerly cinder) to provide a distributed, shared file system

(67)

Comparison of Different Infrastructures

HPC is well understood for limited application scope; robust core services like security and scheduling

– Need to add DevOps to get good scripting coverage

• Hypervisors with management (OpenStack) are now well understood but

high system overhead as changes every 6 months and complex to deploy optimally.

– Management models for networking non trivial to scale – Performance overheads

– Won’t necessarily support custom networks

– Scripting good with Nova, Cloudinit, Heat, DevOps

• Containers (Docker) still maturing but fast in execution and installation. Security challenges especially at core level (better to assign nodes)

– Preferred choice if have full access to hardware and can chose – Scripting good with machine, Dockerfile, compose, swarm

(68)

Comparison in detail

• HPC

– Pro

• Well understood model • Hardened deployments • Established services

– Queuing, AAA

• Managed in research environments

– Con

• Adding software is complex • Requires devops to do it

right

• Build “master” OS • OS build for large

community

• Hypervisor

– Pro

• By now well understood • Looks similar to bare metal

– E.g. lots of software the same

• Customized Images

• Relative good security in shared mode

– Con

• Needs Openstack or similar, which is difficult to use

• Not suited for MPI

(69)

Comparison

• Container

– Pro

• Super easy to install • Fast

• Services can be containerized

• Application-based

Customized containers

– Con

• Evolving technology • Security challenges • One may run lots of

containers

• Evolving technologies for distributed container

management

• What to chose:

– Container

• If there is no pre installed system and you can start from scratch

• Have control over the hardware

– HPC

• If you need MPI • Performance

– Hypervisor

• If you have access to cloud

• Have many users • Do not totally saturate

(70)

Scripting the environment

• Container (Docker)

– Docker-machine

• provisioning

– Dockerfile

• Software install

– Docker compose

• Orchestration

– Swarm

• Multiple containers

• Relatively easy

• Hypervisor (Openstack)

– Nova

• Provisioning

– Cloudinit & DevOps

framework

• Software install

– Heat & DevOps

• Orchestration & multiple vms

(71)

Scripting environment

• HPC

– Shell & scripting

languages

– Build in workflow

• Many users do not exploit this

– Restricted to installed

software, and software

that can be put in user

space

(72)

Cloudmesh

(73)

CloudMesh SDDSaaS Architecture

• Cloudmesh is a open source http://cloudmesh.github.io toolkit:

– A software-defined distributed system encompassing virtualized and bare-metal infrastructure, networks, application, systems and platform software with a unifying goal of providing Computing as a Service.

– The creation of a tightly integrated mesh of services targeting multiple IaaS

frameworks

– The ability to federate a number of resources from academia and industry. This includes existing FutureSystems infrastructure, Amazon Web Services, Azure, HP Cloud, Karlsruhe using several IaaS frameworks

– The creation of an environment in which it becomes easier to experiment with platforms and software services while assisting with their deployment

and execution.

– The exposure of information to guide the efficient utilization of resources. (Monitoring)

– Support reproducible computing environments

– IPython-based workflow as an interoperable onramp

Cloudmesh exposes both hypervisor-based and bare-metal provisioning to users and administrators

• Access through command line, API, and Web interfaces.

(74)

Cloudmesh: from IaaS(NaaS) to Workflow (Orchestration)

Images

Dat

a

HPC-ABDS Software components defined in Ansible. Python (Cloudmesh)

controls deployment (virtual cluster) and execution (workflow)

(SaaS Orchestration)

Workflow

(IaaS Orchestration)

Virtual Cluster

Components

Infrastructure

• IPython

• Pegasus etc.

• Heat • Python

• Chef or Ansible

(Recipes/Playbooks)

• VMs, Docker,

Networks, Baremetal

(75)

Cloudmesh Functionality

User On-Ramp

Amazon, Azure, FutureSystems, Comet, XSEDE, ExoGeni, Other Science Clouds

Cloudmesh

Information

Services

CloudMetrics

Provisioning Management

Rain

Cloud Shifting

Cloud Bursting

Virtual Machine

Management

IaaS Abstraction

Experiment

Management

Shell

IPython

Accounting

Internal

External

(76)

Cloudmesh Components I

Cobbler: Python based provisioning of bare-metal or hypervisor-based systems

Apache Libcloud: Python library for interacting with many of the popular cloud service providers using a unified API. (One Interface To Rule

Them All)

Celery is an asynchronous task queue/job queue environment based on RabbitMQ or equivalent and written in Python

OpenStack Heat is a Python orchestration engine for common cloud environments managing the entire lifecycle of infrastructure and applications.

Docker (written in Go) is a tool to package an application and its dependencies in a virtual Linux container

OCCI is an Open Grid Forum cloud instance standard

Slurm is an open source C based job scheduler from HPC community with similar functionalities to OpenPBS

(77)

Cloudmesh Components II

Chef Ansible Puppet Salt are system configuration managers. Scripts are used to define system

Razor cloud bare metal provisioning from EMC/puppet

Juju from Ubuntu orchestrates services and their provisioning defined by

charms across multiple clouds

Xcat (Originally we used this) is a rather specialized (IBM) dynamic provisioning system

Foreman written in Ruby/Javascript is an open source project that helps system administrators manage servers throughout their lifecycle, from provisioning and configuration to orchestration and monitoring. Builds on Puppet or Chef

(78)

… Working with VMs in Cloudmesh

VMs

Panel with VM Table (HP) Search

(79)

Cloudmesh MOOC

Videos

(80)
(81)

Comet

• Two operational modes

– Traditional batch queuing system

– Virtual cluster based on time sliced reservations

• Virtual compute resources • Virtual disks

(82)

Layered Extensible Architecture

• Allows various access

modes

– GUI

– REST

– API

– Has API to also allow

2-tier execution on

target host

– Reservation backends

and resources can be

added

GUI

REST

Command Line & shell

API

API

Reservation Backend

(83)

Usage Example OSG on Comet

– Traditional HPC Workflow:

• Reserve number of nodes on batch queue • Install condor glidein

• Integrate batch resources into condor

• Use resources facilitated with glidins in condor scheduler • Adv: can utilize standard resources in HPC queues

– Virtualized Resource Workflow:

• Reserve virtual compute nodes on cluster • Install regular condor software in the nodes • Register the nodes with condor

• Use resources in condor scheduler • Adv:

– does not require glidein

(84)

Comet High Level Interface (Planned)

• Command line • Command shell

– Contains history of previous commands and actions • REST

(85)

Comet VC commands

(shell and commands)

• Based on simplified version of cloudmesh

– Allows one program that delivers a shell and executes commands also in a terminal

(86)

Command Shell

Ø cm comet

cm> cluster --start=… --end=..

--type=virtual

--name=myvirtualcluster cm> status --name=myvirtualcluster cm> delete --name=myvirtualcluster cm> history

(87)

Conclusions

§

Collected 51 use cases; useful although certainly incomplete

and biased (to research and against energy for example)

§

Improved (especially in security and privacy) and available as

online form

§

Identified 50 features called facets divided into 4 sets (views)

used to classify applications

§

Used to derive set of hardware architectures

§

Could discuss software (se papers)

§

Surveyed some benchmarks

§

Could be used to identify missing benchmarks

§

Noted streaming a dominant feature of use cases but not

common in benchmarks

§

Suggested integration of DevOps, Cluster Control, PaaS and

workflow to generate software defined systems

References

Related documents

Martin Grauer, MD, Head of DAAD Group Medical Programme Erlangen, Head of AGMAN Germany for Africa, Specialist in Internal Medicine, Infectiology, Tropical- and Travel

The Association of Energy Engineers (AEE), a nonprofit professional society of over 16,000 members, issued a survey to its members to determine the need for Energy Management Jobs,

"Martinez was fully aware of the proposed zone changes. Obviously, the reason for such notice is to apprise interested parties of the hearing so that they may attend and

The candidate, given a complete set of PPE and SCBA and given emergency decontamination equipment and a simulated victim exposed to a hazardous material, shall perform and

It appears that the impacts of such catchment forest regeneration will be significantly negative, resulting in a 49% decline in average (expected) net incomes from command

introduced in the existing three degree programmes by way of add-on skill oriented subjects during the first, second and third year of education as

Charges for surface water drainage and highway drainage for measured premises and car parks built from 1 April 2010 that do not have a water connection or a meter (see 4.5.7),

application/submission was made Name of the person/company making donation or gift Residential address or registered /official office address ABN if not an individual