• No results found

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

N/A
N/A
Protected

Academic year: 2021

Share "Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

INSTITUT FÜR EXPERIMENTELLE KERNPHYSIK FAKULTÄT FÜR PHYSIK

Integration of Virtualized Workernodes in Batch

Queueing Systems – The ViBatch Concept

(2)

(Computer) Virtualization

Sharing resources of one physical machine between independent

Operating Systems (OS) in Virtual Machines (VM)

Virtual Machines are decoupled from the underlying hardware and (almost) arbitrary operating systems can be installed.

Different virtualization techniques provided by various vendors and open-source communities

Physical host machine Virtualization VM server 1 Workernode OS 1 VM server 3 User portal OS 3 VM server 2 Proxy server OS 2 VM server 2 Proxy server VM server 3 User portal OS 3 VM server 1 Workernode OS 1

(3)

Why Virtualization?

Offers independence from host systems and encapsulation of user interaction.

Enables use of special validated operating systems for high energy physics analysis

Enables use of Virtual Appliances , e.g. CernVM (see later) Allows the dynamic partitioning of a shared HPC cluster:

Grants different setups for different user groups No incompatibilities have to be considered

(4)

Linux Kernel module VM 2 SuSe VM 1 Debian Normal user processes

- Kernel-based Virtual Machine

KVM is implemented as a Kernel module

Linux kernel is the virtual machine monitor

VMs run as normal processes Supports native virtualization

techniques AMD-V and Intel VT-x => Very good performance!

Hardware

Interface to common VMMs/hypervisors such as KVM, Xen, Vmware, UML

(Remote) management of virtual machines and storage. More Information: http://libvirt.org

(5)

Dynamic Virtualization Project at KIT:

HPC Cluster Models

Isolated Computing Cluster

Each group/institution has sep. cluster  Administration overhead

 Can not cover peak loads

Shared Computing Cluster

All groups share one cluster

 Setup compromise not always possible

 Load-balancing by fair-share

Dynamic Partitioned Cluster

Configure cluster in real-time with VMs

 Allows any software/OS configuration  Virtualization layer hidden

 Load-balancing by fair-share

(6)

Dynamic Virtualization Project at KIT:

ViBatch

Lightweight tool enabling virtualization of job environments Can be implemented into arbitrary batch systems

Batch system is not aware of the virtualization – no code modification needed (only adapt configuration)

Virtual environment is determined per job just by the queue

the job is sent to:

qsub -q [normal_queue] job1.sh qsub -q [virtual_queue] job1.sh

job submission: only queue changes!

(7)
(8)

ViBatch - Lightweight

Core components:

just bash scripts

( prologue, epilogue and remoteshell )

Additional scripts for (almost)

automatic installation on arbitrary clusters

Cluster information and preferences in one

config-file

Logfiles enable debugging and workload statistics.

(9)

ViBatch - Virtual Appliances: CernVM-FS

Our VM image includes CernVM-FS, which is a remote file system via HTTP developed by CernVM Software appliance

http://cernvm.cern.ch/portal

Provides LHC software installation (various VOs: CMS, ATLAS, ...) including most common versions of experiment software We don't have to care about own installations!

A simple Squid HTTP proxy server does the caching

(10)

Monte Carlo Sim. (vbfnlo) CPU benchmark whetstone CMSSW physics analysis +17 % +12 % native native virtual virtual virtual

native: not available

Load of ViBatch (last 6 weeks)

ViBatch in Operation at EKP, KIT

ViBatch has already been used at EKP for several HEP analysis:

Data Skims for Higgs TauTau analysis (see talk A. Burgmeier, T49.7)

Monte-Carlo generation for studies in Higgs search (C. Hackstein, T49.1)

Running on EKP production cluster in parallel to native job submission

Performance

Depends on KVM tuning and host setup Currently investigated and tuned (KSM, ...)

#

jobs

SLE11 not binary

(11)

ViBatch in Operation at EKP, KIT

Memory consumption ~ 2GB RAM per VM

Currently no InfiniBand driver for our VMs => No native use of

Lustre file system possible

Storage mounted via NFS export

Shared Institutscluster IC1 at KIT Workernodes (EKP) 200 (25)

CPU 8x2.66 GHz Intel Xeon

Memory 2 GB RAM per core

Disc space 750 GB per node

Storage 350 TB Lustre FS

Network 40 Gbit/s InfiniBand

Our setup – characteristics & problems

Problems with compatibility kernel space NFS daemon Lustre driver: Unstable, few nodes crashed

Currently solved using user space NFS daemon

(12)

Conclusion and outlook

Extend operation to the whole cluster (200 nodes 1600 VM slots) Provide detailed documentation

Further simplify installation Burst into cloud:

Connect with ROCED (Talk S. Riedel, T 77.3)

Cloud

+

ViBatch

References

Related documents