• No results found

icer Bioinformatics Support Fall 2011

N/A
N/A
Protected

Academic year: 2021

Share "icer Bioinformatics Support Fall 2011"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

iCER Bioinformatics Support

Fall 2011

John B. Johnston

HPC Programmer

Institute for Cyber Enabled Research

(2)

Institute for Cyber Enabled

Research (iCER)

•Hardware (HPCC)

•Software and Support

•Education

•Consulting

(3)

iCER: What is it?

The Institute for Cyber Enabled Research (iCER) at Michigan State University (MSU) was

established to coordinate and support

multidisciplinary resource for computation and

computational sciences.

The Center's goal is to enhance MSU's national and international presence and competitive edge in disciplines and research thrusts that rely on

(4)

HPCC: What is it?

The HPCC provides computational

hardware and support to MSU faculty,

students and researchers.

The HPCC is contained within iCER;

effectively representing the hardware,

systems and software “arm” of iCER’s

research support mission.

(5)

Bioinformatics Outreach

•HPCC hardware •Software resources •Help Desk •Seminars •One-on-one Consulting

•Limited on-site systems setup and configuration •Programming and scripting assistance

•FREE!

(6)

HPCC Cluster Overview

•Linux operating system

•Primary interface is text based though Secure Shell (ssh) •All Machines in the main cluster are binary compatible (compile once, run anywhere)

•Each user has 50Gigs of personal hard drive space.

–/mnt/home/username/

•Users have access to 33TB of scratch space.

–/mnt/scratch/username/

•A scheduler is used to manage jobs running on the cluster •A submission script is used to tell the scheduler the

resources required and how to run a job

•A Module system is used to manage the loading and unloading of software configurations

(7)

Gateway to the System

•Access to HPCC is primarily though the

gateway machine:

–ssh

[email protected]

–ssh

[email protected]

–Access to all HPCC services uses MSU

username and password.

•Once in, you can go to the user-oriented

destination of choice.

(8)
(9)

Why the HPCC Cluster?

•Large data sets

•Lots of number crunching

•A need to run many simultaneous jobs with

different data sets and/or configuration settings •You need software you don’t have, don’t want to / can’t setup

•Comprehensive readymade development environment that is actively administered

(10)

Linux? OH NOES!

•If you are a Linux pro, go ahead and take a short nap (you’ve got ~60

seconds)

•If you’re not, don’t worry! That’s why I get the (not so) big bucks.

•The Bioinformatics Help Desk is here to get you up and running.

(11)

Linux Support

•Client application selection

•Bring in your laptop (if you have one)

•Cookbook tutorials and cheat sheets (more on the way)

•One-on-one consultation

•Limited on-site support and training

•We also provide samba support for Windows and Mac boxes so you can map your HPCC account directory to your workstation

(12)

HPCC Online Resources

www.hpcc.msu.edu

– HPCC home

wiki.hpcc.msu.edu

– Public/Private Wiki

forums.hpcc.msu.edu

– User forums

rt.hpcc.msu.edu

– Help desk request

tracking

(13)

Available Software

•Center Supported Development Software

–Intel compilers, openmp, openmpi, mvapich, totalview, mkl, pathscale, gnu...

•Center Supported Research Software

–Matlab, R, amber, blast, charmm, emboss...

•Customer Software (module use.cus)

–Clustalw, QuEST, MEME, Velvet, mpiBLAST, bowtie, AMOS, ABySS, MUMmer, HMMER, phylip, SAMTools… –For a more up to date list, see the documentation wiki:

http://wiki.hpcc.msu.edu/

(14)

User Software

•50GB of initial user space provided •Install your own in user space

•HPCC offers a rich build environment •Quota increases can be made available

•Code installation and (modest) modification support is available through “moi”

(15)

Virtual Machines

•Virtual “Servers” expressed in software

•Available for research labs/working groups •Flavors currently available:

– Galaxy

– BLAST (web browser based) – UCSC Genome Browser

(16)

Database Offerings

•db-01: Internal MySQL database node attached to the cluster. Host user datasets of modest

size.

•BLAST database repository

•VM-based – UCSC for example

•Up to 1TB total user space for free, $250/yr. per TB thereafter

(17)

Multiprocessor Apps

Many bioinformatics applications are beginning to appear in multiprocessor-capable versions.

Workload can be divided to allow each processor to complete part of the job in parallel, decreasing run time.

HPC provides accessibility to a large number of processing cores, memory, and disk space.

(18)

Some Examples

•Multithreaded BLAST – shared memory •mpiBLAST – distributed memory

•Velvet Assembler – multithreaded shared mem •MAKER2 – MPI, distributed memory

(19)

Cluster Developer Nodes

•Developer Nodes are accessible from gateway and used for testing.

–ssh dev-amd05 – Same hardware as amd05 –ssh dev-intel07 – Same hardware as intel07 –ssh dev-intel10 – Same hardware as intel10 –ssh dev-amd09 – Same hardware as amd09 –ssh dev-gfx10 – Same hardware as gfx10

•We periodically have some test boxes:

–ssh dev-gfx08– Nvidia Graphics Processing Node –ssh dev-cell08 – Playstation 3 Cell processor

–ssh dev-intel09 – 8 core Intel Xeon with 24GB of memory

•Jobs running on the developer nodes should be limited to two hours of walltime.

(20)
(21)

Steps in Using the HPCC

Connect to HPCC

• Determine required software

• Transfer required input files and source code

• Compile programs (if needed)

• Test software/programs on a developer node • Write a submission script

• Submit the job

(22)

A couple of examples

•Biological model – long running, many similar but not identical runs

•Multiprocessor BLAST searches •Multiprocessor Velvet assembly

•Use of the HPCC cluster was able to produce more results in less time, with little or no active user management

(23)

But I don’t need a “cluster”

•Tool selection, setup

•Scripting assistance

•Data “browsing”, sharing, group analysis

•Lab help or training

(24)

Scripting

•Customized, standardized, modify •Python, Perl, or ?

•We have a growing “collection” available as a Git repository.

•Perhaps you don’t know anything about

scripting; or maybe you do, but could use some help?

(25)

Tutorials

•Titus Brown's ANGUS-NGS tutorials, converted for using examples on HPC instead of Amazon •Using UCSC for certain tasks

•mpiBLAST

•Velvet and Oases

(26)

Seminars and Education

•NextGen Bioinformatics Seminars

•wiki.hpcc.msu.edu/display/Bioinfo/NextGen+Bioinformatics+Seminars

•HPCC Mid-Morning Break

(27)

Setting up an account

•All account requests must come via a PI.

•Have your PI fill-in the form at:

www.hpcc.msu.edu/request

•Once received, we will process your

request and notify you when your account

is ready.

(28)

Bioinformatics Contact

•John Johnston, HPC Programmer –M-W, 1449 BPS, 884-2572

–Th-F, 505 BMB, 432-7177

[email protected]

•Ticket requests:

–https://rt.hpcc.msu.edu/index.html

–Please include “Bioinformatics Help” in the subject to more quickly route your request.

wiki.hpcc.msu.edu/display/Bioinfo/Bioinformatics+Support+at+MSU •www.hpcc.msu.edu •wiki.hpcc.msu.edu •forums.hpcc.msu.edu •rt.hpcc.msu.edu •mon.hpcc.msu.edu –www.hpcc.msu.edu/request https://rt.hpcc.msu.edu/index.html

References

Related documents

Hospital information Systems Invoicing Decision support Patient Health Decision Support & Disease management Fundamental Clinical Research Pathogenesis Translational

XCAT Infrastructure Xen Virtualization Applications Runtimes Infrastructure software Hardware Windows Server 2008 HPC. Dynamic Virtual

err put original old encode dev stat- gt dev , amp statbuf- gt st dev Lenovo s limited hardware tool for Windows is obvious of amazing The new name is more key to

• Enrichment tools map a large number of ‘interesting’ genes to biological annotation terms (e.g. GO Terms or Pathways).. • Statistical examination of the enrichment of user

law in the circuit, which found direct effect in breach of contract cases only when the foreign government had agreed to render payment at a U.S. Furthermore, although

[r]

Services include hardware exchanges, warranty and non-warranty hardware support, loan service, backup and restore, image and builds and preventative maintenance as well as

 Transfer the Site Equipment Script to the OSS and input the following command at the OSS prompt – and NOT in Moshell prompt.. ssh -o PubKeyAuthentication=no