Introduction to High-Performance
Computing and the
Supercomputing Institute
Carlos P Sosa
Cray, Inc. and Biomedical Informatics and Computational Biology
Agenda
•
What is High-Performance Computing?
•Floating-Point Operations
•
Cost per Gflops
•
High-Performance Computing Historical Perspective
•Introduction to MSI facilities
•
Show you how to access our systems
•Point you to where to go for help
•
Brief introduction to Linux/UNIX and some useful
commands
•
Ensure that you are not overwhelmed
•
Encourage you to ask questions of MSI staff to get what
you need
What is High-Performance
Computing?
High-performance computing (HPC) uses
supercomputers to solve advanced computation
problems
Today, computer systems approaching the teraflops-region
are counted as HPC-computers
Measure of a computer's processor speed. This speed can
be expressed as a trillion floating point operations per
second, 1012 floating-point operations per second
http://en.wikipedia.org/wiki/High-performance_computing
International System of Units
Floating-Point Operations
FLOPS (or flops or flop/s, for floating-point operations per second)
• Measure of a computer's performance
• Floating-point is a method of representing real numbers • S stands for "second", conservative speakers consider
"FLOPS" as both the singular and plural of the term
• FLOP (or flop) is used as an abbreviation for "FLoating-point
OPeration“
• Flop count is a count of these operations in a given section of
a computer program FLOPS is not an SI unit
4/1/2013 5
Why “Floating” Point
4/1/2013 6
significant digits x base
exponentradix point (decimal point, or, more commonly in
computers, binary point) can "float” or be shifted left or right
How to Calculate Mflops?
4/1/2013 7
What’s a petaflop?
One quadrillion calculations per second!
If you multiplied two 14-digit numbers together per second: 32 years to complete 1 billion calculations.32 thousand years to complete 1 trillion calculations. 32 million years to complete 1 quadrillion calculations.
32 years ago, Star Wars was released
32 thousand years ago, early cave paintings were completed
32 million years ago, the Alps were rising in Europe
Binary versus Decimal
• 210 is very nearly equal to 1000 and started using the SI prefix
"kilo" to mean 1024
• Everybody does not "know" what a megabyte is
• Computer memory, most manufacturers use megabyte to mean
220 = 1 048 576 bytes
• Manufacturers of computer storage devices usually use the
term to mean 1 000 000 bytes
• Local area networks have used megabit per second to mean 1
048 576 bit/s
• Telecommunications engineers use it to mean 106 bit/s • Two definitions of the megabyte are not enough, a third
megabyte of 1 024 000 bytes is the megabyte used to format the familiar 90 mm (3 1/2 inch), "1.44 MB" diskette
The confusion is real, as is the potential for incompatibility in standards and in implemented systems.
http://physics.nist.gov/cuu/Units/binary.html
Binary Notation
April 13 Cray Inc. 10
Historical Change in Cost per
GFlops
April 13 Cray Inc. 11
HPC Historical Perspective
High-Performance Computers were introduced in
the 1960s and were designed primarily by Seymour
Cray at Control Data Corporation (CDC)
●
Led the market into the 1970s
●Founded Cray Research
●
Big irons dominated the market (1985–1990)
●1980s the decade of the minicomputer
●
Mid-1990s "supercomputer market crash"
Big Irons
April 13 Cray Inc. 13
1985 Cray-2
Supercomputers Peak
Performance
April 13 Cray Inc. 14
http://www.reed-electronics.com/electronicnews/article/CA508575.html?indust ryid=21365
Why Should I Care About HPC?
April 13 Cray Inc. 15
http://pubs.acs.org/cen/science/87/8715sci3.html
April 13, 2009 “The Looming Petascale”
“Chemists gear up for a new generation of supercomputers”
“The new petascale computers will be 1,000 times faster than the terascale supercomputers of today, performing more than 1,000 trillion operations per second. And instead of machines with thousands of processors, petascale machines will have many hundreds of thousands that simultaneously
process streams of information.” “This technological sprint could be a
great boon for chemists, allowing them to computationally explore the
structure and behavior of bigger and more complex molecules.”
Towards More Realistic Systems
April 13 Cray Inc. 16
Enabling and Scaling Biomolecular Simulations of 100 Million Atoms on Petascale Machines with a Multicore-optimized Message-driven Runtime, Chao Mei, Yanhua Sun, Gengbin Zheng, Eric J. Bohm, Laxmikant V. Kale, James C.Phillips, Chris Harrison, SC ’11 November 12-18, 2011, Seattle, Washington
Introduction to the Minnesota Supercomputing
Institute (MSI)
MSI At a Glance
• Manages approximately 400 software packages and 50 research databases.
• Main data center in Walter Library.
• MSI operates five laboratories on campus, mostly serving the Life Sciences.
• Serves more than 500 PI groups with 3000+ users.
• MSI is an academic unit of the University of Minnesota (under OVPR) with about 45 full time employees.
• Dedicated resources for serial workflows, database management, and cloud access.
BSCL
MSI
Offices
HPC Resources
Labs
Eligibility
•
Faculty members at the University of
Minnesota
•
University of Minnesota academic
professionals
•
Faculty researchers at other accredited
institutions of post-secondary education in
the state of Minnesota
•
Some software may not be available to
MSI Resources
HPC Resources • Koronis • Itasca • Calhoun • Cascade • GPUT HPC Resources • Koronis • Itasca • Calhoun • Cascade • GPUT Laboratories • BMSDL • BSCL • CGL • SDVL • LMVL Laboratories • BMSDL • BSCL • CGL • SDVL • LMVL Software • Chemical and Physical Sciences • Engineering • Graphics and Visualization • Life Sciences • Development Tools Software • Chemical and Physical Sciences • Engineering • Graphics and Visualization • Life Sciences • Development Tools User Services • Consulting • Tutorials • Code Porting • Parallelization • Visualization User Services • Consulting • Tutorials • Code Porting • Parallelization • VisualizationAccess
•
System access information is available at
www.msi.umn.edu/access.html
•
myMSI for additional resources
•
For building access, contact
[email protected]
x6-0802
•
Information can be found in the
““
“
“
General Information
””
”
”
page for each lab (
www.msi.umn.edu/labs/
)
Remote Access to MSI lab Linux systems
Cannot remotely login directly to most machines at MSI
Two options to connect to MSI lab systems
● ssh login.msi.umn.edu
● NX - Requires user to download and install client
Cannot run software directly from login or NX
isub
HPC Resources
Koronis: SGI Altix
1140 Intel Nehalem Cores 2.96 TB of memory
Itasca: Hewlett-Packard 3000BL
8728 Intel Nehalem Cores 26 TB of memory
Calhoun: SGI Altix XE 1300
1440 Intel Xeon Clovertown Cores 2.8 TB of memory
Cascade:
8 Dell Compute Nodes 32 Nvidia M2070 GPGPUs
GPUT:
Laboratories
Scientific Development and Visualization • Application development • Computational Physics • Workshops Scientific Development and Visualization • Application development • Computational Physics • Workshops Basic Sciences Computing • Structural Biology • Batch processing • Stereo Projection system Basic Sciences Computing • Structural Biology • Batch processing • Stereo Projection system Computational Genetics • Bioinformatics • Genomics • Proteomics Computational Genetics • Bioinformatics • Genomics • Proteomics Biomedical Modeling, Simulation, and Design • Drug Design • Molecular Modeling • Computational chemistry Biomedical Modeling, Simulation, and Design • Drug Design • Molecular Modeling • Computational chemistry LCSE-MSI Visualization • Virtual reality • Large screen • Remote visualization LCSE-MSI Visualization • Virtual reality • Large screen • Remote visualizationBSCL
SDVL
LMVL
CGL
Software
Approximately 400 software applications
www.msi.umn.edu/sw
Classified into three service levels
www.msi.umn.edu//resources/software/service-levels ● Primary
● Ancillary ● Minimal
Disk Resources
•
Labs and HPC systems (other than Koronis) have
shared home directories
•
Disk space is allocated to a group
•50 GB initial quota on Koronis.
•
Every lab and HPC machine has access to at least
1 TB of scratch space
•
On HPC systems your scratch quota is 50 GB
•You may request a quota increase if necessary.
•PIs may request project space for large data and
group collaboration.
Tutorials/Workshops
Introductory
● Unix, Linux, remote computing,
job submission, queue policy
Programming & Scientific Computation
● Code parallelization, programming
languages, math libraries
Computational Physics
● Fluid dynamics, space physics,
structural mechanics, material science
Computational Chemistry
● Quantum chemistry, classical
molecular modeling, drug design, cheminformatics
Computational Biology
● Structural biology, computational
genomics, proteomics,
Support
•
MSI is a resource dedicated to enabling and
improving the quality of computational research
at the University of Minnesota
•
Basic research support is made available to
researchers at minimal or no cost
•
MSI participates with researchers who are
seeking funding for projects that require MSI
consultants or developers
Projects
Internal Services
● Assist with long term development projects ● Wide range of project activities
● Tailored to specific research initiatives or development
programs
● At cost
External Services
www.msi.umn.edu/services
Acknowledgement of MSI in publications / grant
proposals is requested.
Integrate MSI Resources
into Course Work
Class accounts
● Access to MSI on a semester
basis for all students in a course
Customized workshops and guest lectures
● When our expertise overlaps
with your interests, e.g., parallel programming, integration of
computational assignments in a field
Reserve the LMVL
● Use the stereoscopic
capabilities of the visualization lab for high impact structure and model visualization
MSI web pages
MSI home page
● www.msi.umn.edu Software ● www.msi.umn.edu/sw Password reset ● www.msi.umn.edu/password Tutorials ● www.msi.umn.edu/tutorial FAQ ● www.msi.umn.edu/support/faq.html
Why Modules?
•
Modules environment management package provides
support for dynamic modification of the user environment
via modulefiles
•
Each modulefile contains all the information needed to
configure the shell for a particular application
•
Advantage:
• You are not required to specify explicit paths for different
executable versions or to set the $MANPATH and other environment variables manually.
• All the information is embedded in the modulefile
• Modules can be loaded and unloaded dynamically and atomically,
in an clean fashion.
• All popular shells are supported, including bash, ksh, zsh, sh, csh,
tcsh, as well as some scripting languages such as perl
•
As a user, you can add and remove modulefiles from your
current shell environment
•
The environment changes performed by a modulefile can
be viewed by using the module command as well
Module Commands
Command Description
module list Lists modules currently loaded in a user’s environment
module avail Lists all available modules on a system in condensed format
module avail -l Lists all available modules on a system in long format
module display Shows environment changes that will be made by loading a given module
module load Loads a module
module unload Unloads a module
module help Shows help for a module
module swap Swaps a currently loaded module for an unloaded module
What is Loaded Now?
cpsosa@silver:~> module list
Currently Loaded Modulefiles:
1) local 3) user 5) torque 7) base
2) vars
4) moab
6) suacct
Re-Initializing the Module Command
Modules software functionality is highly dependent upon the shell environment being used
Sometimes when switching between shells, modules must be re-initialized
For example, you might see an error such as the following: $ module list
-bash: module: command not found
To fix this, just re-initialize your modules environment: $ source $MODULESHOME/init/myshell
Where myshell is the name of the shell you are using and need to re-initialize
What is Available?
To see which modulefiles are available on your system, enter this command:
% module avail [string]
The module avail command produces an alphabetical listing of every modulefile in your module use path and has no option for "grepping." Therefore, it is usually more useful to use the command with an string
argument
cpsosa@silver:~> module avail gcc
---/soft/modules/modulefiles ---gcc/4.3.3 gcc/4.4.0
gcc/4.4.3(default)
Loading and Unloading Modules
If a modulefile is not already loaded, use the module load command to load it.
% module load modulefile
This command loads the currently defined default version of the module, unless you specify otherwise
cpsosa@silver:~> module list Currently Loaded Modulefiles:
1) local 3) user 5) torque 7) base 2) vars 4) moab 6) suacct
cpsosa@silver:~> module load dock/6.3 cpsosa@silver:~> module list
Currently Loaded Modulefiles:
1) local 4) moab 7) base 10) ompi/1.2.9/xl
2) vars 5) torque 8) xlc/10.1 11) dock/6.3
3) user 6) suacct 9) xlf/12.1 cpsosa@silver:~> module unload dock/6.3
Module Swapping
Alternatively, you can use the module swap or module switch
command to unload one module and load the comparable module
cpsosa@silver:~> module load gcc/4.3.3 cpsosa@silver:~> module list
Currently Loaded Modulefiles:
1) local 3) user 5) torque 7) base
2) vars 4) moab 6) suacct 8)
gcc/4.3.3
cpsosa@silver:~> module swap gcc/4.4.0 cpsosa@silver:~> module list
Currently Loaded Modulefiles:
1) local 3) user 5) torque 7) base
2) vars 4) moab 6) suacct 8)
gcc/4.4.0
Lab 1: Test Modules
1.
$ssh itasca
2.
$module list
3.
Type:
$g09
4.
Type:
module load g09
5.
Type:
g09
Portable Batch System (PBS)
• The Portable Batch System (PBS) is a queuing system installed
for job batch processing
• It matches job requirements with available resources
• It ensures that machines are fully used and resources are
distributed among all users
• In contrast to the HPC system queues, the Lab system queues
do not require Service Units (SUs) in order for jobs to run
4/1/2013 44
http://wiki.ccs.tulane.edu/index.php5/Portable_Batch_System_%28PBS%29
Lab Compute Nodes
4/1/2013 45
Node Details
Nodes Model CPU cores Memory
labh01 PowerEdge R900 24 Intel Xeon 2.67
GHz 128GB
labh02 SunFire X4440 16 AMD Opteron
8384 2.7 GHz 128GB
labh03 - labh08 SunFire X4600 32 AMD Opteron
8356 2.3GHz 128GB
mirror1-mirror16 Altix XE 310 8 Intel Xeon 2.66GHz 16GB lab001-lab064 Altix XE 310 8 Intel Xeon 2.66GHz 16GB laboc01-laboc04 LiquidCool Liquid
Submerged
12 Intel Xeon X5690
How to Create a Job Submission Script
1. Log in to lab.msi.umn.edu 2. Write a PBS batch script 3. Example:
1. 1-hour job to run on a single processor of a single node
2. 1gb of memory
3. Load any preferred modules
4. run any software that can operate in batch mode
#!/bin/bash -l
#PBS -l nodes=1:ppn=1,mem=1gb,walltime=01:00:00 #PBS -m abe
cd /home/msi/username/Testpbs
module load intel
./test < input.dat > output.dat
How to Submit a Job
• Use the command qsub to submit a job to the queuing system • qsub takes a job submission script that contains special
commands telling PBS what resources are needed
• It also contains the commands necessary to run the submitted
job. Example:
$qsub script.pbs
To submit to a different queue:
$qsub -q lab-long script.pbs
How to Check Job Status
• You can check job status using the showq command
$ showq -u username
Lab1: submit a very simple script
• Go to the scratch file system: $cd /scratch
• Create a directory: $mkdir myaccount
• Move to that directory: $cd myaccount
• Create a directory:$mkdir test
• Move to that directory: $cd test
• Copy file: $Cp ~cpsosa/gaussian/test/* .
• Submit your first script: $Qsub test.pbs
Example: Electronic Structure Calculation
4/1/2013 50
Schrödinger Equation
4/1/2013 51
N-body wavefunction
Spatial positions
charges of the individual particles Energy of either the ground
or an excited state of the system
http://www.physics.uc.edu/~pkent/thesis/pkthnode12.html
Gaussian09 an Electronic
Structure Suite of Programs
Gaussian 09:
is an electronic structure package capable of
predicting many properties of atoms, molecules, and
reactions,
e.g.
•
•
•
•
molecular energies and structures
•
•
•
•
vibrational frequencies
•
•
•
•
molecular orbitals
•
•
•
•
and much more …
utilizing
ab initio
, density functional theory,
semi-empirical, molecular mechanics, and hybrid methods.
Gaussian History
Gaussian70, Gaussian76, Gaussian77, Gaussian78, Gaussian80, Gaussian82, Gaussian83, Gaussian85, Gaussian86, Gaussian88, Gaussian90, Gaussian 92, Gaussian93, Gaussian 94, Gaussian95, Gaussian96, Gaussian 98, Gaussian 03, Gaussian 09 4/1/2013 53 John Pople Born 31 October 1925 Burnham-on-Sea, Somerset, England
Died March 15, 2004 (aged 78)
Chicago, Illinois, United States
Nationality England
Fields
Theoretical chemistry Quantum chemistry
Computational chemistry
Alma mater Cambridge University
Doctoral advisor John Lennard-Jones
Input Format
4/1/2013 54% Resource management
# Route card
Title section
Molecular coordinates
Geometric variables
Other input options
blank line
blank line
blank line
blank line blank line
Input File
4/1/2013 55 %mem=32mb #p hf/6-31g opt hf/6-31g optimization of water 0 1 o h 1 oh h 1 oh 2 aoh oh=0.9 aoh=104.0 computational model type of calculation titlecharge & multiplicity
structure definition (z-matrix)
variable values
Route Card
4/1/2013 56
Description:
– specifies keywords and options – always begins with a # character
– keywords can be specified in any order – options are grouped in parentheses, () – keywords should not be repeated
– route section can be up to 5 lines long – ended with a blank line
Syntax:
Type of Calculation
4/1/2013 57
•
single point energy and properties
•
geometry optimization
•
reaction path following/searching
•
frequency
Computational Method
4/1/2013 58
Level of theory:
– molecular mechanics
amber, dreiding, uff
– semi-empirical
am1, pm3, mndo, …
– density functional theory
b3lyp, mpwpw91, custom …
– ab initio
hf, mp2, ccsd, qcisd, …
– hybrid
Lab3: Run Gaussian09
1.
Copy files: $cp ~cpsosa/gaussian/g09/* .
2.
Build your h2o.com input file
3.
Build your h2o.pbs
4.
Submit your h2o.pbs to the QUE
5.
Visualize results
Lab 4: Visualize Results
1.
Load the gaussian module if you do not
have already loaded:
$module load
gaussian
2.
Run GaussView by typing
:
$gview
Questions?
•
MSI help desk is staffed
Monday through Friday from
8:30AM to 7:00PM
•
Walk-in help available in
room 569 Walter
•
Phone 612.626.0802
4/1/2013 62
Thank You!
http://www.cray.com
http://r.umn.edu/academics-research/bicb/
http://www.msi.umn.edu Many thanks to:
• Nancy Rowe for providing The MSI slides