An Introduction to High Performance Computing
in the Department
Ashley Ford & Chris Jewell
Department of Statistics University of Warwick
1 Some Background
2 How is Buster used?
3 Software
Outline
1 Some Background
2 How is Buster used?
3 Software
What is Buster for?
Raw computing power:
Large datasets → lots ofmemory
Complex algorithms →fast processing
Batch processing
Ability to set your algorithm running, and get on with other work.
Interactivesessions
Manipulating data in real-time.
Cost effectiveness
High-powered centralised computing facility shared among users
System architecture
Internet
Execution
Fileserver Frontend
Specifications
Frontend Node2 x 2.33GHz Intel E5410 Quad Core processors 8GB fully buffered RAM
Storage Cluster
LustreTM high performance filesystem
17TB RAID storage
Execution Nodes(11 machines, 108 CPUs)
5 nodes: 3.16GHz Intel X5460 Quad Core x2, 16GB FBRAM 2 nodes: 2.93GHz Intel X5570 Quad Core x2, 16GB FBRAM 5 nodes: 2.80GHz Intel X5660 Six Core x2, 48GB FBRAM
Outline
1 Some Background
2 How is Buster used?
3 Software
What you get
Standard user accounts provide:
Username (same as ITS username) andpassword
1GB fault-tolerant home storage
Expandable if required
Backed up nightly to ITS central backup service
Default 150GB storagespace.
Fault tolerant NOTbacked up!
Up to 800GB scratchspace, shared with other users
Files deleted automatically after 14 days
Frontend node
Internet
Frontend
Fileserver
Logging in
Login provided over SSH secure shell connection
Hostname: buster.stats.warwick.ac.uk
Provides password-protected access fromanywhere on the
internet.
Graphical forwarding enabled (eg. R graphs, text editors) Clients:
Linux / MacOS X- nativesshclient
Accessing files
SSH
Encryptedfile transfer from anywhere on the internet Usescp(Linux/MacOS X) orWinSCP(Windows)
Windows fileshare
Unencryptedfrom within campus (still requires password, though)
Home: \\buster\<username>
The module system
Software packages available viamodulecommand
Allows versioningof software packages Checks for conflicts between packages
Available packages: module avail
Adding a module (default version): module add R
Adding a module (specific version): module add R/2.8.1
Displaying information: module display R
Submitting jobs
Internet
Execution
Fileserver Frontend
Submitting jobs
Grid Engine
Jobs managed by Grid Engine (Sun/Oracle/Open Grid
Scheduler)
Interactivejobs
Batchjobs
Submit a job from the Frontend node, and Grid Engine sends it to a free slot on an execution node
Interactive jobs
Requested via theqlogincommand on the frontend node
Shortjobs only
eg. running individual commands in R or interactive Python
Pros: interactivity, graphics, quick and simple to use.
Cons: youlose your jobif your connection to Buster is interrupted, lower processor scheduling priority.
Batch Jobs
Submit via a job submission script from the frontend node with qsub:
Ideal for longjobs run in batch mode
Pros:
Allows requests formultiple processors
Provides thetask array facility
Saving standard output and error buffers to disk
Allows you to log out and get on with something else while your job runs
High-priority processor scheduling
Cons:
You have to write a job submission script No interactivity/graphics
Batch Jobs
Example job script - /usr/sge/examples/jobs/r-example.sh
# !/bin/bash #$ -S /bin/bash #$ -o /storage/$USER/r-example.stdout #$ -e /storage/$USER/r-example.stderr #$ -l h vmem=500M,h rt=0 . /etc/profile module add R cd /storage/$USER
time R --vanilla << EOF
x<-runif(100) pdf("R-output.pdf") plot(x) dev.off()
Submitting Jobs
Simply:
$ qsub <path to script>
Queues veryshort 1 hour short 12 hours medium 24 hours long 48 hours unlimited ∞
Batch Jobs
Advanced options
Task arrays - instruct Grid Engine to run N instances of your algorithm
-t 1-N:5 (ie. 1 to N jobs, skip every 5)
Parallel environments
For runningparallelalgorithms only!
Shared memory (smp) or distributed memory (mpi) -pe <smp | mpi> n
Monitoring jobs
Monitoring jobs:
qstat monitors job status
To see how busy the queue is: qstat -u \*
Killing jobs:
qdel deletes jobs
Requires the job number (use qstat)
To kill all your jobs at once: qdel -u <username>
Jul Aug Sep Oct Nov
0
40
80
Outline
1 Some Background
2 How is Buster used?
3 Software
Available Software
Applications R Maple Ox Ggobi Scilab Octave Scripting languages Python (+numpy) Perl R J Libraries GSL ATLAS LAPACK Boost GNU Multiprecision SPRNG JAGS ACML CompilersGNU Compiler Collection Sun Java 6 SE SDK
Outline
1 Some Background
2 How is Buster used?
3 Software
Buster Support
Using Buster:
1 Command man pages
2 Web documentation: Dept. Homepage →Intranet → Local IT Info → Cluster
3 Forum: New Forum
4 Sysops: Phil Harvey-Smith & Simon Parkes
Help on specific software:
1 Software package documentation
2 Web documentation (FAQ etc etc)
3 Mailing lists 4 Google!!!!!!!!!!!!!
Buster Forum
http://forums.warwick.ac.uk/wf → Departments → Stats → Buster
A new forum has been set up to :
1 provide hints and tips
2 if you spend a long time finding a solution to a problem,
others might benefit from your answer.
3 make requests for new or upgraded software