• No results found

Introduction to Scientific Computing

N/A
N/A
Protected

Academic year: 2021

Share "Introduction to Scientific Computing"

Copied!
77
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction to

Scientific Computing

“ what you need to learn now to decide what you need to learn next”

Bob Dowling

University Computing Service [email protected]

(2)

1. Why this course exists 2. Common concepts and

general good practice Coffee break

(3)

Why does this course exist?

Basics C/Fortran

Scripts

!

MPI

e-Science

grid computing

(4)

Any program can be copied

from one machine to another.

Common misconceptions

Every line in a program takes the same time to run.

Everything has to fit in the same program.

Programs with source code have to be built every time they're run.

(5)

Common concepts

and good practice

(6)

A simple program

A B C B D

Linear execution

A B C B D

Not constant speed

A B C B D

Repeated elements

(7)

A B C B D

Lines in a script

Chunks of machine code Calls to external libraries Calls to other programs

(8)

B B B B

B

“Parallel” program

A

C C C C

C

B B B B B D Much harder to program ingle nstruction ultiple ata S I M D

(9)

Machine architectures

MIPS

Opteron

i586

Sparc

Power G5

Xeon

i386

CPU

Pentium

RAM

Cache

L2

L1

L3

Pipeline

Memory

DMA

bus

I/O

Internal communication

(10)

Operating

system

support

System libraries Kernel Your program Maths library Kernel Science library C library Support libraries

(11)

“Floating point” problems

e.g. numerical simulations

Universal principles:

0.1 → 0.1000000000001 and worse…

“Program Design:

(12)

Other problems

e.g. sequence comparison text searching

^f.*x$ firebox fix

fusebox

“Pattern matching using Regular Expressions”

(13)

Split up the job

bash

input data

Python Fortran

output data

gnuplot

graphical Bits of the job

Different tools &

“Glue”

“Divide and conquer”

(14)

Choose a suitable

tool for each bit

FORTRAN

Java

C++

Perl

MATLAB

gnuplot

bash

SPSS

Excel

Python

What tools? How to pick

the right one? Pros & Cons

(15)

“Glue”

Splitting up

↑↓

Gluing together

1. Pipeline

2. Shell script 3. GUI

(16)

Pipeline

make_initial

Course: “Unix: Introduction to the Command Line Interface”

process_lots input_file input_file make_initial post_process | process_lots output_file > post_process output_file | < Redirection Piping

(17)

#!/bin/bash -e job="${1}"

if

while

post_process < "${job}.dat" fi

make_initial < "${job}.in" > "${job}.out" then

[ ! -f "${job}.dat" ]

done

process < "${job}.dat" > "$ {job}.new" do work_to_do "$ {job}.dat"

Shell

script

Course: “Simple shell scripting for scientists”

(18)

Graphical User Interface

User input Data output

(19)

“Lumps”

Splitting up

↑↓

Gluing together

“Objects” “Modules” “Functions” “Units”

(20)

Structured programming

(21)

Never repeat code

a_norm = 0.0

for i in range(0,100):

a_norm += a[i]*a[i]

b_norm = 0.0

for i in range(0,100):

b_norm += b[i]*b[i]

c_norm = 0.0

for i in range(0,100):

c_norm += c[i]*c[i] …

(22)

Structured code

a_norm = norm(a)

b_norm = norm(b)

c_norm = norm(c)

v_norm = 0.0

for i in range(0,100): v_norm += v[i]*v[i] return v_norm

Calling the function

Single instance of the code

def norm(v): Define a function called “norm()”

(23)

Structured code

Once!

Function

code

Test

function Debug

function Timefunction

Improve

function All good practice follows structuring.

(24)

Improved code

def norm(v):

a_norm = norm(a)

b_norm = norm(b)

c_norm = norm(c)

w = []

for i in range(0,100): w.append(v[i]*v[i]) w.sort()

v_norm = 0.0

w[i]

Improve function return v_normv_norm +=

(25)

for i in range(0,len(v)):

More flexible code

def norm(v):

a_norm = norm(a)

b_norm = norm(b)

c_norm = norm(c)

w = []

v_norm = 0.0

w[i]

Improve function return v_normv_norm +=

for i in range(0,

w.sort()w.append(v[i]*v[i])

):

(26)

More “Pythonic” code

def norm(v):

a_norm = norm(a)

b_norm = norm(b)

c_norm = norm(c)

w = [item*item for item in v] w.sort()

v_norm = 0.0

Make

function more

“pythonic” return v_norm

for item in w:

(27)

Best sort of code

a_norm = … b_norm = … c_norm = … import library .norm(a) library .norm(b) library .norm(c) library Get someone else to do all the work!

(28)

Libraries

Written by experts In every area

Learn what

libraries exist in your area

Use them

Save your effort for your research

(29)

An example library

Numerical Algorithms Group: “NAG

● Roots of equations

● Differential equations ● Interpolation

● Linear algebra ● Statistics

● Sorting

(30)

Unit testing

“Programming is like sex: one mistake and you're providing support for a lifetime.”

— Michael Sinz

Program split into “units”.

Test each unit individually.

Catch bugs earlier.

(31)

Debuggers

Step through running code. Examine state as you go.

Better to write debugging code in your program!

“Each new user of

a new system uncovers a new class of bugs.”

(32)

The optimiser

Finds short cuts

in your code. Leave it to the system!

“Premature optimisation is the root of all evil

(or at least most of it) in programming.”

— Donald Knuth

Structured code optimises better.

(33)

Algorithms

Size of input /

Required accuracy Time taken /

Memory used vs.

(34)

Example: Matrix multiplication

for(int i=0; i<N, i++) {

for(int j=0; j<P, j++) {

for(int k=0; k<Q, k++) {

a[i][j] += b[i][k]*c[k][j]

}

} }

for(int j=0; j<P, j++) {

for(int k=0; k<Q, k++) {

(35)

Example: Matrix multiplication

(

A11 A21

A12

A22

)(

B11 B21

B12

B22

)

(

C11 C21

C12

C22

)

=

M1=(A11+A22)(B11+B22) M2=(A21+A22)B11

M3=A11(B12‒B22) M4=A22(B21‒B11) M5=(A11+A12)B22

M6=(A21‒A11)(B11+B12) M7=(A12‒A22)(B21+B22)

C11=M1+M2‒M5+M7 C12=M3+M5

C21=M2+M4

(36)

UCS advice

[email protected]

Advice on… libraries

techniques algorithms

(37)

Coffee break

15 minutes Wrists

Spine Eyes

(38)

FORTRAN

Java

C++

Perl

MATLAB

gnuplot

bash

SPSS

Excel

Python

Choosing a language

?

What's … available? used already? suitable? best?

(39)

Classes of language

Interpreted Compiled

Shell

script Perl Python Java C,C++,Fortran

What the system sees

What you do

What

files get created

(40)

Shell scripting languages

#!/bin/bash job="${1}" …

Several scripting languages:

/bin/sh /bin/csh

/bin/bash

/bin/tcsh

/bin/ksh /bin/zsh

(41)

Shell script

Suitable for…

gluing programs together

“wrapping” programs small tasks

Easy to learn

Very widely used

Unsuitable for… performance-critical jobs

floating point GUIs

(42)

“Simple shell scripting for scientists”

UCS courses

“Unix: Introduction to the

(43)

“Further shell scripting”?

Python!

(44)

High power scripting languages

Python

Perl

#!/usr/bin/python import library

#!/usr/bin/perl use library;

… Call out to libraries

(45)

Perl

The “Swiss army knife” language

Suitable for…

text processing

data pre-/post-processing small tasks

CPAN: Comprehensive

Perl Archive Network Widely used

Bad first language Easy to write

unreadable code

“There's more than one way to do it.” Beware Perl geeks

(46)

Python

Suitable for…

text processing

data pre-/post-processing small & large tasks

Built-in comprehensive library of functions

Scientific Python library

Excellent first language

Easy to write

maintainable code The “Python way”

“Batteries included”

Code nesting style is “unique”

(47)

“Python: Introduction for absolute beginners”

“Python: Introduction for programmers”

“Python: Further Topics”

(48)

“Python: Interoperation with Fortran” “Python: Operating system access”

UCS courses

“Python: Regular expressions” “Python: Unit testing”

(49)

Spreadsheets

Microsoft Excel

OpenOffice.org calc Apple Numbers

(50)

Spreadsheets

Taught at school Taught badly at school!

Easy to tinker Easy to corrupt data

(51)

“Excel 2007: Introduction” “Excel 2007: Beginners”

“Excel 2007: Functions and Macros” “Excel 2007: Managing Data & Lists”

UCS courses

“Excel 2007: Analysing &

(52)

Specialist systems

Database → PostgreSQL, Access, … Graphs → gnuplot, ploticus, …

GUIs → Glade

Mathematics → Mathematica, MATLAB, Maple Statistics → SPSS, Stata

(53)

Drawing graphs

Manual vs. automatic gnuplot

ploticus

matplotlib

(54)

GUIs

Glade

Back end: C, C++, Perl, Python

(55)

Mathematical manipulation

(56)

Mathematical manipulation

(57)

Mathematical manipulation

Suitable for… “fiddling”

Small and medium numerical work

Graphical subsystem

Problems… additional “modules”

(58)

UCS courses

“Mathematica: Basics”

“Mathematica: Graphics”

“Mathematica: Linear algebra” “MATLAB: Basics”

“MATLAB: Linear algebra” “MATLAB: Graphics”

(59)

Statistics

(60)

UCS courses

“SPSS: Introduction for beginners” “SPSS: Beyond the basics”

“Stata: Introduction”

“Stata for regression analysis” “Regression analysis in R”

(61)

Compiled languages

No specialist system and scripts not

fast enough Compiled language

C

C++

Fortran Java

Library

requirement with no script interface

(62)

libc.so.6 compiling linking source code object files executable fubar.c main() pow() zap() snafu.c pow() zap() printf() fubar.o main() pow() zap() snafu.o pow() zap() printf() fubar main() pow() zap() printf() printf()

(63)

“Somebody else's code”

“Unix: Building, installing and running software”

(64)

You don't need to write the whole

program in a compiled language!

Python Fortran

f2py

(65)

Compiled languages

C

C++

Fortran Java

(66)

Fortran

The best for numerical work Excellent numerical libraries

Unsuitable for everything else Very different versions:

(67)

Fortran courses

“Fortran: Introduction to modern Fortran”

“Python: Interoperation with Fortran”

(68)

C

Excellent libraries

Superceded by C++

The best for Unix work

(69)

Very basic C course

“C: Introduction for Those New to Programming”

(70)

C++

Standard template library

Very hard to learn well

Extension of C Object oriented

(71)

Learning C++

“Thinking in C++, 2nd ed.”

Eckel, Bruce (2003)

(two volumes: 800 and 500 pages!)

“Programming: principles and practice using C++”

Stroustrup, Bjarne (2008)

(72)

B B B B

B

A

C C C C

C

B B B B

B

D

Parallel programming

“Parallel programming: Introduction to MPI”

Fortran, C, C++ MPI library

“Parallel programming: Options and design”

(73)

Java

Some poorly thought out libraries Multiple versions: Use >= 1.5

1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6

Object oriented

Good general purpose language

Much easier to learn and use than C++ Cross-platform

(74)

Java courses

Computer Lab Java courses

IA: “Object Oriented Programming”

(75)

Other UCS courses of interest

“vi editor: introduction”

“emacs editor: introduction”

“Programming concepts: introduction for absolute beginners”

“Program design: organising and structuring programming tasks”

(76)

Courses

All UCS courses

http://training.csx.cam.ac.uk/ucs/theme

Scientific computing “theme”

http://www-uxsup.cam.ac.uk/courses/

(77)

Contacts

UCS help desk

[email protected]

Scientific computing support

References

Related documents

This article will focus on the possible impacts of Brexit on financial services regulation in relation to three areas linked to executive remuneration: the

XXXIV Simposio de la Asociación Española de Economía – Valencia, Spain VIII Jornadas de Economía Laboral – Zaragoza, Spain. XII Encuentro de Economía Aplicada –

The purpose of this quantitative, nonexperimental, correlational study was to examine the relationship of demographic variables, sense of autonomy, and positive relationships

and issued a permanent injunction. § 360aaa for the relevant statute, as well as the Final Rule on the Dissemination of Information on Unapproved/New Uses for Marketed Drugs,

x Vehicles moving in heterogeneous traffic streams maintain variable lateral gaps and this behavior is explained using macroscopic traffic characteristic called area occupancy..

Electroencephalography, event-related potentials, psychophysiological recording and data analyses, polysomnography (sleep recording), impedance cardiography, pupillometry,

These demonstrate the theory developed in the thesis, allowing the principles behind the computation schemes of Chapter 3 to be physically realized, as well as the quasiparticles

is denoted d k. Graphical plots of d k ’s are shown in Fig. This type of graphical plots shows where the mismatch is the largest, which can help to interpret the relevancy of the