• No results found

9.10 Shell scripts

Finally, one can parallelize tasks at the level of the shell, even if the programs you write/run aren’t parallelized, using a tool like GNU Parallel (see below). Briefly, with GNU Parallel you can split a list of (ambarassingly parallel) tasks across multiple cores even if the program itself is serial in nature. See the GNU Parallel page below and the tutorial page for some examples. In our lab we use GNU Parallel to distribute subject-level brain imaging processing across multi- ple cores.

• GNU Parallel121

Exercises

E 9.1 Here is a serial version of a simple MATLAB program that iterates through a for loop, printing to the screen each time. Your task is to parallelize that for loop so that iterations are distributed among available compute cores. Note that this is an “embarrasingly parallell” problem, i.e. each iteration of the for loop is completely independent of the others.

Also note that you will likely not see the loop iterations appear in sequen- tial order, once the loop is parallelized. When operations are parallelized the order is essentially arbitrary from one run to another. This is why this method of parallelization is only appropriate for computations where each loop iteration is truly independent of the others.

for i=1:10

disp(sprintf('hello from iteration # %d', i)) end

E 9.2 Parallel bootstrap

As we will find out in my statistics course next semester, we can useran- dom resampling with replacementto simulate drawing samples from a pop- ulation, and we can use this method to quantitatively assess the probabil- ity of the null hypothesis in the context of an experiment, and/or some data.

Assume we have the following sample of data x:

x = [3 5 4 3 6 7 -1 2 -3 -4 2 -5 3 2 -1]

and a second sample, y:

153 9.10. SHELL SCRIPTS

The mean of the sample x is 1.533 and the mean of sample y is 2.933. Let’s compute a statistic we’ll calldsampwhich is the difference between means,

¯

y−¯xwhich in this case is equal todsamp=1.400.

The null hypothesis is that the samples were taken from a single popula- tion with the same mean, and sodpop =0.00, and any departure from this

in our sample statistic dsamp is due to random sampling error.

Question 1

Use bootstrapping to compute the probability of the null hypothesis, with one million bootstrap (re)samples. Write your code in the usual serial fash- ion, using for-loop(s).

Question 2

Rewrite your solution to question 1, and parallelize the bootstrap. In other words execute the one million bootstrap iterations in parallel, with the work spread over your available compute cores.

Links

97http://en.wikipedia.org/wiki/Parallel_computing 98http://en.wikipedia.org/wiki/Multithreading_(computer_architecture) 99http://en.wikipedia.org/wiki/Symmetric_multiprocessing 100http://en.wikipedia.org/wiki/Hyper-threading 101http://en.wikipedia.org/wiki/Supercomputer 102http://en.wikipedia.org/wiki/Tianhe-2 103https://www.sharcnet.ca/my/front/ 104http://aws.amazon.com/ec2/ 105http://en.wikipedia.org/wiki/Oracle_Grid_Engine 106http://en.wikipedia.org/wiki/Cluster_(computing) 107http://en.wikipedia.org/wiki/SETI@Home 108http://en.wikipedia.org/wiki/Folding@home 109http://en.wikipedia.org/wiki/Botnet 110http://en.wikipedia.org/wiki/Distributed_denial-of-service_attack#Distributed_ attack 111http://en.wikipedia.org/wiki/Grid_computing 112http://www.nvidia.com/object/tesla-workstations.html 113http://www.nvidia.com/object/cuda_home_new.html 114http://www.khronos.org/opencl/ 115http://www.mathworks.com/discovery/matlab-gpu.html 116http://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_ units 117http://www.mathworks.com/products/parallel-computing/ 118http://www.mathworks.com/help/distcomp/parfor.html 119http://www.mathworks.com/help/distcomp/getting-started-with-parfor.html 120http://www.mathworks.com/help/distcomp/examples/

155 LINKS

121http://www.gnu.org/software/parallel/

10

Graphical displays of data

MATLAB comes with lots of built-in functionality for generating both 2-D and 3-D graphics. You will learn a collection of commands (mainlyplot()) that will enable you do generate plots and configure their appearance with just about any degree of precision you want. The great advantage of generating graph- ics purely based on scripted commands (as opposed for example to graphical, pointy-clicky approaches based on graphical user interfaces (GUIs)) is that you can easily and instantly re-generate graphics simply by re-running the script(s). In addition making changes to your graphics becomes a simple matter of tweak- ing commands and scripts.

Rather than writing notes that essentially duplicate the extensive online docu- mentation included with MATLAB, I will instead point you to some intial start- ing locations where you can learn about generating graphics in MATLAB.

Graphics in MATLAB123

Common Graphics Functions in MATLAB124(a graphical menu showing differ-

ent kinds of plots and how to generate them)

2-D and 3-D Plots125

Theplot()command126

Creating 2-D Graphs and Customizing Lines127

Add Titles, Axis Labels and Legends to Graphs128

Figures with multiple axes using subplots129

3-D Surface and Mesh Plots130

Shaded Polygons131

159

Exercises

E 10.1 Line Plot

Let x be a vector:

[0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9]

and set y to be equal to sin(2*pi*x).

Generate a line plot that looks like this:

Smoother Line

Change Line Colour

Plot it again but make the line blue, and thick:

E 10.2 Bar Plot

Let y be a vector with the following values:

161

Generate the following bar plot:

Again with x labels

Plot it again but this time include labels on the x axis:

E 10.3 Multiline plot

x = [1,2,3,4,5] y1 = [1,2,3,4,4] y2 = [1,5,6,8,10] y3 = [5,4,2,2,1]

Generate a multi-line plot like this:

Add a Legend

163

E 10.4 Scatterplot

Let x be a vector starting at 1 and ending at 100. Let y be a vector equal to (x * 0.15) + N where N is a vector of random values chosen from a gaussian distribution with mean 0.0 and standard deviation 1.0.

Generate a scatterplot of x vs y:

Let z be equal to ((x * 0.05) + 2) + N where N is a vector of random values chosen from a gaussian distribution with mean 0.0 and standard deviation 1.5.

Generate a scatterplot of y and z vs x, including a legend:

E 10.5 Subplots

Let x be a vector starting at 1 and ending at 100. Let y be a vector equal to (x * 0.15) + N where N is a vector of random values chosen from a gaussian distribution with mean 0.0 and standard deviation 1.0.

Let z be equal to ((x * 0.05) + 2) + N where N is a vector of random values chosen from a gaussian distribution with mean 0.0 and standard deviation 1.5.

Generate a figure with scatterplots of (x vs y) and (x vs z) in separate sub- plots, stacked vertically:

165

horizontally

Generate the plot again but stack the subplots horizontally:

more complex arrangements

Generate the plot again with 3 subplots: one on the top, occupying the full width of the figure, in which you plot (x vs y) and (x vs z) overlayed, and

two on the bottom, in which you plot (x vs y) in one and (x vs z) in the other:

E 10.6 Histograms

Define a variable y that contains 1000 values drawn from a gaussian dis- tribution with mean 0.0 and standard deviation 1.0. Plot a histogram of y. For now just use the default settings for whatever function you find that produces a histogram.

167

Number of bins

Replot the histogram using 25 bins:

Bin width

E 10.7 Line of best fit

Let x be a vector starting at 1 and ending at 100. Let y be a vector equal to (x * 0.15) + N where N is a vector of random values chosen from a gaussian distribution with mean 0.0 and standard deviation 1.0.

Generate a scatterplot of (x vs y) and overlay on the plot, the line of best fit after fitting a linear model to the data:ˆy=β0+β1x.

169

Add a legend

Add the equation of the line

E 10.8 Surface plot

Define x and y each over a grid [-8,8] in steps of 0.5.

Definex=sin(r)/r

171 LINKS

Links

123http://www.mathworks.com/help/matlab/graphics.html

124http://www.mathworks.com/help/matlab/creating_plots/types-of-matlab-plots.html#

btjs9s4-1

125http://www.mathworks.com/help/matlab/2-and-3d-plots.html 126http://www.mathworks.com/help/matlab/ref/plot.html 127http://www.mathworks.com/help/matlab/creating_plots/using-high-level-plotting-functions. html 128http://www.mathworks.com/help/matlab/creating_plots/add-title-axis-labels-and-legend-to-graph. html 129http://www.mathworks.com/help/matlab/creating_plots/create-graph-with-subplots. html 130http://www.mathworks.com/help/matlab/surface-and-mesh-plots-1.html 131http://www.mathworks.com/help/matlab/geometry.html 132http://www.mathworks.com/help/matlab/graphics-objects.html

11

Signals, sampling & filtering

Whereas signals in nature (such as sound waves, magnetic fields, hand posi- tion, electromyograms (EMG), electroencephalograms (EEG), extra-cellular po- tentials, etc) vary continuously, often in science we measure these signals by

samplingthem repeatedly over time, at somesampling frequency. The resulting collection of measurements is adiscretizedrepresentation of the original contin- uous signal.

Before we get intosampling theoryhowever we should first talk about how sig- nals can be represented both in thetime domainand in thefrequency domain.

Jack Schaedler has a nice page explaining and visualizing many concepts dis- cussed in this chapter:

Seeing circles, sines and signals133

11.1 Time domain representation of signals

This is how you are probably used to thinking about signals, namely how the magnitude of a signal varies over time. So for example a signalscontaining a sinusoid with a periodTof 0.5 seconds (a frequency of 2 Hz) and a peak-to-peak magnitudebof 2 volts is represented in the time domaintas:

s(t) = ( b 2 ) sin(wt) (11.1) where 173

w= 2π

T (11.2)

We can visualize the signal by plotting its magnitude as a function of time, as shown in Figure 11.1.

Figure 11.1:Time domain representation of a signal.