• No results found

Confirmatory Analyses

As we mentioned in Section2.1.2, the ultimate goal of functional brain imaging is tra- ditionally considered to be the creation of activation maps, that is, the detection of voxels active to a given stimulus. Consider a simple on/off block design experimental paradigm where a stimulus is presented only in odd-numbered blocks. The confirma-

tory4approach treats the problem of analysis as an instance of classical statistical test- 4The approach is more commonly termed hypothesis-driven in the field, as opposed to a data-driven analysis. Here, following the terminology of (Tukey,1977), we refer to it as confirmatory, to be contrasted from exploratory, simply because we find the more common terms slightly misleading. Any analysis,

Sec. 2.2. Confirmatory Analyses 35 ing. The experimental condition is viewed as an independent variable and the brain response in each voxel is treated as the dependent variable. We observe the response both during treatment (on blocks) and control (off blocks) and aim to confirm that the effect size of treatment relative to control is statistically significant.

Naturally, the null hypothesis assumes that the presence of stimulus does not alter the response. Under the null hypothesis, the probability that the effect size may be as large or larger than the observed value constitutes the statistical significance of the effect, the so-called p-value. The p-value indicates the probability that the null hypoth- esis may be rejected based on the observation. Thresholding the significance value at a desired level yields the detection results; thresholds of about p= 104are common in fMRI analysis. Piecing together the results of detection, or the significance values, across voxels in space provides us with the activation map for the stimulus of interest. One of the main challenges for this simple framework stems from the fact that it treats the essentially multivariate dimension of space in a mass univariate fashion. Since the map is created from a large number of separate statistical tests, tens of thou- sands in a typical data set, the classical multiple comparison problem arises. If the noise properties of data exactly match our assumptions, the significance value corre- sponds to the probability of type-I error, i.e., the probability that a non-active voxel is declared otherwise. Among 50,000 samples from the null distribution, there are on average 5 voxels that pass the extremely conservative threshold of p = 104. In other words, the probability that the collective result of our tests on such a sample is accurate will be minuscule:(1104)50000 0.007. The two main approaches to con- firmatory analysis, parametric and nonparametric tests, both provide techniques for dealing with the multiple comparison problem as we shall describe next.

!

2.2.1 Statistical Parametric Mapping

Parametric tests explicitly formulate the null hypothesis for fMRI data. In the frame- work of the General Linear Models (GLM), Equation (2.3) can be used to derive a variety of different statistical tests for fMRI observations (Friston et al., 1995c). For instance, consider an experiment of visual object recognition where experimental con- ditions 1 and 2 correspond to presentations of face and object images, respectively. Say, we are interested in detecting brain areas that demonstrate significantly larger re- sponses to face images compared to images of objects. We may form a linear contrast

ctˆbi (c = [1,1, 0, . . . , 0]t) to compute the effect size in each voxel. Assuming white

gaussian noise !i in voxel i, under the null hypothesis that ctbi = 0, we can show

whether exploratory or confirmatory, is based on a hypothesis of some kind and also uses the data in some fashion to make the inference. The hypothesis-driven versus data-driven dichotomy creates the impression that the the latter avoids making any hypotheses about the data. Yet, no inference is possible unless we assume some structure in the data.

36 CHAPTER 2. APPROACHES TO INFERENCE FROM FMRI DATA (Friston et al.,2007): ctˆbi !" 1 T−Dˆ!tiˆ!i # ˜ct(AtA)1˜c Student(T−D), (2.5) where T is the number of points in each time course, ˆ!iis the residual, D is the overall

number of conditions and confounds (columns of A), Student(TD)is a Student’s

t-distribution with T−D degrees of freedom, and ˜c IRD is a zero-padded exten-

sion of the contrast vector c. This expression defines the t statistics and provides the grounds for the t-test, which is probably the most widely used test in fMRI analysis. We can now compute the significance of the face-object contrast in each voxel by com- puting the p-value for the observed summary statistics based on the null Students’s

t-distribution.

F-test is an alternative parametric test that has applications in the Analysis of Vari-

ance (ANOVA). This test aims to answer the question whether a number of experimen- tal conditions significantly contribute to the fMRI signal in a given voxel. For example, consider a case where we are interested in detecting voxels where any of the experi- mental conditions elicit a significant brain response. In other words, we would like to ask whether the first term of the right hand side of Equation (2.3) is needed for ex- plaining the observed variance in the signal at voxel i. The null distribution in this case corresponds to bi = 0. Let P = IT×T−A(AtA)1Atand P0 = IT×T−F(FtF)1Ft be

projection matrices onto subspaces orthogonal to the spaces of full and reduced mod- els, respectively. Once again, assuming white Guassian noise, we find (Friston et al.,

2007):

T−D S

yit(P0P)yi

yitP0yi F(S, T−D), (2.6)

where S is the number of experimental conditions (regressors) and F(S, T−D)de- notes an F-distribution with degrees of freedom S and T−D. The denominator of

the F statistics is the residual of the entire model while its numerator is the difference between the residual of the reduced and full models. We can use the F-distribution to compute a p-value for the observed summary statistics in each voxel to test whether the variance in the response is due to any of the experiment conditions or not (omnibus

F-test). More generally, an F-test can be used with any arbitrary partitioning G and F

of the design matrix A. Note that when an F-test is applied to only one regressor, it becomes equivalent to a t-test (Friston et al.,2007).

Due to the multiple comparison problem, thresholding raw significance maps com- puted from the above voxel-wise tests results in a large number of false positives. The Bonferroni correction is a simple way to handle this problem by multiplying the de- sired threshold on the false positive rate with the number of voxels tested. Since this correction may be overly conservative, a more moderate technique is to use the false discovery rate (FDR) control (Benjamini and Hochberg,1995) for adjusting the thresh- old on the significance maps (Genovese et al.,2002). Worsley et al.(1996b) proposed

Sec. 2.3. Exploratory Analyses 37