Fixed Effects ESS F Random Effects
1.3 1,4 1.5 Param eter Estim ate
4.5 Discussion
4.5.6 Levels of Inference Arising from Fixed and Random Effects Models
Worsley and colleagues (1992) first suggested the use of a ‘summary statistic’ approach to the analysis of functional neuroimaging data. However, the implementation used in the current study is that of Holmes and Friston (1998), who suggested random effects analyses for balanced designs in neuroimaging employing a general linear framework to allow for the between subject variance component in multi-subject designs. As discussed previously, the random effects analysis confers generality, but with a concomitant loss of sensitivity due to the inevitably low degrees of freedom. It was assumed that the were Normally distributed, and this assumption was incorporated into all random-effects level models. However, by examining figure 9 it is clear that the f/ do not necessarily conform to this distribution. If the voxel in figure 9a is examined it is clear that this voxel has a skewed-right distribution. Indeed, if one asks a simpler question of the voxel in figure 9a (how often is (X>0) and uses a simple sign test, the probability of getting 31/33 positive a ’s is <7x10‘^. Yet this voxel does not pass the random effects analysis used here.
This is only a single voxel. However, it casts doubts on assumptions of Normality for the £/, and it is clear that further investigation is needed into the distribution of between-session variance. The development of random-effects models that do not require prior assumptions of the distribution of residuals may
be needed to address this issue. The use of random effects models in the analysis of fMRI data is a recent addition to the canon of neuroimaging analysis methods. It is established practice in experimental psychology to treat subjects as random factors. In fact, a growing number of experimental psychologists now argue that even stimuli should be treated as random factors (reviewed in Siemer, 1997), so that experimental results can be generalised to the population of stimuli.
These concerns are valid ones. Yet it is wise to learn from previous debates concerning these topics. Although he was primarily concerned with the generalisability of experimental stimuli, Clark’s (1973) initial proposal that multilevel modeling should be used more frequently highlighted an obvious problem. Treating a sample as random does not mean it has actually been selected
randomly from a population. Although I could argue that by using a random effects analysis in the study, it is possible to generalise the results to the subject as a putative population, there has been a very limited sampling of the subject’s responses. All scanning sessions were collected over a two-month period, and session times were selected in a biased manner: near midday and near 6pm in the evening. It is not elegant, however, to have to state that ‘the results generalise to the population of possible sessions sampled from the subject over a period of two months, using the resources available in the laboratory’. In practice, these caveats are usually accepted. Indeed, the use of random effects models to ensure the correct level of inference in multisubject fMRI analyses rarely addresses the other sources of systematic variation in the population that investigators are generalising to (usually male, Caucasian right-handers who respond to advertisements and financial reward). However, adopting a random effects model does afford some protection against inappropriate generalisation of results, as noted by Clark and colleagues (1976) in the reply to their critics.
Although I have shown that with an appropriate statistical model and a large sample of sessions it is possible to obtain robust results, a number of issues remain unanswered. In particular, I would hesitate before generalising the current results to other centres, subjects, or activation paradigms, as between-session
variance may vary greatly depending on the context under which it is studied.
4.6 Conclusions
In this chapter I have described the results of an experiment designed to examine intersession variance in fMRI during the performance of simple visual, motor and cognitive tasks by a single subject. A number of interesting points are raised by the data.
First, analysing the data session by session it is evident that binarised statistical maps, though convenient, are not a useful tool for the evaluation of intersession variability. When examining each multisession dataset, by paradigm, evidence of significant session by condition interactions was found. This result demonstrates that session context effects have a significant effect on fMRI data, and illustrates that a single session should be considered merely as a single sample of a subject’s responses to the experimental intervention employed. As a large number of sessions across all paradigms were studied, I then compared the differences between analysing these data using either fixed- or random-effects linear models, the latter being a recent addition to neuroimaging analysis. Although random effects analyses are certainly useful as they allow inference about experimental effects to be extended to the population which the sessions were sampled from, the current random effects model used may be invalid. The assumption of Normally distributed inter-session residuals was not supported by close examination of some of the data, and thus future work is required before random effects models can be used to their full potential. Finally, I acknowledge that identifying the source and magnitude of the different sources of intersession variance in fMRI is crucial. The ability to differentiate between variability caused by the neurovascular signals that fMRI measures, and variability introduced by the means of measurement and analysis of these signals, is essential.
How do these results illuminate those of Chapter Three? While being of methodological interest in their own right, they illustrate that due care must be taken before fMRI will be able to fulfill its promise as a technique sensitive enough to use in longitudinal studies of human brain function. More importantly
from my own perspective, they usefully demonstrate that all studies, even those with simple hypotheses, must acknowledge the intrinsic variability in the fMRI response - whatever its ultimate origin. However, even with the variability demonstrated here, many studies have produced robust and repeatable results using fMRI to study cognitive paradigms. In the next chapter I explore if optimising both the stimulus equipment and stimulus protocols may lead to a better delineation of somatotopical detail.