• No results found

3.6 Sampling with marginal likelihood

4.1.4 Application to fMRI data

As an application to real data, we consider measurements from functional Magnetic Resonance Imaging (fMRI). Data from fMRI experiments are used to identify activated regions in the hu- man brain. The data presented here was collected in an experiment in which a test person was exposed to a visual stimulus for a period of 30 seconds followed by 30 seconds of rest. During the alternate sequence of 4 phases of rest and 3 phases of stimulus, data was recorded atT= 70 time points with 3 seconds lag in between. At each time point, brain activity is measured on a three-dimensional grid ofNpixels (or voxels).

Typically, the quality of such data suffers from several sources of random error during the recording process, e.g. due to movement of the test person. In addition, there is a systematic distortion of the original ON-OFF stimulus to the signal perceived by the brain. Therefore, exhaustive preprocessing is necessary, usually carried out in form of a regression model. More precisely, observation yit for voxel i = 1, . . . ,N at timet = 1, . . . ,T is gained by correcting

the measurements for time trends and systematic transformations, see G ¨ossl, Auer & Fahrmeir (2000) for a thorough discussion on this matter. After preprocessing, usually a spatial analysis

70 4. Further Topics in Clustering Partition Models

is performed in order to identify activated areas in the brain. Our focus is solely on the latter part.

We consider data for one time point and one horizontal layer of pixels. Therefore, we have a two-dimensional lattice with n = 2948 pixels. The preprocessed data has been taken on from Lang & Brezger (2003) who also give further details on the preprocessing step. Further- more, they report results for a spatial analysis of this data based on two-dimensional Bayesian P-splines. The aim of the analysis is to detect activated regions of the brain and separate them from non-activated regions. According to the nature of the stimulus activated pixels will mainly occur in the visual center of the brain. Although the data are discrete, a continuous model is reasonable due to the extremely large number of pixels. To speed up the analysis in our discrete model we constrain the image to 1179 pixels located in the rear part of the brain containing the visual center.

We have analyzed data for three time points: t1 = 18,t2 = 38, andt3 = 58. These corre-

spond to the first, second, and third period of stimulus, respectively. Detailed results are only reported fort3 = 58 since this seems to be the roughest data. The large number of pixels is

almost the limit of the capability of the CPM. We have increased the burn-in and lag to gain acceptable autocorrelations, especially for the number of clustersk. All results were collected in a run with 102,000,000 iterations including 2,000,000 burn-in and a lag of 20,000. Thus, pos- terior quantities are based on 5000 samples. We have used three different priors for the number of clustersk: uniform, geometric with parameterc = 0.02, and a rather informative Poisson prior with parameterµ = 30. The latter choice was based on a visual inspection of the data alone. Surprisingly, differences in the posterior median estimates were found to be small. This indicates a strong spatial structure in the data which is discovered by all three priors. However, there seems to be no objective justification for an informative Poisson prior. Therefore, we will present results for the uniform prior in detail.

In Figure 4.5 the data and the posterior median estimates are displayed. Both show a strong spatial structure with rather extreme sudden changes. The estimates are plausible with large areas of almost constant values around zero. In general, this result is desired since zero corre- sponds to non-activated regions. Areas with estimated levels above zero mainly coincide with the known location of the visual center in the human brain.

To constitute such a clear structure, the partition model is limited to few viable partitions (compared to the enormous number of possible partitions). Therefore, one would expect low acceptance rates for the partition changing moves. However, those were passable, about 9% for the moves birth and death, nearly 10% for a switch, and over 44% for a shift. Still, the data seems to support only few partitions, while unsuitable partitions are often rejected. Further analysis of the posterior distribution of the cluster centers confirms this assumption. In Fig- ure 4.6 the posterior probabilities of each pixel to be selected as a cluster center are displayed. There are 785 out of 1179 pixels that have a probability below 0.1, whereas there are only 31 pix- els with a probability above 0.5. Note that the framed pixel in the center of the lattice indicates a pixel without observation.

4.1. Image processing 71 5 10 15 20 25 s 10 20 30 40 50 t -50 0 50 100 150 200 250 data 5 10 15 20 25 s 10 20 30 40 50 t -50 0 50 100 150 200 250 posterior median

Figure 4.5: fMRI data fort3=58 (left) and posterior median estimates (right).

0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8

Figure 4.6: Posterior distribution of the cluster centers.

Still, the algorithm discovers a clear spatial structure. While the expected number of clus- ters is 590 a priori, this number is decreased considerably ranging from 80 to 185 with a me- dian number of 128 in the posterior. From Figure 4.5, it becomes obvious that for some pixels almost no smoothing is performed. For example, the large peak of about 215 in the data is only shrunken to about 209 in the posterior.

For comparison, we will consider the results for the same data set gained by the Bayesian P-spline approach (Lang & Brezger 2003). They propose two different models. Their basic model has a global smoothing parameter, i.e. the variance is assumed to be the same over the whole space. Alternatively, they modify the model and allow spatially varying variances to account for sudden changes in the data. Note that Lang & Brezger (2003) have analyzed the whole layer, while our analysis is only based on a fraction of the pixels. Yet, a comparison will give some insight on the performance of our model.

Although the coarse structure is roughly the same with the P-spline and the CPM approach, a closer comparison of the results yields some obvious differences. Both P-spline models give very smooth estimates without sharp edges. Even for large areas of values around zero, the estimates are rather wavelike. Moreover, the data is shrunken much more than by the partition

72 4. Further Topics in Clustering Partition Models

model. Especially, the large peak is estimated to 105 and 144 with global and adaptive variance, respectively.

To summarize, the CPM provides more clear structure in the estimates than the P-spline models. Extreme values are preserved while smaller changes in the surface are filtered out as noise. This indicates that the CPM prior performs spatially heterogeneous smoothing. There- fore, in the following section we will investigate the smoothing properties of the CPM prior in detail.