The PCA Based Projections Representation: Learning

3.3 Markov Random Field Based Classification

3.3.2 The PCA Based Projections Representation: Learning

Section 2.2.1 described a multivariate distribution representation that uses PCA to select a set of orthogonal vectors within that distribution. The representation is a concatenation of QFs estimated from samples projected onto each vector. This section uses this multivariate

distribution representation to represent pixel intensities in compact neighborhoods. In this context, the learned vectors correspond to linear filters, as discussed in more detail below. I term the model built using these local features to be the PCA-MRF texture model.

PCA-MRF can be compared to the MR8 filter bank. PCA-MRF is computationally more complex but more straightforward. It is more complex since it must learn its filters during training. Also, PCA-MRF learns many filters, n2 filters for an n×n neighborhood. PCA- MRF combined with QF-QDA leads to a straightforward classification algorithm compared to classification using the MR8 filter bank for several reasons. First, PCA-MRF uses small 3×3 and 5×5 pixel neighborhoods. This is in contrast to MR8, which uses filters computed from 49×49, 41×41, and 25×25 neighborhoods for the MR8-1, MR8-2, and MR8-3 filter banks, respectively. Second, PCA-MRF does not require a preselected filter bank. Third, PCA-MRF requires few normalization steps. The MR8 model uses images with a normalized intensity distribution, filters that areL1normalized, and responses normalized by either Weber’s Law or

the max. achieved in training. PCA-MRF usesL2normalized filters and unmodified responses.

At the end of this section, image normalization is discussed and QF-QDA classification using PCA-MRF is shown to not require such normalization. Later, the results section compares classification accuracy using both models.

PCA-MRF can also be compared to the Strong-MRF model. PCA-MRF is not restricted to pairwise pixel features like the Strong-MRF model. In the results section, the PCA-MRF features are shown to be more discriminative than the Strong-MRF features. This allows smaller neighborhoods to be used. However, learning these features increases training complexity. Also, PCA-MRF has a single parameter for QF bin count while the Strong-MRF model has two parameters.

The Learned Linear Filters

Section 2.2.1 presented a multivariate distribution representation based on multiple projections. For the representation, it was shown that PCA computes an ideal set of projection directions. The directions produce maximally uncorrelated coefficients, which minimizes information loss when QFs are independently built for each projection.

Linear projections have special meaning for distributions of intensities in a pixel neighborhood. Projection onto a vector is a linear function of a pixel neighborhood, an identical operation to a linear image filter. Therefore, a vector is a L2 normalized linear filter and

projecting all the samples of a distribution onto a vector is equivalent to image convolution. In this context, PCA can be considered as an approach to learning a task-specific filter bank composed of minimally correlated, orthogonal linear filters.

In this section, a common set of filters is learned across classes. They are computed using PCA on a random sample of 400 pixels from every training image pooled across classes. Figure 3.9 shows all 9, 25, and 49 filters learned for 3×3, 5×5, and 7×7 neighborhoods, in order of decreasing eigenvalues across the rows. PCA seeks directions that best represent the samples. Such generative directions have similar goals as lossy image compression techniques. There is a strong resemblance between the learned directions and the discrete cosine transform (DCT), which is used in jpeg image compression. Figure 3.10 shows the DCT for an 8×8 image patch. Both methods generate orthogonal, nonlocal vectors, where local vectors are constrained to a portion of the image patch.

The filter banks used for texture classification have distinctly different properties from the vectors found through PCA. Many of these differences arise from filter bank design being focused on discrimination while PCA focuses on generative vectors. Filter banks are typically spatially smooth and have locality about the center pixel. They are often selected to have certain invariances, such as rotational invariance. Filter banks are also often not constrained to have linear responses. The MR8 filter bank, for example, takes as a response the maximum of several filters. These properties of preselected filter banks generally tend to increase their discriminative power, especially when training sets are limited. Limited training sets could also prevent PCA from learning sufficiently general directions. The sensitivity of PCA-MRF to training set size is examined later in this section. Many of these desirable properties of preselected filter banks could be incorporated into a more compex, PCA based learning process.

Figure 3.9: All 9, 25, and 49 principal directions found by applying PCA to 3×3 (left),

5×5 (center), 7×7 (right) image patches pooled across classes. Each set represents both a learned filter bank and an uncorrelated, orthogonal basis.

Figure 3.10: The discrete cosine transform for an 8×8 image patch. Image taken from [dct].

Table 3.4: Classification accuracy of QF-NN and QF-QDA using PCA-MRF with and without image normalization. The QF-QDA results demonstrate that 3×3 pixel neighborhoods are sufficient to discriminate the CUReT materials. QF-QDA is also insensitive to image normalization compared to QF-NN.

Neighborhood QF-NN QF-QDA Size Raw Norm. Raw Norm. 1x1 51.65 63.04 69.31 65.60 3x3 74.95 93.09 98.76 99.62 5x5 78.92 94.75 99.28 99.72

Results

Classification results using PCA-MRF are given in Table 3.4 and Figure 3.11. Unless otherwise specified all results use 32 values per QF. I first discuss the results using normalized images in Table 3.4; a discussion of the other results is given later in this section. QF-QDA achieves an accuracy of 99.62% and 99.72% using 3×3 and 5×5 neighborhoods, respectively. Previous methods have pointed out that small, compact neighborhoods specify the CUReT materials. However, these 3×3 results allow the stronger statement that the CUReT materials can be completely distinguished by simple 3×3 neighborhoods. As mentioned in Section 3.3.1, Varma & Zisserman report results of 95.87% and 97.22% using their MRF model with 3×3 and 5×5 neighborhoods, respectively [VZ03]. Pietikainen et al. report results of 87% using LBPs constrained to a 3×3 neighborhood [PNMT04].

QF-QDA using PCA-MRF outperforms QF-QDA using MR8-1, MR8-2, and Strong-MRF, and it is equivalent to QF-QDA using MR8-3. This finding holds not only for the results in Table 3.4 based on a training set of size 46, but also for all training set sizes, as shown in Figure 3.11 (top).

Similar to the MR8 QF-QDA classifier, the PCA-MRF QF-QDA classifier is also compact. PCA-MRF QF-QDA achieves an accuracy of 99.04% using just 4 values per QF and a 3×3 neighborhood, for a compact 36 dimensional representation. Figure 3.11 (bottom) shows the sensitivity of PCA-MRF to QF size. QF-QDA results using PCA-MRF are very similar to MR8-3M results. The only exception is the failure of PCA-MRF when only one QF value is

4 10 20 30 40 46 60 75 85 90 95 98 100

Training set size per material

Classification accuracy (%) 4 10 20 30 40 46 60 75 85 90 95 98 100

Training set size per material

Classification accuracy (%) MR8−3M QF-NN 5x5 PCA-MRF QF-NN 3x3 PCA-MRF QF-NN MR8−3M QF-QDA 5x5 PCA-MRF QF-QDA 3x3 PCA-MRF QF-QDA 1 2 4 8 16 32 64 128 256 15 70 80 90 95 98 100

Number of values per QF

Classification accuracy (%) MR8−3M QF-NN 5x5 PCA-MRF QF-NN 3x3 PCA-MRF QF-NN 1 2 4 8 16 32 64 128 256 15 70 80 90 95 98 100

Number of values per QF

Classification accuracy (%)

MR8−3M QF-QDA

5x5 PCA-MRF QF-QDA

3x3 PCA-MRF QF-QDA

Figure 3.11: Accuracy of PCA-MRF compared to the MR8-3M filter bank for varying training set and QF sizes. The3×3 and5×5PCA-MRF QF-QDA results are very similar to those using the hand-tuned, non-linear, MR8-3M filter bank.

used. One QF bin is equivalent to the mean of each projection, which has been effectively normalized to zero for every image. Therefore, this failure is expected. The MR8 filter bank model avoids this problem by taking the absolute value of several of the filter responses (before taking their max.).

Image Normalization

I also examine the dependence of classification using PCA-MRF to image normalization. All results discussed so far have used preprocessed images with zero mean, unit standard deviation marginal intensity distributions. Table 3.4 gives classification results with and without this normalization. Results show that the normalization is crucial for the QF-NN classifier. In contrast, with no normalization QF-QDA performs nearly as well for 3×3 and 5×5 neighborhoods and actually better for 1×1 neighborhoods. Since mean and standard deviation

are linear changes for QFs, this information can be useful for classification, as it is for 1×1 neighborhoods, or easily down-weighted in the covariance matrix when more discriminating texture information is available.

The effect of this normalization can also be considered geometrically in the space of QFs. As mentioned in Section 2.1.4, all zero mean, unit standard deviation distributions exist on a hypersphere. Since all of the marginals in the various neighborhoods are approximately identically distributed, this normalization makes the concatenated QF vectors live in an approximately spherical space. This could confound the linear estimation performed by QDA. However, the results in Table 3.4 show that this normalization is useful, if possibly not ideal, for QF based representations.

In document Compact appearance in object populations using quantile function based distribution families (Page 101-107)