Weight distribution in source domain - Framework for Transfer Learning: Maximization of Quadrat

3.5 Experiments

3.5.2 Weight distribution in source domain

In this section, we will closely analyze the weight assignment among source domain data. According to the proposed weighting scheme, source points residing in the neighborhood of target domain data are assigned with higher weights compared to other source points. This will cause source points that are not structurally similar with target ones to be placed at distant locations on the projected space. Hence the distance between each of these non-candidate points and any target point also increases through out the iterations as they share least similar structure with target domain data. We can analyze this phenomena by measuring the minimum of the distances between each source point and all the target points, that is,

Here dist(.) measures the distance between two points and dmis the minimum distance

among pair-wise distances between a source point and all target points. According to the proposed weighting scheme, dm(xs) and the weight of xs are inversely pro-

portional that is, the increase of dm(xs) will result in smaller instance weight and

vice versa. This behavior is illustrated in Figure 3.11 for four different sub-problems of Office+Caltech dataset. For each source sample, (dm(xs), w) pair is rendered

using scatter plot, where w is a sample weight. As expected according to the proposed weighting scheme of iterative WQMI-S method, source candidate points are mapped in close proximity of target points as they share underlying similarity with target ones. Hence they are assigned with higher weights than non-candidate source points, also the mean (red indicator in Figure 3.11) of associated dm distribution is

comparatively smaller than non-candidate ones. The opposite behavior hold for non- candidate source samples i.e. their associated weights are very small and the mean of dm distribution is comparatively higher than candidate ones. This proves the desired

phenomena offered by our weighting scheme.

Another illustration is provided in Figure 3.12 that also expresses the behavior of dm of a source sample vs. corresponding weight. 50 source samples are randomly

selected and constructed a distance vector with corresponding dm values. Distance

and weight vectors are placed side by side where a single stripe represents one sample in either vector. Higher value is mapped with dark color and lower value with light color in the colormap. From this figure, it is observed that when a source point is mapped closely with target points in WQMI-S subspace (i.e. smaller dm and light

stripe), corresponding instance weight is higher (dark stripe) and vice versa.

Finally, Figure 3.13 shows the final output subspace generated by the proposed WQMI-S framework and the current best work in the literature (TJM). Our goal

𝐴 → 𝑊 𝐴 → 𝐷

𝐶 → 𝑊 𝐶 → 𝐷

Figure 3.11: Scatter plots of dm vs. weight of source domain data for 4 different sub-problems. Source candidate

points are higher weighted than non-candidate ones. The mean of dmdistribution is idicated with red line which is

lower (higher) for source candidate (non-candidate) points.

was to learn a domain adaptive discriminative subspace where source and target domain data with same class label will be tightly clustered. The figure is generated using a popular high dimensional data visualization software, known as t-SNE [3]. The projected data generated from the learned subspace are transformed into two dimensional data using t-SNE and shown as a scatter plot in Figure 3.13. The two columns of the figure represent two different sub-problems A→D and A→W. In part (a) and (d) of Figure 3.13, source and target domain data distributions are visualized for two different sub-problems. We can see that target points are clustered with a well- separated margin along with source points that share similar underlying structure. In part (b) and (e) of the figure, we can see the same data distributions annotated with corresponding class labels, where source points are known a priori and target data are annotated with predicted labels. In part (c) and (f) of the figure, we see the same scenario generated using TJM approach. It is worth noting that projected data

(a) Distance vector Weight vector Low High Distance vector Weight vector Distance vector Weight vector (b) Distance vector Weight vector (c) (d) Low High

Figure 3.12: Illustration of distance vector constructed with dmvalues and corresponding weight vector of 50 randomly

selected source samples using colormap. Each sub-figure is representing one sub-problem. In each vector, a single stripe represents dm or weight of one sample. Higher value of dm is associated with lower value of corresponding

Table 3.3: Comparative results in terms of classification accuracy(%) of target data for 12 different sub-problems using the weighting scheme of source/target balancing. Each sub-problem is in the form of source → target, where C(Caltech-256), A(Amazon), W(Webcam) and D(DSLR) indicate four different domains.

Methods C → A C → W C → D A → C A → W A → D W → C W → A W → D D → C D → A D → W Avg Origfeat 23.70 25.76 25.48 26.00 29.83 25.48 19.86 22.96 59.24 26.27 28.5 63.39 31.37 PCA 36.95 32.54 38.22 34.73 35.59 27.39 26.36 29.35 77.07 29.65 32.05 75.63 39.65 QMI-H 55.95 49.49 45.86 42.12 42.71 37.58 30.37 35.8 80.89 35.71 38.31 61.02 46.32 QMI-S 57.72 55.93 48.41 41.76 46.44 38.85 30.72 36.74 83.44 38.38 42.48 77.63 49.88 WQMI-S(S≡T) 57.93 55.25 48.41 41.59 47.12 39.49 32.77 36.74 83.44 37.49 40.92 77.97 49.93

are more tightly clustered around class center in the WQMI-S subspace than TJM subspace resulting in better classification accuracy. The visualization of the scatter plot also provides a practical proof of the efficacy of our proposed iterative subspace learning approach based on maximization of WQMI-S.

In document Framework for Transfer Learning: Maximization of Quadratic Mutual Information to Create Discriminative Subspaces (Page 95-99)