Parameter Tuning - Experiments in Predicting Multiple Targets

7.2 Experiments in Predicting Multiple Targets

7.2.2 Parameter Tuning

We choose parameters for our graph kernel models using only the first target. Although we run experiments with the remaining 35 targets in Section 7.2.3, we only parameter tune our models with the first target. This may produce suboptimal results, although it should give a reasonable setting.

To measure success, we compare our results using the average loss. The average loss is calculated by adding the absolute differences from each prediction and output of a model. For each model, we give the average loss from 35 targets and a 5-fold cross-validation on the parameter tuning set.

A similar experimental methodology to Chapters 4 and 6 is used in our experiments. The procedure to choose parameters for graph kernels, colourings and representations is briefly summarized here. For the MG representation, we first fix the parameters of the four basic kernels (FC, FS, IM and IG) and then choose a colouring method (N, M1, M2, M3 and No-Colour). We only test No-Colour for the RG and TP representations. Next using this colouring method, we test the FC and FS kernel with walks from length 1 to 15. We test the IM and IG graph kernel using 15 iterations of our dynamic programming equations using weights 0.01, 0.05, 0.1, 0.2, . . . , 0.9. We then add soft-matching (SM) extensions, TP soft-matching (TPSM), and single gaps (1G). We add these extensions to the finite-length kernels by choosing a weight with the σsm and σ1g parameters. We choose these weights from values 0.01,0.05,0.1,0.2,. . . ,0.5. The TPSM extension is only added to the MG representation as we only test soft-matching from atom labels

Chapter 7 Experiments with Multiple Targets 116 to the 6 TP labels. If the atom labels do not match, the TP labels are used and weighted according to the following: 1₄, 1₄, ₁₆1 , ₁₆1, ₁₆1, ₁₆1 . In addition to this weighting scheme, the σsm parameter is used to choose the amount of weight given to the TPSM

feature. For each parameter setting above we also chose the best SVM-C value from 103_,102_,101_,10−1_,10−2 _{and 10}−3_.

Colouring methods were chosen for each kernel and representation. For the MG representation, We fixed the kernel parameter for FC, FS, IM and IG kernels and then chose the colour among N, M1, M2, M3 and No-Colouring. FC and FS kernels obtained the best result with N-Colouring using this fixed setting. IM kernel found the best result with M3-Colour and IG with N-Colour. For the RG and TP representations, we only test our models with No-Colour as these seem like good choices from experiments in previous chapters.

Next, using these colouring choices, we optimized parameters for the two finite-length kernels, FC and FS. In Figure 7.1, we give the results for parameter tuning these kernels. Unlike Chapters 4 and 6 which use an AUC score, we are giving the average absolute loss from a 5-fold cross-validation, so the optimal parameter is the smallest instead of largest value.

A general trend appears for both kernels, the MG and RG kernels find an optimal parameter setting at 4 or 5, while TP chooses the longest walk. This may occur as RG are smaller graphs, and MG has structural information built into each vertex. For the FC kernel, the best parameters are 5, 5 and 15 for the MG, RG and TP representations respectively. For the FS kernel, the best parameters are 3, 4 and 15 for the MG, RG and TP representations respectively.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 Walk−length Average Loss FC Parameter Tuning MG−FC RG−FC TP−FC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 Walk−length Average Loss FS Parameter Tuning MG−FS RG−FS TP−FS

Figure 7.1: Parameter tuning experiments for the FC kernel are given in 7.1(a) and the FS kernel in 7.1(b). As the average loss values for each model quite similar, the

Chapter 7 Experiments with Multiple Targets 117 In Figure 7.2 we give results for optimizing parameters for the infinite-length kernels, IM and IG. In this Figure we have fixed the scale of the average absolute loss between 0.6 and 0.7. We give these graphs to show how these parameters perform over a certain range, although only the smallest of these parameters are chosen for full testing. The best parameter for the IM kernel was 0.01, the smallest parameter, for all three representations. The IG kernel similarly chose a small value, 0.05, for the RG representation which seems sensible as the reduced graphs are quite small. The MG and TP representations chose the longest value, 0.9, with the IG kernel, providing more emphasis to longer walks. 0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 σ_IM Average Loss IM Parameter Tuning MG−IM RG−IM TP−IM 0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 σI G Average Loss IG Parameter Tuning MG−IG RG−IG TP−IG

Figure 7.2: Parameter tuning experiments for the IM kernel are given in 7.2(a) and the IG kernel in 7.2(b). As the average loss values for each model quite similar, the

scale is fixed between 0.6 and 0.7.

Finally, we give results for optimizing SM, TPSM and 1G extension parameters in Figure 7.3. We test these with the best FS kernel and graph colouring chosen in the initial parameter tuning experiments. We test a range of parameter values that weight each extension. For the MG representation, the best parameter was found with a small weight (0.01) with SM and 1G, although a large weight (0.5) was chosen with TPSM. The RG representations chose the largest parameters (0.5) for both SM and 1G, suggesting that more flexibility in these smaller graph is beneficial. The TP graphs chose the smallest parameter (0.01) to weight SM and 1G, suggesting that these are not very beneficial for these models.

Chapter 7 Experiments with Multiple Targets 118 0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 σSM Average Loss SM,TPSM Parameter Tuning MG−SM MG−TPSM RG−SM TP−SM 0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 σ_1G Average Loss 1G Parameter Tuning MG−1G RG−1G TP−1G

Figure 7.3: Parameter tuning for SM and TPSM is given in 7.3(a) and 1G is given in 7.3(b). As the average loss values for each model quite similar, the scale is fixed

between 0.6 and 0.7.

In document Graph kernel extensions and experiments with application to molecule classification, lead hopping and multiple targets (Page 128-131)