• No results found

4.3 Results

5.3.2 Simulation results

Network analysis and clustering Table 5.1 and 5.2 show the performance of the stud- ied methods for network analysis and hierarchical clustering. WGCNA obtains number of clusters closer to the truth than graphical lasso and the ridge-penalty approach. WGCNA estimates, on average,kˆ = 3andˆk = 5fork = 4andk = 8underlying modules, re- spectively. This slight underestimation ofkyields a large number of false positives (see Table 5.2). Focusing on the situation ofk = 4, and taking the group with highest sim- ulated within correlation as reference, Table 5.2 shows a false positive rate of 38.2% for WGCNA, mainly due to the incorrect assignment of features of the second cluster to the first one. In contrast, graphical lasso overestimates the number of simulated modules.

The number of estimated modules is not affected by the number of underlying mod- ules (for example,ˆk= 14for bothk= 4andk= 8withn= 50), but it increases with the number ofpsimulated features. This is likely due to the reliance of graphical lasso on partial correlations instead of Pearson correlations. After having a closer look at the estimated modules, we observe that graphical lasso generateskˆgroups, which are subsets of the underlying simulatedkmodules. In other words, graphical lasso does not group together features belonging to different underlying modules (WGCNA does), and the es- timated modules can be grouped in such a way that the originalkmodules are recovered. This translates in a very small false positive rate when taking any of theksimulated mod- ules as reference (see Table 5.2). Finally, the ridge-penalty approach is, in most of the

200 variables 4 modules 8 modules n=50 n=100 n=50 n=100 WGCNA 3.1(2-5) 3.0(2-5) 5.0(3-8) 5.0(4-7) Graphical lasso 14.7(9-21) 17.0(12-23) 14.4(9-21) 17.6(13-25) Ridge penalty 1.0(1-3) 1.3(1-6) 1.5(1-8) 9.8(1-21) 1000 variables 4 modules 8 modules n=50 n=100 n=50 n=100 WGCNA 3.1(2-5) 3.0(2-5) 5.6(4-18) 5.0(4-11) Graphical lasso 48.3(40-86) 76.5(57-93) 59.6(39-81) 77.5(63-95) Ridge penalty 10.2(1-71) 52.6(3-72) 13.1(1-69) 61.5(6-81)

Table 5.1: Simulation study. Average number of clusters obtained accross cross-validation by WGCNA, graphi- cal lasso, and ridge penalty. The minimum and maximum number of clusters identified are presented in brackets.

50 Individuals 100 Individuals TPR FNR FPR TPR FNR FPR module 1 WGCNA .999 .001 .382 .998 .002 .375 Graphical lasso .308 .692 .000 .259 .741 .000 Ridge penalty .999 0.001 .997 .962 .038 .951 module 3 WGCNA .918 .082 .190 .989 .011 .148 Graphical lasso .189 .811 .001 .192 .808 .000 Ridge penalty .999 .000 .997 .960 .040 .951

Table 5.2: Simulation study. Average (across 10 cross-validation folds and 500 replicates) true positive rate (TPR), false negatives rate (FNR) and false positives rate (FPR) for WGCNA, graphical lasso and ridge penal- ization. Top part: Scenario a. Reference module: module 1 (corresponding to the first 50 variables in Fig 2 left panel which present the highest level of correlation). Bottom part: Scenario b. Reference module: module 3 (corresponding to the variables 100-150 in Fig 2 left panel).

cases, not able to lead to the identification of any cluster with small number of features and subjects (seep= 200andn= 50in Table 5.1). For larger number of individuals and variables, the number of clusters is overestimated for the same reason as graphical lasso. Namely, the reliance of this method on partial correlations.

5.3 Simulation Study 73

Predictive ability

Table 5.3 and Table 5.4 show the results in terms of the predictive accuracy measure Q2forp= 200andn= 50and, forp= 1000andn= 50respectively. Table A and Ta-

ble B in S1 File, show results forn= 100. Adaptive group ridge and group lasso present similar performances in most of the studied situations. These two methods outperform the other considered three-step approaches. Also they are the best performing methods when the known grouping was used. Further, these approaches may outperform the com- monly used regularized regression methods lasso, ridge and elastic net regression in terms of predictive ability. Specifically, group lasso relying on grouping structure coming from WGCNA and graphical lasso systematically outperforms ridge and lasso and it presents a similar predictive ability than elastic net whenp= 200. Forp= 1000the predictive ability of the standard ridge, lasso and elastic net is lower while the methods based on group lasso and adaptive group ridge present similar behavior than forp= 200. There- fore, the gain of these new approaches appears to be larger when the number of predictors increases.

Compared to adaptive group ridge, group lasso was less sensitive to the chosen net- work method. Namely, all scenarios adaptive group ridge presents bad performance when using the ridge penalty approach Ha and Sun (2014) for network construction. The perfor- mance of group lasso is robust with respect to the studied network construction methods in all the studied scenarios, and close to its performance when using the true underly- ing grouping structure. Sparse group lasso provides proper results in terms of prediction ability when the clustering is known a priori, withQ2 values only slightly lower than

the corresponding values of adaptive group ridge and group lasso. However, when the grouping is estimated, its performance drops. The predictive ability appears to drop to a Q2<0.1for scenario b, which is 8 times lower than the predictive ability obtained with

a combination of graphical lasso and group lasso. The variable selection approach based on selecting hubs only provides satisfactory results when using the WGCNA method for network construction in scenario a.

4 modules 8 modules

Scenario a b c a b c

Sparse group lasso0.5 .79(.01) .51(.06) .65(0.02) .75(.02) .71(.02) .69(0.03)

Sparse group lasso0.9 .79(.01) .48(.06) .59(.03) .74(.02) .69(.04) .65(0.04)

A Priori Sparse group lasso0.1 .79(.01) .53(.06) .66(.02) .75(.02) .72(.02) .70(0.03)

Group lasso .87(.01) .53(.07) .77(.02) .84(.02) .78(.03) .81(0.02) Group ridge .94(.01) .43(.08) .69(.07) .90(.02) .73(.06) .85(0.03) Hubs .81(.03) .15(.10) .59(.11) .81(.05) .18(.13) .55(.12) Sparse group lasso0.5 .72(.12) .15(.12) .57(.15) .41(.21) .28(.19) .36(.20)

WGCNA Sparse group lasso0.9 .73(.13) .13(.22) .53(.13) .41(.22) .26(.12) .35(.19)

Sparse group lasso0.1 .69(.12) .16(.12) .58(.15) .39(.20) .29(.17) .36(.19)

Group Lasso .90(.02) .58(.07) .87(.02) .83(.04) .76(.06) .83(.04) Group ridge .78(.03) .46(.06) .62(.05) .69(.07) .61(.08) .53(.09) Hubs .52(.20) .26(.15) .51(.18) .52(.22) .45(.20) .51(.22) Sparse group lasso0.5 .69(.13) .08(.06) .45(.16) .31(.21) .22(.15) .27(.18)

Graphical lasso Sparse group lasso0.9 .68(.13) .06(.05) .42(.16) .32(.21) .19(.15) .26(.17)

Sparse group lasso0.1 .69(.13) .08(.06) .46(.16) .31(.21) .24(.15) .28(.18)

Group lasso .92(.01) .54(.08) .87(.03) .86(.03) .76(.06) .86(.03) Group ridge .93(.02) .46(.08) .61(.06) .85(.08) .71(.06) .70(.11) Hubs .52(.06) .11(.02) .47(.06) .27(.10) .22(.07) .27(.09) Sparse group lasso0.5 .77(.09) .42(.05) .67(.02) .68(.07) .63(.04) .67(.04)

Ridge penalty Sparse group lasso0.9 .79(.07) .46(.06) .61(.03) .72(.05) .66(.05) .65(.05)

Sparse group lasso0.1 .73(.08) .40(.04) .68(.02) .62(.09) .59(.03) .63(.04)

Group lasso .87(.02) .48(.06) .84(.02) .79(.04) .71(.05) .78(.03) Group ridge .67(.05) .07(.03) .69(.05) .47(.06) .32(.07) .45(.07) Lasso .88(.03) .52(.10) .73(.05) .81(.04) .74(0.06) .79(0.05) Common Ridge .67(.05) .07(.03) .59(.06) .46(.06) .55(0.04) .70(0.03) Elastic net .96(.04) .74(.26) .79(.20) .87(.02) .81(.04) .89(.02) Table 5.3: Simulation study. Results obtained in terms of averageQ2 (across 500 replicates) for scenarios

a,b,c, p=200 variables, k=4 and k=8 modules, and n=50 individuals. Standard errors are given in brackets. The first column represents the method used to build the network. A Priori represents the situation were the true clustering of the predictors is known and no network analysis is performed.

5.3 Simulation Study 75

4 modules 8 modules

Scenario a b c a b c

Sparse group lasso0.5 .80(.002) .64(.02) .63(.03) .77(.016) .69(.036) .69(.030)

Sparse group lasso0.9 .80(.001) .56(.036) .54(.047) .76(.019) .62(.047) .67(.056)

A Priori Sparse group lasso0.1 .80(.002) .66(.026) .66(.032) .77(.016) .70(.033) .72(.025)

Group lasso .89(.003) .76(.021) .71(.046) .87(.011) .81(.022) .84(.016) Group ridge .97(.011) .65(.076) .55(.083) .95(.018) .87(.033) .78(.065) Hubs .87(.026) .48(.12) .45(.324) .45(.324) .13(.127) .08(.088) Sparse group lasso0.5 .74(.143) .61(.098) .57(.153) .43(.244) .36(.206) .32(.221)

WGCNA Sparse group lasso0.9 .74(.147) .54(.090) .53(.138) .44(.252) .35(.193) .29(.223)

Sparse group lasso0.1 .70(.134) .62(.098) .58(.155) .40(.227) .34(.196) .32(.205)

Group lasso .94(.01) .85(.031) .87(.027) .88(.036) .79(.043) .78(.058) Group ridge .80(.037) .59(.061) .62(.059) .70(.067) .50(.088) .62(.096) Hubs .52(.054) .55(.054) .21(.039) .42(.059) .46(.063) .43(.050) Sparse group lasso0.5 .79(.032) .54(.110) .12(.08) .46(.251) .32(.202) .34(.185)

Graphical lasso Sparse group lasso0.9 .79(.030) .49(.122) .09(.075) .46(.249) .30(.191) .30(.195)

Sparse group lasso0.1 .79(.030) .56(.111) .13(.083) .46(.254) .32(.208) .37(.180)

Group lasso .96(.01) .81(.039) .61(.084) .93(.023) .83(.044) .82(.054) Group ridge .96(.02) .61(.062) .59(.075) .81(.127) .66(.106) .75(.069) Hubs .02(.052) .07(.064) .01(.028) .04(.069) .05(.075) .05(.060) Sparse group lasso0.5 .59(.245) .57(.163) .13(.14) .69(.137) .62(.140) .59(.136)

Ridge penalty Sparse group lasso0.9 .70(.186) .49(.148) .13(.149) .72(.132) .59(.136) .60(.147)

Sparse group lasso0.1 .47(.254) .59(.164) .13(.127) .59(.139) .58(.130) .53(.116)

Group lasso .91(.031) .79(.029) .42(.065) .82(.053) .75(.042) .70(.059) Group ridge .75(.07) .63(.078) .10(.055) .53(.097) .48(.11) .37(.10) Lasso .91(.016) .59(.060) .51(.080) .87(.035) .68(.065) .70(.074) Common Ridge .80(.028) .73(.037) .26(.046) .66(.041) .63(.044) .539(.050)

Elastic net .92(.015) .54(.089) .60(.057) .87(.032) .69(.067) .68(.065) Table 5.4: Simulation study. Results obtained in terms of averageQ2(across 500 replicates) for scenarios a,b,c,

p=1000 variables, k=4 and k=8 modules, and n=50 individuals. Standard errors are given in brackets. The first column represents the method used to build the network. A Priori represents the situation were the true clustering of the predictors is known and no network analysis is performed.

Variable selection

Finally, we investigated the variable selection properties of the best performing (in terms of predictive ability) three-step procedures. Figs 5.3 and 5.4 show for scenario a, k = 4,p= 200andn = 100the variable selection properties of adaptive group ridge and group lasso in combination with WGCNA and graphical lasso, respectively. In both cases, the performance of lasso and elastic net is also shown. For each method, each boxplot shows for each of thepvariables ofXthe distribution of the average estimated regression coefficients over the 10 fold cross-validation folds for each of theM = 500

Monte Carlo trials. The true simulated regression coefficients are also shown (red dots). Complete results for all scenarios are presented in the S2 File, Figure A to Figure R.

These results show that our three step approaches perform well in terms of specific regression coefficient estimation and variable selection. The four investigated approaches given by the combination of WGCNA and graphical lasso with adaptive group ridge and group lasso clearly separate informative from non-informative variables. In contrast, lasso regression, especially in scenario a, shows a very poor performance. The mean estimated coefficients by lasso for allpvariables are close to zero, while the variability is very high for the features with non-zero effects, reflecting that lasso randomly selects a few of the informative variables and assigns a very large effect to them. To a lesser extent, the same phenomenon is also observed for elastic net. Even if the mean estimate for informative variables is larger and variability is lower than for lasso, the overall performance of elastic net is inferior to our three-step methods based on including grouping information.

Fig 5.3 top panel shows that the combination of WGCNA and group lasso tends to overestimate the effect of the variables belonging to the second cluster of variables. This is due to the underestimation of the number of clusters by WGCNA and the joint penal- ization of group lasso. Interestingly, adaptive ridge is less affected by this issue. When using graphical lasso as network analysis method, the first informative group of variables is clearly separated from the rest, and the estimation is close to the theoretical one (Fig 5.4).

5.3 Simulation Study 77

Figure 5.3: Simulation study: Variable selection results with WGCNA. Variable selection results for scenario a,k= 4,p= 200, andn= 100. Box-plots of the absolute values of the estimated parameters for the 200 variables over the 500 simulated datasets are plotted. The red points represent the absolute average true values over the 500 datasets.

Figure 5.4: Simulation study: Variable selection results with Graphical Lasso. Variable selection results for scenario a,k= 4,p= 200, andn= 100. Box-plots of the absolute values of the estimated parameters for the 200 variables over the 500 datasets simulated are plotted. The red points represent the absolute average true values over the 500 datasets.