• No results found

For analyzing complex disease longitudinal data, careful modeling temporal depen- dency can help reveal disease progression mechanisms and timely intervene to help delay and prevent disease onset. It is definitely an important direction for future reseach.

There can be several possible improvements for longitudinal data analysis. From our experimental results, we have not been able to figure out significant time-associated

patterns. One possible explanation may be that the temporal dependency with respect to the disease development may not be as strong as the other interactive effects identified in this specific dataset. On the other hand, only the rules with dynamic features having ob- servations at the same time points are grouped but more general temporal dependency and continuity at neighboring time points may need to be considered. For example, the feature

bmi_03 should be more correlated to bmi_06 rather than to bmi_24. Such dependency rela- tionships may need to be integrated into the future rule-based discovery models. With the flexibility of overlapped group LASSO, we can establish a complex interaction network by adding more general dependency relationships, which may lead to more interesting pat- terns and association findings related to disease progression. We will study these potential models in our future research.

REFERENCES

[1] M. A. Brown, L. G. Kennedy, A. J. Macgregor, C. Darke, E. Duncan, J. L. Shatford, A. Taylor, A. Calin, and P. Wordsworth, “Susceptibility to ankylosing spondylitis in twins the role of genes, hla, and the environment,” Arthritis & Rheumatism, vol. 40, no. 10, pp. 1823–1828, 1997.

[2] D. A. Di Monte, “The environment and parkinson’s disease: is the nigrostriatal sys- tem preferentially targeted by neurotoxins?,” The Lancet Neurology, vol. 2, no. 9, pp. 531–538, 2003.

[3] R. Brookmeyer, S. Gray, and C. Kawas, “Projections of alzheimer’s disease in the united states and the public health impact of delaying disease onset.,” American Jour-

nal of Public Health, vol. 88, no. 9, pp. 1337–1342, 1998.

[4] T. Welborn, M. Knuiman, V. McCann, K. Stanton, and I. Constable, “Clinical macrovascular disease in caucasoid diabetic subjects: logistic regression analysis of risk variables,” Diabetologia, vol. 27, no. 6, pp. 568–573, 1984.

[5] Y. Lin, X. Qian, J. Krischer, K. Vehik, H.-S. Lee, and S. Huang, “A rule-based prog- nostic model for type 1 diabetes by identifying and synthesizing baseline profile pat- terns,” PloS one, vol. 9, no. 6, p. e91095, 2014.

[6] J. H. Friedman and B. E. Popescu, “Predictive learning via rule ensembles,” The

Annals of Applied Statistics, pp. 916–954, 2008.

[7] J. Friedman, T. Hastie, and R. Tibshirani, “A note on the group lasso and a sparse group lasso,” ArXiv Preprint ArXiv:1001.0736, 2010.

[8] B. P. Tabaei and W. H. Herman, “A multivariate logistic regression equation to screen for diabetes development and validation,” Diabetes Care, vol. 25, no. 11, pp. 1999– 2003, 2002.

[9] J. Friedman, “Rulefit with r,” 2005.

[10] S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier methodology,” 1990.

[11] L. Hyafil and R. L. Rivest, “Constructing optimal binary decision trees is np- complete,” Information Processing Letters, vol. 5, no. 1, pp. 15–17, 1976.

[12] L. Breiman, “Technical note: Some properties of splitting criteria,” Machine Learn-

[13] A. A. Al Jarullah, “Decision tree discovery for the diagnosis of type ii diabetes,” in Innovations in Information Technology (IIT), 2011 International Conference on, pp. 303–307, IEEE, 2011.

[14] C. W. Olanow, R. L. Watts, and W. C. Koller, “An algorithm (decision tree) for the management of parkinson’s disease (2001): treatment guidelines,” Neurology, vol. 56, no. suppl 5, pp. S1–S88, 2001.

[15] Y. Freund, R. E. Schapire, et al., “Experiments with a new boosting algorithm,” in

Icml, vol. 96, pp. 148–156, 1996.

[16] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996.

[17] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. [18] M. Khalilia, S. Chakraborty, and M. Popescu, “Predicting disease risks from highly

imbalanced data using random forest,” BMC Medical Informatics and Decision Mak-

ing, vol. 11, no. 1, p. 1, 2011.

[19] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal

Statistical Society. Series B (Methodological), pp. 267–288, 1996.

[20] M. Haghighi, S. B. Johnson, X. Qian, K. F. Lynch, K. Vehik, S. Huang, T. S. Group,

et al., “A comparison of rule-based analysis with regression methods in understanding

the risk factors for study withdrawal in a pediatric study,” Scientific Reports, vol. 6, 2016.

[21] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” Jour-

nal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 67, no. 2,

pp. 301–320, 2005.

[22] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, “Sparsity and smooth- ness via the fused lasso,” Journal of the Royal Statistical Society: Series B (Statistical

Methodology), vol. 67, no. 1, pp. 91–108, 2005.

[23] M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped vari- ables,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 68, no. 1, pp. 49–67, 2006.

[24] L. Yuan, J. Liu, and J. Ye, “Efficient methods for overlapping group lasso,” in Ad-

vances in Neural Information Processing Systems, pp. 352–360, 2011.

[25] A. Liaw and M. Wiener, “Classification and regression by randomforest,” R News, vol. 2, no. 3, pp. 18–22, 2002.

[26] J. Liu, S. Ji, and J. Ye, SLEP: Sparse learning with efficient projections. Arizona State University, 2009.

[27] B. Muktabhant, P. Sanchaisuriya, M. Trakulwong, R. Mingchai, and F. P. Schelp, “A first-degree relative with diabetes mellitus is an important risk factor for rural thai villagers to develop type 2 diabetes mellitus,” Asia-Pacific Journal of Public Health, p. 1010539514555861, 2014.

[28] K. V. Narayan, J. P. Boyle, T. J. Thompson, E. W. Gregg, and D. F. Williamson, “Effect of bmi on lifetime risk for diabetes in the us,” Diabetes Care, vol. 30, no. 6, pp. 1562–1566, 2007.

[29] Y. Park, C. Wang, K. Ko, S. Yang, M. Park, M. Yang, and J.-X. She, “Combinations of hla dr and dq molecules determine the susceptibility to insulin-dependent diabetes mellitus in koreans,” Human Immunology, vol. 59, no. 12, pp. 794–801, 1998.

APPENDIX A

MISCELLANEOUS

Related documents