Risk prediction models for dementia constructed
by supervised principal component analysis using
miRNA expression data
Daichi Shigemizu
1,2,3,4, Shintaro Akiyama
1, Yuya Asanomi
1, Keith A. Boroevich
3, Alok Sharma
3,4,5,6,
Tatsuhiko Tsunoda
2,3,4, Kana Matsukuma
7, Makiko Ichikawa
7, Hiroko Sudo
7, Satoko Takizawa
7,
Takashi Sakurai
8,9, Kouichi Ozaki
1,3, Takahiro Ochiya
10,11& Shumpei Niida
1Alzheimer’s disease (AD) is the most common subtype of dementia, followed by Vascular Dementia (VaD), and Dementia with Lewy Bodies (DLB). Recently, microRNAs (miRNAs) have received a lot of attention as the novel biomarkers for dementia. Here, using serum miRNA expression of 1,601 Japanese individuals, we investigated potential miRNA bio-markers and constructed risk prediction models, based on a supervised principal component analysis (PCA) logistic regression method, according to the subtype of dementia. The final risk prediction model achieved a high accuracy of 0.873 on a validation cohort in AD, when using 78 miRNAs: Accuracy=0.836 with 86 miRNAs in VaD; Accuracy=0.825 with 110 miRNAs in DLB. To our knowledge, this is the first report applying miRNA-based risk prediction models to a dementia prospective cohort. Our study demonstrates our models to be effective in prospective disease risk prediction, and with further improvement may contribute to practical clinical use in dementia.
https://doi.org/10.1038/s42003-019-0324-7 OPEN
1Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi 474-8511, Japan.2Department of Medical Science Mathematics,
Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo 113-8510, Japan.3RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.4CREST, JST, Tokyo 102-8666, Japan.5School of Engineering & Physics, University of the South Pacific, Suva, Fiji.
6Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, QLD 4111, Australia.7Toray Industries, Inc., Kamakura, Kanagawa 248-0036,
Japan.8The Center for Comprehensive Care and Research on Memory Disorders, National Center for Geriatrics and Gerontology, Obu, Aichi 474-8511, Japan.9Department of Cognitive and Behavioral Science, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan.
10Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo 104-0045,
Japan.11Institute of Medical Science, Tokyo Medical University, Tokyo 160-8402, Japan. Correspondence and requests for materials should be addressed to
D.S. (email:[email protected])
123456789
W
ith an increasingly aging global human population, the number of people with dementia is rapidly increasing, and is estimated to reach 75 million by 2030 and135 million by 2050, worldwide1. Since dementia is a clinical
syndrome that leads to difficulties in daily activities involving memory, language and behavior, this rapid increase raises a
substantial burden for medical care and public health systems2.
On the other hand, there is no current cure for this disease, and the available treatments are only able to postpone the
progres-sion3. Therefore, identification of new biomarkers for earlier
diagnosis and therapeutic intervention of the disease is promptly
required4.
The diagnosis of dementia is generally based on the patients’
cognitive function5. Alzheimer’s disease (AD) is the most
com-mon subtype of dementia, followed by vascular dementia (VaD),
and dementia with Lewy bodies (DLB)1. While recent studies
have showed that three proteins in the cerebrospinalfluid (CSF):
amyloid-beta 1–42 (Aβ142), total tau (T-tau) and phosphorylated
tau 181 (P-tau181), could be effective in characterizing AD6,7, it is
still challenging to use these CSF molecules as biomarkers in general physical examination for early diagnosis and therapeutic intervention due to the highly invasive collection process. In addition, new imaging-based techniques, including positron emission tomography scans for detection of amyloid-beta deposition or tau tracers, and the volumetric magnetic reso-nance imaging with determination of hippocampal or medial temporal lobe atrophy, are not suitable for initial screening due to
the high cost performance8–10. It has also been reported that
microRNAs play a key role in the control of glial cell development
in the central nervous system11. Therefore, the present study is
evaluated on the hypothesis that neurite and synapse destruction, associated with pathologic of dementia and other neurodegen-erative diseases, can be detected in vitro by quantitative analysis
of brain-enriched cell-free microRNA in the human blood5.
MicroRNAs (miRNAs) are approximately 22-nucleotide small non-coding RNAs, which have been shown to regulate gene expression by binding to complementary regions of messenger transcripts. The alteration of some miRNAs expression has recently been found in neurons of patients with AD and other
neurodegenerative diseases12–14, and hence miRNAs are expected
to be useful as easily accessible and non-invasive biomarkers15.
Here, we performed a comprehensive miRNA expression analysis using 1601 serum samples, composed of dementia patients and individuals with cognitive normal function (referred to as normal controls (NC)), in order to investigate new bio-markers for earlier diagnosis and therapeutic intervention and to construct risk prediction models using the biomarkers. We applied 10-fold cross-validation to a discovery cohort of 1092 individuals, separated from a validation cohort of 1089 indivi-duals. We performed a two-step procedure similar to those used
for risk prediction in several previous disease studies16–19. We
first selected effective miRNA biomarker candidates in the logistic regression risk prediction models. Using the pre-selected miRNAs
and the principal component scores (PC scores), we then con-structed risk prediction models based on a supervised principal component analysis (PCA) logistic regression method. Finally, we determined the optimal miRNA and PC score set though
cross-validation. Thisfinal risk prediction model, constructed based on
the entire discovery cohort, was evaluated with an independent validation cohort by the area under the receiver operating char-acteristic curve (AUC). We further evaluated the predictive ability
of our model using a prospective cohort. Our findings indicate
that the prediction models using serum miRNA expression data may be useful as biomarkers for dementia and contribute to the development of future therapeutic measurement for this common but serious disorder.
Results
Japanese samples. We divided 1601 Japanese individuals (1021 AD cases, 91 VaD cases, 169 DLB cases, 32 mild cognitive impairment (MCI), and 288 NC) into a discovery cohort of 786 individuals (511 AD cases, 46 VaD cases, 85 DLB cases and 144 NC) and a validation cohort of 783 individuals (510 AD cases, 45 VaD cases, 84 DLB cases, and 144 NC) (see Materials and methods). The separation was performed to result in a similar distribution in the age between the discovery and validation
cohorts for each disease (Table1).
Construction of risk prediction models. Our risk prediction models were constructed based on a supervised PCA logistic regression method. All approaches that we considered were
car-ried out on datasets of theppre-selected miRNAs (p≤2562). The
selection of miRNAs was carried out based on thez-value in the
logistic regression. Nine-tenths of entire training set was used for
the calculation of thez-values and tofit the model for each
cross-validation step. The adjusted model was evaluated using the remaining one-tenth of the training set. This process was repeated
10 times (10-fold cross-validation). The cutoff value Tof the z
-values was then raised from 0.1 to 5.0 at an interval 0.1. The
number of top PC scores used,m, was set from 1 to 10 (Fig.1).
On the basis of the average AUC, we investigated all
combina-tions of theT andm, and in AD, the combination of (T,m)=
(4.5, 10) achieved the highest AUC of 0.877 in the discovery
cohort. In VaD, a (T, m)=(4.0, 10) achieved an AUC=0.923,
and in DLB, a (T,m)=(3.4, 9) achieved an AUC=0.885 (Fig.2).
Final risk prediction models were constructed based on the
optimalTandmdetected in each disease using the entire training
set (discovery cohort). The adjusted models were then evaluated on the validation cohort, which was completely independent from the discovery cohort. As a result, 78 miRNAs out of 2562 were
employed for thefinal model construction in AD, which achieved
an AUC of 0.874 in the validation cohort (Fig.3a). Of the 78, two
miRNAs (MIMAT0004947 and MIMAT0022726) were
AD-specific miRNAs reported in previous studies20. The remaining
previously reported miRNAs did not show significantly better Table 1 Average age, sex and APOE information in the discovery and validation cohorts
Discovery cohort Validation cohort
Phenotype #Sample Age Sex (Male) APOEa #Sample Age Sex (Male) APOEa
AD 511 79.2 0.29 0.53 510 79.2 0.31 0.47
VaD 46 79.0 0.63 0.33 45 79.1 0.56 0.18
DLB 85 79.5 0.45 0.34 84 79.5 0.36 0.30
NC 144 71.7 0.49 0.22 144 71.8 0.56 0.15
APOEapolipoprotein E,ADAlzheimer’s disease,VaDvascular dementia,DLBdementia with Lewy bodies,NCnormal control aAPOE shows the average of the number of APOEε
outcome in logistic regression in the selection of miRNAs (Sup-plementary Data 1). A maximum average sensitivity and speci-ficity of the ROC curve was achieved at a sensitivity of 0.933 and specificity of 0.660 in AD. The accuracy showed 0.873 when the
prognostic index was 0.281 (Table2). In a similar way, 86
miR-NAs and 110 miRmiR-NAs were employed for our final model
con-struction in VaD (Fig. 3b) and DLB (Fig. 3c), which achieved
AUCs of 0.867 and 0.870 in the validation cohort, respectively. A maximum average sensitivity and specificity of the ROC curve was achieved at a sensitivity of 0.733 and specificity of 0.868 in VaD and at a sensitivity of 0.762 and specificity of 0.861 in DLB. The accuracies in VaD and DLB were 0.836 and 0.825 when the
prognostic index was−0.761 and 0.0392, respectively (Table2).
We also constructed risk prediction models based on the logistic regression method using only clinical information (age, sex and apolipoprotein E (APOE)) for the entire training data set. The adjusted models were then evaluated on the validation cohort. The AUCs achieved were 0.857 in AD, 0.813 in VaD and
0.827 in DLB. Ourfinding’s miRNAs contributed to an increase
of AUCs for the risk prediction models. We further compared our two-step method with a one-step penalized regression method, LASSO (least absolute shrinkage and selection operator). We constructed risk prediction models based on the LASSO method using all miRNAs in the entire training data set. The adjusted models were then evaluated on the validation cohort. The AUCs achieved were 0.898 in AD, 0.821 in VaD and 0.892 in DLB. For AD and DLB, the one-step penalized regression method showed similar AUCs to our two-step method, but the penalized method
showed a lower AUC in VaD than our method (LASSO=0.821,
our method=0.867).
Effective miRNAs and the functional gene annotations. The
number of miRNAs used for final risk prediction models were
78, 86 and 110 in AD, VaD and DLB, respectively (Supplemen-tary Data 2). We next examined the common and disease-specific miRNAs among these three diseases. A large number of miRNAs
were shared between VaD and DLB (32 miRNAs) and among
all three (31 miRNAs) (Fig.4a). AD possessed the most
disease-specific miRNAs compared to the other diseases (AD; 29/78=
0.371, DLB; 34/110=0.309, VaD; 18/86=0.209) (Fig. 4a).
Overall, miRNAs regulate the expression of thousands of protein-coding gene targets (mRNAs) at both post-transcriptional
and translational levels21–23. To determine the biological
significance of ourfindings (miRNAs), we examined microRNA
Target Prediction and Functional Study Database (miRDB24),
which can predict miRNA functional target genes. The 78 miRNAs in AD were predicted to target 1755 genes. In the similar way, the 86 miRNAs in VaD and 110 miRNAs in DLB were predicted to target 2017 and 2521 genes, respectively. Compared with miRNAs, a large number of mRNAs were shared among
three diseases (miRNAs: 31/162=0.191, mRNAs: 960/3370=
0.285) (Fig.4a, b).
Functional modules using co-expression network analysis. Since we detected several candidate gene targets in the three diseases, we next attempted to elucidate functional modules from the candidates. We focused on the occurrence of hub genes, which have relationships with many genes, through large-scale gene co-expression network analysis. The gene co-expression
information was gathered from the COXPRESdb database25(see
Materials and methods). Gene co-expression network
visualiza-tion was performed using Cytoscape software26. Three hub genes,
which co-expressed with >25 genes, were detected in the
func-tional modules (EXOC5,DDX3X and YTHDF3, Fig. 5).EXOC5
was associated with AD and VaD, and the remaining two genes were common among the three diseases. These three genes were also verified to express in brain tissue through the
Genotype-Tissue Expression (GTEx) project27,28.
Validation in a prospective cohort. We measured miRNA expression for 32 MCI subjects, which were obtained from the
PC1 PC2 miRNA1 miRNA2 miRNA3 miRNA1 miRNA3 2. PCA |z-value| > T
1. For each miRNA
3. m PC scores 4. Evaluation of T and m using cross-validation logit (Pi) = 1PC1 + ··· + mPCm, where PCi = l1x1 + ··· + lnxn 0 –1 0 t 1 1 x
Fig. 1Workflow of our risk prediction model construction with supervised principal component analysis (PCA) logistic regression method. We calculated
z-value in the logistic regression method for each microRNA (miRNA). The cutoff value of thez-value,Tand the number of miRNAs,n, was pre-selected (1). The PCA was performed using the pre-selected miRNAs (2). The risk prediction models were constructed based on the combination of the miRNAs andmPC scores (3). This optimal parameter set (T,m) was determined in the discovery cohort using 10-fold cross-validation (4)
prospective data, of which 10 subjects converted to AD after at least 6 months. We evaluated if our risk prediction model in AD could predict the converted subjects after 6 months. Prognosis indices (PIs) assigned to each subject were calculated by applying 78 miRNA expression values to our prediction model. A PI score
greater than 0.281 predicted the subject would convert to AD
(Table 2). As a result, all of the 10 converted subjects were
cor-rectly predicted by our model (sensitivity=1.0). Furthermore,
all 4 subjects predicted not to covert to AD did not actually
convert to AD (negative predictive value=1.0). The remaining
a b c
False positive rate False positive rate
True positive rate
False positive rate
True positive rate
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 AD VaD DLB
Fig. 3The ROC curves of our risk prediction models in a validation cohort. Final risk prediction models were constructed based on the supervised principal component analysis (PCA) logistic regression method in each disease using the complete discovery cohort. The adjusted models were then evaluated on the validation cohort. Thefinal model construction in AD achieved an AUC of 0.874 in the validation cohort (a): AUC=0.867 in VaD (b); AUC=0.870 in DLB (c). Sensitivity and specificity were maximized at a sensitivity of 0.933 and specificity of 0.660 in AD (a): a sensitivity of 0.733 and specificity of 0.868 in VaD (b); and a sensitivity of 0.762 and specificity of 0.861 in DLB (c). ROC receiver operating characteristic, AUC area under the curve, AD Alzheimer’s disease, VaD vascular dementia, DLB dementia with Lewy bodies
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 |Z-value| AUC #PC score #PC score #PC score AD a b c VaD DLB PC10 0.8 0.81 0.82 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.88 0.79 0.8 0.81 0.83 0.84 0.86 0.87 0.88 0.9 0.91 0.92 0.84 0.84 0.85 0.85 0.86 0.86 0.86 0.87 0.87 0.87 0.88
Fig. 2Risk prediction models using 10-fold cross-validation on a discovery cohort. Thexandyaxes show the cutoff value of thez-value in the logistic regression (T) and the number of top principal component (PC) scores used (m) for the prediction models, respectively. In AD (a), a combination of (T,m)=(4.5, 10) achieved the highest AUC of 0.877 in the discovery cohort. In VaD (b), a (T,m)=(4.0, 10) achieved an AUC=0.923, and in DLB (c), a (T,m)=(3.4, 9) achieved an AUC=0.885. AUC area under the curve, AD Alzheimer’s disease, VaD vascular dementia, DLB dementia with Lewy bodies
18 subjects were predicted to convert to AD, but had not yet
converted (specificity=0.18) (Table 3). For further validation
of this discordance, we may have to follow-up with the subjects in the future and use the additional comprehensive information, including genetic data and/or whole transcriptome data, for fur-ther improvement for practical clinical use in dementia. Discussion
New biomarkers for early diagnosis and intervention have been
examined in many diseases29–31. The role of serum miRNAs has
recently been reviewed with emphasis on their impact on the
etiopathogenesis of sporadic AD32 and cancers21–23,33,34. For
example, dysregulated serum miRNAs, such as the down-regulation of miR-137, miR-181c, miR-9, and miR-29a/b in the
blood of AD patients, have been identified35–38. It has also been
reported that expressional differences between AD and other
dementia types were observed for some miRNAs39. However,
due to the small sample sizes in these previous reports, com-prehensive miRNA expression analysis had not been performed in AD or the other subtypes of dementia. Therefore, in this study we investigated biomarkers with respect to each subtype of dementia, using serum miRNAs and a larger sample size.
We first detected optimal parameters for risk prediction
models using cross-validation of a discovery cohort. The final
models were then constructed with the optimal parameters using
the complete discovery cohort. The adjusted models werefinally
evaluated on an independent validation cohort using AUC as the discriminative accuracy of these risk prediction models. In general, these risk prediction models on cross-validation of the discovery cohort achieved higher AUCs than the adjust models
on the validation cohort18,40. The difference of the AUCs is due
to overfitting of the model construction criterion. However, our risk models showed only small differences between discovery
and validation cohorts in the AUCs (AD=0.877 and 0.874,
VaD=0.923 and 0.867, and DLB=0.885 and 0.870). These
results imply the miRNAs used in our models were efficient to classify disease samples and non-disease samples, although additional replication studies are necessary in future work. We also constructed the risk prediction models using a larger
max-imum value of PC scores (m=50). In AD and DLB, the AUCs
of the final models were slightly increased in a validation
cohort in the combination of (T, m)=(3.6, 41) in AD and that
of (T, m)=(3.2, 12) in DLB, compared with those in m=10
(AD= +0.007, DLB= +0.002, Supplementary Table 1), but that
in VaD was considerably decreased in the combination of
(T,m)=(3.8, 13) (VaD=−0.015, Supplementary Table 1). For
all, a larger number of miRNAs were required for thefinal model
construction: m=50 (AD, VaD, DLB)=(171, 134, 143)
miR-NAs, m=10 (AD, VaD, DLB)=(78, 86, 110) miRNAs
(Sup-plementary Table 1). When considering prediction models with a low number of biomarkers, our approaches would be efficient for optimal risk prediction models. We further compared our final models using pre-selected miRNAs to those using all miR-NAs. Our models using pre-selected miRNAs had superior AUCs
to those using all miRNAs in all three diseases for bothm=10
andm=50 (Supplementary Table 2). Investigations using larger
sample sizes will lead to further improvement in the performance of risk prediction models.
The annotation of gene targets for miRNAs is critical for
functional characterization of ourfindings. We used miRDB24for
these functional gene annotation from miRNAs. A large number of genes associated with the dementia was detected. We further elucidated three functionally important modules (i.e. hub genes,
EXOC5, DDX3X and YTHDF3) through large-scale gene co-expression network analysis. These three genes were verified to
express in brain tissue through the GTEx project27,28. Jun et al.41
have reported that a single-nucleotide polymorphism (SNP)
in theEXOC5showed evidence for association with AD.DDX3X,
the DEAD (Asp-Glu-Ala-Asp) box helicase 3, X-linked, belongs to ATP-dependent RNA helicase, the activation of which is
associated with cancer in many tissues, including brain42–44.
Previous studies have reported that DDX3X expression level is
positively correlated with poor survival outcome in human
glioma45. Also, several studies have reported that YTHDF
protein could be associated with accumulation of m6A-modified
transcripts46, and this m6A mRNA modification is critical for
glioblastoma stem cell self-renewal and tumorigenesis47.
Fur-thermore, recent transcriptomic meta-analyses revealed that AD and glioblastoma patients had similar expression patterns in a
number of genes48. These observations support the existence
of molecular substrates that could partially account for direct
co-morbidity relationships49–51. These results suggest that the
three hub genes detected could not only play a key role in pathogenesis of dementia, but also contribute to discovery of novel drug targets.
The diagnosis of dementia is not always consistent with brain
pathological changes52,53. Also, elderly dementia patients often
have concomitant cerebrovascular disease pathologies as well as
other concomitant neurodegenerative disease pathologies54. We
proposed a methodology that finds the best risk prediction
model for each disease rather than a general model that could be applied to any data set. Our proposed models might be able to differentiate these complex neurological disorders. However, further refinement of this methodology will be required before its practical use in healthcare. One way may be to consider genetic variations, such as SNPs and insertions and deletions (indels) AD VaD DLB 31 13 5 29 32 18 34 AD VaD DLB 960 250 122 423 631 304 680 a b
Fig. 4Effective microRNAs (miRNAs) and genes used in risk prediction model.aThe number of miRNAs used forfinal risk prediction models were 78, 86 and 110 in AD, VaD and DLB, respectively. The pie-chart showed the common and disease-specific miRNAs among these three diseases.bUsing microRNA Target Prediction and Functional Study Database (miRDB), which can predict miRNA functional target genes, the 78 miRNAs in AD were predicted to target 1755 genes. The 86 miRNAs in VaD and 110 miRNAs in DLB were predicted to target 2017 and 2521 genes, respectively. The pie-chart showed the common and disease-specific target genes among these three diseases. AD Alzheimer’s disease, VaD vascular dementia, DLB dementia with Lewy bodies
Table 2 Accuracy estimation in three diseases using the validation cohort
Disease PI cutoff Accuracy Sensitivity Specificity AD 0.281 0.873 0.933 0.660 VaD −0.761 0.836 0.733 0.868 DLB 0.0392 0.825 0.762 0.861
PIprognostic index,ADAlzheimer’s disease,VaDvascular dementia,DLBdementia with Lewy bodies
and gene expressions. The development of next-generation sequencing technology has facilitated comprehensive analysis of these genetic and expression data. There is no doubt that these additional data would contribute to further improvement of our risk prediction models.
Materials and methods
Ethics statements. This study was approved by the ethics committee of the National Center for Geriatrics and Gerontology (NCGG). The design and per-formance of current study involving human subjects were clearly described in a research protocol. All participants were voluntary and completed informed consent in writing before registering to NCGG Biobank.
Clinical samples. All 1601 serum subjects and the associated clinical data were distributed from the NCGG Biobank, which collects human biomaterials and data for geriatrics research. Of them, 1021 subjects were patients with AD: 91 patients with VaD, 169 patients with DLB, 32 patients with MCI and 288 subjects were normal controls with normal cognitive function (NC). NCs who had subjective cognitive complaints, but normal cognition on the neu-ropsychological assessment, were categorized as normal controls. The AD and MCI subjects were diagnosed with a probable or possible AD based on the criteria of the National Institute on Aging Alzheimer’s Association workgroups55,56.
We used the probable ADs as AD subjects in this study. The VaD and DLB subjects were diagnosed based on the criteria of report of the NINDS-AIREN International Workshop57and fourth report of the DLB Consortium58,
respec-tively. The diagnosis of all subjects was conducted based on medical history, physical examination and diagnostic tests, neurological examination, neu-ropsychological tests and brain imaging with magnetic resonance imaging or computerized tomography by experts including neurologists, psychiatrists, geria-tricians or a neurosurgeon, all experts in dementia who are familiar with its diagnostic criteria. Comprehensive neuropsychological tests included Mini-Mental State Examination (MMSE), Alzheimer’s Disease Assessment Scale Cognitive Component Japanese version, Logical Memory I and II from the Wechsler Memory Scale–Revised, frontal assessment battery, Raven’s colored progressive matrices and Geriatric Depression Scale59. If necessary, dopamine transporter imaging and
metaiodobenzylguanidine myocardial scintigraphy were performed for the diag-nosis of DLB. Pathological tests and biomarkers in cerebrospinalfluid tests were not used for the diagnosis of dementia. For all of the subjects, the status of the APOEε4allele genotype (the major genetic risk factor with AD) and the MMSE
score were obtained. All subjects were >60 years in age. All NC subjects had a MMSE score of >23. MAP3K8 FOSL2 PFKFB3 MXD1 CREM DUSP2 EDC3 TUBG1 OGFOD1 AP2B1 PSME3 IRF2BP2 ACTR1A CNIH4 ENSA C20orf24 RHOB PDAP1 DOCK5 MAFF SH3BGRL KLF4 ATF3 KLF6 OPHN1 JOSD1 PLXDC1 NAMPT BTG2 FOS DUSP1 NR4A2 DDIT3 PPP1CA ZNF117 ATF7 CDKN1A BHLHE40 PLK3 ZNF138 VEGFA DSTN ITPRIP SYNGAP1 JUND POLR2E AP1S1 APH1A RERE SEPHS1 ARF6 ABRAXAS2 JMJD6 SLC25A46 IL13RA1 TMEM245 TMEM167A IMPAD1 PGGT1B REEP3 CNIH1 G3BP1 ZMYM5 NONO KPNA1 TM9SF2 CLIC4 NFYA MAPK14 ELMOD2 YTHDF3 AP5M1 HNRNPF SRPK2 CSNK1A1 SCOC RHOA SELENOT DAZAP2 ACTR10 PDE7A BRCC3 UBE4B CERS6 RABL3 CSNK2A1 ELAVL1 SSR3 AMMECR1 INTS6 FAR1 CELF1 ZMYND11 TMED2 TMEM30A OSGIN2 YWHAZ B3GNT2 SLC4A7 CPD SRP19 ZMAT3 ATP6AP2 SORT1PTP4A1 COMMD10 HACD3 BECN1 TFCP2 PTPN12 RNF7 SAR1B PTBP3 ERLIN1 SS18 API5 WAPL SLMAP C5orf22 UTRN TRAM1 SAR1A CSMD2 GOSR2 MYBL1 DDA1 LMNB2 DSCR3 RAB3IL1 WNT10B DNASE2 SLC22A23 CFL2 POLDIP3 MFN2 RBM14 BICD1 KCTD5 NFRKB PTBP1 DUSP3 RELA SLC35A2 ERC1 ZDHHC5 MDGA1 LAMP1 FKBP14 DNAJB4 INPP5A FKBP1A MAPK1IP1L ITM2B ZFAND3 UBE3B FAM20B SURF4 KPNA6 CRKL TMED5 KCTD20 UBE3C TMEM39A PHF8 OAZ2 SZRD1 PNPLA2 PEA15 CYB5R3 TAGLN2 PATZ1 CALU GOLT1B TRRAP SLC31A1 SPCS3 CASK CORO1C CISD2 UBE2J1 CCNYL1 ZNF468 ZNF254 ZNF765 RIT1 GPI ZNF107 SEC11A CELF3 TIPRL SMIM7 ARPC5 ACTG1 CRK HNRNPC OAZ1 SUMO2 RB1 SKP1 STAU1 PGK1 TIMM17A VDAC1 ZNF430 SUB1 PAIP2 HSPD1 YBX1 FXR1 CLIP3 MAP1A FGFR2 EIF3J CLSPN TMEM151B BTBD1 DPM1 PFDN4 UBE3A LDHA PPIF HSPA9 CACYBP M6PR COPS8 MFSD6 MAP2K1 TMEM183A SRP68 ZER1 RAVER1 SCAMP4 ARHGDIA ACTN4 CARM1 TFE3 PACS1 TAPT1 LRRC59 HDGF RPS6KA4 WDTC1 HCFC1 PA2G4 HMGA1 ZYX NPLOC4 STK40 KHSRP BCL2L1 SMARCD1 DISP3 GNAI2 TBC1D22B RAB11B SSBP3 FXR2 MAP3K3 MINK1 GIGYF1 PIK3R2 URM1 CD276 SLC27A4 PTPA PPDPF UBALD1 MAU2 ATXN7L3 AGPAT1 PITHD1 SLC8A3 CAPZB ATXN2L PMLCALR DAZAP1 MIS12 SCAMP5 CAMKK2 RAB35 EFR3BABCD1 AKIRIN2 ARF3 GFPT1 GTPBP2 TBC1D10B PRKACACIC PPP1R9B PIP5K1C AXIN1 ADPRHL2 ADAMTS10 PCYT2 PDE4A FAM84A SLC39A8 MEX3D DLG4 TPD52L3 CPEB3 TRIM8 GSE1 KCNC2 JAG1 SEMA6D SLC6A2PTPRA CACUL1SEH1L WDR43 METAP1 CREBL2 MOB1B SCO1 WDR3 PAK1IP1 LRRC57 IDH3A EIF2S2 SEZ6L2 POU2F2 CD3EAP GSK3A NDC1 DHX33 LTBR TMEM151A CDC6 GINS3 MCM10 CHEK1 FEN1 SEC24A PATL1 SEC23A HSPA13 NRAS KPNA4 ARFGEF1 SLAIN2 PHTF2 ATP6V1A MAPRE1 FOPNL RAB14 PAPOLA GTF2I FNDC3A NUCKS1 GMFB SLC39A9 ABHD17B LIN7C ACTR2 RAB6A PRPF4B EIF5TBL1XR1 CBFB OSER1 GABPB1 ZNF655 UBQLN1 TMEM33 EIF4G2 CSNK1G3 RNF38 SRPRA DDX3X BZW1 PTPN11 ARL8B FBXW11 NAA50 TOR1AIP1 CAPRIN1 MAPK1 VPS35 UFM1 SRSF1 MBNL1 ATL2 RNF6 SCYL2 STX16 GNA13 CTDSPL2 USP38 TRA2B ZNF706 SET VAPA CUL4B TRUB1 BRD4 HNRNPA0 CREB1 AGPS TM9SF3 UBE2K PAFAH1B1 MAP3K2 MFN1 KBTBD2 REST C2orf49 SRSF2 AEBP2 CMTR2 PNN SPOPL SYNCRIP PPTC7 BNIP2 PPP2R5E VIRMA RAB18 FGFR1OP2 FMR1 ATF1 FAM76B CBLL1 CEP120 ATF2 EXOC5 RAB28 TNKS2 BLZF1 ZNF644 EDEM3 UBR5 LARP4 THUMPD1 GABPA SNX13 CCPG1 MTMR6 SP3 UBR3 CHUK SPTY2D1 USP47 USP32 LENG8 CCDC43 TADA2B ARID1B CPSF7 HECTD4 SSFA2 SEC16A FAM177A1 FBXO45 KMT2D BMPR2 DCAF7 CAPN1 RPRD2 BICD2 BMT2 ZNF780A AP1G1 RPS6KB1 TATDN3 SMURF2 VEZF1 SAMD4B TXLNG FBXL5 ARID1A KANSL3 CPOX MAP4K5 SRP54 KIAA0232 SPOP PCGF3 CEP97 STK35 BNIP3L ZCCHC3 TSC1 RMND5A ZBTB41 SENP7 SELENOI FBXO28 TRIM23 SLC30A7 SYNJ1 STRN IKZF5 FAM126B SFPQ TNPO1 SPAG9 C5orf24 MIGA1 DNAJC3 ATF6 TOR1AIP2 MIER1 KDM6A YTHDF2 RHOT1 NCOA6 LNPK SPRTN HNRNPA2B1 COL4A3BP BROX ARIH1 UQCC1 ZBTB43 USP10 ZBTB6 STX6 SEL1L DCUN1D1 CDC73 FAM199X SOS2 MIER3 FAM76A MED13 RYBP COQ10B PUM1 TAF4 CCDC93 KIF2A SOCS5 WDR26 SMARCC2 KDM5A SRRM1 BAZ1A ZFR PCYOX1 KLHL12 USP9X PJA2 UBXN7 ABCB10 MED14 TADA1 TSNAX STRN3 TMEM87B SMAD5 KLHL28 ZYG11B FRS2 ZNF654 SEC22C SPIN1 INO80D PIK3CA RASL10B ADARB1 MEX3A LTBP3 UBE2J2 MLLT1 ELAVL3 RPRD1B SIRT5 ORAI2ARHGAP23 SPTBN4 PSMB2 PTMS SNRPF SLC35E2B VSX1 DCBLD2 TFRC ENY2 LANCL2 CDH15ACAA2 MGAT5B PCNP RANBP9 CGGBP1 ADAM10 EML4 AZIN1 NUDCD1 BBOF1 STX7 NACC1 RANBP1 PCMT1 GRSF1 MAPK6 UBE2G1 COPS2 MAP2K4 TBC1D23 CNBP UBE2D2 RHEB ARMC1 TCAIM KCMF1 UTP15 SLC35A5 ABCE1 ZNF267 EIF2S1 MARCH5 NCK1 ETF1 SERF2 PPP2CA QTRT2 RABEP1 SRSF3 ADO NAB1 TAB2 REEP5 ARFIP1 CNOT8 NAA30 CAB39 HIPK3 TRAPPC2 LARP1 CHD8 FAM222B TOB1 C6orf62 PIP4K2A CSNK1G2 MARK2 YIPF6 RAP2A RAB5B EDEM1 DENND4A PEX5 ATXN2 GPD2 DNAJB9 MAP3K11 PPP1R11 MGRN1 MLXIP FAM234A HRAS MPRIP NAA60 TRAFD1 DGKZ TLK2 MCFD2 GORASP2 LARP4B HNRNPU C6orf106 PPP3R1 MOCS3 ARCN1 SUPT6H PHF12 CNOT3 SDC3 PLXNA1 RNF146 DUSP11 CSDE1 FAM149B1 NIP7 PPP4R1 ARAP2 PARG EIF4E2 MTCH2 COQ3 MRPL51 CDK4 PRMT1 PDSS2 ORC4 PSMC4 VPS37A ERI1 FAM91A1 LSM6 BTF3L4 RTN4IP1 ARSK ASB1 CNOT7 FLT4 MAK16 ANAPC10 MED28 ERCC4 RAP2C ANKRD46 AGGF1 KRCC1 RNF141 MBLAC2 RP2 SEC63 ATP11C HSP90B1 LAMTOR3 PRKAA1 DCP2 PRKCI MYNN TFAM ZKSCAN8 COQ5 PHF20L1 ZNF763 COA3 COX14 PNKD EFR3A CNPY2 DYNLL1 KCTD9 THAP1 SLC35B3 PI4K2B NFYB NRBF2 YWHAQ BBS10 GOLPH3 ACSL4 UBE2W ZNF22 PEX3 CAPZA1 MYO9A DIP2B CAAP1 WDR7 PHF21A ASXL1 RALGPS2 CYLD ZFC3H1 KLHL24 NAA16 TPP2 PRDM10 ARIH2 SNRK MYCBP2 CBL CCDC88A ANKRD12 TAF5L RALBP1 RBM41 ELK4 COCH PAPD7 NSUN2 CDC42BPA USP34 GSPT1 ATF7IP MBTD1NAA25 MSANTD2 EBLN2 CDK13 STAG1 ITPKB RAP1B NF1 PPP6R3 BTG1 HAUS6 ORMDL3 RNF11 CTCF CCP110 ZC3H4 MIEN1 NFAT5 AMMECR1L RSBN1 LTN1 RAB11FIP2 SIKE1 SUCO STXBP3 SEC24B NAA15 ZNF207 FBXO33 ZNF148 PPM1A BTBD7 CUL3 ZNF12 GPBP1 CDK17 PANK3 PRPF40A EPG5 ZBTB21 PHAX KDM2A USP37 ESCO1 BAZ2A PRDM2 ZFP91 GAPVD1 BRD2 ATXN7L3B HIPK1 YPEL5 DESI2 UBE4A MARCH6 SMG1 WDR47 SPTBN1 DKC1 SIAH1 MLEC AGO3 LPGAT1 BRWD3 MED1 GPATCH8 CWC25 ITSN2 WDR37 JMY ATRX MDM4 UHRF2 YLPM1 SBNO1 ANKRD17 ATG14 HIF1AN KPNB1 TAOK1 PAN3 KMT2A SRSF5 BTAF1 UPF2 SCAF11 LUC7L3 ESF1 EZH1 KHDC4 ZRANB2 PAXBP1 AFF4 TBC1D15 BRWD1 POGZ RBM26 KAT6B ARGLU1 ING3 ZCCHC8 OGT SP4 SRSF6 UBE2D3 UBE2B DLG1 TRA2A STK4 CUL5 KIF5B ABHD13 RBM39 BIRC6 CPSF6 MYSM1 EEA1 EPM2AIP1 NR2C2 TNRC6A CHD1 XRN1 LCOR KMT2E ILF3 NR1D2 RICTOR ZNF292 ARID4B FBXO11 YTHDC1 SAMD8 AGO2 ARID4A CREBRF PIKFYVE NUFIP2 RIF1 USP42 SRSF11 RSRC2 ATXN7 UBA6 RALGAPB PPM1B MGA FNBP4 CNOT2 ASB7 PHC3 PHIP SREK1 BBX U2SURP RBMX PUM2 HNRNPA3 RNF111 SETX TARDBP RAPGEF6 PPP1R10 ELMSAN1 CREBBP WNK1 ASH1L KMT2C TMEM87A LUZP1 FMNL2 ATXN1L SEPT7 BCOR ZNF609 SNX27 AKAP13 EPC2 RCOR3 NIPBL WASL FAF2 HNRNPUL2 MACO1 AGO1 ELF1 PDPK1 EP300 SIN3A WASF2 CDK12 GCN1 PRR14L DDX6 TULP4 MSL2 TOB2 DYNC1LI2 PGAP3 HECTD1 CHD4 THRAP3 KAT6A DYRK1A USF3 BOD1L1 PHF20 PCDHGA10 MAPK8 NSF PCDHGA3 PCDHGC3 PCDHGA1 NT5C2 RALGAPA2 PNRC1 RALYL UBQLN4 DMC1 TMEM50B ZC3H7B UACA NUDT4 SOGA1 SPN FLT1 RREB1 ZNF202 PRR11 PCDHGA8 PGF ZNF440 MPZL1 ZXDC GGNBP2 PDE4C TACC1 ANKRD28 DLGAP4 OPA1 TASP1 SLF2 BTRC MSI2 PLEKHM3 RASGEF1B TSC22D2 HERC3 EIF4G3 CUL1 JARID2 LRCH1 AP3M1AKAP11 RALA FAM220A SUN1 KLHL15 SMNDC1 RLIM DPP8 STAG2 ACSL3 MITD1 OXR1 ZMYM4 KRIT1 SERINC3 CCNC ZKSCAN1 VGLL4 CUL4A STAM ANKRD13A TMEM161B PAPD4 TMEM106B CTBS CSTF3 VWA8 ZNF507 C6orf89 VPS36 C18orf25 WBP1L DPY19L3 NHLRC3 TIAL1 PHF6 PGRMC2 HDAC2 PCMTD1 DPY19L4 DOCK7 XRCC5 CCDC126 PIGA VAPB TMCC1 PPIP5K2 RNF138 CLCN3 ZFP62 STK3 ZNF302 ORC3 ZNF439 ZNF322 TNKS KMT5B TMEM260 PCNX1 SHPRH PPP1R15B GSK3B ZZZ3 TIA1 CCDC91 TMEM248 KIAA1468 ZNF189 RFX3 TAP1 PIAS2 FKTN RSBN1L AP4E1 SNX1 PSME4 NUTF2 RAP1A SOS1 TFDP2 ABI2 CLASP2 GIGYF2 KATNBL1 ZFAND6 RBM15 GXYLT1 RAB21 PDS5B FBXL3 NPAT ZBTB44 GPATCH2L PRKRA DENND1B USP24 CTNNB1 CHD6 CREBZF XPO1 ARID2 TRPM7 PURA STX17 DTX3L DCAF10 VPS4B SMAD2 SMARCAD1 VPS13A ZDHHC21 PDCD6IP TRAPPC8 IRF2 BIRC2 GPBP1L1 PPHLN1 GDAP2 ZNF518A MED13L SCAI RFX7 SENP6
Fig. 5Gene co-expression network analysis. Node size corresponds to the number of connected edges. The gene name is displayed for nodes with >25 edges. Node color corresponds to which diseases the gene is associated: AD (orange), VaD (blue), DLB (green), AD and VaD (pink), AD and DLB (yellow), VaD and DLB (purple), and all three (red). AD Alzheimer’s disease, VaD vascular dementia, DLB dementia with Lewy bodies
Table 3 Validation using the prospective cohort
Prospective cohort
Conversion MCI to AD MCI to MCI Total Prediction MCI to AD 10 18 28
MCI to MCI 0 4 4
Total 10 22 32
miRNA expression. Serum samples were isolated from whole blood following the standard operating procedure of NCGG Biobank. In brief, blood samples tubes were gently inverted a few times, put in an upright-position for at least 30 min to clot, and then centrifuged for 15 min at 3500 rpm at 4 °C. After centrifugation, serum was transferred to storage tubes containing 500μl per tube and immediately stored in−80 °C freezers. Total RNA was extracted from a 300μl serum sample using a 3D‐Gene RNA extraction reagent from a liquid sample kit (Toray Indus-tries, Inc.), as previously described34. Comprehensive miRNA expression analysis
was performed using a 3D‐Gene miRNA Labeling kit and a 3D‐Gene Human miRNA Oligo Chip (Toray Industries, Inc.), which was designed to detect 2562 miRNA sequences registered in miRBase release 21 (http://www.mirbase.org/).
The normalization of miRNA expression was performed by the following steps. Mean and standard deviation (SD) were calculated using a set of pre-selected negative control signals (background signals), the top and bottom 5% of which were removed. Signal values greater than mean+2 SD of the background signals were replaced using log2(signal–mean) and labeled effective signals. The remaining signal values were replaced by the minimum of the effective signals–0.1. Undetected signal values were replaced by the average signal of each miRNA signal. To normalize the signals across different microarrays, a set of pre-selected internal control miRNAs (miR-149-3p, miR-2861 and miR-4463), which had been stably detected in more than 500 serum samples, was used. Each miRNA signal value was standardized with the ratio of the average signal of the three internal control miRNA signals34.
Risk prediction model construction. We calculated thez-value corresponding to the miRNA in the logistic regression model in each disease (AD, VaD and DLB, Fig.1) in the following way:
logitð Þ ¼Pi α0þα1´Sexiþα2´Ageiþα3´APOEiþα4´miRNAi; Thez-value was the regression coefficient divided by its standard error. The cutoff value,T, of thez-value, andn, the number of miRNAs (n=1,…, 2562), was pre-selected (Fig.1). Next, the PCA was performed using the pre-selected miRNAs. The risk prediction models were constructed based on the combination of the miRNAs and PC scores as defined by Fig.1:
logitð Þ ¼Pi β1´PC1þ¼þβm´PCm;
where PCi=l1×x1+…+ln×xn, andxjis the normalized expression value of miRNAj. These calculations were iteratively performed for all combinations of cutoff values (T=0.1, 0.2,…, 5.0) and the top PC scores (m=1,…,10) (Fig.1). This optimal parameter set (T, m) was determined in the discovery cohort using 10-fold cross-validation. The regression method used in this study was conducted using theglmnetpackage in the statistical software R60.
Evaluation of risk prediction models. All data were strictly separated into the discovery cohort and validation cohort. An optimal parameter set (T, m) was detected using 10-fold cross-validation in the discovery cohort with respect to each disease (AD, VaD and DLB). Final models were constructed with the optimal parameter sets using the complete discovery cohort. The adjusted models were evaluated on an independent validation cohort. The receiver operator characteristic (ROC) curves61on the validation cohort and the AUC were used as the
dis-criminative accuracy of the risk prediction models. In order to further apply these final risk prediction models to prospective cohort data, we calculated prognostic index in each sample as defined by:
prognostic index¼X i
βi´PCi;
whereβis the estimated regression coefficient of each PC score using a supervised PCA logistic regression method in the discovery cohort. These optimal prognostic indices were determined using a maximum average sensitivity and specificity of the ROC curve in the discovery cohort.
Target gene annotation of miRNAs. The functional gene annotation of miRNAs was conducted using miRDB, which includes predicted gene targets regulated by a comprehensive 6709 miRNAs24. All the gene targets have a prediction score in the
range between 0 and 100 assigned by MirTarget V3, with a higher score repre-senting more statistical confidence in the prediction result. Only gene targets with the score of >90 were used as functional gene annotation for our analysis.
Gene co-expression network analysis. COXPRESdb25provides gene
co-expression relationships for 11 animal species (human, mouse, rat, monkey, dog, chicken, zebrafish,fly, nematode, budding yeast andfission yeast). For all gene pairs, Pearson’s correlation coefficients were calculated, and these values were transferred to the Mutual Rank (MR) value62, which is the geometric average of
asymmetric ranks in co-expressed gene lists. In this study, gene pairs with a MR < 20 and Pearson’s correlation coefficients > 0.4 in human were used as co-expression genes. The gene co-expression network was generated using Cytoscape v3.5.126.
Code availability. We used open source program languages R (version 3.4.1), Ruby (version 2.4.0) and Python (version 3.5.1) to analyze data and create plots. Code is available upon request from the corresponding authors.
Reporting summary. Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.
Data availability
All microarray data (2562 miRNAs) of this study are publicly available through the Gene Expression Omnibus (GEO) database at the National Center for Biotechnology Infor-mation (NCBI) and accessible through GEO series accession numberGSE120584. Datasets generated during the current study are available from the corresponding author on reasonable request.
Received: 21 May 2018 Accepted: 24 January 2019
References
1. Robinson, L., Tang, E. & Taylor, J. P. Dementia: timely diagnosis and early intervention.BMJ350, h3029 (2015).
2. Haan, M. N. & Wallace, R. Can dementia be prevented? Brain aging in a population-based context.Annu. Rev. Public Health25, 1–24 (2004). 3. Kim, D. H. et al. Genetic markers for diagnosis and pathogenesis of
Alzheimer’s disease.Gene545, 185–193 (2014).
4. Spires-Jones, T. L. & Hyman, B. T. The intersection of amyloid beta and tau at synapses in Alzheimer’s disease.Neuron82, 756–771 (2014). 5. Sheinerman, K. S. et al. Plasma microRNA biomarkers for detection of mild
cognitive impairment.Aging4, 590–605 (2012).
6. Fagan, A. M. et al. Comparison of analytical platforms for cerebrospinalfluid measures of beta-amyloid 1–42, total tau, and p-tau181 for identifying Alzheimer disease amyloid plaque pathology.Arch. Neurol.68, 1137–1144 (2011). 7. De Meyer, G. et al. Diagnosis-independent Alzheimer disease biomarker
signature in cognitively normal elderly people.Arch. Neurol.67, 949–956 (2010).
8. Mistur, R. et al. Current challenges for the early detection of Alzheimer’s disease: brain imaging and CSF studies.J. Clin. Neurol.5, 153–166 (2009). 9. Miller, G. Alzheimer’s biomarker initiative hits its stride.Science326, 386–389
(2009).
10. Schmand, B., Eikelenboom, P., van Gool, W. A. & Alzheimer’s Disease Neuroimaging Initiative. Value of neuropsychological tests, neuroimaging, and biomarkers for diagnosing Alzheimer’s disease in younger and older age cohorts.J. Am. Geriatr. Soc.59, 1705–1710 (2011).
11. Zheng, K., Li, H., Huang, H. & Qiu, M. MicroRNAs and glial cell development.Neuroscientist18, 114–118 (2012).
12. Satoh, J. MicroRNAs and their therapeutic potential for human diseases: aberrant microRNA expression in Alzheimer’s disease brains.J. Pharmacol. Sci.114, 269–275 (2010).
13. Cogswell, J. P. et al. Identification of miRNA changes in Alzheimer’s disease brain and CSF yields putative biomarkers and insights into disease pathways. J. Alzheimers Dis.14, 27–41 (2008).
14. Tacutu, R., Budovsky, A., Yanai, H. & Fraifeld, V. E. Molecular links between cellular senescence, longevity and age-related diseases - a systems biology perspective.Aging3, 1178–1191 (2011).
15. Femminella, G. D., Ferrara, N. & Rengo, G. The emerging role of microRNAs in Alzheimer’s disease.Front. Physiol.6, 40 (2015).
16. Bair, E. & Tibshirani, R. Semi-supervised methods to predict patient survival from gene expression data.PLoS Biol.2, E108 (2004).
17. Kooperberg, C., LeBlanc, M. & Obenchain, V. Risk prediction using genome-wide association studies.Genet. Epidemiol.34, 643–652 (2010).
18. Shigemizu, D. et al. The construction of risk prediction models using GWAS data and its application to a type 2 diabetes prospective cohort.PLoS One9, e92549 (2014).
19. Liang, Y. et al. Cancer survival analysis using semi-supervised learning method based on Cox and AFT models with L1/2 regularization.BMC Med. Genom.9, 11 (2016).
20. Wu, H. Z. et al. Circulating microRNAs as biomarkers of Alzheimer’s disease: a systematic review.J. Alzheimers Dis.49, 755–766 (2016).
21. Heneghan, H. M., Miller, N., Kelly, R., Newell, J. & Kerin, M. J. Systemic miRNA-195 differentiates breast cancer from other malignancies and is a potential biomarker for detecting noninvasive and early stage disease. Oncologist15, 673–682 (2010).
22. Asaga, S. et al. Direct serum assay for microRNA-21 concentrations in early and advanced breast cancer.Clin. Chem.57, 84–91 (2011).
23. Roth, C. et al. Circulating microRNAs as blood-based markers for patients with primary and metastatic breast cancer.Breast Cancer Res.12, R90 (2010).
24. Wong, N. & Wang, X. miRDB: an online resource for microRNA target prediction and functional annotations.Nucleic Acids Res.43, D146–D152 (2015).
25. Okamura, Y. et al. COXPRESdb in 2015: coexpression database for animal species by DNA-microarray and RNAseq-based expression data with multiple quality assessment systems.Nucleic Acids Res.43, D82–D86 (2015). 26. Shannon, P. et al. Cytoscape: a software environment for integrated models of
biomolecular interaction networks.Genome Res.13, 2498–2504 (2003). 27. Consortium, G. T. The Genotype-Tissue Expression (GTEx) project.Nat.
Genet.45, 580–585 (2013).
28. Consortium, G. T. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans.Science348, 648–660 (2015).
29. Fang, C. et al. Serum microRNAs are promising novel biomarkers for diffuse large B cell lymphoma.Ann. Hematol.91, 553–559 (2012).
30. Mitchell, P. S. et al. Circulating microRNAs as stable blood-based markers for cancer detection.Proc. Natl. Acad. Sci. USA105, 10513–10518 (2008). 31. Mizuno, H. et al. Identification of muscle-specific microRNAs in serum of
muscular dystrophy animal models: promising novel blood-based markers for muscular dystrophy.PLoS ONE6, e18388 (2011).
32. Maes, O. C., Chertkow, H. M., Wang, E. & Schipper, H. M. MicroRNA: implications for Alzheimer disease and other human CNS disorders.Curr. Genom.10, 154–168 (2009).
33. Zhu, W., Qin, W., Atasoy, U. & Sauter, E. R. Circulating microRNAs in breast cancer and healthy subjects.BMC Res. Notes2, 89 (2009).
34. Shimomura, A. et al. Novel combination of serum microRNA for detecting breast cancer in the early stage.Cancer Sci.107, 326–334 (2016).
35. Geekiyanage, H., Jicha, G. A., Nelson, P. T. & Chan, C. Blood serum miRNA: non-invasive biomarkers for Alzheimer’s disease.Exp. Neurol.235, 491–496 (2012).
36. Bekris, L. M. et al. MicroRNA in Alzheimer’s disease: an exploratory study in brain, cerebrospinalfluid and plasma.Biomarkers18, 455–466 (2013). 37. Kumar, P. et al. Circulating miRNA biomarkers for Alzheimer’s disease.
PLoS One8, e69807 (2013).
38. Kiko, T. et al. MicroRNAs in plasma and cerebrospinalfluid as potential markers for Alzheimer’s disease.J. Alzheimers Dis.39, 253–259 (2014). 39. Sorensen, S. S., Nygaard, A. B. & Christensen, T. miRNA expression profiles in
cerebrospinalfluid and blood of patients with Alzheimer’s disease and other types of dementia - an exploratory study.Transl. Neurodegener.5, 6 (2016). 40. Shigemizu, D. et al. The prediction models for postoperative overall survival
and disease-free survival in patients with breast cancer.Cancer Med.6, 1627–1638 (2017).
41. Jun, G. et al. Genome-wide scan suggested novel Alzheimer’s disease susceptibility genes by factoring influence of APOE.J. Alzheimer’s Assoc.7, S187 (2011).
42. Bol, G. M. et al. Expression of the RNA helicase DDX3 and the hypoxia response in breast cancer.PLoS One8, e63548 (2013).
43. Wu, D. W. et al. DDX3 loss by p53 inactivation promotes tumor malignancy via the MDM2/Slug/E-cadherin pathway and poor patient outcome in non-small-cell lung cancer.Oncogene33, 1515–1526 (2014).
44. Sun, M., Song, L., Zhou, T., Gillespie, G. Y. & Jope, R. S. The role of DDX3 in regulating Snail.Biochim. Biophys. Acta1813, 438–447 (2011).
45. Hueng, D. Y. et al. DDX3X biomarker correlates with poor survival in human gliomas.Int. J. Mol. Sci.16, 15578–15591 (2015).
46. Shi, H. et al. YTHDF3 facilitates translation and decay of N(6)-methyladenosine-modified RNA.Cell Res.27, 315–328 (2017). 47. Cui, Q. et al. m(6)A RNA methylation regulates the self-renewal and
tumorigenesis of glioblastoma stem cells.Cell Rep.18, 2622–2634 (2017). 48. Sanchez-Valle, J. et al. A molecular hypothesis to explain direct and inverse
co-morbidities between Alzheimer’s disease, glioblastoma and lung cancer.Sci. Rep.7, 4474 (2017).
49. Lehrer, S. Glioblastoma and dementia may share a common cause.Med. Hypotheses75, 67–68 (2010).
50. Driver, J. A. et al. Inverse association between cancer and Alzheimer’s disease: results from the Framingham Heart Study.BMJ344, e1442 (2012). 51. Musicco, M. et al. Inverse occurrence of cancer and Alzheimer disease: a
population-based incidence study.Neurology81, 322–328 (2013). 52. Kosunen, O. et al. Diagnostic accuracy of Alzheimer’s disease: a
neuropathological study.Acta Neuropathol.91, 185–193 (1996). 53. Dubois, B. et al. Research criteria for the diagnosis of Alzheimer’s disease:
revising the NINCDS-ADRDA criteria.Lancet Neurol.6, 734–746 (2007).
54. Kapasi, A., DeCarli, C. & Schneider, J. A. Impact of multiple pathologies on the threshold for clinically overt dementia.Acta Neuropathol.134, 171–186 (2017).
55. McKhann, G. M. et al. The diagnosis of dementia due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement.7, 263–269 (2011).
56. Albert, M. S. et al. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Aging-Alzheimer’s disease.Alzheimers Dement.7, 270–279 (2011).
57. Roman, G. C. et al. Vascular dementia: diagnostic criteria for research studies. Report of the NINDS-AIREN International Workshop.Neurology43, 250–260 (1993).
58. McKeith, I. G. et al. Diagnosis and management of dementia with Lewy bodies: fourth consensus report of the DLB Consortium.Neurology89, 88–100 (2017).
59. Kawai, Y. et al. Neuropsychological differentiation between Alzheimer’s disease and dementia with Lewy bodies in a memory clinic.Psychogeriatrics 13, 157–163 (2013).
60. R Development Core Team.R: A Language and Environment for Statistical Computing(R Foundation for Statistical Computing, Vienna, 2009). 61. Sing, T., Sander, O., Beerenwinkel, N. & Lengauer, T. ROCR: visualizing
classifier performance in R.Bioinformatics21, 3940–3941 (2005).
62. Obayashi, T. & Kinoshita, K. Rank of correlation coefficient as a comparable measure for biological significance of gene coexpression.DNA Res.16, 249–260 (2009).
Acknowledgements
We thank NCGG Biobank for providing the study materials, clinical information and technical support. This study was supported by the“Development of Diagnostic Tech-nology for Detection of miRNA in Body Fluids”grant from the Japan Agency for Medical Research and Development and New Energy and Industrial Technology Development Organization (to S.N., Grant Number JP17ae0101013).
Author contributions
D.S. and S.A. developed the method and performed the analyses; K.M., M.I., H.S. and S.T. performed the experiments of miRNA expression; Y.A., K.A.B., A.S. and T.T. pro-vided the technical assistance; T.S. contributed to data acquisition and the analyses; D.S., T. O., K.O. and S.N. wrote the manuscript and organized this work. All authors con-tributed to and approved thefinal manuscript.
Additional information
Supplementary informationaccompanies this paper at
https://doi.org/10.1038/s42003-019-0324-7.
Competing interests:The authors declare no competing interests.
Reprints and permissioninformation is available online athttp://npg.nature.com/ reprintsandpermissions/
Publisher’s note:Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visithttp://creativecommons.org/
licenses/by/4.0/.