• No results found

Statistical issues in the analysis of microarray data

N/A
N/A
Protected

Academic year: 2021

Share "Statistical issues in the analysis of microarray data"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

Statistical issues in the analysis of microarray data

Daniel Gerhard

Institute of Biostatistics Leibniz University of Hannover

ESNATS Summerschool, Zermatt

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 1 / 30

(2)

Table of Contents

1 Outline

2 Experimental design

3 Statistical modelling

4 Hypotheses testing

5 Gene set enrichment analysis

6 Classification

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 2 / 30

(3)

Outline

Focus is set on

Single channel microarrays

I

One sample per array

I

Gene expressions for thousands of oligonucleotides Identifying genes that are differentially expressed due to a treatment

Finding significantly differentially expressed genes with a given error probability

(Predicting a treatment level given the gene expression data)

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 3 / 30

(4)

Controlled experiments

Independent replications

Multiple sources of variability present:

I

Sample-, array-, environmental variability, . . .

Account for this variability in the experimental design by several replications

I

of arrays, samples, multiple timepoints, . . .

Randomisation

Needed to separate treatment effects from other factors, which might influence gene expression

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 4 / 30

(5)

Experimental design

Planning an experiment

Multiple arrays per sample? Enables estimating array variability.

Large amount of RNA needed.

With more complex designs a larger number of arrays, samples is needed

Measuring covariates, which are not directly of interest, but might have an influence on gene expression

Simple classic design

2 Treatments (Control/Treatment), Multiple arrays/samples per treatments

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 5 / 30

(6)

Data structure

Treatment A Treatment B . . .

Array 1 Array 2 Array 3 Array 4 Array 5 Array 6 . . .

Gene 1 y 11 y 12 y 13 y 14 y 15 y 16 . . .

Gene 2 y 21 y 22 y 23 y 24 y 25 y 26 . . .

Gene 3 y 31 y 32 y 33 y 34 y 35 y 36 . . .

.. . .. . .. . .. . .. . .. . .. . .. .

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 6 / 30

(7)

Data example

Generating artificial data 2 treatments (A, B) 20 arrays per treatment 5000 genes per array

Normal distributed residuals, array effects within array sd = 1; between array sd = 0.5 100 genes show an effect (δ = ±2)

2 x transformation

D. Gerhard (LUH) Analysis of microarray data 23. Sep 09 7 / 30

References

Related documents