NEXT GENERATION SEQUENCING
DNA
SANGER SEQUENCING
Flowcell
Genomic DNA
NEXT GENERATION SEQUENCING
I 4 nucleotidi marcati con fluorocromi e bloccati in 3’
sono aggiunti contemporaneamente
Primer di sequenziamento
Nucleotidi marcati e bloccati
ACQUISIZIONE DELL’IMMAGINE
RIMOZIONE DEL FLUOROFORO
RIMOZIONE DEL BLOCCO AL 3’
125bp 2 FLOWCELL 8 Lane/Flowcell 2 x 125bp/Cluster (Pair-end) 8 Lane/Flowcell 125bp 250 * 10^6 Cluster/Lane 250 * 10^6 Cluster/Lane = 2 * 8 * 250 * 10^6 * 2 * 125bp = 1000 000 000 000 bp !! = 1 Tb (Terabase)
HiSeq2500
ThroughputSANGER SEQ vs. NGS
C A G C G A C A G C A G C A T T G G G A C Allele #1 C A G C G A C A G C G G C A T T G G G A C Allele #2 C A G C G A C A G C A G C A T T G G G A C Allele #1 C A G C G A C A G C G G C A T T G G G A C Allele #2 C A G C G A C A G C A G C A T T G G G A C NGS Read #2 C A G C G A C A G C G G C A T T G G G A C NGS Read #1 C A G C G A C A G C A G C A T T G G G A C NGS Read #3 C A G C G A C A G C A G C A T T G G G A C NGS Read #4 C A G C G A C A G C G G C A T T G G G A C NGS Read #5 Coverage = 5THROUGHPUT
COSTO PER-BASE
GENOMIC DNA SEQUENCING RESEQUENCING DE NOVO SEQUENCING ChIP-Seq ULTRADEEP SEQUENCING METHYL-SEQ
DNA
RNA
TRANSCRIPTOME SEQUENCING (RNA-SEQ) TAG SEQUENCING (DITAG)
mRNA SEQUENCING
MICRO-RNA STUDIES
WHOLE-EXOME SEQUENCING
WHOLE-GENOME, WHOLE-EXOME AND ULTRADEEP-SEQUENCING
COVERAGE COVERAGE
WHOLE-GENOME
ULTRADEEP SEQUENCING – QUANDO ?
M
M
ULTRADEEP-SEQ
COV
ER
SINGLE NUCLEOTIDE POLYMORPHISM C T ..ACTGAATTGCTGATTGTCAAGTCTGCTAGCG... G A A A G G G A A A A A A G G G G T T .... A A G G G G T T T T T T .... ..ACTGAATTGCTGATTGTCAAGTCTGCTAGCG.. VARIANT A A A A T T T T T T T G G G G G C C C C C C T T T T T T CASE SAMPLE CONTROL SAMPLE
VARIANT CALLING
MUTATION, SEQ ERROR OR SNP ?
VarScan 2 (http://massgenomics.org/varscan)
CONTR
OL
CA
SE
CONTRO L CASE T A A A T A T
C
OMPARATIVEE
XONICQ
UANTIFICATIONANALYZER
Piazza R. et al., PLoS One. 2013 Oct 4;8(10):e74825
Wilcoxon Signed-Rank test
Statistical module
Wilcoxon Signed-Rank
test Test statistic W
As sample size increases (Nr> 10) the Z-Score converges to a Gaussian
distribution!
Estimating the error function of the normal distribution of W..
Nr i i control i case i x R x W 1 ) ( ) ( sgn
5
2 5 4 4 3 3 2 2 1 1 ) (x a t a t a t a t a t e x erf ..using the Abramowitz and Stegun approximation equation 7.1.26
Chr9
CML-BC PATIENT: CML001BC
CDKN2A (p16)
Log2
CML-BC PATIENT: CML004BC
Chr17p53
ANALISI DI PRODOTTI DI FUSIONE ONCOGENICI
FRAMMENTAZIONE
RNA-seq – DRIVER FUSION TRANSCRIPTS IDENTIFICATION
Bridge reads
Junction reads
76bp 76bp
CCDS / REFFLAT
BCR ex14 ABL ex2
??? ABNORMAL PAIRS SCANNER PREFILTERING ALGORITHM EXOME BUILDER EXOME DATASET ABNORMAL PAIRS HALF-MAPPED PAIRS PUTATIVE TRANSLOCATIONS SET (PTS) FILTERED HALF-MAPPED PAIRS FILTERED PTS Genome Mapping Quality Homology Filter N Filter Threshold Filter Read Quality ALIGNMENT TO HUMAN GENOME SAM BAM
FILTERED PTS JUNCTION FINDER Ex4 Ex3 Ex2 Ex14 Ex13 Ex12 2 1 3 4 BCR ABL 1 2 3 4 JUNCTION Ex14 Ex2 FILTERED HALF-MAPPED PAIRS JUNCTIONS LIST BCR ??? ALIGNMENT ALGORITHM JUNCTION JUNCTION READ
JUNCTION READ FRAME ALGORITHM RECIPROCAL TRANSLOCATION ALGORITHM DIRECTION ALGORITHM BCR ABL 5’ 3’ BCR ABL 5’ 3’
AML1-ETO t(8;21) BCR-ABL1 p190 t(9;22) BCR-ABL1 p210 e13a2 t(9;22)
BCR-ABL1 p210 e14a2 t(9;22)
CBFB-MYH11 inv(16) CEP110-FGFR1 t(8;9) EWSR1-ERG t(21;22) MLL-MLLT1 t(11;19)
LOW EX PRES S ION HIGH EX PRESS ION
RNA-SEQ GOES DIGITAL
RNA
-Seq
EXON READ
RPKM = READS PER KBASE PER MILLION OF MAPPED READS TPM = TRANSCRIPTS PER MILLION
TOPHAT (http://tophat.cbcb.umd.edu/)
GENOMIC DNA SEQUENCING RESEQUENCING DE NOVO SEQUENCING ChIP-Seq DEEP SEQUENCING METHYL-SEQ
DNA
RNA
TRANSCRIPTOME SEQUENCING (RNA-SEQ) TAG SEQUENCING (DITAG)
mRNA SEQUENCING
MICRO-RNA STUDIES
WHOLE-EXOME SEQUENCING