Next Generation Sequencing Core Facility
Next Generation Sequencing Core Facility
Max Planck Institute for Molecular Genetics Max Planck Institute for Molecular Genetics Berlin, Germany
Berlin, Germany
Dr.
Bernd
Timmermann
Overview
of
Next
Generation
Sequencing
platform
technologies
1. Technologies
Illumina
Roche / 454
2. Projects and Applications
whole Genome Re-sequencing
Sequence Capture
Amplicon Sequencing
3. Outlook
Outline
Max Planck Society
Max Planck Institute for molecular Genetics
80
institutes
and
research
facilities
20,435
people
Budget
1,400
million
euro
in
2010
1980 1985 1990 1995 2000 2005 2010 1000,000,000 100,000,000 10,000,000 1,000,000 100,0000 10,000 1000 100 10 Gel-based Systems Capillary Sequencing Next Generation Sequencing First Generation Capillary Sequencer Second Generation Capillary Sequencer Microwell Pyrosequencing Short-Read Sequencer Throughput pe r s y s te m [k ilobas e s /da y ] Year
Modified after MR Stratton et al. Nature458, 719‐724 (2009)
Development of Sequencing Throughput
Development of Sequencing Technologies
• 96 sequences in parallel
• 3.2 billions of sequences
per run
Human Genome Project
1000 Genomes Project
7 x Illumina
3 x SOLiD
5 x Roche GS
Sequencing Capacities
at the MPI-MG
3 x Capillary Systems
May 23rd 2012, BudapestIT Infrastructure
Long Read Technologies Short Read Technologies TB GB25 x 32 (64) Compute Server with 128 (512 GB) RAM 4 peta byte Storage Capacity
1. Technologies
Illumina
Roche / 454
2. Projects and Applications
whole Genome Re-sequencing
Sequence Capture
Amplicon Sequencing
3. Outlook
Technologies
Genome Sequencer FLX
HiSeq 2000/
SOLiD
• ChipSeq • MeDipSeq • miRNA • RNAseq• Sequencing of target regions
• Whole genome resequencing
• de novo Sequencing
• Metagenome Analyses
• Amplicon Sequencing
• Full length Transcriptome Analyses
• Sequencing of target regions
Principle Illumina Sequencing
Library
Preparation
Attachement of single molecules to surface
Amplification to form clusters
Cluster
Generation
5’ G T C A G T C A G T C A G T 3’ 5’ C A G T C A T C A C C T A G C G T A
First base incorporated Cycle 1: Add sequencing reagents
Remove unincorporated bases Detect signal
Cycle 2-n: Add sequencing reagents and repeat
Sequencing by Synthesis (SBS)
Referenzsequenz ....CGAGCGAATGAAGTCGGGAGTCGTAATGAGCCCGTAATCCCGTTAGTA.... CGAGCGAATGAAGTCGGGAGTCCTAATGAGCCCGTA GAGCGAATGAAGTCGGGAGTCCTAATGAGCCCGTAA CGAATGAAGTCGGGAGTCCTAATGAGCCCGTAATCC TGAAGTCGGGAGTCCTAATGAGCCCGTAATCCCGTT TCGGGAGTCCTAATGAGCCCGTAATCCCGTTAGTA Sequence Reads
Conversion of image data to DNA sequences
Input Material:
~ 1-3 µg DNA shotgun Sequencing
~ 10 ng
ChipSeq
Sequencing
Library Preparation:
~ 1.5 days
Cluster Generation:
~ 1 day
Run Time/
Single read ~ 2 days (36 b)
Read Length:
Paired End ~ 10 days (2 x 100 b)
Data Processing:
~ 1 day
Output:
Paired End ~ 500 Gb
Reads:
up to 4800 Mio
Facts Illumina Sequencing (HiSeq 2000)
1. Genome is loaded into a PicoTiter™ plate 3. Load Reagents in a single rack 4. Sequencing 2. Load PicoTiter
plate into instrument
454 Sequencing Instrument
Principle 454 Sequencing
Library Preparation
Emulsion PCR
Depositing DNA Beads into the PicoTiter™Plate Pyrosequencing
Emulsion Breaking
Input Material:
~ 0.5 µg DNA
Library Preparation :
~ 4 hours
Emulsion PCR:
~ 1 day
Run Time:
20 hours
Data Processing:
~ 10 hours
Output:
Titanium+ 700 -
1000 MB
Reads:
Titanium+ 1.000.000 -
1.600.000
Read length:
700 -
800 bases
Facts 454 Sequencing
Sequencing Pipeline
Library Preparation Library Quantification Bead Enrichment Sequencing May 23rd 2012, Budapest1. Technologies
Illumina
Roche / 454
2. Projects and Applications
whole Genome Re-sequencing
Sequence Capture
Amplicon Sequencing
3. Outlook
Projects and Applications
Goals
• A public database of essentially all SNPs and detectable CNVs with allele frequency >1% in each of multiple human population samples
• Pioneer and evaluate methods for:
• Generating data from next-generation sequencing platforms
• Exchanging and combining data and analytical methods
• Discovering and genotyping SNPs and CNVs from nextgen data
• Imputation with and from next generation sequencing data
• 454, Illumina and AB SOLiD platforms
• Academic genome centers in US, UK, Germany, China and platform companies
(Nature 2010, Science 2010 and Nature 2011)
OncoTrack, “Methods for systematic
next generation oncology biomarker
development”,
is an international consortium of over
60 scientists, that has launched one of
Europe’s largest collaborative academic‐
industry research projects to develop
and assess novel approaches for
identification of new markers for colon
cancer.
DNA
RNA
Protein
Methylation
Mutations
mRNA
miRNA
Cell
lines
Tissues
Sequencing
Bioinformatics
May 23rd 2012, BudapestRNAseq
expression profiling total RNA Isolation
quality control small RNA Depletion
dsDNA generation using random hexamers
Illumina library preparation
massive parallel sequencing
Mapping
GWAS
Candidate
Genes
Whole
Exome
0.5 –
5 MB
35 MB
385 k Array, Nimblegen
In-solution
Enrichment
2.1 Mio Array, Nimblegen
In-solution
Enrichment
Sequence Capture
Targeted Resequencing:
Project outline
Identification patients Sequence capture
Next-Gen sequencing “Bioinformatics” Follow-up sequencing Functional characterization
Work-flow
Sample preparation May 23rd 2012, BudapestPrinciple of sequence capture
DNA Preparation Enrichment of target regions Sequencing
A1 SP1 A2 genomic DNA Fragments (200‐500bp) Ligation of adapters Hybridization Selection with streptavidin beads Amplification and Quantification May 23rd 2012, Budapest
Cleft lip with or without cleft palate (CL/P)
Cooperation with M. Nöthen and E. Mangold
•
Prevalence among live births
~ 1 : 1.000
•
Risk for siblings 1 : 20 –
1 : 25
•
λ
s40 -
50
Epidemiology of
Epidemiology of
nonsyndromic CL/P
nonsyndromic CL/P
Mangold E. et al. (2010), Nature Genetics
•
3 Loci on chr 8 (640Kb), 10 (161Kb) and 17 (340Kb)
in 20 affected individuals
•
MID tagging and pooling of 10 samples
•
Enrichment using the 2.1M NimbleGen array
•
Sequencing on a Roche GS FLX system
Cleft lip with or without cleft palate (CL/P)
Resequencing as follow up of GWAS
Mapping
•
•
6.726 unique variants (>10 x Coverage)
•
•
3.783 variants not listed in dbSNP (hg19)
•
•
4 coding Variants
•
•
Detection of structural Variations not yet finished
Cleft lip with or without cleft palate (CL/P)
Preliminary Results
Mutation detection pipeline quality
Concordance with Affymetrix Array "genome-wide human SNP array 6.0"
Aim
Detection and quantification of new and known variants
METHOD
Amplification and sequencing of target regions
Multiple alignments of sequences against a reference
reference
patient sequences
Amplicon Sequencing
Amplicon Sequencing
B-primer (21 bp) MID MID key key A B A-primer (21 bp) Sequence of interest Locus‐specific PCR amplificationemPCR Amplificationand sequencing
•
Long
reads
required
to
sequence
through
the
locus
specific
primer,
enable
haplotyping
over
longer
distances
•
100s
to
1000s
of
amplicon
clones
sequenced
simultaneously
Amplicon Sequencing
IRON Study
I
nterlaboratory
Ro
bustness of
N
GS
Amplicon Sequencing
IRON Study
Hematology Focus Group
Amplicon Sequencing
IRON Study
Results
• per each amplicon, the median coverage eached was 713-fold, ranging from 553-fold to 878-fold
• a total of 92 variants (44 distinct mutations and 10 SNPs) were observed
• in comparison to data available from Sanger sequencing, 454 amplicon deep-sequencing detected all mutations and SNPs that were previously known
• we here confirm in a multicenter analysis that amplicon-based deep-sequencing is technically feasible, achieves a high concordance across multiple laboratories, and
therefore allows a broad and in-depth molecular characterization of hematological malignancies.
Kohlmann et al. (2011), Leukemia
Sensitivity
of
mutation
detection
as
a
function
of
tumor
cell
content
Querings et al. (2011), PlosOne
Establishment of small scale NGS systems
Analysis of complete genomes
Personalized medicine
Outlook
Acknowledgments
Hans Lehrach
Bernhard Herrmann
Hilger Ropers
Martin Vingron
Michal Schweiger
Martin Kerick
Markus Ralser
Sequencing Facility:
Ilona Hauenschild
Sonia Paturej
Tina Moser
Ina Lehmann
Norbert Merges
Daniela Roth
Sabrina Rau
Heiner Kuhl
Sven Klages
Martin Werber
May 23rd 2012, BudapestThanks