CHAPTER 7: MUTATION SCREENING APPROACH 2: NGS SOLiD SEQUENCING OF
7.2.3 Preparation of LR-PCR products for next generation sequencing
7.2.3.2 SOLiD4 barcoded fragment library preparation and sequencing
Sequencing libraries were prepared by the service provider (Life Technologies, Melbourne, Australia) as the first step in which samples (amplicons in two 96-well plates) were adapted for sequencing by oligonucleotide ligation and detection (SOLiD, Life Technologies) sequencing. The samples comprised of fourteen amplicons from each of the eight sheep which were pooled in
equimolar ratios into two pools of seven non-overlapping amplicons (Table 7.4). Each of this pool represents a library, which means there are 2 libraries for each animal.
The barcoded fragment library was utilised in this study. Each library is tagged with a barcode, containing unique 5 - 10 base sequences, on one of the adaptors, to enable multiplexed sequencing analysis where multiple samples are run simultaneously in a single sequencing run.
Preparation of the barcoded fragment libraries using the amplicon pools was performed using the SOLiD4 fragment library barcoding kit (Applied Biosystem, USA) and the SOLiD fragment library construction kit (Applied Biosystem, USA). A summary of the workflow description of the preparation of barcoded fragment libraries and a schematic diagram of a typical barcoded fragment are shown below in Figures 7.2 and 7.3, respectively.
Figure 7.2: Preparation workflow for the barcoded fragment libraries (adapted from Applied Biosystems SOLiD4 system library preparation guide, April 2010).
Summary of construction of the barcoded fragment libraries are as follows:
1. Each pool of seven amplicons was sheared into small DNA fragments with a mean fragment size of 165 bp and a fragment size range of 150 to 180 bp, using the Covaris S2 System.
2. DNA ends of the sheared DNA fragments were end-repaired with End Polishing Enzymes 1 and 2 and purified with the PureLink PCR purification kit (Applied Biosystems, USA).
3. Ends of the end-repaired DNA fragments in each pool of fragmented amplicons were ligated with adaptors (Multiplex P1 Adaptor and a Multiplex P2 Adaptor) and purified with the PureLink purification kit (Applied Biosystems, USA). The Multiplex P2 adaptors included unique barcode sequences (total of 16 barcodes with 1 barcode for each of the two pools of seven amplicons for each of the eight sheep) with each barcode subsequently generating a single fragment library to be multiplexed in the same sequencing run. Sequences for Multiplex P1 and P2 adaptors used for barcoded fragment library construction are listed in Table 7.3. An example of the pool-barcode arrangement is as follows: For animal CPW 156, barcode 1 was ligated to pool 1 (amplicons 1,3,5,7,9,11,13) and barcode 9 was ligated to pool 2 (amplicons 2,4,6,8,10,12,14). Barcode labelings of the remaining seven animals are shown in Table 7.4.
4. After adaptor-ligation and purification, DNA underwent nick translation and was amplified for nine cycles using primers specific to the Multiplex P1 and Multiplex P2 adaptors (Multiplex library PCR-1: CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT and Multiplex library PCR-2: CTGCCCCGGGTTCCTCATTCT primers). After amplification, the library was purified with a PureLinkPCR purification kit (Applied Biosystems, USA). 5. The library was quantitated using the SOLiD Library TaqMan quantitation kit (PN
4449639), according to the manufacturers instructions described in ‘SOLiD4 system library quantitation with the SOLiD library TaqMan quantitation kit’.
6. The barcoded fragment libraries were pooled in equal molar amounts.
Once the barcoded fragment libraries were constructed, the templated bead preparation was performed following the manufacturer’s instructions (Applied Biosystems SOLiD system templated bead preparation guide, March 2010). In summary, this involved clonal amplification using an emulsion PCR of each library template on the SOLiD P1 DNA beads, enrichment of the templated beads and deposition of the beads onto a sequencing slide. The beads were then sequenced on the SOLiD Analyzer on a single segment of a four-quadrant slide, with a 50 bp read length expected for the SOLiD4 system. An overview of the sequencing chemistry for the templated bead preparation and sequencing is described at: http://www.appliedbiosystems.com/absite/us/en/home/applications-technologies/SOLiD-next-
generation-sequencing/next-generation-systems/SOLiD-sequencing-chemistry.html.
Figure 7.3: Schematic diagram of a typical SOLiD4™ barcoded fragment after library construction (adapted from Applied Biosystems SOLi4 system library preparation guide, April 2010).
Table 7.3: Multiplex P1 and P2 adaptors used for barcoded fragment library construction. Each adaptor consists of two sequences to form a double strand as shown in Fig 7.1. The bold sequence flanked by the internal adaptor sequence on the 5’ end and the P2 adaptor sequence on the 3’ end is the barcode sequence
Adaptor and barcode Sequence 5'>3' Length
Multiplex P1 Adaptor ATCACCGACTGCCCATAGAGAGGTT 25
CCTCTCTATGGGCAGTCGGTGAT 23
Multiplex P2 Adaptor with Barcode- 001
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGTGTAAGAGGCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
002
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTAGGGAGTGGTCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
003
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTATAGGTTATACTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
004
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGGATGCGGTCCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
005
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGTGGTGTAAGCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
006
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGCGAGGGACACTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
007
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGGGTTATGCCCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
008
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGAGCGAGGATCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
009
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTAGGTTGCGACCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
010
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGCGGTAAGCTCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
011
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGTGCGACACGCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
012
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTAAGAGGAAAACTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
013
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGCGGTAAGGCCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
014
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGTGCGGCAGACTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
015
CGCCTTGGCCGTACAGCAG 19
CTGCCCCGGGTTCCTCATTCTCTGAGTTGAATGCTGCTGTACGGCCAAGGCG 52 Multiplex P2 Adaptor with Barcode-
016
CGCCTTGGCCGTACAGCAG 19
Table 7.4: Amplicon pool-barcode arrangements used in the SOLiD barcoded fragment library preparations. Each box represents an individual sheep and the two-barcoded pools of seven equimolar non-overlapping amplicons.
CPW156 (SH_N) SH1038/08 (SH_C2) Barcode-001 (sheep1_1)* Barcode-009 (sheep1_9)* Barcode-005 (sheep5_5)* Barcode-013 (sheep5_13)* 1ar, 2ir, 3dr, 4ar, 5ir, C1cr,
C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr
1ar, 2ir, 3dr, 4ar, 5ir, C1cr, C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr SH1022/07 (SH_A1) SH1039/08 (SH_A3) Barcode-002 (sheep2_2)* Barcode-010 (sheep2_10)* Barcode-006 (sheep6_6)* Barcode-014 (sheep6_14)* 1ar, 2ir, 3dr, 4ar, 5ir, C1cr,
C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr
1ar, 2ir, 3dr, 4ar, 5ir, C1cr, C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr SH1032/08 (SH_C1) L06 (M_N) Barcode-003 (sheep3_3)* Barcode-011 (sheep3_11)* Barcode-007 (sheep7_7)* Barcode-015 (sheep7_15)* 1ar, 2ir, 3dr, 4ar, 5ir, C1cr,
C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr
1ar, 2ir, 3dr, 4ar, 5ir, C1cr, C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr SH1033/08 (SH_A2) L07 (M_A) Barcode-004 (sheep4_4)* Barcode-012 (sheep4_12)* Barcode-008 (sheep8_8)* Barcode-016 (sheep8_16)* 1ar, 2ir, 3dr, 4ar, 5ir, C1cr,
C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr
1ar, 2ir, 3dr, 4ar, 5ir, C1cr, C3cr
1hr, 2jr, 3ir, 4ir, 5fr, C2ar, C4dr
# Parenthesis following each animal’s ID indicates breed and disease status of each sheep. CPW= Coopwoth, M= Merino, SH= South Hampshire sheep, A= affected, C= carrier, N= normal
* ID used in the Gbrowse Genome Browser (see Figure 7.6)
7.2.4 Bioinformatic analysis
Initial bioinformatic analysis and technical support for the SOLiD sequence data was performed by Drs John Davis (Applied Biosystems, Mulgrave, Melbourne) and Matthew Hobbs (University of Sydney). This involved identification of polymorphisms (SNPs and small indels), aligning reads to the provided ovine reference sequence and visualisation of the annotated data on the generic genome browser (GBrowse version 2.38), which is a part of the generic model organism GMOD suite of genome analysis software tools (Stein et al., 2002).
Samples were labeled according to their respective amplicons pools and barcodes (Table 7.4): sheep 1 (CPW156) samples were labelled ‘sheep1_1’ corresponding to sheep 1 pool with barcode 001, and ‘sheep 1_9’ corresponding to sheep 1 pool with barcode 009.
Preliminary bioinformatic analysis was performed by separating reads according to their respective barcodes and aligning the generated sequence reads to the ovine reference sequence generated in Chapter 5, using the SOLiD system analysis pipeline tool. The output generated from alignment was in BAM (binary alignment map) file format that stores large numbers of nucleotide sequence alignments. Two BAM files were generated for each of the eight sheep. Also obtained were FASTA format files of nucleotide sequences to which reads had been mapped.