3 MATERIAL & METHODS
3.16 HPV16 INTEGRATION ANALYSIS
Analysis of a cohort of HPV positive OPSCC (n=43) and control cell lines (CaSki & SiHa) was undertaken to determine the presence or otherwise of viral integration into the host genome. This analysis was undertaken using direct PCR based analysis of the E2 gene integrity.
Additionally, a pilot series of OPSCC sourced from the above cohort (n=9) and the control cell lines (CaSki & SiHa) were further interrogated using a recently described technique coupled with massively parallel sequencing.
E2 Gene Integrity Analysis
To determine the integrity of the HPV16 E2 gene, the previously modified and optimised approach described by Collins et al was employed182. The technique
utilised sets of overlapping sequence-‐specific primers for the E2 gene (Figure 10). Determination of integration state relies on the assumption that integration occurs exclusively in the E2 gene and that failure of amplification of a component (or components) of the E2 gene implies integration. Conversely, detection of all
components of the E2 gene by PCR amplification reflects a presumed episomal viral state.
Figure 10: Schematic Diagram of HPV16 E2 Integrity Overlapping Primer Analysis
Location of primer sets detailed with respect to the E2 gene. Nucleotide numbers are according to the whole HPV16 genome. (Modified from Collins et al182)
Briefly, 60ng of DNA samples from each case was amplified using Hotstart
Mastermix with 0.4umol/L of the appropriate primer set. Thermal cycling conditions are detailed in Table 22 and Table 23, the only alteration from the conditions
described by Collins et al. is a reduction in number of cycles from 60 to 40.
Step Temperature (oC) Time No of cycles
Taq Activation 95 5 min
Denaturation 95 30 sec
Annealing 57 60 sec 40
Extension 72 120 sec
Final extension 72 10 min
Table 22: Thermal Cycling Conditions for HPV16 E2 Whole Gene
Step Temperature (oC) Time No of cycles
Taq Activation 95 5 min
Denaturation 95 30 sec
Annealing 54 60 sec 40
Extension 72 120 sec
Final extension 72 10 min
Table 23: Thermal Cycling Conditions for HPV16 E2 Component Parts (P1 – P5)
Following thermal cycling, PCR products were run on a 2% agarose gel and visualised using UV visualisation on a UVP VisionWorks LS instrument to
demonstrate product presence (or absence). Controls included DNA samples from CaSki and SiHa cell lines, which had previously been demonstrated to contain complete head-‐to-‐tail complete viral gene integrants193 (hence positive control for all primer pairs) and a solitary integrant with loss of the E2 gene respectively194
(integration positive control with expectation of primer set 2 amplicon failure). The negative controls were DNA derived from the known HPV16 negative cell line
HBEC-‐3KT and DNA from the HPV negative OPSCC (sample No.11).
Next Generation Sequencing (NGS) Analysis
Prior to commencement of the project, options for both target sequence
acquisition and sequencing were subject to collaborative discussion with the third party organization chosen to undertake sample preparation and sequencing; Centre for Genomics Research, University of Liverpool, Liverpool, UK.
Due to the previous success of Depledge et al195, target capture and library preparation was undertaken using the previously validated SureSelectXT Target Enrichment System for extraction of the sequences of interest and generation of an Illumina Paired-‐End Sequencing Library (Agilent, Santa Clara, CA, USA)(Figure 11). Once more, selection of the platform, best suited to specifics of the project, was made in response to guidance provided by the third party collaborator and in keeping with project goals. Paired-‐end sequencing of all target sequences was completed using the HiSeq 2000 (Illumina, San Diego, USA)195.
Figure 11: Overall sequencing sample preparation workflow
(Modified from Agilent SureSelect XT Protocol) * indicates correlation with hybridisation workflow (Figure 12).
Figure 12: Sample Hybridisation Schematic
(Modified from Agilent SureSelect XT Protocol) * indicates input point of prepared and purified sample libraries.
Selection of cases for analysis was undertaken, ensuring adequate available DNA (3µg) and sample quality as detected by Nanodrop analysis (A260/280 and A260/230
The sample preparation, hybridisation and sequencing were outsourced to a third party organisation; Centre for Genomics Research, University of Liverpool,
Liverpool, UK. The workflow for sample preparation is graphically represented in Figure 11 and the simplified graphical representation of the target sequence hybridisation, portrayed in Figure 12.
Briefly, the protocol entailed shearing of 3ug of gDNA for each of 9 HPV16 positive OPSCC samples and 2 HPV16 positive cell lines (CaSki and SiHa) using the Covaris
300 programme to a target size of 300bp. The sheared and size-‐selected DNA was analysed on a DNA 1000 chip. Samples were compared to optimal DNA shearing profiles to ensure accurate shearing prior to proceeding to hybridisation (Figure 13).
Figure 13: Optimal DNA shearing profile from Agilent 2100 Bioanalyzer electropherogram (12k chip) Target fragment size 300bp. Peaks at extreme left (15bp) and extreme right of profile (1500bp) represent reference control fragments.
Following confirmation of satisfactory shearing profiles, samples underwent end repair, non-‐templated addition of 3’-‐A, adaptor ligation, hybridisation, enrichment PCR and related sample purification steps according to the SureSelect Illumina Paired-‐End Sequencing Library protocol (version 1.2, May 2011). The SureSelect
capture library or “baits” were customized for the HPV genome and the RNAse P human gene as follows. Overlapping 120-‐mer RNA baits allowing x5 coverage of the entire HPV16 genome was designed with the Agilent eArray software and then synthesized by Agilent Biotechnologies (NCBI Reference Sequence: NC_001526.2). Bait design paid additional attention to the circular nature of the genome to ensure coverage (x5) at the extremes of linearized text sequence, resulting in a total of 335 baits for the HPV16 genome.
Additionally, baits were designed and synthesized for the host gene, RNaseP and
multiplexed with HPV16 baits. As before coverage was x5 for the 341bp RNase P gene (NCBI Reference Sequence: NC_000014.8). Inclusion of this gene was intended to serve two purposes; firstly, it would allow direct validation of the sequencing method with previously determined quantitative PCR results for each sample, and secondly allow calculation of relative HPV viral load between samples with RNaseP reads being the equilibrator for input DNA.
Sequencing was performed on the Illumina HiSeq platform in accordance with standard manufacturers protocols. Raw data management and bioinformatic analysis was provided by the third party. Bioinfomatic outputs were predetermined with the third party to ensure specific research targets and data were both
realistically achievable and delivered to allow interpretation in keeping with the project aims. Specific reporting features were paired-‐end read origin (host or viral), mapping positioning, viral-‐host read analysis with specific interpretation of chimeric reads to report viral and host genomic break point/insertion locations and relative viral load.