B. Methods to Analyze DNA Methylation Patterns
B.3 Pyrosequencing
The quantification of methylation for each individual CpG in a targeted DNA sequence can be performed using pyrosequencing, which was firstly described as an alternative to Sanger sequencing. In 1986 Pål Nyrén developed a way to follow in real time the release of the pyrophosphate by DNA polymerase during the incorporation of nucleotides (152). Briefly, pyrosequencing uses a cascade of enzymatic reactions following addition of a base by DNA polymerase to produce light emission that is detected by a camera and displayed in a graph called a pyrogram. Prior to
pyrosequencing, a PCR is setup with primers that amplify a larger DNA region than the one to be sequenced. The region to be sequenced is called target region and is located within the amplicon of the PCR reaction. One of the PCR primers contains a biotin molecule at their 5’-end. As amplification progresses the PCR product will consist of double stranded molecules with one biotin-labeled strand. The amplicons are mixed with streptavidin-coated beads and captured by vacuum on filter probes. The captured DNA is then washed in successive solutions, including a denaturing one that separates the double strand DNA, releasing the non-labeled strand and retaining the biotin-labeled one. The beads with the single stranded DNA are then released in a solution containing a third primer, called sequencing primer. After the sequencing primer anneals to the single-
stranded DNA, the mixture is added to the pyrosequencing instrument and sequencing begins (153). The instrument is composed of a cartridge that contains all 4 nucleotides, an enzyme mixture and substrate mixture in separate wells and a place for a sequencing plate. Due to increase in pressure, the contents of each well in the cartridge can be
individually dispensed on the plate. At first the enzyme mixture and the substrate mixture are dispensed in all wells of the plate. The nucleotides are then added one at a time, according to a specified order, to each well. If the added nucleotide is complementary to the DNA strand being sequenced, the DNA polymerase will add the nucleotide and release pyrophosphate, which is similar to what happens on any reaction catalyzed by DNA polymerase. Next, adenosine 5’-phosphosulfate in the presence of the released pyrophosphate is converted to ATP by the enzyme sulfurylase. Luciferase can then catalyze the oxidation of luciferin to oxyluciferin due to the presence of ATP in solution, emitting light. The amount of light generated is proportional to the amount of ATP and it is detected by charge coupled device (CDD) and displayed as a peak on a pyrogram (153). Before the next nucleotide is added to the mixture, apyrase degrades any ATP and nucleotides that were not used. The peaks in the pyrogram have a height proportional to the number of each nucleotide added. For example, on the sequence ‘GCAGGCCT’, the pyrogram would be similar to what is shown on Figure 3.1.
Figure 3.1 – Pyrogram depicting the relative peak height for all the nucleotides dispensed when sequencing the fragment ‘GCAGGCCT’. The pyrogram only shows peaks when the dispensed nucleotide is in the sequence. The relative peak heights for a position with two repeated nucleotides (‘GG’ and ‘CC’) are double the height of peaks in positions that contain only single nucleotides. Adapted from (154).
When performing analysis for DNA methylation, for each CpG the instrument will dispense a T and a C and measure the light emitted after each dispensation. Since the light is proportional to the amount of each nucleotide, quantification is possible. On the Qiagen Pyrosequencing® method multiple assays with different PCR products and
sequencing primers can be ran in the same plate. The PyroMark® software used to design each run allows the customization of each individual plate by a drag-and-drop method of assays in the plate layout (155). The user needs to input the sequence to analyze and the software will create the dispensation order. If the user also inputs the ‘sequence to analyze prior to bisulfite conversion’, which corresponds to the genomic DNA sequence, the software highlights possible locations for bisulfite controls in red.
The peak height will provide information about the relative amount of cytosines and thymines for each CpG. Since bisulfite reaction does not modify methylated
cytosines, a high peak when a cytosine is dispensed corresponds to a high methylation percentage. On the example shown on Figure 3.2, the first CpG shows 2% of methylation because the peak height for ‘C’ is 2% of the total peak height for a single nucleotide. The PyroMark® software determines the peak height for all individual base pairs, such as the
first ‘G’ shown in the pyrogram (Figure 3.2, position 4). On the variable positions (CpGs) the sum of peaks of ‘C’ and ‘T’ should be equivalent to the height determined by the remaining sequence.
Figure 3.2 –Pyrogram showing the percent methylation in 5 CpGs (highlighted in blue). The peaks corresponding to ‘C’ are very low when compared to ‘T’, which makes the methylation level to be below 10% for all peaks. The pyrogram shows relative light units on the y-axis and position on the target sequence on the x-axis. The ‘E’ and ‘S’ on the beginning of the x-axis correspond to the dispensation of enzyme and substrate mixtures, respectively. The ‘C’ highlighted in red shows the bisulfite control. On top of the pyrogram is shown the sequence to analyze, where ‘Y’ corresponds to a CpG
The percent methylation is then calculated from the relative peak height of ‘C’ versus the expected peak height for a single nucleotide. To confirm that the correct PCR product is being sequenced instead of a random sequence that annealed to the sequencing primer,
the instrument dispenses negative controls, that is, nucleotides that are not
complementary to the strand being sequenced and these should not display any peak when dispensed. One example of a negative control is the first ‘G’ dispensed after the substrate (S) on Figure 3.2. The sequence to analyze (Figure 3.2, on top labeled as ‘A2’) starts with a CpG (labeled as ‘Y’ on the sequence to analyze) and thus either a ‘C’ or a ‘T’ should be dispensed, however the instrument dispensed a ‘G’ to confirm that the sequencing primer annealed where expected. Comparing the sequence to analyze and the nucleotides dispensed on the x-axis will provide information on which ones are negative controls.
The bisulfite control is shown in the image as a dispensed ‘C’ on position 11. The PyroMark® software assumes that all ‘Cs’ not followed by ‘Gs’ on the genomic DNA sequence are unmethylated and thus will be converted to ‘Ts’. As we can see on the sequence to analyze on Figure 3.2, after position 11 there is a string of 2 ‘Ts’, which result from bisulfite modification of unmethylated ‘Cs’. If bisulfite modification had not been completely successful, at least one of those ‘Ts’ would still be a ‘C’ and thus a peak would show on the pyrogram. Dispensing a ‘C’ before or after any ‘T’ that is the result of bisulfite conversion will therefore serve as a bisulfite control for pyrosequencing. When the ‘sequence to analyze prior to bisulfite conversion’ is included in the assay design of the Pyromark® software, it highlights in red the potential positions on the sequence where bisulfite controls can be added by the user. Otherwise, the nucleotide dispensation can be modified to include as many bisulfite controls or negative controls as desired or
depending on the time available to sequence each well, since more nucleotides dispensed results in longer dispensations and more time needed for each well (155).
The first time that pyrosequencing was used for body fluid identification for forensics application was in 2012 and the results are displayed in the report of Madi et al. (156). In their experimental setup, DNA extracted from several samples of body fluids and methylation was quantified for 4 separate genome locations. The authors selected which CpGs in each location show methylation differences between body fluids. When multiple samples from different donors of each body fluid were ran with each primer pair, the percent of methylation for each CpG of the 4 separate genome locations was recorded. Thus each body fluid has an expected mean percent of methylation for each genome location and a corresponding standard deviation.
The limitations of this method being implemented in forensic laboratories is the fact that it requires new equipment and training of personnel, which may be a deterrent for some laboratories. However the great advantage of pyrosequencing is its capacity for high throughput and its ability to be automated (153). The forensic analyst can be
confident in which CpGs are methylated since the pyrogram reveals all issues associated with the sequencing process. Furthermore, even though the method has not yet been designed for multiplexing PCR, it may be possible to have multiple primers amplified in a single PCR reaction and then sequenced separately depending on which sequencing primer is added for each well.