a) Selection of Markers
Short Tandem Repeats (STRs) markers used in forensic genotyping are selected on the basis of their power of discrimination and ability to be multiplexed. Once a suitable set of STR markers have been selected, the primer binding sites must be chosen properly to ensure accurate amplification of the multiplex and electrophoretic separation.
When selecting markers for multiplexed systems, it is also important to make sure that the target regions are far enough apart to avoid any possible mis-priming and overlap between regions. With the ability to multiplex by both size and fluorescent dye generally a size difference of about 10bp between markers avoids overlap between adjacent markers. The markers used for forensic analysis have been well documented in terms of size, location and frequency. However, applications may occur in which there is a need for the re-design of these loci for improved detection and analysis. This is done primarily by primer re-design, as opposed to selection of different markers. Reference sequences can be obtained from STRBase (NIST) or GenBank (NCBI) website. Once the sequence is obtained it can then be imported into primer design software for the determination of useful and thermodynamically favorable primer sites.89
b) Primer Design
The ability to successfully amplify a sample for multiplex STR analysis is highly dependent on the primers used during PCR. The primers control the location of the target sequence and provide the initial point of elongation where the polymerase can attach and begin its process of adding new nucleotides. During PCR a forward and reverse primer
48
are required for each target region to permit binding to both the sense and anti-sense strands. The primer should be specific for its target and have a high efficiency for amplification success, without the formation of artifacts. Primer design is the first step in optimization of PCR and the parameters discussed below should be carefully analyzed and taken into consideration. With each additional primer added to a set of multiplexed PCR reactions, the complexity of the interaction increases between the individual primers and the importance of proper design more prominent.90
i) Primer Length
The ideal primer length that should be chosen for the design of a multiplex kit ranges from 18-25bp. The length of the primer controls the specificity of the binding, the hybridization stability and the cost.54 Generally the longer the primer sequence the more unique it is. For each additional nucleotide added to the primer length, the possibility of finding that sequence in a random genetic sequence drops by a factor of 1/4. Therefore, the chance of finding a random primer sequence with 20 nucleotides is (1/420). Longer primer lengths are more specific and permit lower annealing temperatures, which in some cases can improve sensitivity.
On the other hand, the greater the length of the primer, the greater the chance of it binding to itself, forming secondary structures. Long primers take more time to break away from the primer-template complex requiring increased extension times that become longer with the addition of each nucleotide.90
49 ii) Primer Melting Temperature (Tm)
The primer melting temperature is the temperature at which half the DNA duplex dissociates and becomes single stranded and characterizes the stability of the duplex. Generally a range between 55- 60°C is best. All primers in a multiplex should have similar Tm values to ensure that the optimal temperature for annealing is close and that all
primers are stabilized over the same temperature range. If primer melting temperatures vary greatly within a multiplex, the chances for mispriming are high and non-specific products may be amplified. If the melting temperature is too low even more non-specific products are observed and at temperatures that are two high, there can be a loss or complete drop out of allele peaks.
Most primer design software calculates the primer melting temperatures using the nearest neighbor method when suggesting primers for multiplex reactions. The melting temperature is calculated using Equation 4:
Equation 4:Primer Melting Temperature
Where Tm is the primer melting temperature, H and S is the enthalpy and entropy of helix
formation, R is the molar gas constant (1.987 cal/°C mol), c is the DNA primer concentration in solution, 273.15 is the Kelvin to Degree Celsius conversion and [salt] is the concentration of salts present.91-94
Tm(°C) = H S+ Rln(c / 4) − 273.15°C +16.6log salt
[ ]
50 iii) Primer annealing temperature (Ta)
The primer annealing temperature is the optimal temperature at which the primer will bind to the DNA template. The annealing temperature is generally a few degrees less than the lowest primer pair Tm. The optimal Ta can be calculated from Equation 5.95
Equation 5: Primer annealing Temperature
At low Ta, primers may bind to multiple sequences other than the target sequence
resulting in non-specific products. At high Ta, primers do not bind as easily and hence the
amount of product may be reduced. Thus, the Ta must be optimized to ensure a specific
product with high yield and no artifacts. iv) GC content
When designing primers, the general rule is that the GC content should be between 40-60%. This allows for stronger annealing to the template as the GC pairing has three hydrogen bonds. The GC content directly affects the melting temperature, which is also critical to the specificity as described above. When designing a multiplex, all primers should have similar GC percentages as each other.90 If primers GC % differ greatly, the primer length may be used to facilitate similar binding temperature and conditions.
v) Primer efficiency
The primer efficiency is described as the ability of the primer to bind to its target region specifically, with low false priming and formation of secondary structures. During
51
primer design, there are a number of precautions that are used to ensure a high efficiency of binding. For example, since the process of elongation during PCR starts at the 3’ end of the primer, a GC clamp can be incorporated to increase binding and promote specific binding. However, no more than 3 G’s or C’s should be present within the last 5 bases at the primer 3’end.
vi) Secondary structures
When primers in a STR multiplex are not designed properly, the formation of secondary structures can be produced by inter or intra-molecular interactions. Interactions reduce the conformational changes and efficiency of primer binding. The three types of secondary structures formed are hairpins, self-dimers or cross-dimers. The stability of these structures depends on the primer sequence and its free energy of interaction with the nearest nucleotides.93 Most primer design software takes into account these calculations depending on the enthalpies and entropies of the nearest nucleotides. If the free energy is greater than 0, then the secondary structure is too unstable to interfere with the reaction. Energies less than 0 can spontaneously form and greatly reduce the efficiency of the primer.
Hairpin secondary structures are formed as result of intra-molecular interactions of the primer sequence causing the primer to fold onto itself. These hairpin loops can be formed with as little as 3 nucleotides. The 3’ end of the primer is most important. If the hairpin is at the 3’ position, ΔG of about -2 kcal/mol or more is acceptable. If the hairpin is internal, a ΔG of about -3 kcal/mol or more is satisfactory.96, 97
52
Self-dimers are formed by inter-molecular attractions between two of the same primers. This means that the primer is homologous to itself, and if the dimers formed more readily than that of the primer binding to target, then the amount of product is greatly reduced and artifacts may be observed. (Figure 16) If the self-dimer occurs at the 3’ position, a ΔG of about -5 kcal/mol is acceptable. If the self-dimer occurs along the primer sequence a ΔG of about -6 kcal/mol is satisfactory.96, 97
Cross-dimers are formed by inter-molecular attractions between two different primers. If two or more primers have similar sequences, especially in multiplex systems where the number of primers can be in excess of 30, they can form dimers between themselves and greatly reduce the amplification efficiency. (Figure 16) The use of appropriate software should be used to ensure primer compatibility when designing a system. If the cross-dimer occurs at the 3’ position or along the primer sequence, the same self-dimer ΔG values are acceptable.96, 97
Figure 16: (Left) a primer self-dimer formed as a result of high ΔG values. (Right) a cross dimer formed between two different primers in a multiplex.
A basic local alignment search tool (BLAST) can be performed on the primers and checked against the human genome to see if any similarities that may be present.
53
This can be performed using the National Center for Biotechnology Information (NCBI) website. The more individual the primer sequence and the fewer secondary structures that are present, the more efficient and specific the product will be.
vii) Primer concentration
To ensure maximal binding, a relatively high concentration of primers is used. If the concentration of primer in the PCR master mix is too low then sensitivity and amplicons can be lost. If the primer concentration is too high, there is a greater chance for the formation of primer dimers or non-specific binding, which can result in undesired products.
Most PCR reactions have a primer concentration between 0.2-1μM. The primer concentration can also be a useful tool when designing STR multiplexes as different loci may amplify at different rates leading to peak height imbalance. The adjustment of the primer concentration of each locus is commonly used to maintain peak balance across loci in the multiplex.
54