4 Mitochondrial DNA and Y chromosome diversity
4.3 Mitochondrial DNA variation
4.3.2 Coding region
Nine polymorphisms that affect restriction enzyme recognition sites throughout the human mtDNA coding region were typed: 1715 Dde I, 4577 NIa III, 7025 Alu
I, 8251 Ava II, 10394 Dde I, 10397 Alu I, 12305 Hin fl, 13704 Mva I and 15606
Alu I. In combination, these nine polymorphisms can be used to characterise the European haplogroup (lineage) to which each mtDNA sample belongs (Finnila et al., 2001). The haplogroup of 99/141 samples (70.2%) could be characterised; the remainder of the samples fell into two categories. Certain fragments could not be amplified (and therefore typed) from some samples, whereas some samples could be typed at all nine polymorphic sites but their haplogroup was not among the European ones under study. The haplogroup of these samples could therefore not be defined using these polymorphisms. The distribution of haplogroups in the 99 samples that could be characterised is shown in Table 4-6.
Haplogroup Defining polymorphisms (in coding region) Total
H 7025 A/ul (-), 10394 D de\(-) 41
V 4577 /V/a III (-), 10394 Ddel (-) 4
U 12305 Hini\ (+), 10394 Dde 1 (-) 21
K 12305 H/nfl (+), 10394 Ode 1 (+) 9
T 15606 Alu 1 (+), 10394 Ode 1 (-) 7
J 13704 Mva 1 (-), 10394 Ode 1 (+) 9
1 1715 Ode 1 (-), 8251 Ava II (+),10394 Ode 1 (+) 1
X 1715 Dde l(-), 10394 Ddel(-) 4
z 10397 Alu 1 (+), 10394 Ode 1 (+) 3
TOTAL 99
Table 4-6 Distribution of mtDNA coding region haplogroups
Of those samples for which the coding-region haplogroup could be defined, 41/99 (41.4%) belonged to haplogroup H and 21/99 (21.2%) belonged to haplogroup U. These figures agree with those determined by analysing the variability of HVR-1 in the same samples (see section 4.3.1), where 41.4% of samples were characterised to belong in haplogroup H and 21.6% in
haplogroup U. These two haplogroups are at the highest proportion in the samples under study, and also in the European population in general (Richards
et a/., 2000; Torroni et a/., 1998). Nine samples (9.1%) belong to each of haplogroups J and K, seven (7.1%) to haplogroup T and between one and four samples to each of the remaining haplogroups (I, V, X and Z). The haplogroup definition agrees in 48 of 82 (58.5%) samples for which both HVR-1 and coding region haplogroups could be determined. Reasons for the slightly low agreement rate between HVR-1 and coding-region RFLP data may include homoplasy (back-mutation) at key polymorphic sites, incompleteness of RFLP data, PCR contamination of mtDNA fragments or occurrence of haplogroups that cannot be typed by this RFLP system.
.f13.N yl(l) 94-19921, 94-20900 35 66, K443, K025 / HV? RUS284 T 0 AG, CO CRS
Figure 4-c Reduced median network of nine mtDNA coding region RFLP sites, drawn using the program Network 2.0. Major haplogroups are indicated. Reticulations indicate homoplasy, i.e. multiple occurrences of the same polymorphism. Samples that do not fall in typical haplogroups are marked.
Figure 4-c shows a reduced median network drawn using the data from the nine polymorphic sites analysed in the mtDNA coding region. Most of the haplogroups can be distinguished from each other, but the network contains several homoplasies where the same polymorphism (probably 10394 Dde I, which is known to be a homoplasic site) appears to have occurred in separate lineages. Four out of 23 nodes (17.4%) are empty, five are occupied by a single sample and the remainder contain two samples or more.
If only the nodes containing 2 or more samples are considered, the network becomes considerably less complicated (see Figure 4-d). This network is far simpler than that portrayed in Figure 4-c, and is more likely to be accurate. By excluding those nodes that contain only a single sample and are therefore less reliable, three of the empty nodes and many of the apparent homoplasies disappear. In Figure 4-d, the uncharacterised samples (including many of the African samples) cluster together, and may be an indication of an African outgroup. This is discussed further in section 4.3.3.
P o s s ib le A fric a n \ o u tg ro u p (se e te xt) — y A G , CG ^ 66, K443, 94-19921, ^ 9 4 -2 0 9 0 0
Figure 4-d Reduced median network of nine mtDNA coding region RFLP sites, based on Figure 4-c but only showing nodes that contain two or more samples. Major Eurasian haplogroups are indicated. A cluster of non-Jewish African samples, outlined in red, may indicate a possible African outgroup.
4.3.2.1 9bp deletion In sample CG at positions 8272-8280
During electrophoresis following a PCR batch to amplify mtDNA fragments containing the 8251 Ava II restriction polymorphism, the band amplified from sample CG appeared to run slightly further than those from other samples (see Figure 4-e).
I
negative
samples comrol samples
negative control negative samples control m m m m m m # ^ m m m m -261 bp Band from sample CG
Predicted amplicon size: 261 bp
Figure 4-e Gel photograph showing size difference In fragment for 8251
Ava II analysis from sample CG. After PCR amplification, electrophoresis was performed In 1% agarose gel at 100V for approximately 20 mln. The band for sample CG (Indicated) appears to have run further than those for the other samples, suggesting that It may contain a deletion.
A portion of each amplicon was digested with the restriction enzyme Ava II, and the remainder of the amplicons from some samples were sent for DNA sequencing to see if sample CG contained a deletion in this fragment. Figure 4-f depicts an alignment of partial DNA sequences from this fragment, and clearly shows that sample CG contains a 9-bp deletion at position 8272-8280, of the repeated nucleotides CCCCCTCTA.
8260 8270 8280 8290 8300
;
i
i
;
;
CRS 92-00280 TTACCCTÀTàGCACCCCCTCTàCCCCCTCTàGAGCCCACTGTAAàGCTAACTTAGC&TT^ 92-17505 mCCCTàTAGCACCCCCTCmCCCCCTCTAGà6ÇCCACTb^&àà6CTAÀCTTèGÇÀTTÀÀC< 92-33973 ITACCCTàTiy3CÀCCCCCTCr;LCCCCCTCTàG&GCCCA.Cl^ÂAA0PTAÀCTTàGCATTftÂC< 88-72577 TTACCCTATAGÇ&CCCCCTCT&CCCCCTCT&GAGCCCAC'MAAAGCTAACTTAGqkTTW 6074951 rrACCCT&TAGi:ÀCCCCCTCTALCCCCCTCTAGAGCC%C'#j?AaAGprAÀCTT6GÇàTTAA( HS ÎT^^TàTAGÇiCCCCCTCTÀCCCCCTCTAGAGCCCÀCT^AAàGCTMCTTAGCATTAÂC< AG TTàC<%T&TAGCACCCCCTCT&CCCCCTCTAGAGCCC&CTp%AaGeTAACTTAGCATTAA< C G T T à C C C T A T à G C A C C C C Ç T C T à ---G à G C C C A C l^ è A à ^Figure 4-f Alignment showing 9-bp deletion In the mtDNA coding region of sample CG. The repeated nucleotides CCCCCTCTA are deleted at position 8272-8280 In this sample.
This deletion occurs in the COII/tRNA^^^ intergenic region of mtDNA and has been previously reported to have multiple ethnic associations, including south east Asia (Melton e\ a/., 1995), native Amerindians (Torroni & Wallace, 1995) and sub-Saharan Africa (Soodyall et al., 1996). It is therefore thought that this deletion may have arisen several times independently, possibly due to this region's genetic instability (Thomas et a!., 1998b). Sample CG comes from an African origin, and clusters closely with sample AG (also of African origin) in both KSHV and mtDNA analyses. In this fragment, sample CG carries the 9-bp deletion and also a G ^ A polymorphism at position 8206 (detected in the DNA sequencing analysis) which separate it from sample AG, even though these two samples are identical in their mtDNA coding region RFLP pattern.