W Polymorphic Information content (PIC) is equal to:
3.3.3 Distribution of Microsatellite Sequences
It is estimated that the human genome can accommodate 500,000 microsatellites of all types, interspersed at an average distance of 6 kilobases of DNA sequence (Beckman and Weber, 1992). Seventy six percent of them are (A)n, (AC)n, (AAAN)n, (N=G, T, 0) or (AG)n, in decreasing order of abundance. Approximately 40% of all microsatellites in the rat and mouse genomes are of the
(AC)n type which is twice the frequency of the same microsatellite in humans (Beckman and Weber, 1992; Love et al., 1990). Humans and mice have an estimated 50,000 and 100,000 (AC)n copies per haploid genome respectively (Hamada and Kakunaga, 1982). Estimates of the distribution of (AC)n repeats suggest that in humans rat and the mouse, they occur on average every 30, 21 and 18 kb respectively (Stallings et al., 1991). The data on the physical distribution of
(AC)n repeats in humans originates from a study on the proportion of human chromosome-specific DNA cosmid clones (36 kb average size) containing one or more (AC)n repeats, and on database searches (Stallings et al., 1991). In the study by Stallings and colleagues, the cosmid clones were g ridded out onto nylon filters and hybridised to a (TG)n oligonucleotide. Also included on these filters were 528 genomic mouse-DNA clones. About 12% of the human clones that were assigned as negative upon hybridisation of cosmid grids to a (TG)n oligonucleotide did in fact contain a repeat when DNA was analysed by Southern blotting. The mouse clones were not subject to Southern blot analysis but grid hybridisation indicated that about 80% of the cosmids were positive. If we assume that the detection error for grid hybridisation was in the order of 10%, then nearly 90% (475 clones) of the mouse clones would be expected to contain at least one (AC)n repeat. The mouse clones (528 in total) represented 1.9 megabases of DNA sequence, so the average physical distance for at least one (AC)n repeat to occur is 40 kb (1.9 Mb /1 .7 Mb x 36 kb = 40 kb), assuming that this set of clones do not overlap . Comparison of this figure to the frequency of (AC)n repeats in the database (18 kb), would suggest that on average one mouse cosmid clone could contain at least two (AC)n repeats. This is not to say that no chromosome or chromosomal region would be expected to have a low microsatellite content, since there appears to be significant depletion of variable
(CA)n markers on mouse chromosomes 10 and X (Dietrich et a!., 1994) and on human chromosomes 9 and 19 (Weissenbach, 1993). Some autosomal regions in the mouse microsatellite map show large genetic gaps, probably reflecting the presence of recombinational 'hotspots' rather than physical clustering of (AC)n
The microsatellite content of the mouse X chromosome appears to be about 50% of what would be expected, based on its cytogenetic length (Dietrich et al., 1994). Also, the degree of allelic variation among laboratory strains for X- chromosome microsatellites is 33%, compared to about 50% for autosomal ones (except for MMU10 with 36 % variability). The reason why microsatellites on the X chromosome are less variable is not clear. In humans, most new mutations at the Lesch-Nyhan disease locus are of paternal origin (Francke et a/., 1976) and the human X chromosome is less variable, as shown by lower rate of RFLP detection (Hofker et a!., 1986). It is thought that spermatocytes acquire more mutations than oocytes because of the greater number of cell divisions involved in gametic maturation in males than in females (Hofker et a!., 1986). As females have two X chromosomes and males only one, the X chromosome replicates through oogenesis in 2/3 of the cases and 1/3 through spermatogenesis. This could ultimately lead to proportionally lower polymorphism on the X than the autosomes (Francke et a!., 1976). The lack of recombination for most part of the X in male meiosis could be a contributing factor for the low rate of RFLP but is a less likely explanation for the low variation level at microsatellites of the mouse X chromosome. The most puzzling observation is that human X-chromosome microsatellites are as polymorphic as autosomal ones (Weissenbach, 1993). The mouse X chromosome in one of the most well mapped chromosomes using other genetic markers and significant progress has been made in establishing physical maps for certain regions (Brown et a!., 1993; Brown, 1994). These data will help establish the true microsatellite content of the X chromosome and perhaps shed some light on the underlying causes of their low variability.
In the human genome, the subtelomeric regions appear to contain gaps on the microsatellite genetic map (Weissenbach et a!., 1992; Gyapay et a!., 1994). Subtelomeric regions frequently contain a subclass of R-bands called T-bands, that harbour the richest GC-fraction known as isochore H3 (Saccone, 1992; reviewed by Bernardi, 1993). Genomic libraries enriched for isochore H3 DNA contained the same proportion of microsatellite clones as total human genomic DNA libraries (cited in Weissenbach, 1993). The genetic distribution of (AC)n markers derived from isochores is reported to be quite distinct from that of the other markers (cited in Gyapay et a!., 1994). The genetic distances of subtelomeric regions are known to be amplified in man (Murray et a/., 1994); this could explain the genetic gaps observed in these regions. Furthermore, the amenability of GC-rich regions to amplification using the PCR may be limited and this might create a bias by discarding sequences that fail or are difficult to amplify.
The distribution of microsatellites has been considerably conserved between closely related species, such as the mouse and rat or humans and chimpanzees, so
amplify the homologous locus of a related species (Stallings et al., 1991; Moore et al., 1991; Deka et al., 1994). The use of conserved microsatellites will help advance the comparative maps and could provide a powerful tool for the construction of framework genetic maps between related species, so that comparative data could be more accurately collated. Comparisons of the position of microsatellites have revealed that even in species as diverse as mice and humans, the position of some repeat runs has also been conserved. Six out of 20 microsatellites found in man and mouse evolved in homologous positions (Stallings et ai, 1991; Stallings, 1995). Such observations would suggest that some microsatellites sites are of very ancient origin. However, this conclusion could only be based on the premise that conserved microsatellites pre-existed in ancestral species and did not evolve at these sites by chance. On the other hand, such comparisons might indicate that certain DNA segments could be more prone to the formation of tandem repeats than others, possibly through the action of cis-acting regulatory sequences.