positional candidate gene approaches' Collins, F.S (1995)
2000 Positional
1.2.2 Physical maps of the human genome
Physical maps consist of ordered, overlapping cloned fragments of genomic DNA covering each chromosome. The different genetic maps of the human genome that have been assembled so far all represent the same concept: sets of linked polymorphic markers (linkage groups) corresponding to different chromosomes. However a variety of different types of physical map are possible. In a sense, the first physical map of the human genome was obtained when cytogenetic banding techniques were used not only to distinguish the different chromosomes, but also to enable discrimination of different sub-chromosomal regions (Bemardi, 1993). Although the resolution of this map is coarse (an average sized chromosome band in a 550-band preparation contains -6 Mb of DNA), it has been very useful as a framework for ordering the locations of human DNA sequences by chromosome in situ hybridisation techniques (Trask, 1991; van Ommen
etal, 1995).
Other maps have been obtained by mapping natural chromosome breakpoints, using panels of somatic cell hybrids containing fragments derived fi"om translocation and deletion chromosomes (Abrams et al, 1995), or by mapping artificial breakpoints using radiation hybrids (RH) (Walter and Goodfellow, 1993). However, the resolution achieved by such hybrid cell panels, that is the average distance between neighbouring breakpoints can be limited for large parts of the genome. As a result, higher resolution physical maps are desirable. Clearly, the physical map which will provide the highest
possible resolution, that of single base pairs, is the ultimate map: the complete nucleotide sequence of the genome. As this will not be achieved for some time, attention has been focused on constructing physical maps of intermediate resolution. Comprehensive RH maps and rare-cutter restriction maps have been achieved thus far for only a few human chromosomes. One example is chromosome 21 where a Notl restriction map has been published for the entire long arm (Ichikawa et al, 1993). In addition, much of the current mapping effort is aimed at mapping of coding DNA sequences, thereby producing comprehensive transcription maps.
A major intermediate goal of the Human Genome Project is to construct a complete contig map of the DNA of each of the 24 different types of human chromosome. This means relating different DNA clones to define a series of partially overlapping DNA molecules covering the entire length of a chromosome. Identification of overlaps between the DNA segments of different clones can be achieved by a variety of different procedures such as: 1) chromosome walking which establishes clone contigs from fixed starting points (Anand et ah, 1991; Little et ah, 1992), 2) repetitive DNA fingerprinting, involving characterising each clone in terms of the pattern of restriction fragments detected by two human repetitive sequence probes. For example, a whole genome clone fingerprinting approach has been applied to mapping the human genome largely on the basis of repetitive DNA fingerprinting of YACs (Bellanne-Chantelot et ah,
1992). Thirdly, sequence-tagged site (STS) content mapping (Green and Olson, 1990; Cole
et ah, 1992 ) has been used, which involves PCR-based screening with genetically mapped microsatellite markers; YACs identified as containing such markers were referred to as 'genetically anchored YACs'. In this multilevel mapping approach physical map construction is directly based on integration with the genetic map.
Given the large size of mammalian genomes, physical mappping of the entire human genome requires the use of clones with large DNA inserts, of the order of 1 megabase (Mb). Yeast artificial chromosomes (YACs) are currently the only cloning system capable of propagating such large DNA fragments and hence have been particularly useful in generating such contigs. However YACs suffer from high rates of chimerism and rearrangement (Larionov et ah, 1994) and thus are unsuitable for genomic sequencing (Chumakov et ah, 1995). STS-based maps sidestep this problem by having a sufficiently high density of landmarks so that one can rapidly regenerate physical coverage of any region by PCR-based screening of clones appropriate for sequencing - such as cosmids, bacteriophage PI clones, bacterial artificaial chromosomes (BACs) and Pl-derived artificial chromosomes (PACs) (reviewed in Monaco and Larin, 1994). Ultimately, high resolution maps based on cosmid contigs will provide a suitable framework for sequencing whole chromosomes. Significant contig maps for individual human chromosomes were first obtained for chromosome 21 (Chumakov et ah, 1992) and the Y chromosome (Foote et ah, 1992). Reasonably comprehensive YAC contig maps have
been published for chromosomes 3 and 22 (Gemmill et al., 1995; Collins et al, 1995) and also integrated maps of chromosomes 16 and 19 which include high resolution cosmid contigs (Little, 1995).
The first generation physical map of the human genome was constructed by exhaustive screening of the CEPH YAC library which contains 33,000 YACs with an average insert size of 0.9 Mb, representing 10 haploid genome equivalents (Cohen et ai,
1993). Overlaps between YAC clones were identified using three methods: repetitive DNA fingerprinting, STS content mapping and Alu-PCR probe hybridisation (Nelson et al., 1989). Whilst this physical map was far from complete, with poor coverage of some chromosomes, it provided a framework for the scientific community to build upon, in order to produce maps of all the chromosomes. This detailed mapping information which was made widely available by electronic access through the Internet, was used by various researchers to relate to specific chromosomes, or often sub- chromosomal regions that were of interest. An updated YAC contig map has since been published, covering about 75% of the human genome and consisting of 225 contigs with an average size of 10 Mb (Chumakov et al., 1995).
By providing a common language for physical mapping projects, the use of STSs allowed incorporation of any type of physical mapping data into the evolving map. A physical map of the human genome has now been constructed based on 15,086 sequence-tagged sites (STSs), with an average spacing of 199 kilobases (Hudson et al,
1995). This involved assembly of a radiation hybrid map of the human genome comprising 6,193 lod and incorporated a genetic linkage map of the human genome comprising 5,264 lod. This combined with the results of STS-content screening of 10,850 lod against a yeast artifidal chromsome library produced an integrated map, anchored by the radiation hybrid and genetic maps. This map represented an early step in an international project to generate a transcript map of the human genome, with more than 3,235 expressed sequences localised. Recently a map of 30,181 human gene-based markers was assembled and integrated with the current genetic map (Schuler et al., 1996) by radiation hybrid mapping. This new gene map (Deloukas et al, 1998) consisted of data from 41,664 STS s based on 3' untranslated regions of cDNAs. It contained nearly twice as many genes as the previous release induding most genes that encode proteins of known function.