The partial intron/exon structure of M B l was determined from comparison of the complete M B l cDNA sequence with the genomic sequence contained in contigs 5, 6 and 7 (figure 6 .6). This sequence analysis showed the organisation of three
exons (exons 1, 2 and 3) which are 33 bp, 306 bp and 283 bp in size, respectively, and are separated by two introns of >510 bp and >2.2 kb. As the genomic sequence of M B l is contained within three different contigs it is not possible to determine an accurate size of the introns of M B L However, it is likely that the
E l E2 E3 E4 E5 E6 HUMAN LMP7 E l E2 E3 E4 E5 E6 HUMAN LMP2 E l E2 E3 E4 E5 E6 i MOUSE delta 500 bp E l
I
E2 E3 HUMAN M BlFigure 6.6. Comparison of the genomic organisation of human LMP2, LM P7 and mouse delta genes with the human M B l gene. All transcripts are shown in a 5' to 3' orientation.
gaps in the genomic sequence between contigs 5 and 6, and between contigs 6 and
7 are less than 200 bp (S. Beck, personal communication). Hence, it is probable that intron 1 is 510-710 bp in size and intron 2 is 2220-2420 bp. Therefore, M B l
spans approximately 4 kb of genomic sequence. Both introns 1 and 2 are full of
Alu repetitive sequences which could explain why it has been difficult to obtain complete sequence coverage of M B l at the genomic level (S. Beck, personal communication). Analysis of the M B l cDNA sequence revealed that the cDNA contains a 5' untranslated region of 121 bp.
Comparison of the genomic organisation of M B l with that of LMP2, LMP7
and mouse delta showed that M B l differs significantly from the other proteasome genes at the genomic level (figure 6.6). The genomic organisation of exons and
introns in human LMP2, human LMP7 and mouse delta are consistent with each other. All three genes are made up of six exons and five introns, although the sizes and locations of the introns are variable. Human LMP2 is closely related to human
delta, showing 59% identity at the amino acid level. Comparison of the intron/exon organisation of the LMP2 gene with mouse delta supports this close relationship. The human delta gene is most likely to have a similar exon/intron organisation to that of LMP2, since the equivalent mouse gene has this pattern. Human M B l and LMP7 genes are also closely related, showing 67% identity at the amino acid level. It would be expected that this relationship would also be echoed in the genomic organisation of the two genes. However, this is not the case. An MBi-related sequence is thought to have duplicated to produce LM P7
(Belich et al., 1994) which was subsequently brought together with LMP2 and the
TAPs in the MHC. Since the intron/exon organisation of M Bl and LMP7 differs so widely it is likely that M B l and LM P7 have been separated for a long evolutionary time or the altered gene structures have functional or regulatory consequences.
6.3. Conclusions
Analysis of the molecular anatomy of genes and gene families has been used to provide answers concerning the evolution of such genes. MHC class I and class H molecules, for example, show clear homology to immunoglobulin domains (Orr et al., 1979) and therefore belong to the immunoglobulin superfamily (Williams and Barclay, 1989). Exactly when MHC molecules first evolved remains unclear, however similar genes have been reported in amphibians, fish, reptiles and birds.
This implies that they were present prior to vertebrate evolution which occured about 400 million years ago (Kaufman et al., 1990). The primordial gene for MHC antigens is thought to have encoded a single molecule similar to a class II gene which associated to form heterodimers. Duplication and divergence led to a multigene family encoding heterodimers like present-day class II antigens. Eventually a gene coding for a chain with three extracellular domains was formed and associated with p2-m giving rise to a class I-like molecule. Duplication and divergence of this gene led to the multiple class I loci in the MHC (Kaufman et al., 1984; Hughes and Nei, 1993).
Nucleotide and amino acid analysis of class II genes shows that all the a - chain genes are equally diverged from one another suggesting that they arose by duplication at roughly the same time (Auffray et al., 1984). Similar studies performed on P-chain sequences suggest that DPBl, D Q Bl and DRB are equally related and that DOB may have diverged prior to events that gave rise to the other P-chain loci (Tonnelle et al., 1985). More recent duplication events are responsible for the highly related genes DQAl and DQA2, D Q Bl and DQB2, D PAl and DPA2 and DPBl and DPB2.
The TAP and IM P genes in the MHC class II region represent a gene cluster where the protein products are involved in the class I antigen processing pathway. Belich et al. (1994) have proposed a model to explain the genetic origins of the
IM P genes, discussed in section 6.1.4. This chapter has described a preliminary study aiming to determine how the TAP/LMP gene cluster came to reside in the class II region.
Cosmids encompassing the T M f 7-related gene, M B l, were isolated and used to identify novel cDNAs in close proximity to the M B l gene. Three cDNAs were isolated and mapped by hybridisation to cosmid blots. The four cDNAs (including M B l) were all localised to a 12 kb Bglll restriction fragment, their close proximity hinting at the presence of a gene cluster. Genomic sequencing of the 12 kb Bglll restriction fragment was carried out and showed that one of the clones was composed of Alu repeat sequences. The genomic sequence and partial cDNA sequences of the remaining two cDNA clones were used to search the EMBL and Swiss-Protein databases but no significant matches were found. The cDNA clones mapping close to M B l are, therefore, unrelated to the T A P
transporters. Genomic sequencing also revealed that the 5' end of M B l is located 81 bp from a Bglll restriction enzyme site. If the 12 kb Bglll fragment, encoding the M B l gene and the cDNA clones M il, M14 and M15, is located at one end of
cosmid MB4 it is still possible that TAP-related sequences reside close to M Bl.
Further restriction enzyme mapping and sequencing of cosmid MB4 will determine if this the case.
A tight cluster of genes that are apparently unrelated by function has been discovered on chromosome 16q22.1 (Larsen et al., 1993). The genes for a protein serine kinase (PSKHl), the previously cloned lecithin: cholesterol acyl transferase
(LCAT), a protein of unknown function and the proteasome subunit, MECL-1 are located in a 12 kb genomic region which includes a CpG island. The expression of
MECL-1 is inducible with y-IFN and has reciprocal expression to that of the proteasome subunit Z (Tanaka, 1995). It has been suggested that y-IFN may induce the subunit replacement of Z by MECL-1 (along with the replacement of
delta and M Bl by LMP2 and LMP7, respectively), producing proteasomes that are more appropriate for antigen processing through the class I pathway (Tanaka, 1995). The cluster of genes on chromosome 16 includes genes with restricted tissue expression patterns and widely expressed genes. The tight clustering of these genes suggests that they may be subject to reciprocal transcriptional regulation (Larsen et al., 1993). Further genomic sequence analysis of the MB4 cosmid, on chromosome 14, will determine if M B l exists in such a gene cluster.
Comparison of the M B l cDNA sequence with the genomic sequence
enabled determination of the M Bl intron/exon organisation. M B l is composed of three exons which are 33, 306 and 283 bp in size. Introns 1 and 2 are >510 bp and 2.2 kb in size, respectively. This organisation is widely different to that of LMP7
and other proteasome genes. Gene duplication, a major mechanism in evolution, is responsible for the existence of all present-day genes (Ohno, 1970). For example, the chicken ovomucoid gene is thought to have evolved by triplication of an ancestral protein-coding DNA segment split by one intron. It is now composed of seven introns and eight exons (Breathnach and Chambon, 1981). There is also evidence that the conalbumin gene, composed of sixteen introns and seventeen exons, evolved by duplication from an ancestral gene with seven or eight exons (reviewed in Breathnach and Chambon, 1981). By analogy, M B l
probably duplicated to produce LMP7 which is composed of six exons and five introns. From this data it is impossible to comment on when the duplication event occured.
Figure 6 . 6 shows the intron/exon structure of mouse delta which is similar
to that of human LMP2. Since human and mouse class II genes are related both in function and genomic organisation it is to be expected that human delta will have
a similar molecular anatomy to that of mouse delta and hence human LMP2. This is in contrast to that of human M B l and LMP7. The reasons for this are unclear. However, determination of the genomic organisation of human delta may help to understand the evolution of the proteasome genes.