Japan and†Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan Manuscript received November 5, 2003
Accepted for publication November 16, 2003
ABSTRACT
New World monkeys (NWMs) occupy a critical phylogenetic position in elucidating the evolutionary process of major histocompatibility complex (MHC) class I genes in primates. From three subfamilies of Aotinae, Cebinae, and Atelinae, the 5⬘-flanking regions of 18 class I genes are obtained and phylogenetically examined in terms ofAlu/LINEinsertion elements as well as the nucleotide substitutions. Two pairs of genes from Aotinae and Atelinae are clearly orthologous tohuman leukocyte antigen(HLA)- Eand- Fgenes. Of the remaining 14 genes, 8 belong to the distinct group B, together withHLA-Band -C, to the exclusion of all otherHLAclass I genes. These NWM genes are classified into four groups, designated asNWM-B1, -B2,-B3, and-B4. Of these,NWM-B2is orthologous toHLA-B/C. Also, orthologous relationships of NWM-B1,-B2, and-B3exist among different families of Cebidae and Atelidae, which is in sharp contrast to the genus-specific gene organization within the subfamily Callitrichinae. The other six genes belong to the distinct group G. However, a clade of these NWM genes is almost equally related toHLA-A, -J,-G, and -K, and there is no evidence for their orthologous relationships toHLA-G. It is argued that class I genes in simian primates duplicated extensively in their common ancestral lineage and that subsequent evolution in descendant species has been facilitated mainly by independent loss of genes.
G
LYCOPROTEINS encoded by genes in the major Table 1 lists class I homologs ofhuman leukocyte antigen (HLA) genes of simian primates. Note that not all of histocompatibility complex (MHC) region inver-these homologs are orthologous toHLAclass I genes. tebrates trigger the acquired immune system by
present-Although there is the 1:1 orthology among humans ing non-self-peptides to T cells (Klein1987;Kleinand
and great apes ofAlocus, there is no 1:1 orthologous
Horˇejsˇı´ 1997). The number of bona fide MHC genes
relationship betweenHLA-AandPatr-AL(Adamset al. seems to have been optimized by dual functions of MHC
2001). In Old World monkeys (OWMs), there are many molecules: T-cell restriction by self-peptides and
presen-A-like genes in addition to the so-calledAGgene (
Boy-tation of non-self-peptides (Takahata 1995; Celada
sonet al. 1996, 1997). Since theseA-related genes dupli-andSeiden1996;Wegneret al.2003).MHCgenes are
cated specifically in the lineage of OWMs (Boysonet divided into classes I and II with respect to their
struc-al. 1996, 1997), they are orthologous to hominoidA, but ture and function. Each class is further subdivided into
they are not 1:1 orthologs toHLA-A. In more distantly classical and nonclassical according to the pattern of
related New World monkeys (NWMs), no A-like genes gene expression or the extent of polymorphism (Klein
have been found. 1987). Classical class I genes are ubiquitously expressed
For the B and C loci, the 1:1 orthology is demon-and generally exhibit high degrees of polymorphism.
strated within hominoids. HLA-C orthologs are found Nonclassical class I genes are expressed mainly in
re-in African apes and orangutans (Chenet al. 1992;Adams
stricted tissues or organs and exhibit relatively low
ex-et al. 1999) and this is consistent with the suggestion tents of polymorphism. Some nonclassical MHC
mole-that the duplication of HLA-B and -C occurred ⬎40 cules function as a ligand to a natural killer cell receptor
million years ago (MYA; Kulski et al. 1997; but see (NK receptor) and send a signal to prevent the cell lysis
Piontkivska and Nei 2003). Since this estimate pre-by the NK cell (Braudet al.1998;Leeet al. 1998;
Leib-dates the divergence between hominoids and OWMs
son1998).
(Martin1993), it is likely that the ancestor once pos-sessed the HLA-Cortholog. Thus, theHLA-Cortholog seems to have been lost in gibbons and OWMs. On the other hand, orangutans and OWMs experienced gene
Sequence data from this article have been deposited with the
EMBL/GenBank Data Libraries under accession nos. AB113090– duplication of theBlocus independently, so that these AB113112 and AB113202–AB113205.
genes within each species as a whole are orthologous
1Corresponding author:Department of Biosystems Science, Graduate
toHLA-B.Cadavidet al.(1997) found twoB-like genes
University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan.
E-mail: [email protected] in spider monkeys (Ateles belzebuth) and saki monkeys
TABLE 1
Evolutionary relationships among nonhuman primate class I genes with 10HLAloci
Species or subfamily Classical genes Nonclassical genes Pseudogenes
Homo sapiens A B C G E F H J K L
Pan troglodytesa A B C G E F H J K L
AL
Gorilla gorilla A B C G H
Pongo pygmaeusb A B1 C G E
B2
Hylobates lar A B
Macaca mulattac A1 B G E F
A2 I
AG
Cebinaed (1)
Aotinae (2)
Callitrichinae (1–3) E F
Pitheciinae B (3)
Atelinae B (2)
aPart-ALis polymorphic in terms of presence or absence and is paralogous toHLA-A.
bPopy-C is polymorphic in terms of presence or absence. Popy-B1 and -B2 are duplicates specific to the
orangutan, making them group orthologs toHLA-B.
cMamu-G is a pseudogene orthologous to HLA-G. There are multiple Bloci in M. mulatta. However, the
phylogenetic analysis does not classify sequences into well-defined groups and the number of loci is still unknown.Mamu-Iis different fromBloci and emerged specifically in OWMs.Mamu-A1,-A2, and-AGduplicated in OWMs and group orthologs ofHLA-A.
dThe number of knownG-related loci is given in parentheses because of the uncertainty of the orthology
toHLA-G. The orthologous relationships of theB-related genes in Pitheciinae and Atelinae are not known.
(Pithecia pithecia). However, theseB-like genes in NWMs ments. We use this information as well as the nucleotide substitutions to study the evolutionary relationships of are paraphyletic with respect toHLA-B/C(Cadavidet al.
1997) and their orthologous relationships are not dem- class I genes in simian primates. onstrated.
As for the nonclassical class I loci,E andF are well
MATERIALS AND METHODS conserved and their 1:1 orthologies are largely
estab-lished among primates (Otting and Bontrop 1993; HLAclass I sequences:We retrieved genomic sequences of
Knappet al. 1998). By contrast, the orthologous relation- 10HLAloci:HLA-A(GenBank accession nos. AP000519 and AP000520), HLA-B(AP000507), HLA-C(AP000508),HLA-E
ships of G can be seen only between hominoids and
(AP000514), HLA-F (AP000521), HLA-G (AP000521),
HLA-OWMs (Boyson et al. 1997; Castro et al. 2000). In
H/54 (AP000520),HLA-J/59 (AP000519),HLA-K/70 (AP00-NWMs,G-like genes have been reported in several sub- 0520), andHLA-L/92(AP000516). The length of the 5⬘- and families. However, the orthology of these G-like genes 3⬘-flanking regions isⵑ10 kb each and that of the coding to HLA-G has not been established. Furthermore, in region isⵑ3 kb.
Samples:We used the owl monkey (Aotus trivirgatus), the
the subfamily Callitrichinae (cotton-top tamarins and
tufted capuchin (Cebus apella), and the spider monkey (A.
marmosets),G-like genes have duplicated in
genus-spe-belzebuth) as representatives of subfamilies of Aotinae, Cebinae, cific manners and may function as classical class I in and Atelinae, respectively. We also used the rhesus monkey the absence of classicalHLAorthologs (Watkinset al. (Macaca mulatta) because its MHC is best studied among 1990;CadavidandWatkins1997;Cadavidet al. 1997, OWMs. We purchased the rhesus monkey genomic DNA from CLONTECH (Palo Alto, CA) and prepared the NWM genomic 1999). In other subfamilies, there are few studies about
DNA from a 2-ml blood sample of each individual kept at the the origin and evolution ofMHCclass I genes (Adams
Primate Research Institute by using a blood and cell culture andParham2001). DNA kit (QIAGEN, Chatsworth, CA).
To understand the evolutionary dynamics of primate PCR and sequencing:To amplify the 5⬘-flanking and coding MHCgenes, it is essential to establish the orthologous sequences of class I loci, we designed PCR primers specific to certain groups of HLA loci: pAluE, 5⬘-GACCCTGTCTCTC relationships of various MHC loci, particularly B- and
TAAACAACAGCA-3⬘; pAluB, 5⬘-AGGCATCCTAAYCAGTG G-related genes. To this end, we design PCR primers to
ele-Figure1.—NJ trees of 10 HLAloci based on thep -dis-tances in (A) the 5⬘-flanking region (1105 bp), (B) in-trons (1559 bp), (C) exons 2–3 (542 bp), and (D) ex-ons 4–8 (462 bp). For the 5⬘-flanking region,HLA-Eis excluded. Only bootstrap values⬎65% are shown.
(available upon request). The amplified PCR fragments were RESULTS ⵑ3 kb and were purified [PCR purification kit (QIAGEN),
HLA loci and their flanking regions:In general, the
S.N.A.P. gel purification kit (Invitrogen, San Diego)]. Purified
fragments were cloned (TOPO cloning kit; Invitrogen). To topological relationships among class I genes strongly avoid sequencing errors, we sequenced three or more clones depend on the region and the length of sequences used for each PCR fragment in both directions by about six sets of in the phylogenetic analysis (Parhamet al. 1989, 1995). sequence primers. We performed sequencing reactions by
We compared the 5⬘- and 3⬘-flanking sequences of the using the dye terminator cycle sequencing method [DNA
se-10HLAloci in addition to their coding sequences. How-quence kit (ABI, Columbia, MD)] and the DNA seHow-quencer
(ABI377; ABI). According to the MHC designation system ever, because the alignment in the 3⬘-flanking region (Kleinet al.1990;BontropandKlein1997),MHCgenes in is difficult to make, particularly forHLA-E and -F, we M. mulatta,A. trivirgatus,C. apella, andA. belzebuthare subse- excluded the 3⬘-flanking region from further analysis. quently prefixed byMamu,Aotr,Ceap, andAtbe, respectively.
We analyzed the 5⬘-flanking region, introns, exons 2–3,
Sequence analysis:To identify homologous regions between
and exons 4–8 separately and evaluated the reliability of pairs of DNA sequences, we used Dotter (Sonnhammerand
Durbin1995) and aligned homologous regions by Clustal W the resulting phylogenetic trees in terms of the topology (Thompsonet al.1994). We further modified the alignment and the bootstrap value (BV). As expected from exten-manually wherever necessary. We constructed phylogenetic
sive nonsynonymous substitutions in exons 2–3 driven trees by the neighbor-joining (NJ) method (SaitouandNei
by balancing selection (HughesandNei1988;
Taka-1987) implemented in PHYLIP, version 3.572 (Felsenstein
1993). We used the p-distances (the observed numbers of hata1995), the tree topology in these exons appears nucleotide differences per site) to determine the topological to be much affected by homoplasy and significantly dif-relationships, as recommended bySaitou andNei (1987),
ferent from that in other regions. In addition, the phylo-and the d-distances (the estimated numbers of nucleotide
genetic relationships are least reliable in terms of BVs substitutions;Kimura1980) to estimate the divergence times
Figure 2.—Insertion sites of Alu/ LINEelements, indels, and PCR primer positions in four groups ofHLAclass I genes. The three PCR primers of pAluE, pAluB, and pAluG are designed to in-clude group-specific Alu sequences. Hatched rectangles indicate exons 1 and 2; open rectangles,Alus; stippled rectan-gles,L1/L2elements; thick arrows, PCR primer positions; thin arrows within rect-angles, the 5⬘–3⬘direction of an inser-tion element; vertical lines across differ-ent loci, the same Aluelements; open and solid triangles above each line, inser-tions and deleinser-tions, respectively; and the number above a triangle, the size of an indel.
regions show a more or less similar tree topology, and FLAM_Cis a subfamily of the left arm of theAlu mono-mer and is thought to have disseminated earliest (Jurka
the BVs are satisfactorily high for the 5⬘-flanking region
and introns (Figure 1). Importantly, the 5⬘-flanking re- andMilosavljevic1991;KapitonovandJurka1996). HLA-B,-C, and-Lshare anFLAM_Cby which they are gion contains a number of nucleotide insertions and
deletions (indels), as well as Alu,L1, andL2insertion grouped in B. In addition to thisFLAM_C, there is an AluJo that can discriminateHLA-B/CfromHLA-Lwithin elements. It turns out that these insertion elements and
indels are phylogenetically informative, so that we subse- group B. This AluJo is inserted in a 6- to 9-kb region upstream from exon 1 and is not depicted in Figure 2. quently focus on the 5⬘-flanking region.
Among phylogenetically informative insertion ele- There is anAluJb that can define a single clade of group A and group G. These groups can be distinguished from ments (Figure 2), there are four Alus, which allow us
to classify the 10 HLAloci into four groups: B (HLA-B, each other by anAluY specific to group A and by another AluJo specific to group G. Group E orHLA-Eis highly -C, and-L), A (HLA-Aand-H), G (HLA-G,-J,-K,and-F),
and E (HLA-E). Alus are primate-specific short inter- unusual in that fiveAlus are specifically inserted in the 5⬘-flanking region. The PCR primers inmaterials and
spersed elements, and on the basis of diagnostic sites
Alus are classified into four subfamilies:FLAMorFRAM, methods are designed to include these insertion ele-ments and we use them as phylogenetic markers to AluJ,AluS, andAluY (JurkaandMilosavljevic1991;
BatzerandDeininger2002). Since these subfamilies identify class I genes in OWMs and NWMs.
Classification ofMamuclass I genes:When the PCR
disseminated in different evolutionary times (Willard
et al. 1987;JurkaandMilosavljevic1991;Kapitonov primer set was applied to M. mulatta, it yielded five different 5⬘-flanking sequences each of 3–4 kb in length. and Jurka 1996; Batzer and Deininger 2002), we
Figure3.—Five different configu-rations ofAlu/LINEs inserted in the 5⬘-flanking region of Mamu class I genes. Symbols are the same as those in Figure 2.
ately classified into three groups: one into group E, one by insertion elements and indels are also supported by BLAST search. Using each of the 5⬘-flanking sequences into group B, and three into group A or G (Figure 3).
We could identify the group E sequence as Mamu-E ofMamuclass I genes as a query, we obtained the corre-spondingHLAorPatrortholog as the best hit. Similarly, (Boyson et al. 1995) from the presence of four Alus
and one L1, of which the insertion sites are identical BLAST search of an ⵑ150-bp exon 1–2 sequence showed that the Mamu-E and -F sequences are 99% to those ofHLA-E. For the other singleMamusequence
in group B, we found that the insertion sites of one identical to those in the database, respectively, and that theMamu-Asequence is 88% identical to theMamu-A*11 FLAM_Cand threeL1s are identical to those inHLA-L
(Figures 2 and 3). Therefore this sequence is designated sequence. However, theMamu-Kand-Lexon sequences turn out to be⬎90% identical toMamu-B*08, suggesting asMamu-L. To distinguish whether the three sequences
belong to group A or G, the group-A-specific AluY is either some confusion in locus identification or loose linkage between the 5⬘-flanking region and exons 1–2. not a useful marker since it is found mainly in hominoids
and rarely in OWMs or NWMs (BatzerandDeininger Classification of NWM class I genes:The same PCR primer set as forM. mulattayielded 18 NWM sequences: 2002). Instead, we distinguished group G from group
A by the presence or absence of the group-G-specific 7 fromA. trivirgatus, 5 fromC. apella, and 6 fromA. belzebuth (Table 2). Since the insertion sites of group-specificAlus AluJo. In other words, if thisAluJo is present, a sequence
is identified as a member of group G, and if not, it is are identical to those inHLAgenes (Figure 4), it is possi-ble to classify the 18 sequences into three groups: E, B, of group A. In this way, two of the threeMamusequences
are classified into group G and one into group A. The and G. Two sequences are classified into group E byL1 andAluSg, which are shared withHLA-E; 8 into group latter is designated asMamu-A(Milleret al. 1991) and
this designation is also supported by BLAST search (see B by FLAM_C, which are shared with HLA-B, -C, and -L; and the remaining 8 into group G byAluJo andL2, below).
Furthermore, according to indels that are shared with which are shared withHLA-G,-J,-K, and-F. The apparent lack of group A is due to either PCR primer mismatches HLA-Kor -F, the twoMamusequences in group G are
identified asMamu-Kand-F.Mamu-Kshares a single 95- or its true absence in the NWM genome.
The two group E sequences from A. trivirgatus and bp insertion withHLA-K, whileMamu-F (Otting and
Bontrop1993) shares nine indels (one 2-bp, one 6-bp, A. belzebuth(Table 2) are unambiguously designated as Aotr-E andAtbe-Eby insertion elements (Figures 2 and and one 120-bp insertion, and two 1-bp, two 3-bp, and
two 4-bp deletions) withHLA-F. Although our failure 4), which is consistent with the finding of conservation ofEgenes in simian primates (Knappet al.1998). Within to detectMamu-Band-G(Boysonet al. 1995, 1996) is
probably owed to insufficient screening of PCR prod- group B sequences, there are four different constella-tions (B1–B4) ofAlu/LINEs. In addition, there are eight ucts, we did newly identifyMamu-Land-K in addition
TABLE 2 1990; Knappet al. 1998) as well as group B (Cadavid
et al.1997) and group G sequences in NWMs (Watkins
Classification ofHLAloci and nonhuman primateMHCgenes
et al.1990). However, this method did not unambigu-ously classify all the amplified sequences into loci. For Groupsa
this reason and to make quantitative arguments, we
Species E B G A carried out the phylogenetic analysis of class I sequences
spanning from the 5⬘-flanking region to exon 2.
H. sapiensb HLA-E HLA-B HLA-F HLA-A
HLA-C HLA-G HLA-H Phylogenetic analysis of NWM class I genes: In the
HLA-L HLA-J phylogenetic analysis of the 5⬘-flanking region, we ex-HLA-K cluded all group E sequences because no reliable align-M. mulattac Mamu-E Mamu-L Mamu-F Mamu-A
ment of these is feasible with other group sequences.
(Mamu) Mamu-K
The analysis of the remaining sequences shows that
A. trivirgatusc Aotr-E Aotr-B1 Aotr-F
group B and G sequences form two distinct clades with (Aotr) Aotr-B2 Aotr-G1
100% BVs. The common node of group B separates the
Aotr-B3 Aotr-G2
C. apellac Ceap-B3 Ceap-G1d L sequences from all other B-like sequences and that
(Ceap) Ceap-B4 Ceap-G1*d
of group G separates the F sequences from the rest
Ceap-G2 within this group (data not shown, but see Figure 5). A. belzebuthc Atbe-E Atbe-B1 Atbe-F
However, the evolutionary relationships among NWM (Atbe) Atbe-B2 Atbe-G2
sequences within group B or G are not completely
un-Atbe-B3
ambiguous. Therefore, we further analyzed group B and aClassification based onAlus andLINEs in the 5⬘-flanking
G separately to increase the number of nucleotide sites
region. that can be aligned. Figure 5A shows the NJ tree of 12
bTheHLAsequences were retrieved from GenBank.
sequences in group B and Figure 5B shows the NJ tree cTheMHCsymbol is followed by a four-letter abbreviation
of 14 group G and 3 group A sequences (HLA-A, -H, system (Kleinet al. 1990;BontropandKlein1997).
dCeap-G1andCeap-G1* are presumably allelic. andMamu-A).
In group B (Figure 5A), the eight NWM sequences form a single cluster together withHLA-Band-C(100% 1.5-kb region upstream from exon 2. Most of these con- BV) to the exclusion ofHLA-LandMamu-L. Apart from sistently support the clustering betweenAotr-B1andAtbe- theHLA-B/Clineage, there are four distinct clades of B1, betweenAotr-B2 and Atbe-B2, and between Aotr-B3 the NWM sequences: B1 is [Aotr-B1,Atbe-B1], B2 is [ Aotr-and Ceap-B3(Figure 4). Since these three pairs occur B2,Atbe-B2], B3 is [Aotr-B3,Ceap-B3, Atbe-B3], and B4 is between different subfamilies or families, it is likely that [Ceap-B4]. Both B1 and B2 clades are supported by they are orthologous. However, some exceptional indels 100% BVs, while the B3 clade is supported by 83% BV. do not support the sister relationships of Atbe-B3 and These results are consistent with our previous assump-Aotr-B3/Ceap-B3. Ceap-B4 is also exceptional in that it tions thatAtbe-B3is orthologous toAotr-B3andCeap-B3
has a number of unique indels. and thatCeap-B4is an independent locus. Importantly,
Among eight group G sequences (Table 2), Atbe-F Figure 5A suggests that the sequences in the NWM B2 andAotr-Fdiffer from the rest of group G sequences in clade (hereafter designated asNWM-B2for the locus) that a diagnostic AluJo splits into two parts by a 1-kb are more closely related toHLA-B/Cthan those in the insertion sequence of unknown origin (Figure 4). These NWM B1, B3, and B4 clades (NWM-B1, -B3, and -B4, two genes are orthologous to HLA-F and Mamu-F respectively). It is possible that some NWM loci in group (OttingandBontrop1993), as supported also by the B are orthologous toHLA-B(Cadavidet al.1997), since presence of eight indels (one 2-bp insertion and two HLA-Band-Cduplicated in the stem lineage leading to 1-bp, two 3-bp, two 4-bp, and one 6-bp deletions), which OWMs and HLA-B-related genes are more prevalent are uniquely shared among these four sequences (Fig- thanHLA-Cin primates. Although the clade of NWM-ure 4). Among the remaining six sequences in group B2andHLA-B/Cis only weakly supported by BVs, it does G, there are four informative indels, all of which are suggest that there is an ortholog of HLA-Bin NWMs. shared only amongCeap-G1,Ceap-G1*, andAotr-G1.Ceap- Furthermore, the NWM-B1, -B2, and -B3 genes are G1* is unique in possessingAluSc. However, sinceCeap- shared between different families of Cebidae and Atel-G1andCeap-G1* are sampled from a single individual idae.
Figure4.—Nine different con-figurations ofAlu/LINEs inserted in the 5⬘-flanking region of NWM class I genes. A dagger above an indel symbol shows that the indel is not shared among the se-quences at a given locus. The dashed line in group E indicates that sequencing was not done. Symbols are the same as those in Figure 2.
supported by deletions and are regarded as represent- the 5⬘-flanking region, we sequenced the entire coding regions of four genes ofA. trivirgatus (Aotr-G2, Aotr-F, ing different loci (NWM-G1and-G2). LikeNWM-B1,
-B2, and -B3, NWM-G1and -G2 are shared among two Aotr-B1, and Aotr-B2). These genes individually repre-sent members of major clades in Figure 5. The phyloge-or three different subfamilies. It should be noted that
the total clade of G1 and G2 is supported by 99% BV netic analysis of these coding sequences with nineHLA (excludingHLA-E) loci reveals that the tree topology is to the exclusion ofHLA-A,-G,-J,-K, and-F, suggesting
that all of these loci are paralogous to each. Ceap-G1 almost the same as that for the 5⬘-flanking region. The topology based on the 5⬘-flanking region is also in good and Ceap-G1* are more closely related to each other
than to any other sequences and are clustered in a single agreement with that based on introns (data not shown), suggesting that phylogenetic signals in the 5⬘-flanking clade (Figure 5B). Thep-distance (3.2%) between
Ceap-G1 and Ceap-G1* is relatively large, but it is within a region and introns are not shuffled by recombination. However, it is substantially different from the topology range of the observed allelic diversity at MHC loci
(Satta1993). based on exons 2–3.
It is generally accepted that exon 4–8 sequences rep-To examine to what extent the phylogenetic tree
Figure5.—NJ trees of primate group B and G sequences based on thep-distances in the 5⬘ -flank-ing region. Group E sequences are excluded since the number of nu-cleotide sites that can be com-pared becomes substantially small. The phylogenetic analysis is car-ried out separately for groups B and G. The number of nucleotide sites compared is (A) 1021 bp for group B sequences and (B) 1743 bp for group G sequences. In total, 12 sequences are in group B and 17 sequences are in group G. At a node, the bootstrap values⬎65% are given. An open diamond indi-cates gene duplication and the root of the group B or G tree is assumed to be the same as that placed when all sequences are used. As in text and Table 2, the MHCsymbol is followed by a four-letter abbreviation system (Klein et al. 1990; Bontrop and Klein 1997).
region (Parham et al. 1989, 1995). We compared the Class I loci in the simian primate ancestor:The pair-wisep-distances among four NWM paralogousB-related phylogenetic tree for exons 2–3 or exons 4–8 with that
for the 5⬘-flanking region. Exon 4–8 sequences make loci (B1–B4 in Figure 5A) range from 11.8 to 14.3%. Similarly, the distances betweenL(HLA-LandMamu-L) HLA-B/C,Aotr-B1, andAotr-B2cluster with 97% BV and
makeHLA-FandAotr-Fcluster with 86% BV, although and four NWMB-related loci range from 15.2 to 19.3%. Thesep-distances are still⬎10%, a value averaged over the evolutionary relationships within group B and G
sequences cannot be resolved (data not shown). On the 20 pairs of intron sequences between humans and NWMs (O’hUiginet al. 2002), suggesting that the diver-other hand, exon 2–3 sequences show more
compli-cated phylogenetic relationships than do exon 4–8 se- gence of these loci is prior to the divergence of these species. Thus, there were at least five groupBloci (four quences. In particular, group B and G sequences do
not form two distinct clades. These analyses have sug- NWMB-related andL loci) before the present simian primates began to differentiate.
gested that neither exons 2–3 nor exons 4–8 contain
likely, but uncertain. Subsequently, proto-Band -Ggenes expanded to generate five group B and five group G genes, respectively, and there were at least eight gene duplications during phase I. On the other hand, phase II corresponds to the time period from the divergence of humans and NWMs to the present. During this phase, there were only one duplication forHLA-B and -Cin the lineage leading to humans and one for G1 andG2in the stem lineage of NWMs. The number beside an arrow stands for the minimum number of gene duplications.
that these loci also had already differentiated in an early between humans and NWMs, or even the emergence of prosimians (Martin1993).
stage of primate evolution. This ancient origin ofHLA-A
is supported by the presence of relatively oldAluJo, but Recently, Flu¨ gge et al. (2002) and Go et al. (2003) extensively examined prosimian class I genes, and their it contradicts the absence ofHLA-A-related sequences
in NWMs and the recent work by Piontkivska and exon sequences have revealed the monophyletic rela-tionships with respect to simian class I genes. It would
Nei(2003) who estimated the emergence ofHLA-Aas
recently as 35 MYA. On the other hand, thep-distances be interesting to see if Alu and LINEmarkers can be consistent with this observation. We foundAluJ,FLAM, between NWM-G1 and -G2 genes range from 9.5 to
11.5%. It is not easy to judge whether they duplicated andFRAMin the 5⬘-flanking region of group B, G, and E sequences. Since these Alus were dispersed in the before or after the emergence of NWMs. To be
conserva-tive in the following discussion, however, we assume that primate genome ⵑ80 MYA (Batzer and Deininger
2002), it is likely that they were integrated into the NWM-G1and-G2loci duplicated shortly after the split
of humans and NWMs. Thus, in addition toF(Figure prosimian genome as well. Our molecular clock analysis has also consistently provided the ancient differentia-5B), there were at least six paralogous loci (fourHLA,
G1/G2, andFloci) in groups A and G in the stem line- tion of group B and G sequences. Therefore, identifica-tion of prosimian class I loci by this method is feasible age of simian primates.
and even legitimate to fully understand the evolution of primate class I loci.
DISCUSSION
Oldest class I locus in primates:In this study, we have shown that most of primate class I loci diverged from
Divergence times of class I loci:To date the sequence
each other before the split between humans and NWMs. divergences within group B or G, we constructed NJ
But which primate class I locus diverged first? Shiina
trees on the basis of thed-distances rather than on the
et al. (1999) proposed the hypothesis that theHLA re-p-distances and examined the molecular clock
hypothe-gion was shaped by successive duplications of a block sis by the two-cluster test (Takezakiet al. 1995). The test
encompassing a singleHLAlocus. They concluded that reveals that four group G sequences (HLA-F,Mamu-F,
a block containing HLA-F is the oldest, which is also Aotr-F, andAtbe-F) and three group B sequences (
Atbe-supported by the recent work byPiontkivskaandNei
B2,Aotr-B2, andCeap-B4) have evolved significantly (P⬍
(2003). However,Shiinaet al. (1999) excludedHLA-E 0.01) faster than other sequences within each group.
from their consideration, because the gene content of However, once we removed these sequences, no rate
the E block is totally different from that of other blocks heterogeneity remained. Thus, we compute the average
(Kulskiet al. 2000). On the other hand,Piontkivska
height (the averaged-distances from the tips to the latest
andNei(2003) used exon 2–4 sequences in their phylo-common node) of groups B and G as 9.1⫾0.5% and
genetic analysis. Although they carefully examined the 7.3 ⫾ 0.3%, respectively. If we assume that the silent
molecular clock hypothesis among variousMHCgenes, nucleotide substitution rate at primate MHC loci is
the phylogeny has low BVs at critical nodes and appears 10⫺9/site/year (Satta et al. 1993), the emergence of
to be influenced by homoplasy. groups B and G can be dated as ⵑ90 and 70 MYA,
reveals thatHLA-Ediverged first when dog and pig se- tion. Thus, the gene organization was basically re-molded by extensive duplication during the early stage quences are used as outgroups (data not shown). The
first divergence of HLA-E from otherHLA loci is sup- of primate evolution and the present repertoire has been shaped mainly by loss of loci. Such a tempo and ported by insertion elements in both 5⬘- and 3⬘-flanking
regions ofHLA-E. In the 5⬘-flanking region,HLA-Edoes mode of evolution in primate class I loci surely awaits not share any insertion elements with other HLA and further scrutiny to determine its biological significance. among them isFRAM, the oldestAlu. Similarly in the This work is supported in part by a grant (no. 12304046) from 3⬘-flanking region ofHLA-E, no insertion elements are the Japan Society for the Promotion of Science and in part by the Cooperation Research Program of Primate Research Institute, Kyoto
shared with otherHLA. It is thus concluded thatHLA- E
University.
diverged first and has long evolved in isolation from the others.
Gene duplication rate in primate class I loci:In Figure
6, we hypothesize contraction and expansion models of LITERATURE CITED
group B and G loci in humans and NWMs, respectively.
Adams, E. J., and P. Parham, 2001 Species-specific evolution of
In group B, there are oneLand at least fourB-related MHC class I genes in the higher primates. Immunol. Rev.183: loci in the ancestral species of humans and NWMs. As 41–64.
Adams, E. J., G. ThomsonandP. Parham, 1999 Evidence for an
discussed, the NWM-B2 locus is likely orthologous to
HLA-C-like locus in the orangutanPongo pygmaeus.
Immunogenet-HLA-Band this locus is retained in both humans and ics49:865–871.
NWMs. However, theNWM-B1,-B3, and-B4loci are lost Adams, E. J., S. L. Cooper and P. Parham, 2001 A novel, non-classicalMHCclass I molecule specific to the common
chimpan-in humans andHLA-Bduplicated to produceHLA-Cin
zee. J. Immunol.167:3858–3869.
the stem lineage of hominoids and OWMs (a diamond Adkins, R. M., E. L. Gelke, D. RoweandR. L. Honeycutt, 2001 in Figure 5A). By contrast, all fourBloci are retained Molecular phylogeny and divergence time estimates for major rodent groups: evidence from multiple genes. Mol. Biol. Evol.
butL may be lost in NWMs. In group G, there are at
18:777–791.
least five loci (HLA-G, -J, -K, -F, and NWM-G1/G2) in Batzer, M. A., andP. L. Deininger, 2002 Alurepeats and human addition to HLA-A in the stem lineage of simian pri- genomic diversity. Nat. Rev. Genet.3:370–379.
Bontrop, R. E., andJ. Klein, 1997 Nomenclature for the MHCs
mates (Figure 5B). While theFlocus is retained in both
and alleles of different nonhuman primate species, pp. 552–555
humans and NWMs, three orthologs ofHLA-G,-J, and inMolecular Biology and Evolution of Blood Group and MHC Antigens -Kmay become extinct in NWMs. Similarly, the ortholog in Primates, edited by A.Blancher, J.Kleinand W. W.Socha.
Springer-Verlag, Berlin.
ofNWM-G1/G2 is lost in humans. Subsequently, there
Boyson, J. E., S. N. McAdam, A. Gallimore, T. G. Golos, X. Liuet is only one duplication to generateNWM-G1 and -G2
al., 1995 TheMHCclassElocus in macaques is polymorphic
loci (a diamond in Figure 5B). If we assume 45–55 and is conserved between macaques and humans.
Immunogenet-ics41:59–68.
million years of the divergence between humans and
Boyson, J. E., C. Shufflebotham, L. F. Cadavid, J. A. Urvater, NWMs (Martin1993;KumarandHedges1998;
Taka-L. A. Knappet al., 1996 The MHC class I genes of the rhesus
hata 2001), two gene duplications generating HLA- monkey. Different evolutionary histories ofMHCclass I and II B/C and NWM-G1/G2 took place in the descendant genes in primates. J. Immunol.156:4656–4665.
Boyson, J. E., K. K. Iwanaga, T. G. GolosandD. I. Watkins, 1997
lineages. In addition, HLA-H duplicated from HLA-A
Identification of the rhesus monkeyHLA-Gortholog:Mamu-Gis
after the divergence of human lineage from OWMs. a pseudogene. J. Immunol.157:5428–5437.
Consequently, the duplication rate isⵑ3/(55 ⫻ 2) ⫽ Braud, V. M., D. S. J. Allan, C. A. O’Callaghan, K. So¨ derstro¨ m, A. D’Andrea et al., 1998 HLA-E binds to natural killer cell
0.027 to 3/(45 ⫻ 2) ⫽ 0.033/genome/million years
receptors CD49/NKG2A, B and C. Nature391:795–799.
during this later stage of primate evolution. Cadavid, L. F., and D. I. Watkins, 1997 MHC class I genes in For the duplication rate in the early stage of primate nonhuman primates, pp. 339–357 inMolecular Biology and Evolu-tion of Blood Group and MHC Antigens in Primates, edited by A.
evolution, we note that there is no orthologous
relation-Blancher, J.Kleinand W. W.Socha. Springer-Verlag, Berlin.
ship betweenH-2andHLAclass I genes and therefore Cadavid, L. F., C. Shufflebotham, F. J. Ruiz, M. Yeager, A. L. assume that there is only one common ancestral class Hugheset al., 1997 Evolutionary instability of the major histo-compatibility complex class I loci in New World primates. Proc.
I gene when primates diverged 80–100 MYA (Kumar
Natl. Acad. Sci. USA94:14536–14541.
and Hedges 1998; Adkins et al. 2001). The ancestral Cadavid, L. F., B. E. MejiaandD. I. Watkins, 1999 MHC class I gene might have diverged into an E-like locus and an genes in a New World primate, the cotton-top tamarin (Saguinus oedipus), have evolved by an active process of loci turnover.
Immu-ancestral gene of group A, B, and G loci, and this
ances-nogenetics49:196–205.
tor may have later expanded to generate one group A,
Castro, M. J., P. Morales, J. Martinez-Laso, L. Allende, R. Rojo
-five group B, and -five group G loci. Then, at least eight Amigoet al., 2000 Lack of MHC-G4 and soluble (G5, G6) iso-forms in the higher primates,Pongidae.Hum. Immunol.61:1164–
duplications must have taken place during the period of
1168.
25–55 million years between the divergence of primates
Celada, F., andP. E. Seiden, 1996 Affinity maturation and
hypermu-and rodents (80–100 MYA) hypermu-and the divergence of hu- tation in a simulation of the humoral immune response. Eur. J.
Immunol.26:1350–1358.
mans and NWMs (45–55 MYA). The gene duplication
Chen, Z. W., S. N. McAdam, A. L. Hughes, A. L. Dogon, N. L. Letvin rate then becomes 8/55 ⫽ 0.15 to 8/25 ⫽ 0.32 per
et al., 1992 Molecular cloning of orangutan and gibbon MHC
genome per million years. This rate is nearly 10 times class I cDNA. TheHLA-Aand -Bloci diverged over 30 million
years ago. J. Immunol.148:2547–2554.
evolu-HLA-A,B,Cpolymorphism. Immunol. Rev.143:141–180. tion at major histocompatibility complex class I loci reveal
over-Piontkivska, H., andM. Nei, 2003 Birth-and-death evolution in dominant selection. Nature335:167–170.
primateMHCclass I genes: divergence time estimates. Mol. Biol.
Jurka, J., andA. Milosavljevic, 1991 Reconstruction and analysis
Evol.20:601–609. of humanAlugenes. J. Mol. Evol.32:105–121.
Saitou, N., andM. Nei, 1987 The neighbor-joining method: a new
Kapitonov, V., andJ. Jurka, 1996 The age of Alu subfamilies. J.
method for reconstructing phylogenetic trees. Mol. Biol. Evol. Mol. Evol.42:59–65.
4:406–425.
Kimura, M., 1980 A simple method for estimating evolutionary rates Satta, Y., 1993 Balancing selection at HLA loci, pp. 129–149 in of base substitutions through comparative studies of nucleotide Mechanisms of Molecular Evolution, edited by N.Takahataand sequences. J. Mol. Evol.16:111–120. A. G.Clark. Sinauer Associates, Sunderland, MA.
Klein, J., 1987 The Natural History of the Major Histocompatibility. Black- Satta, Y., C.O’hUigin, N.Takahataand J.Klein, 1993 The synon-well, Oxford. ymous substitution rate at the primateMhcloci. Proc. Natl. Acad.
Klein, J., andV. Horˇejsˇı´, 1997 Immunology. Blackwell, Oxford. Sci. USA90:7480–7484.
Klein, J., R. E. Bontrop, R. L. Dawkins, H. A. Erlich, U. B. Gyllens- Shiina, T., G. Tamiya, A. Oka, N. Takishima, T. Yamagataet al., tenet al., 1990 Nomenclature for the major histocompatibility 1999 Molecular dynamics of MHC genesis unraveled by se-complexes of different species: a proposal. Immunogenetics31: quence analysis of the 1,796,938-bp HLAclass I region. Proc. 217–219. Natl. Acad. Sci. USA96:13282–13287.
Knapp, L. A., L. F. CadavidandD. I. Watkins, 1998 TheMHC-E Sonnhammer, E. L., andR. Durbin, 1995 A dot-matrix program locus is the most well conserved of all known primate class I with dynamic threshold control suited for genomic DNA and histocompatibility genes. J. Immunol.160:189–196. protein sequence analysis. Gene167:GC1–GC10.
Kulski, J. K., S. Gaudieri, M. Bellgard, L. Balmer, K. Gileset al., Takahata, N., 1995 MHC diversity and selection. Immunol. Rev.
143:225–247. 1997 The evolution ofMHCdiversity by segmental duplication
Takahata, N., 2001 Molecular phylogeny and demographic history and transposition of retroelements. J. Mol. Evol.45:599–609.
of humans, pp. 299–305 in Humanity From African Naissance to
Kulski, J. K., S. GaudieriandR. L. Dawkins, 2000 UsingAlu J
Coming Millenia, edited by P. V.Tobias, M. A.Taath, J.Moggi -elements as molecular clocks to trace the evolutionary
relation-Cecchiand G. A.Doyle. Firenze University Press, Johannesburg, ships between duplicatedHLAclass I genomic segments. J. Mol.
South Africa. Evol.50:510–519.
Takezaki, N., A. RzhetskyandM. Nei, 1995 Phylogenetic test of
Kulski, J. K., P. Martinez, N. Longman-Jacobsen, W. Wang, J.
Wil-the molecular clock and linearized tree. Mol. Biol. Evol. 12:
liamsonet al., 2001 The association betweenHLA-Aalleles and
823–833. anAludimorphism nearHLA-G.J. Mol. Evol.53:114–123.
Thompson, J. D., D. G. HigginsandT. J. Gibson, 1994 CLUSTAL
Kumar, S., andS. B. Hedges, 1998 A molecular timescale for
verte-W: improving the sensitivity of progressive multiple sequence brate evolution. Nature392:917–920.
alignment through sequence weighting, position-specific gap
Lee, N., M. Liano, M. Carretero, A. Ishitani, F. Navarroet al.,
penalties and weight matrix choice. Nucleic Acids Res.22:4673– 1998 HLA-E is a major ligand for the natural killer inhibitory
4680. receptor CD49/NKGA2. Proc. Natl. Acad. Sci. USA 95:5199–
Watkins, D. I., Z. W. Chen, A. L. Hughes, M. G. Evans, T. F. Tedder
5204.
et al., 1990 Evolution of the MHC class I genes of a New World
Leibson, P. J., 1998 Cytotoxic lymphocyte recognition of HLA-E: primate from ancestral homologues of human non-classical utilizing a nonclassical window to peer into classicalMHC.Immu- genes. Nature346:60–63.
nity9:289–294. Wegner, K. M., M. Kalbe, J. Kurtz, T. B. H. ReuschandM. Milinski,
Martin, R. D., 1993 Primate origins: plugging the gaps. Nature363: 2003 Parasite selection for immunogenetic optimality. Science
223–234. 301:1343.
Miller, M. D., H. Yamamoto, A. L. Hughes, D. I. Watkinsand Willard, C., H. T. NguyenandC. W. Schmid, 1987 Existence of
N. L. Letvin, 1991 Definition of an epitope and MHC class I at least three distinctAlusubfamilies. J. Mol. Evol.26:180–186. molecule recognized by gag-specific cytotoxic T lymphocytes in