Duplication-Dependent CG Suppression of the Seed Storage
Protein Genes of Maize
Gertrud Lund,*
,1Massimiliano Lauria,* Per Guldberg
†and Silvio Zaina
‡*Plant Biochemistry Laboratory, Department of Plant Biology, The Royal Veterinary and Agricultural University, DK-1871 Frederiksberg C, Denmark,†Institute of Cancer Biology, Danish Cancer Society, DK-2100 Copenhagen, Denmark and
‡Experimental Cardiovascular Research, Wallenberg Laboratory, Department of Medicine, University of Lund, 205 02 Malmø, Sweden
Manuscript received February 28, 2003 Accepted for publication June 13, 2003
ABSTRACT
This study investigates the prevalence of CG and CNG suppression in single-vs.multicopy DNA regions of the maize genome. The analysis includes the single- and multicopy seed storage proteins (zeins), the miniature inverted-repeat transposable elements (MITEs), and long terminal repeat (LTR) retrotranspo-sons. Zein genes are clustered on specific chromosomal regions, whereas MITEs and LTRs are dispersed in the genome. The multicopy zein genes are CG suppressed and exhibit large variations in CG suppression. The variation observed correlates with the extent of duplication each zein gene has undergone, indicating that gene duplication results in an increased turnover of cytosine residues. Alignment of individual zein genes confirms this observation and demonstrates that CG depletion results primarily from polarized C:T and G:A transition mutations from a less to a more extensively duplicated gene. In addition, transition mutations occur primarily in a CG or CNG context suggesting that CG suppression may result from deamination of methylated cytosine residues. Duplication-dependent CG depletion is likely to occur at other loci as duplicated MITEs and LTR elements, or elements inserted into duplicated gene regions, also exhibit CG depletion.
I
N many organisms, nuclear DNA is methylated at cyto- than the expected value (Bird1980). In contrast, both monocot and dicot plant genes are only slightly CG sine residues, resulting in 5-methylcytosine (5mC).depleted (an average of 75–80% of expected values), In plants, symmetrical 5⬘-CpG-3⬘(CG) and 5⬘-CpNpG-3⬘
and CNG suppression is lacking or less severe than CG (CNG) are the most frequent targets of cytosine
methyl-depletion (McClelland1983;Gardiner-Gardenet al.
ation, whereas in mammals 90% of methylation is
re-1992;Ashikawa2001). stricted to the CG dinucleotide (Gruenbaumet al. 1981,
The most commonly explained mechanism of CG 1982). However, the degree and ratio of CG and CNG
depletion relates to the tendency of 5mC to undergo methylation can vary considerably between plant species
spontaneous deamination to thymidine, resulting in C:T (JeddelohandRichards1996; Kovariket al.1997).
or G:A transition mutations (Couloundreet al.1978). For example, in maize,ⵑ28% of cytosine residues are
Interestingly, the mutability of CG has been shown to methylated compared to only 6% in Arabidopsis (
Leut-be one of the most important causes of germline point wileret al.1984; Matassiet al. 1992;Monteroet al.
mutations in human genetic diseases and is a frequent 1992). Furthermore, analysis of the rRNA genes from
occurrence in somatic mutations leading to cancer maize has shown that the external cytosine is twofold less
(CooperandKrawczak1989;Jones et al.1992; Hol-methylated compared to the internal cytosine residue of
steinet al.1994). In addition to the mutability of 5mC, the 5⬘-CCG-3⬘sequence (Kovariket al. 1997),
indicat-recent evidence has shown that cytosine deamination ing that CG methylation occurs more frequently than
also contributes to CG suppression (Fryxelland Zuck-CNG methylation.
erkandl2000). CG suppression, or depletion, refers to the
underrep-Although the majority of methylation in plants is asso-resentation of the CG dinucleotide compared to an
ciated with repetitive DNA sequences such as transpo-estimated value based on the G ⫹ C content of the
sons, duplicated gene regions can also be methylated sequence investigated. CG suppression is especially
evi-(BianchiandViotti1988;Flavellet al.1988; Bennet-dent in the mammalian genome where the frequency
zenet al. 1994;Flavell1994;BenderandFink 1995; of the CG dinucleotide can be up to fivefold lower
Ronchi et al. 1995;Rabinowiczet al. 1999). In Neuro-spora crassa, duplicated sequences are efficiently tar-geted by methylation, and a large number of C:T transi-1Corresponding author:Plant Biochemistry Laboratory, Department
tion mutations are introduced following duplication
of Plant Biology, Thorvaldsensvej 40, DK-1871 Frederiksberg C,
Den-mark. E-mail: [email protected] [hence the name repeat-induced point mutation (RIP;
Cambereriet al.1989;Selker1990)]. Similarly, inAsco- andMessing 2002). In contrast, the 10-, 15-, 16-, and 27-kD proteins that represent the zein-2 fraction are
bolus immersusa process referred to as methylation
in-duced premeotically (MIP) results inde novo methyla- encoded by one or two genes that show limited sequence similarity to the␣-zeins (Pratet al.1985; Kiriharaet
tion of a DNA sequence upon duplication (Goyonand
Faugeron1989). The observed consequences ofde novo al.1988;Swarupet al.1995).
The majority of 22-kD zein genes form a dense gene methylation in RIP and MIP include gene inactivation
and a reduction in the frequency of recombination cluster, ⵑ168 kb in size, on chromosome 4 of maize (Llaca and Messing 1998), whereas the 19-kD zein (Barry et al. 1993; Rountree and Selker 1997;
MaloiselandRossignol1998). Similar roles of dupli- genes are distributed on five unlinked genomic loca-tions on maize chromosomes 1, 4, and 7 (Soaveet al.
cation-induced DNA methylation have been proposed
to occur in plants (Flavell 1994;Bender1998). 1981, 1982;Wilsonet al.1989;Wooet al.2001;Song andMessing2002). Phylogenetic analysis of the␣-zein In mammals, duplicated genes are more CG
sup-pressed compared to single-copy genes. This observa- genes has revealed that the 19- and 22-kD zein genes share a common ancestor (Song andMessing 2002). tion has led to the suggestion that duplicated genes
have a history of methylation and subsequent mutation Given that only the 22-kD zein genes have been identi-fied inCoix lacryma-jobi, an ancestor of maize (Leiteet
of methylated residues (Krickeret al.1992). Likewise,
in plants, the multigene families of 5S rRNA genes from al. 1990), it is probable that in maize the 19-kD zein genes have derived from the 22-kD zein genes. Interest-Arabidopsis and rRNA genes from maize show elevated
levels of transition mutations that are consistent with ingly, it has been estimated that the amplification of the␣-zein gene family in maize occurred within the last deamination of 5mC, in particular the nonfunctional
members of these gene families (Edward et al. 1996; 3–4 million years (Songet al. 2001;SongandMessing 2002).
Matieuet al.2002). However, neither study can confirm
if CG loss results from spontaneous deamination of In contrast to the clustered zein genes, MITEs and LTR elements are dispersed in the genome. MITEs are methylated residues over time or whether depletion is
the consequence of an active mechanism linked to du- frequently associated with promoter and 3⬘ regulatory plication. We have analyzed the CG dinucleotide and regions of genes, whereas the larger LTR-transposons CNG trinucleotide content of the large zein gene family, are typically found in intergenic regions (Kumar and
which encodes the seed storage proteins of maize. Due Bennetzen 1999). The copy numbers of MITEs and
to differences in gene copy number of each subfamily, LTR elements range from 3000 to 10,000 copies and a the zein genes provide an ideal model system to analyze few to 50,000 copies, respectively (BureauandWessler the effect of gene duplication on CG suppression. In 1992, 1994;SanMiguelet al. 1996;Zhanget al. 2000). addition, the highly abundant LTR-retrotransposons Similar to the amplification of the zein gene family, a and MITEs have also been analyzed for evidence of CG majority of LTR-retrotransposons have colonized the
suppression. maize genome during the last 5 million years (
San-The zeins constitute 50–60% of total endosperm pro- Miguelet al.1998).
tein and can be divided into two major fractions, zein-1 Our analysis shows that duplicated zein genes are CG and zein-2, on the basis of solubility characteristics (Esen suppressed and that the degree of suppression corre-1986). The zein-1 fraction consists of the 19- and 22-kD lates with the copy of each subfamily. Likewise, within polypeptides that are encoded by a large gene family, the 19- and 22-kD zein gene families the extent of CG the␣-zeins. On the basis of DNA sequence identity and depletion correlates with the number of duplications hybridization characteristics, this family can be further each individual gene has undergone. Despite their high divided into four subfamilies, z1A, z1B, z1C, and z1D copy number, most MITEs and LTR elements are not (HeindeckerandMessing1986;RubensteinandGer- CG suppressed, except when duplicated or located in aghty 1986). Genes belonging to the z1A, z1B, and a duplicated DNA sequence. This suggests that the pro-z1D subfamilies encode the 19-kD zein genes, whereas cess leading to CG depletion is activated upon duplica-the 22-kD zein genes are encoded mainly by duplica-the z1C tion or is a consequence of the duplication process itself. subfamily. The 19-kD genes have an estimated copy We discuss the possible role of duplication-dependent number of 56 per haploid genome, whereas the 22-kD CG depletion in the evolution of the GC-poor isochores zeins are presumed to be present in 15 copies per hap- in which the zein genes are located.
loid genome (Hagen and Rubenstein 1981; Wilson
andLarkins 1984). However, the exact copy number
of the ␣-zein genes can show considerable variation MATERIALS AND METHODS
among different inbred lines (Llaca and Messing
Sequences: The di-and trinucleotide composition of the
1998;SongandMessing2002). In addition, within the
coding region of 32 zein genes belonging to the zein-1 fraction
19-kD zein gene family, the z1A subfamily has the high- and 6 genes belonging to zein-2 fraction was analyzed. All 22-est copy number, followed by z1B and z1C (Wilsonand kD zein genes were derived from the inbred line BSSS53
from the B73 inbred line (af546187, af546188; af546189, TTTACATACCAATACATAA-3⬘; W22, 5⬘-GGGTATATAATT AGTGTAATTTAATATATG-3⬘ and 5⬘-ATTCTTAAAACTTTA af546190;Songet al. 2001;SongandMessing 2002). Only
full-length genomic and cDNA clones, including genes with CATACCAATACATAA-3⬘. The resulting PCR products were cloned by TOPO cloning and 16 individual clones were se-in-frame stop codons, were analyzed. Clones in which the
open reading frame was disrupted by insertions or deletions quenced at MWG Biotech (Ebersberg, Germany). To confirm the previously published MITE sequences, theTouristelement were omitted from the analysis. One exception wasZ492M16-5,
which was included as it is expressed despite a deletion in the was also amplified from genomic DNA employing the follow-ing primer pairs: W64A, 5⬘ -CCTTGGTTGTTGGCTCATAAT-open reading frame. The correct -CCTTGGTTGTTGGCTCATAAT-open reading frame was
determined by the use of GenBank’s annotations, and nucleo- 3⬘ and 5⬘-CAGATGAGTATGATCTCGGCA-3⬘; W22, 5⬘-ATA tide compositions were generated using the Genetics Com- AGTGTTCTGGATATTGGTTGTT-3⬘ and 5⬘-TCAGATGAGT puter Group (GCG) analysis software package. Sequences of ATGATCTCGCA-3⬘. These primers were also tested on bisul-MITEs and LTR-retrotransposons were extracted from gene fite-treated DNA and failed to give a product of the expected sequences according to the annotations of the authors size. To ensure that the observed patterns of methylation did (Bureau and Wessler 1992, 1994;SanMiguel et al. 1996; not result from incomplete strand separation during the bisul-Tikhonovet al.1999;Zhanget al. 2000). Only the LTR region fite reaction, theTouristelement from the W64A inbred was and primer-binding site of the LTR-retrotransposon was ana- cloned, bisulfite treated, and amplified with the bisulfite prim-lyzed. ers. This element contains only one cytosine that can be
meth-CG analysis:To measure the extent of repression of a given ylated by theEscherichia coli dcm methylase (i.e., the internal
di- or trinucleotide, a scorewas calculated by the formula cytosine residue of the CCWGG sequence). As expected,
se- ⫽O/E, whereOandEdenote the observed and expected quence analysis of 10 independent clones showed that this counts, respectively. Overall expected counts for di- and tri- cytosine remained unmodified. All the remaining cytosine nucleotides were calculated by multiplying the observed residues of theTouristelement had undergone modification counts of each nucleotide and dividing the product by the to thymidine.
total number of nucleotides found in the sequence. Position-dependent expected counts were calculated assuming the ab-sence of any codon bias. The positions of the 5⬘and 3⬘di- or
RESULTS trinucleotides relative to codon triplets are indicated by roman
numerals;e.g., I-II denotes a dinucleotide including the first CG and CNG analysis of the zein genes:Table 1 shows two nucleotides of a codon triplet. For the CG dinucleotide,
-values (observed/expected) of CG dinucleotides and
position-dependent expected counts were calculated as
fol-CNG trinucleotides of genes belonging to the zein-1
lows: I-II⫽2/3 of arginine-specifying codons; II-III⫽the sum
of 1/6 of serine-, 1/4 of proline-, threonine-, and alanine- and zein-2 fractions. All 19-kD zein genes analyzed were
specifying codons; III-I⫽NNC⫻GNN/T;Nrepresents any isolated from the B73 inbred line, whereas the 22-kD nucleotide andTrepresents the total number of triplets in zein genes were derived from the BSSS53 inbred line the sequence. In the case of CNG trinucleotide,
position-(Songet al.2001;SongandMessing2002). A-average
dependent expected counts were calculated as follows: I-III⫽
was calculated of both zein fractions, of the 19- and
22-the sum of 1/6 of arginine- and leucine-, 1/4 of proline-, and
1/2 of glutamine-specifying codons; II-I⫽ NCN ⫻GNN/T; kD zein genes, and of each of the three 19-kD zein
and III-II⫽NNC⫻NGN/T. subfamilies, z1A, z1B, and z1D. The zein-1 fraction,
rep-Alignment between members of the ␣-zein gene family: resenting the multicopy ␣-zeins has a CG average of
Pairwise alignments were conducted of individual expressed
0.40 (P ⬍0.001), whereas the -average of single-copy
members of the 19- and 22-kD zein genes by GAP analysis
zein genes of zein-2 fraction is 0.75 (P⬍0.001). Indeed,
(GCG analysis software package). For each alignment, the
percentage of C:T and G:A transition mutations was compared the zein-1 fraction is more suppressed than the zein-2
to the total number of single-base-pair point mutations. To fraction (P⬍0.001). Furthermore, the average GC con-establish if transition mutations occur in a polarized fashion, tent of the zein-1 fraction is 48% compared to 66% of i.e., from a younger to an older duplication, or from a gene
the zein-2 fraction (results not shown), indicating that
that has undergone less to more duplications, transition
muta-CG suppression is accompanied by a decrease in G⫹
tions of each gene were counted in individual alignments.
Importantly, only transition mutations occurring in a CG or C (GC) content. In contrast, none of the zein fractions
CNG context were considered. are suppressed at the CNG trinucleotide.
Statistical analysis:Differences betweenOandEvalues were It can also be observed that the degree of suppression
tested using the chi-square analysis. To test for differences in
varies between subfamilies of the zein-1 fraction. The
-values between groups, the Mann-Whitney U-test was
em-more abundant 19-kD zein genes are the em-more CG
sup-ployed. All statistical tests were performed using the
STATIS-TICA software package for Macintosh (StatSoft, Tulsa, OK). pressed compared to the less abundant 22-kD zein genes
Bisulfite analysis:Genomic DNA was extracted from young (0.34 and 0.49, respectively;P ⬍ 0.001) and, likewise,
leaf tissue of the inbred lines W64A and W22 using the within the 19-kD gene family the degree of suppression DNAeasy kit (QIAGEN, Valencia, CA). The W64 and W22
is associated with the copy number of each subfamily.
inbred lines contain the Tourist element located in the 5⬘
The z1A subfamily, which has the highest copy number,
flanking region of the single-copy or duplicated 27-kD zein
gene, respectively (DasandMessing 1987). Between 1 and is more suppressed compared to the less abundant z1B
2g of DNA were treated with bisulfite as described byZesch- and z1C subfamilies (P⬍0.009 andP⬍0.027, respec-nigket al.(1997). For PCR analysis 1/10 vol of bisulfite-treated tively). We also analyzed 18 19-kD genes from different DNA was employed in a standard PCR reaction. The primer
genetic backgrounds and found no differences in the
pairs employed for amplification of theTouristelement from
average CG and CNG scores (results not shown).
bisulfite-treated DNA were as follows: W64A, 5⬘-TAGGTATAT
differs considerably, which could potentially influence highly expressed genes are more CG suppressed com-pared to low expressing genes.
overall CG and CNG scores. For example, the␣-zeins
To understand if CG suppression is a general effect are particularly rich in glutamine, leucine, proline, and
of a specific chromosomal region, the CG content of alanine, whereas the single-copy ␥-zeins have a high
intergenic regions of the 22-kD zein gene cluster located content of methionine. To address this problem, CG
on chromosome 4S was analyzed. The lengths of the and CNG frequencies were analyzed in a
position-22-kD zein intergenic regions varied from 2517 to 14,438 dependent context (Table 2). Position-dependent
fre-bp. The-average CG of the intergenic region was 0.68, quencies correct for differences in amino acid content
which is significantly higher than the-average of 0.50 but not for amino acid codon bias. Essentially,
position-of the 22-kD genes (P⬍ 0.001). dependent frequencies of the zein genes largely reflect
CG analysis of MITEs and LTR-retrotransposons:
overall CG and CNG frequencies. The multicopy␣-zeins
The observed copy number variation in CG suppression are suppressed at positions II-III and III-I (Table 2; ⫽
prompted us to investigate whether the high-copy-num-0.41 and 0.55, respectively;P⬍0.001); in contrast, the
ber LTR-retrotranposons and MITEs exhibited similar single-copy zein genes are not CG suppressed at any
behaviors. Three MITE families (Tourist, Stowaway, and position. This implies that the low overall CG score
Heartbreaker) and three groups of LTR-retrotransposon observed for the 10-kD zein gene, m23537 (Table 1),
(Ty1-copia, Ty3-gypsy, and an unclassified group), dif-is related to the amino acid content of thdif-is gene. Within
fering in element copy number between and within the zein-1 fraction, the 19-kD zein genes are CG
sup-each group, were analyzed for evidence of CG suppres-pressed at positions II-III and III-I (P⬍0.001). The z1A
sion. The-value and G⫹C content was calculated for subfamily is more suppressed than the z1B subfamily at
MITEs and LTR elements in different sequence contexts position III-I, whereas the opposite is true of position
(Tables 3 and 4, respectively). Despite the high copy II-III (P⬍0.002 andP ⬍0.025, respectively). The
22-number of MITEs and LTR elements, no association kD zein genes are also CG suppressed at position II-III
was found between the degree of suppression and ele-( ⫽0.52;P ⬍0.001), but to a lesser degree than the
ment copy number. Most transposons were not, or were
19-kD zein genes ( ⫽ 0.33; P ⬍ 0.001). The CNG
only slightly, CG suppressed. For example, the-average trinucleotide is suppressed only at position I-III of the
of Tourist and Heartbreaker families was 0.66 and 0.60 zein-1 fraction ( ⫽0.60;P⬍0.009; results not shown)
(P ⬍ 0.041 and P ⬍ 0.018), respectively, whereas no and, again, the 19-kD zein genes are more suppressed
suppression was observed of Stowaway family or LTR compared to the 22-kD zein genes at this position (P⬍
elements. However, large differences in CG suppression
0.001). were observed of both MITEs and LTR elements (
A recent analysis of the B73 and BSSS53 inbred lines
ranging from 0.16 to 1.46 and 0.26 to 1.12, respectively). has shown that only a relatively small number of␣-zein
We found that the copy number of the insertion site genes are expressed (Songet al.2001;Wooet al.2001).
could largely explain the variation in ; i.e., elements Most of the nonexpressed genes contain in-frame stop
inserted into multicopy gene regions were more CG codons or insertion/deletions (Spena et al.1983; Liu
suppressed than elements inserted into single-copy andRubenstein1992;LlacaandMessing1998;Song genes. This is nicely illustrated by the-values of Stow-et al.2001). Analysis of the average overall CG content away found 3⬘ of the single-copy 10- and 27-kD zein of seven expressedvs.seven nonexpressed 22-kD genes genes and the multicopy 22-kD zein genes (0.66, 0.96, (marked with a superscript b in Table 1) showed no and 0.25, respectively; Table 3B;BureauandWessler significant differences (0.51vs. 0.47, respectively;P ⫽ 1994). Two -values of Tourist and Stowaway elements
TABLE 3
CG scores of MITE families
Location CG
Acc. no. Gene IS Copy no. of IS Bp O E %G⫹C
A.Tourist(10,000 copies)
x17556a Adh-1Cm 3⬘ 1 126 10 6.86 1.46 47
x04049b Adh-1S 3⬘ 1 137 1 6.26 0.16 43
x07940a Bz-McC 3⬘ 1 136 1 2.24 0.45 26
S48688a Wx-B2 Exon 1 128 2 7.27 0.28 48
x53514a 27-kD zein 5⬘ 1 132 7 5.49 1.27 41
x56118b 27-kD zein 5⬘ 2 131 5 5.72 0.87 42
j05212 Oleosin, KD18 3⬘ 3–4 142 1 4.75 0.21 37
x15406 Pseudo-Gpa1 5⬘ ⵑ10 130 0 5.17 NC 40
x15407 Pseudo-Gpa2 5⬘ ⵑ10 137 3 5.12 0.59 39
Touristaverage: 0.66* 40
B.Stowaway(copy number not reported)
z11879a P-gene Intron 1 80 4 3.20 1.25 40
m23537a 10-kD zein 3⬘ 1 153 2 3.07 0.66 28
x53514a 27-kD zein 3⬘ 1 154 2 2.08 0.96 23
x56118b 27-kD zein 3⬘ 2 163 0 1.72 NC 21
x73152a Gpc4 Intron 4 157 4 3.92 1.02 32
x61085b 22-kD zein 3⬘ ⵑ15 156 1 3.98 0.25 32
Stowawayaverage: 0.83 NS 29
C.Heartbreaker(3,000–4,000 copies)
af203730 NA NA 1 314 9 14.54 0.62 44
af203733 NA NA 1–2 314 9 14.01 0.64 43
af203731 NA NA 2 314 9 14.36 0.63 44
af203729 NA NA 2–4 313 10 15.11 0.66 45
af203732 NA NA ⬎10 314 6 13.75 0.44 43
Heartbreakeraverage: 0.60* 44
IS, insertion site; NA, not available;Adh-1S,alcohol dehydrogenase1Sallele;Adh-1Cm, alcohol dehydrogenase
1Cmallele;Bz-McC, UDPglucose flavonoid glucosyl transferase;Wx, ADP glucose glucosyl-transferase;Gpa1and
Gpa2, pseudogene of glyceraldehyde-3-phosphate dehydrogenase;P, anthocyanin gene;Gpc4, glyceraldehyde-3-phosphatase; other symbols are as in Table 1.
aInsertion into single-copy sequence.
bInsertion in clustered multicopy gene region or duplicated element.
whereas the tandemly duplicated element is severely CG family, was analyzed as a single-copy insertion (Russell and Sachs 1991), and some elements inserted in suppressed ( ⫽ 0.16). Again, CG suppression of the
duplicated element is accompanied by a 4% decrease multicopy sequences were not included as it is unknown whether these genes are clustered or dispersed (e.g., in G⫹C content compared to the single-copy insertion.
LTR regions also exhibit severe CG suppression upon Gpa pseudogenes and oleosins). CG suppression was not observed of LTR elements and MITEs located in duplication or if located in a duplicated DNA region
(Table 4). For example, CG suppression is observed single-copy genic regions ( ⫽0.94), whereas tandemly duplicated elements or gene sequences were suppressed of x58700, aHopscotch-like transposon inserted in the
promoter region of a multicopy 19-kD zein gene (White ( ⫽0.33; P ⬍ 0.020). Furthermore, the -average of MITEs and LTR elements inserted in single-copy gene
et al. 1994), and of u68406, a tandem duplication of an
element,Kake-1 (SanMiguel et al. 1996). regions was higher than the average value of tandemly
duplicated elements and elements inserted into dupli-An average-value was calculated of elements inserted
into single-vs.multicopy gene regions or tandemly du- cated DNA regions (P⬍0.004). Simply analyzing-values of MITEs and LTR elements located in single-vs. multi-plicated elements of selected data points (Tables 3 and
4; labeled with superscripta and b, respectively). The copy gene regions produced results identical to the selected data set, the latter being less significant (P⬍
criterion for selection of data points was knowledge of
copy number of both element and insertion sequence. 0.009).
CG depletion as a function of time or gene
duplica-In addition, only clustered multicopy genes were
TABLE 4
CG values of LTR-retrotransposons
Element Location CG
Acc. no. Name Copy no. Gene IS IS copy no. Bp O E %GC
Ty1-copiagroup
u12626a Hopscotch 2–6 wx-K Exon 12 1 231 11 11.22 0.98 44
x58700b Hopscotch 2–6 19-kD zein 5⬘ ⵑ56 147 2 4.97 0.40 43
af082134a Stonor 30–40 wx-Stonor Intron5/exon6 1 560 37 38.02 0.97 52
u68401 Fourf ⵑ100 334B7.4 Exon 1 NA 1162 64 63.94 1.00 47
u68410 Victim ⵑ100 Intergenic NA 100 1 3.84 0.26 40
u68408 Opie-2 ⬎30,000 334B7.4 3⬘ NA 1271 69 80.22 0.86 50
af090447 Prem-2 NA Intergenic NA 1424 82 104.09 0.79 54
u68405 Ji-3 50,000 Intergenic NA 1176 58 72.62 0.80 50
Average: 0.76 NS 48
Ty3-gypsygroup
af015269a Magellan 4–8 Pl Exon 1 1 336 19 19.54 0.97 49
U68409 Reina ⬍10 Intergenic NA 323 15 20.43 0.73 51
U68404 Huck-2 ⵑ100 334B7.4 Exon 1 NA 1644 162 170.66 0.95 65 U68403 Grande-zm1 ⬎1300 Intergenic NA 645 39 43.00 0.90 52
U68402 Cinful 20,000 Intergenic NA 605 20 23.00 0.86 39
af090447 Zeon-1 20,000 Intergenic NA 669 25 23.93 1.04 38
U11059a Zeon-1 20,000 27-kD zein 5⬘ 1 649 21 21.40 0.98 37
af090447 NA NA Intergenic NA 669 137 157.52 1.04 44
Average: 0.93 NS 47
Nonclassified group
U68407 Milt ⵑ100 334B7.4 3⬘ NA 742 75 66.47 1.12 60
U68406b Kake-1 ⵑ100 Intergenic NA 182 2 6.53 0.27 40
Average: 0.70 NS 50
Symbols are as in Tables 1 and 3.
aInsertion into single-copy sequence.
bInsertion in clustered multicopy gene region or duplicated element.
reflect the number of years a sequence has been methyl- ber of C:T and G:A transition mutations and the total number of transition mutations occurring in a CG or ated. Table 5A shows a comparison between, the
esti-mated time of insertion, and the number of duplications CNG context were counted and compared to the total number of point mutations (results not shown). In addi-of the seven expressed 22-kD zein genes clustered on
chromosome 4S. Similarly, the estimated times of retro- tion, transition mutations occurring in a CG or CNG context were counted for each gene in the pairwise transposon insertion at theAdh1-Flocus has been
com-pared to CG frequencies (Table 5B). Neither CG nor alignments. Transition mutations were the most com-mon point mutations observed, representing on average CNG suppression correlated with time of insertion of
the 22-kD zein genes or of the LTR-retrotransposons. 61% of total point mutations, and the vast majority oc-curred in a CG or CNG context (average 74%). In 7/21 However, the 22-kD zein genes showed an inverse
corre-lation between CG content and the number of duplica- alignments performed, it was possible to distinguish if CG depletion resulted from the time or extent of dupli-tions each gene has undergone (r⫽0.85;P⬍ 0.014).
This was found to be specific of the CG dinucleotide. cation, whereas the remaining alignments were nonin-formative. The informative alignments included zp22/
Similarly, analysis of the expressed 19-kD zein genes
isolated from the B73 cluster also showed that the de- D87 and azs22;8, azs22;10 and azs22;4, azs22;10 and
azs22;14, azs22;12 and zp22/6, zp22/6 and azs22;14, gree of CG suppression correlated with the extent of
duplication (r ⫽ 0.52; P ⬍ 0.05; n ⫽ 14; results not azs22;12 and azs22;10, and zp22/6 and azs22;14. Four of these alignments showed that transition mutations shown).
Sequence divergence of duplicated DNA sequences: occurring in CG context were consistent with CG
sup-pression resulting from duplication, whereas none were To identify the fate of CG dinucleotides, pairwise
num-TABLE 5 andZ448F14-4) were aligned to all the expressed genes of z1B and z1C. In 16/18 alignments, a higher number CG scores of 22-kD zein genes and LTR regions compared
of transition mutations at CG dinucleotides were ob-to insertion time and duplication
served from the less duplicated z1B and z1C genes to the more extensively duplicated z1A genes. The same
Insertion No. of could be concluded by alignments of two z1B genes A. 22-kD gene (MYR)a duplications CG CNG
(Z492M16-1andZ492M16-4) to the two expressed z1D
azs22;12 0.6 4 0.58 1.09 genes. Of the 18 alignments performed, transition
muta-azs22;10 0.6 6 0.49 1.10 tions represented on average 57% of total single base
zp22/6 0.6 6 0.50 1.11 pair mutations and, of these, 48% occurred in a CG or
azs22;14 1.4 6 0.57 0.98 CNG context.
azs22;4 1.4 6 0.52 1.11
Similar conclusions were found by aligning the
dupli-zp22/D87 1.6 8 0.44 1.18
cated kD zein gene, x56118, to the single-copy
27-azs22;8 2.3 7 0.45 1.08
kD zein gene,zc2(x53514). These genes exhibit 97%
Insertion sequence similarity. Only nine point mutations were
B. LTR (MYR)b CG
observed, six being C:T or G:A transition mutations. Again, transition mutations occurred in a polarized
fash-Cinful 0.26 0.90
ion from the single copy to the duplicated gene (fourvs.
Grande-Zm1 0.12 0.90
two, respectively). However, only transition mutations
Opie-1 0.18 0.90
Fourf 1.39 1.00 observed from thezc2 to the x56118 allele were in a
Milt 1.56 1.10 CG context (3/4). This pattern of mutation was also
Ji-3 1.86 0.80 mirrored by alignment of theTouristandStowaway
ele-Reina 2.08 0.73 ments located in the 5⬘and 3⬘ flanking region of the
Huck-1 2.26 0.90
single-copy and duplicated 27-kD zein gene,
respec-Victim 2.42 0.30
tively. Finally, alignment of the single-copy and
dupli-Zeon-1 2.75 1.00
cated Tourist element 3⬘of the Adh1 locus confirmed
aInsertion/duplication as estimated bySonget al. (2001).
that C:T and G:A transition mutations are polarized
bInsertion time as estimated bySanMiguelet al.(1998).
(i.e., occur from the single to the duplicated MITE se-quence).
The degree of sequence identity between individual MITEs varies between 46 and 88% (Bureauand Wes-were observed from azs22;12 to zp22/6, whereas only
7 were observed in the opposite direction. This is in sler1992). Interestingly, the average degree of similar-ity between duplicated MITEs and MITEs inserted in agreement with duplication-dependent CG suppression
aszp22/6 has undergone a larger number of duplica- duplicated gene regions is higher than the average de-gree of similarity of MITEs inserted in single-copy re-tions compared toazs22;12 (sixvs.four, respectively).
If CG suppression were the result of time-dependent gions (66vs. 63%, respectively;P ⫽0.0431). This sup-ports our observations that linked duplicated sequences deamination of cytosine residues, an equal number of
polarized transition mutations would have been ex- evolve more similarly compared to single or dispersed, multicopy sequences.
pected given the fact that these genes have both
ampli-fied 0.6 MYA. In contrast, no association between transi- Methylation status of single- vs. multicopy genic
re-gions: C:T and G:A transition mutations are the
pre-tion mutapre-tions occurring at the CNG trinucleotide and
duplication was found for the informative alignments. sumed products resulting from deamination of methyl-ated cytosines. Therefore, a possible explanation of the Interestingly, 5 of the informative alignments did show
a time-dependent decrease in CNG content. For the enhanced turnover of CG dinucleotides might be that duplicated sequences exhibit qualitative or quantitative noninformative alignments, the majority of transition
mutations observed at CG dinucleotides (12/14) also differences in methylation compared to single-copy genes. To this end, we analyzed the methylation status of occurred in a polarized fashion from a less to a more
duplicated gene (or from a younger to an older duplica- theTouristelement located in the single-copy or tandem duplication of the 27-kD zein gene (zc2and x56118) tion). However, only 6/14 alignments showed a similar
behavior at the CNG trinucleotide. These results sup- by bisulfite sequencing. These elements were chosen because they exhibit small differences in CG and G⫹ port the prior observation that only CG suppression
correlates with the extent of duplication. C content (see Table 3). The results showed that in
both sequence contexts the MITE was hypermethylated Alignments were also performed between genes
be-longing to the 19-kD zein subfamilies, z1A, z1B, and at all CG dinucleotides, in addition to a proportion of methylated cytosines in a nonsymmetrical sequence z1D, which have undergone an average of eight, six,
Figure1.—Methylation state of aTouristelement in the single-copy or duplicated 27-kD zein gene. Bisulfite sequencing of a Tourist element in the 5⬘ region of the single or duplicated 27-kD zein gene. M, methylation of symmetrical CG and CNG sequences;M, methylation of nonsymmetrical sequences. The methylation state of 16 independent clones is indicated, and each letter represents two observations.
ated compared to 25/30 cytosine residues of the ele- duplications. This is supported by the fact that the 19-kD genes are more abundant compared to the 22-19-kD ment inserted in the tandem duplication. In addition,
quantitative analysis of 16 independent clones indicated genes (HagenandRubenstein1981;Wilsonand Lar-kins1984;HeindeckerandMessing1986).
that the MITE located in the duplicated 27-kD zein
gene showed 19 and 37% reduction in symmetrical and Duplication-dependent depletion of the␣-zein genes results largely from C:T and G:A transition mutations asymmetrical methylation, respectively, compared to
the element located in the single-copy gene. in a CG or CNG context. Similarly, elevated levels of
transition mutations have also been identified in other multigene families in plants such as the GAPA and rDNA
DISCUSSION
gene families of maize and the 5S RNA genes from Arabidopsis (Quigley et al.1989;Edwardet al.1996; Our results show that during a short evolutionary time
span, individual ␣-zein genes have accumulated large Matieu et al.2002). For these gene families, a higher level of C:T (or G:A) transition mutations was observed variations in CG content. In particular, within the
19-and 22-kD zein gene families, extensively duplicated of the nontranscribed genes, and it was argued that CG depletion resulted from relaxation of selective con-genes are more CG depleted compared to less
dupli-cated genes. In addition, the 19-kD zein genes are more straints at the transcriptional level (Quigleyet al. 1989; Matieuet al.2002). This explanation is, however, inade-suppressed than the 22-kD zein genes, indicating that
the ␣-zein gene family, as the most highly expressed be hypermethylated in plant tissue (BianchiandViotti genes are the most CG depleted. We speculate that the 1988;Lundet al.1995), only suppression of the multi-high expression levels of the most CG-suppressed␣-zeins copy zeins was observed. Taken together, the data sug-are caused by the lack of CG dinucleotides available for gest that methylation status per se is not sufficient to methylation, thus relieving methylation-mediated tran- explain differences in CG suppression. However, dupli-scriptional repression. Indeed, the fact that␣-zein genes cated sequences may exhibit a specific methylation pat-are methylated in both the coding and noncoding re- tern that alters the mutability of 5mC compared to a gions and exhibit an inverse relation between CG meth- single-copy methylated sequence. The Tourist element ylation and expression lends support to this idea (Bian- inserted in the 5⬘region of the single-copy 27-kD zein chiand Viotti1988; Lundet al. 1995;Sturaro and gene showed a quantitative difference in methylation Viotti2001). In addition, differences in the extent of compared to the same element inserted in the dupli-methylation could explain the large variations in expres- cated 27-kD zein gene. The significance of these find-sion levels of individual zein genes (Wooet al. 2001; ings in relation to CG suppression can only be specu-SongandMessing2002). lated. However, in rodents neither CG density nor the Thein vivorate of deamination of methylated cytosine methylation status could explain the observed mutation residues in plants is not known, whereas in mammals frequencies of CG dinucleotides of a transgene (Skopek the estimated half-life of a cytosine residue is between et al. 1998;Monroeet al.2001). Likewise, the methyla-24 and 60 million years (Yang et al. 1996). However, tion status of the expressed and nonexpressed 5S RNA the average plant gene is less depleted in CG compared genes in Arabidopsis failed to account for the elevated to the average mammalian gene (average CG score is levels of C:T and G:A transition mutations observed of 0.68 and 0.22, respectively) perhaps indicative of a de- the nonexpressed genes (Matieu et al. 2002).
creased mutability of CG dinucleotides in plants (Gar- An alternative explanation for the observed differ-diner-GardenandFrommer1987;Gardiner-Garden ences in CG content is that selective pressures differ et al. 1992). On an evolutionary time scale, many between chromosomal regions, which could result in transposons represent recent insertions in the maize different mutation rates of CG dinucleotides. For exam-genome. For example, a large number of LTR-retro- ple, this might explain why the 22-kD zein genes, which transposons that map to theAdh1-Flocus have inserted map to chromosome 4S, are suppressed, whereas no during the last 5 million years (SanMiguelet al.1998).
suppression of DNA sequences that map to the Adh1-F
Although these LTR regions exhibit a twofold increase
chromosomal region was observed. However, analysis in transition mutations compared to nonmethylated
in-of intergenic and LTR regions in-of retrotransposons, lo-tronic regions (SanMiguel et al.1998), we found no
cated in the 340-kb 22-kD zein gene cluster, showed correlation between CG suppression and insertion time
that CG suppression was localized to zein gene regions of these sequences. Given that spontaneous
deamina-and was not a common feature of this chromosomal tion of 5mC is a very slow process, this suggests that the
region. That CG suppression is independent of chromo-observed low CG values of duplicated elements or of
somal content is also supported by the fact that many elements inserted into duplicated gene regions also
re-sequences analyzed exhibit large variations in CG de-sult from enhanced turnover of CG as a rede-sult of gene
spite being located in identical genomic regions, e.g., duplication.
TouristandStowawayelements located in the noncoding If duplicated sequences are methylated compared to
regions of the single-copy or duplicated 27-kD zein gene their single-copy counterparts, an increase in
methylation-or the single methylation-or duplicated Touristelement located in related deaminations might be expected, subsequently
the 3⬘region of theAdh1locus. resulting in a reduction in CG content. However,
bisul-A study of 101 maize genes has shown that 40% of fite analysis of a MITE inserted in the single-copy or
codon-usage variation is due to a bias toward G or Cvs.
duplicated 27-kD zein gene locus showed that the
ele-A or U ending codons (Fennoy and Bailey-Serres ment was methylated in both sequence contexts. In
addi-1993). The bias toward C or G in the third position tion, as most transposons are hypermethlyated (
Rabi-(GC3) is larger for the single-copy zeins, whereas the nowiczet al. 1999; Tompa et al. 2002), it is probable
␣-zeins have low GC3 values. We found that CG dinucle-that the majority of LTR regions analyzed in this study
otides of the␣-zeins were suppressed at positions II-III are methylated. Indeed, the observation that LTR
re-and III-I, whereas the single-copy genes showed an ex-gions of retrotransposons that map to theAdh-Flocus
cess of CG at these positions. This is interesting as CG exhibit a twofold increase in transition mutations
com-suppression at position III-I seems to be specific of meth-pared to nonmethylated intronic regions strongly
sug-ylating species, whereas nonmethsug-ylating species show an gests that these single-copy insertions are methylated
excess of CG at this position (SchorderetandGartler (SanMiguelet al.1998). However, despite indirect
evi-1992). Therefore, we argue that the differences in the dence that these elements are methylated, most
ele-GC3 bias between the single- and multicopy␣-zein genes ments were not CG suppressed. Likewise, although both
extensive gene duplication and are not due to a bias in Arabidopsis allotetraploids. These changes include non-random alterations in methylation and gene silencing codon usage between the single- and multicopy zein genes.
Mammalian and plant genomes are made up of large caused by methylation or gene loss (Kashkush et al.
2002; Madlung et al. 2002). Although it is unclear regions of relatively homogeneous base composition
known as isochores (Bernardi 2000), and the debate whether the observed effects result from chromosome doubling or from the hybridization of different ge-is ongoing of whether thge-is mosaic structure ge-is caused
by mutation bias, natural selection, or biased gene con- nomes, it suggests that specific mechanisms are activated in plants in response to DNA amplification that presum-version (reviewed by Eyre-Walker andHurst2001).
Of particular interest in this context is the finding that ably function to maintain genome stability.
cytosine deamination plays a primary role in the evolu- The authors thank Mik Noordeweir for assisting in part of the tion of isochores (Fryxell and Zuckerkandl 2000). sequence analysis, Angelo Viotti and Vincenzo Rossi for critical read-ing of the manuscript, and E. Linton for estimates of insertion time
In maize, most genes are confined to isochores with a
of the 22-kDzein genes. This work was supported by a grant from
narrow GC range, with the exception of the␣-zeins and
the Danish National Research Foundation.
ribosomal genes that are located in poor and GC-rich fractions, respectively (Carelset al.1995). We have previously argued that the low GC3 content of the
LITERATURE CITED
␣-zeins could be explained by duplication-dependent
CG depletion. Given that the GC content and, in partic- Ashikawa, I., 2001 Gene-associated CpG islands in plants as revealed by analyses of genomic sequences. Plant J.26:617–625.
ular, the GC3 content of a gene is highly correlated to
Barry, C., G. Faugeron andJ. L. Rossignol, 1993 Methylation
the overall GC content of the isochore in which it is induced premeiotically in Ascobolus: coextension with DNA re-located (Bernardiet al. 1985;Clay et al.1996;Eyre- peat lengths and effect on transcription elongation. Proc. Natl.
Acad. Sci. USA90:4557–4561.
WalkerandHurst2001), duplication-dependent CG
Bender, J., 1998 Cytosine methylation of repeated sequences in
loss may, in part, explain the evolution of this particular eukaryotes; the role of DNA pairing. Trends Biochem. Sci.23:
GC-poor isochore. 252–256.
Bender, J., andG. R. Fink, 1995 Epigenetic control of an
endoge-We have shown that duplicated zein genes, LTR
ele-nous gene family is revealed by a novel blue fluorescent mutant
ments, and MITEs undergo specific changes in nucleo- of Arabidopsis. Cell83:725–734.
tide sequence. These changes have been observed in Bennetzen, J. L., K. Schrick, P. S. Springer, W. E. Brownand
P. SanMiguel, 1994 Active maize genes are unmodified and
CG dinucleotides and result in C:T and G:A transition
flanked by diverse classes of modified, highly repetitive DNA.
mutations and a net reduction in GC content. Such a
Genome37:565–576.
process is reminiscent of RIP inN. crassa, where duplica- Bernardi, G., 2000 Isochores and the evolutionary genomics of
vertebrates. Gene241:3–17.
tions are de novo methylated and riddled with point
Bernardi, G., B. Olofsson, J. Filipski, M. Zerial, J. Salinaset al.,
mutations (Selker1990). Unfortunately, this study
can-1985 The mosaic genome of warm blooded vertebrates. Science
not discern if transition mutations have occurred imme- 228:953–958.
diately upon duplication or result from subsequent mi- Bianchi, M. W., andA. Viotti, 1988 DNA methylation and tissue-specific transcription of the storage protein genes of maize. Plant
toses. This question has been addressed in Arabidopsis
Mol. Biol.11:203–214.
by sequence analysis of multicopy insertions of trans- Bird, A. P., 1980 DNA methylation and the frequency of CpG in genes after three sexual generations (Mittelstein- animal DNA. Nucleic Acids Res.8:1499–1504.
Bureau, T. E., andS. R. Wessler, 1992 Tourist: a large family of
Scheid et al. 1994). None of the transition mutations
small inverted repeat elements frequently associated with maize
characteristic of RIP were found, and it was argued genes. Plant Cell4:1283–1294.
that if RIP occurs in plants it occurs at a much lower Bureau, T. E., andS. R. Wessler, 1994 Stowaway: a new family of inverted repeat elements associated with the genes of
monocotyle-frequency. However, this experiment does not
necessar-donous and dicotylenecessar-donous plants. Plant Cell6:907–916.
ily exclude the possibility of a RIP-like mechanism in Cambereri, E. B., B. C. Jensen, E. SchabtachandE. Selker, 1989 plants. Indeed, if transition mutations are linked to the Repeat-induced G-C to A-T mutations in Neurospora. Science
244:1571–1575.
duplication process, merely inserting a multicopy locus
Carels, N., A. BarakatandG. Bernardi, 1995 The gene
distribu-into the Arabidopsis genome would, expectedly, fail to tion of the maize genome. Proc. Natl. Acad. Sci. USA92:11057–
recover any transition mutations. 11060.
Clay, O., S. Caccio, Z. Zoubak, D. MouchiroudandG. Bernardi,
From the detailed analysis of the 22-kD zein genes it
1996 Human coding and noncoding DNA: compositional
corre-is clear that the rate of CG depletion of duplicated
lations. Mol. Phylogenet. Evol.5:2–12.
sequences is enhanced compared to the average deami- Cooper, D. N., andM. Krawczak, 1989 Cytosine methylation and the fate of CpG dinucleotides in vertebrate genomes. Hum.
nation of methylated residues of dispersed multicopy
Genet.83:181–188.
sequences. Further analysis will reveal if this mechanism
Couloundre, C., J. H. Miller, P. J. FarabaughandW. Gilbert,
is common to other duplicated gene families. Suppres- 1978 Molecular basis of base substitution hotspots inEscherichia coli.Nature274:775–780.
sion of MITEs and LTR elements inserted in duplicated
Das, O. P., andJ. Messing, 1987 Allelic variation and differential
gene regions indicates that this might be the case.
Inter-expression of the 27-kDa zein locus in maize. Mol. Cell. Biol.7:
estingly, rapid responses to genome-wide duplication 4490–4497.
Das, O. P., K. Ward, S. RayandJ. Messing, 1991 Sequence variation
between alleles reveals two types of copy correction at the 27 kDa ation and expression of specific alleles of zein genes in the endo-sperm ofZea maysL. Plant J.8:571–581.
zein locus of maize. Genomics11:849–856.
Edward, S., I. V. Bukler andT. P. Holtsford, 1996 Zea mays Madlung, A., R. W. Masuelli, B. Watson, S. H. Reynolds, J. David-sonet al., 2002 Remodeling of DNA methylation and pheno-ribosomal repeat evolution and substitution patterns. Mol. Biol.
Evol.14:623–632. typic and transcriptional changes in synthetic Arabidopsis allo-tetraploids. Plant Physiol.129:733–746.
Esen, A., 1986 Separation of alcohol-soluble proteins (zeins) from
maize into three fractions by differential solubility. Plant Physiol. Maloisel, L., andJ. L. Rossignol, 1998 Suppression of crossing-over by DNA methylation in Ascobolus. Genes Dev.12:1381–
80:623–627.
Eyre-Walker, A., andL. D. Hurst, 2001 The evolution of isochores. 1389.
Matassi, G., R. Melis, K. C. Kuo, G. Macaya, C. W. Gehrkeet al., Nat. Genet. Rev.2:549–555.
Fennoy, S. L., andJ. Bailey-Serres, 1993 Synonymous codon usage 1992 Large-scale methylation patterns in the nuclear genomes of plants. Gene122:239–245.
inZea maysL. nuclear genes is varied by levels of C-ending and
G-ending codons. Nucleic Acids Res.23:5294–5300. Matieu, O., Y. Yukawa, M. Sugiura, G. PikardandS. Tourmente, 2002 5S rRNA genes expression is not inhibited by DNA
methyl-Flavell, R. B., 1994 Inactivation of gene expression in plants as a
consequence of specific sequence duplication. Proc. Natl. Acad. ation in Arabidopsis. Plant J.29:313–323.
McClelland, M., 1983 The frequency and distribution of methyla-Sci. USA91:3490–3496.
Flavell, R. B., M. O’DellandW. F. Thompson, 1988 Regulation of table DNA sequences in leguminous plant protein coding genes. J. Mol. Evol.19:346–354.
cytosine methylation in ribosomal DNA and nculeolus organizer
expression in wheat. J. Mol. Biol.204:523–534. Mittelstein-Scheid, O., K. AfsarandJ. Paszkowski, 1994 Gene inactivation on Arabidopsis thaliana is not accompanied by an
Fryxell, K. J., andE. Zuckerkandl, 2000 Cytosine deamination
plays a primary role in the evolution of mammalian isochores. accumulation of repeat-induced point mutations. Mol. Gen. Genet.244:325–330.
Mol. Biol. Evol.17:1371–1383.
Gardiner-Garden, M., andM. Frommer, 1987 CpG islands in verte- Monroe, J. J., M. G. ManjanathaandT. R. Skopek, 2001 Extent of CpG methylation is not proportional to thein vivospontaneous brate genomes. J. Mol. Biol.196:261–282.
Gardiner-Garden, M., J. A. SvedandM. Frommer, 1992 Methyla- mutation frequency at transgenic loci in Big Blue rodents. Mutat. Res.476:1–11.
tion sites in angiosperm genes. J. Mol. Evol.34:219–230.
Goyon, C., and G. Faugeron, 1989 Targeted transformation of Montero, L. M., J. Filipski, P. Gil, J. Capel, J. M. Martinez-Zapater
et al., 1992 The distribution of 5-methylcytosine in the nuclear
Ascobolus immersusandde novomethylation of the resulting
dupli-cated DNA sequences. Mol. Cell. Biol.9:2818–2827. genome of plants. Nucleic Acids Res.20:3207–3210.
Prat, S., J. Cortadas, P. PuigdomenechandJ. Palau, 1985
Multi-Gruenbaum, Y., T. Naveh-Many, H. CedarandA. Razin, 1981
Se-quence specificity of methylation in higher plant DNA. Nature ple variability in the sequence of a family of maize endosperm proteins. Nucleic Acids Res.13:1493–1504.
292:860–862.
Gruenbaum, Y., H. CedarandA. Razin, 1982 Substrate and se- Quigley, F., H. Brinkmann, W. F. Martin and R. Cerff, 1989 Strong functional GC pressure in a light regulated maize gene quence specificity of a eukaryotic DNA methylase. Nature295:
620–622. encoding subunit GAPA of chloroplast gyceraldehyde-3-phos-phate dehydrogenase: implications for the evolution of GAPA
Hagen, G., andI. Rubenstein, 1981 Complex organization of zein
genes in maize. Gene13:239–249. pseudogenes. J. Mol. Evol.29:412–421.
Rabinowicz, P. D., K. Schutz, N. Dedhia, C. Yordan, L. D. Parnell Heindecker, G., andJ. Messing, 1986 Structural analysis of plant
genes. Annu. Rev. Plant Physiol.37:439–466. et al., 1999 Differential methylation of genes and retrotranspo-sons facilitates shotgun sequencing of the maize genome. Nat.
Holstein, M., M. S. Greenblatt, K. Rice, T. Soussi, R. Fuchset
al., 1994 Database ofp53gene somatic mutations in human Genet.23:305–308.
Reina, M., P. Guillen, I. Ponte, A. BoronatandJ. Palau, 1990 tumors and cell lines. Nucleic Acids Res.22:3551–3555.
Jeddeloh, J. A., andE. J. Richards, 1996 mCCG methylation in Sequence analysis of a genomic clone encoding a Zc2 protein fromZea maysW64A. Nucleic Acids Res.18:6425.
angiosperms. Plant J.9:579–586.
Jones, P. A., W. M. Rideout, J. C. Shen, C. H. SpruckandY. C. Tsai, Ronchi, A. K., K. PetroniandC. Tonelli, 1995 The reduced ex-pression of endogenous duplications (REED) in the maizeRgene 1992 Methylation, mutation and cancer. Bioessays14:33–36.
Kashkush, K., M. FeldmanandA. A. Levy, 2002 Gene loss, silencing family is mediated by DNA methylation. EMBO J.14:5318–5328.
Rountree, M. R., andE. U. Selker, 1997 DNA methylation inhibits and activation in a newly synthesized wheat allotetraploid.
Genet-ics160:1651–1659. elongation but not initiation of transcription inNeurospora crassa.
Genes Dev.11:2383–2395.
Kirihara, J., J. B. PetriandJ. Messing, 1988 Isolation and sequence
of a gene encoding a methionine rich 10-kDa zein protein from Rubenstein, I., andD. E. Geraghty, 1986 The genetic organization of zeins, pp. 297–315 inAdvances in Cereal Science and Technology, maize. Gene71:359–370.
Kovarik, A., R. Matyasek, A. Leitch, B. Gazdova, J. Fulnecek edited by Y.Pomeranz. American Association of Cereal Chemists, St. Paul.
et al., 1997 Variability in CpNpG methylation in higher plant
genomes. Gene201:25–33. Russell, D. A., andM. M. Sachs, 1991 The maize cytosolic glyceral-dehyde 3-phosphate dehydrogenase gene family: organ specific
Kricker, M. C., J. W. DrakeandM. Radman, 1992
Duplication-targeted DNA methylation and mutagenesis in the evolution of expression and genetic analysis. Mol. Gen. Genet.229:219–228.
SanMiguel, P., A. Tikhonov, Y. K. Jin, N. Motchoulskaia, D.
Zak-eukaryotic chromosomes. Proc. Natl. Acad. Sci. USA89:1075–
1079. harovet al., 1996 Nested retrotransposons in the intergenic regions of the maize genome. Science274:737–738.
Kumar, A., andJ. L. Bennetzen, 1999 Plant retrotransposons. Annu.
Rev. Genet.33:479–532. SanMiguel, P., B. S. Gaut, A. Tikhonov, Y. NakajimaandJ. L. Bennetzen, 1998 The paleontology of intergene
retrotrans-Leite, A., L. M. M. Ottoboni, M. L. P. N. Targon, M. J. Silva, S. R.
Turcinelliet al., 1990 Phylogenetic relationship of zein and posons of maize. Nat. Genet.20:43–45.
Schorderet, D. F., andS. M. Gartler, 1992 Analysis of CpG sup-coixins as determined by immunological cross-reactivity and
Southern blot analysis. Plant Mol. Biol.14:743–751. pression in methylated and non-methylated species. Proc. Natl. Acad. Sci. USA89:957–961.
Leutwiler, L. S., B. R. Hough-EvansandE. M. Meyerowitz, 1984
The DNA ofArabidopsis thaliana.Mol. Gen. Genet.194:15–23. Selker, E. U., 1990 Premeiotic instability of repeated sequences in
Neurospora crassa.Annu. Rev. Genet.24:579–613.
Liu, C. N., andI. Rubenstein, 1992 Molecular characterization of
two types of 22 kilodalton␣-zein genes in a gene cluster in maize. Skopek, T., D. Marino, K. Kort, J. Miller, M. Trumbaueret al., 1998 Effect of target gene CpG content on spontaneous muta-Mol. Gen. Genet.234:244–253.
Llaca, V., andJ. Messing, 1998 Amplicons of maize genes are tion in transgenic mice. Mutat. Res.400:77–88.
Soave, C., R. Reggiani, N. DifonzoandF. Salamini, 1981 Cluster-conserved within genic but expanded and constricted in
in-tergenic regions. Plant J.15:211–220. ing of genes for 20 kd zein subunits in the short arm of maize chromosome 7. Genetics97:363–377.
demethyl-Soave, C., R. Reggiani, N. DifonzoandF. Salamini, 1982 Genes White, S. E., L. F. HaberaandS. R. Wessler, 1994 Retrotranspo-sons in the flanking regions of normal plant genes: a role for for zein subunits on maize chromosome 4. Biochem. Genet.20:
1027–1038. copia-like elements in the evolution of gene structure and expres-sion. Proc. Natl. Acad. Sci. USA91:11792–11796.
Song, R., andJ. Messing, 2002 Contiguous genomic DNA sequence
comprising the 19-kD gene family from maize. Plant Physiol.130: Wilson, C. M., G. F. Sprague andT. C. Nelson, 1989 Linkage among zein genes determined by isoelectrical focusing. Theor. 1626–1635.
Song, R., V. Llaca, E. Linton andJ. Messing, 2001 Sequence, Appl. Genet.77:217–226.
Wilson, D. R., andB. A. Larkins, 1984 Zein gene organization in regulation, and evolution of the maize 22-kD␣-zein gene family.
Genome Res.11:1817–1825. maize and related grasses. J. Mol. Evol.20:330–340.
Woo, Y. M., D. W. Hu, B. A. LarkinsandR. Jung, 2001 Genomics
Spena, A., A. ViottiandV. Pirotta, 1983 Two adjacent genomic
zein sequences: structure, organization and tissue-specific restric- analysis of genes expressed in maize endosperm identifies novel seed proteins and clarifies patterns of zein gene expression. Plant tion pattern. J. Mol. Biol.169:799–811.
Sturaro, M., andA. Viotti, 2001 Methylation of the Opaque2 box Cell13:2297–2317.
Yang, A. S., M. L. Gonzalgo, J. Zingg, R. P. Miller, J. Buckleyet
in zein genes is parent-dependent and affects O2 DNA binding
activityin vitro.Plant. Mol. Biol.46:549–560. al., 1996 The rate of CpG mutation inAlurepetitive elements within the p53 tumor suppressor gene in the primate germline.
Swarup, S., M. C. P. Timmermans, S. ChaudhuriandJ. Messing,
1995 Determinants of the high-methionine trait in wild and J. Mol. Biol.258:240–250.
exotic germplasm may have escaped selection during early cultiva- Zeschnigk, M., C. Lich, K. Buiting, W. DoerflerandB.
Hors-tion of maize. Plant J.8:35–40. themke, 1997 A single-tube PCR test for the diagnosis of
Tikhonov, A. P., P. J. SanMiguel, Y. Nakajima, N. M. Gorenstein Angelman and Prader-Willi syndrome based on allelic methyla-andJ. L. Bennetzen, 1999 Colinearity and its exceptions in tion differences at theSNRPNlocus. Eur. J. Hum. Genet.5:94–98. orthologousadhregions of maize and sorghum. Proc. Natl. Acad. Zhang, Q., J. ArbuckleandS. R. Wessler, 2000 Recent, extensive, Sci. USA96:7409–7414. and preferential insertion of members of the miniature
inverted-Tompa, R., C. M. McCallum, J. Delrow, J. G. Henikoff, B. van repeat transposable element familyHeartbreakerinto genic regions
Steenselet al., 2002 Genome-wide profiling of DNA methyla- of maize. Proc. Natl. Acad. Sci. USA97:1160–1165. tion reveals transposon targets of CHROMOMETHYLASE3. Curr.