• No results found

Nucleotide sequence of the 5' noncoding region and part of the gag gene of Rous sarcoma virus.

N/A
N/A
Protected

Academic year: 2019

Share "Nucleotide sequence of the 5' noncoding region and part of the gag gene of Rous sarcoma virus."

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

JOURNAL 1982, 535-541 0022-538X/82/0535-07$02.00/0

Nucleotide

Sequence

of the

5'

Noncoding

Region

and Part of

the gag

Gene of Rous Sarcoma Virus

RONALDSWANSTROM,* HAROLDE.VARMUS,ANDJ. MICHAEL BISHOP

Department of Microbiology and Immunology, UniversityofCalifornia,San Francisco, California 94143

Received3 August1981/Accepted28September1981

Several functions ofthe retrovirus genome involve structural features in the

vicinity of its 5' terminus. In anefforttofurtherelucidatetherelationship between

structure andfunction inretrovirusRNA, we havedetermined the sequence of the

first 1,010nucleotides at the 5' endofthe genomeof Roussarcomavirusbyusing

theMaxam-Gilbert method to sequence suitable domains in cloned Roussarcoma

virus DNA. The results (i) locate the initiation codon for the gag gene of Rous

sarcomavirus 372nucleotidesfrom the 5' end of viralRNA;(ii) demonstrate that

thiscodonispreceded bythree methionine codons thatareapparentlynotused in

translation; (iii) sustain previous conclusions that the principal site to which

ribosomes bind on the Roussarcomavirus genome in vitro doesnotcontain the

initiation codon for gag; (iv) permitdeduction of the amino acid sequence of a

viralstructuralprotein, p19; (v) confirmtheamino-terminal sequence ofPr76505;

and (vi) substantiate the identification ofa splice donor site described in the

accompanying manuscript (Hackettetal.,J. Virol., 41:527-534, 1982).

The genomesof retroviruses assume multiple

roles during the viral life cycle, servingas

tem-plate forthe synthesis of viralDNAbyreverse

transcriptase, messenger for the synthesis of

several viral gene products, precursor for the

genesis of subgenomic mRNA's, andvehicle for

thetransmission of viralgenesfromonehost cell

to another (1). Each of these roles involves

structuralfeatures locatedinthevicinity of the

5' terminus ofthe viral genome: the initiation

site for reverse transcription (41); a nucleotide

sequence(R)repeatedatthe 5'and3'terminiof

viralRNAthat isrequiredfor chainpropagation

byreversetranscriptase(3a);aribosomebinding

siteandAUG codonrequiredtoinitiate

transla-tion (29, 45); a splice donor site used in the

genesis of subgenomicmRNA's (5, 8, 9, 11,25,

28, 30, 38, 46); and a nucleotide sequence of

uncertain size that is apparently necessary for

incorporation of viralRNAintomaturing virions

(23, 34).Tofacilitate the full elucidation ofthese

functionally important features, wehave

deter-minedthesequenceof1,010nucleotidesatthe 5'

endof the genome of the subgroup A

Schmidt-RuppinstrainofRous sarcoma virus(RSV). Our

experimentalapproachexploitedtheavailability

of molecularly cloned DNA derived from the

genome of subgroup A Schmidt-Ruppin RSV

(7).The results locate the initiation codon for the

gag geneofRSV;demonstrate that thiscodon is

precededby threeAUGcodons that are

appar-entlynotusedintranslation, permitdeduction of

the amino acid sequence ofa viral structural

protein (p19),andidentifyanucleotidesequence

thatis likelytobe the splicedonor sitemapped

intheaccompanying manuscript (13).

MATERALSANDMETHODS

Allanalyses were performed on DNA derived from themolecular cloningofthe genome of SR-ARSV. The viral genome was first cloned into the phage vectorAgtWESXB (2, 21) asdescribedpreviously(7). TheSRA-2cloneof viralDNA was used in thisstudy

(7).Suitable restriction fragmentswerethen subcloned into theplasmidvector pBR322 by conventional pro-cedures(3). Two subclones were used for the present work(Fig. 1). One (pPvuDG)contained thePvuII-D

and-GfragmentsfromRSV DNA, joined at theSacI

site to regenerate their normal orientation compared with the Sacd permuted SRA-2 clone (R. Parker,

personal communication); the other plasmid used (pBam C) contained the 1.35-kilobase-pair BamHI-C

fragmentof SRA-2DNA,which extends from position

525ontheviralgenome (the RNA capping site isused asposition1) to a pointwell within gag (7). Viral DNA

inthe subclones was sequenced by the procedure of Maxam and Gilbert (24). For this purpose, we pre-pared restriction fragments with 5' overhangs that permit end labeling either by repair synthesis with reverse transcriptase (39) or by T4 polynucleotide

kinase(24).

RESULTS

Strategy for determining the nucleotide

se-quence ofviralDNA. The RNA genome of RSV

is transcribed into DNA during the early hours

ofinfection(42). The first stable product of viral

535

on November 10, 2019 by guest

http://jvi.asm.org/

(2)

DNAsynthesis is alinear d

sive with the viral genol

addition, terminal redund

threedomains: U3, encodi

viral RNA; R, a small re(

eachend ofthe viral RNA

the 5' end of the viral I

domains arearranged in th4

to give the long termina

linear duplexes are then

circularmolecules of two

ing onecopy of theLTR se

contain two copies (17, 1

previously reported the is

cloning of thecircular DN.

study we have used the

DNA which contains tw(

sequence (7, 39).

Twosubclones of viral I

used for sequence deter

pPvu DG, a clone of the

R U3 ,U5

Capf

(-)PB

BstEII Sac

Pvu 11

EcoRI BamHI

Xho Hinfl Sau3AI Hpa11

A A

A

pPvuDG

FIG. 1. Strategy for sequ

molecular subclones pPvu I

mapped with restriction endc

age sites useful in sequenci

Maxamand Gilbert (24). Th4 325nucleotides,permittingtl tinuous sequence. The doma

represented inthesequenced topofthedrawing. The dom explained in thetext.The 5'e denoted by the label Cap,

tRNAT`P by (-)PB. The pc

approximated from the data manuscript; the location of previous reports (35,44) and Hunter (personal communic the direction of sequencing restriction sites.

Luplex that is coexten- mentsthatspans the fused 3' and 5' ends of the

me, but that has, in RSV genome and that includes ca. 250

nucleo-lancies composed of tides from the 3' end of the RSVgenome, both

edatthe 3' endofthe copiesof the LTR sequence, andca.750

nucleo-dundancy encoded at tides from the 5' end of the viral RNA (7, 39);

i;

and U5, encodedat and(ii) pBam C, which overlaps with the

right-RNA (17, 32). These ward domain of pPvu DG and extends from

eorder 5'-U3-R-U5-3' position525ontheRSV genome to a pointwell

1 repeat (LTR). The within gag (7). Restriction mapping of these

formed into closed subclones revealed thecleavage sites illustrated

sorts-those contain- in Fig. 1 and led to the sequencing strategy

-quence and those that summarized there. To assure accuracy, each

[8, 32, 39). We have region of viral DNAwaseither sequenced

inde-olation and molecular pendentlyfrommorethanonerestriction siteor

As ofRSV (7). Inthis sequencedrepeatedly from the samesite.

SRA-2 clone of viral Nucleotide sequence of the DNA. Figure 2

D copies of the LTR illustrates the sequence of 1,010nucleotides of

viral DNA from the recombinant clones and

DNAin pBR322 were identifies thelocation of allrecognizable

restric-mination (Fig. 1): (i) tionsites. Figure 3 presents the same sequence

PvuII-D and -G frag- as the plus strand of viral RNA. The sequence as

illustratedinboth figures begins at the 5'

termi-nus of the subgroup A Schmidt-Ruppin RSV

genome and extends in the 3' direction. Because

p19 plO? the 3' and 5' ends of the viral genome are fused

in thecloned DNA (see above), it wasnecessary

to deduce the location of the 5' terminus of the

RNA. This was easily done by reference to

previous analyses of other strains of RSV (12, A

15, 19, 37,

40)

and

by

our own studies

using

an

adaptation of the chain-terminator sequencing

technique of Sanger and colleagues (31, 40) to

determine the sequence of therunoff DNA

prod-uct synthesized from the 5' terminus of

sub-group A Schmidt-Ruppin RSV RNA (data not

shown).

Functional aspects of the sequence. Several

* - - * important landmarks could be located on the

sequence.(i)Theterminallyredundantsequence

known as R has been characterized previously

11111111] and occupies positions 1

through

21

(3a).

(ii)

A

Liizzm

:11 sequence of 18 nucleotides

(positions

102

pBamC

c~~

through

119)

is

base-paired

with the 3' stem

region of tRNATIP to provide a primer for

re-encing viral DNA. The verse

transcriptase

(4,

10,

39);

as aconsequence,

DO and pBam C were viralDNA synthesis initiates at position 101 on

Dng

by the technique of the template (15, 37). (iii) Previous work has

e twoclones overlap by

identified

a

ribosome

binding site that includes

heconstruction of a con- the AUG atposition41 and

neighboring

nucleo-ins of the viral genome tides, positions 9through 53 (6); paradoxically,

DNA are depicted at the this AUG apparently does not serve for the

nains U3, R, and US are initiation of translation (see below). (iv) The

-nd of theviral genomeis beginning of the gag gene was identified by

the site of binding for searching for an AUG followed by an extensive

rported

in

9heapsresent

open

reading

frame

(Fig.

4) and

by

reference to

plO is based largely on the previously determined amino acid sequence

unpublished data of E. at the amino terminus of the gag gene product

-ation). Arrows indicate (27). A suitable AUG was found at position 372

employed at individual (Fig. 4)andwasfollowedbythepredictedamino

acid sequence (see

below).

(v) In the

accompa-*

on November 10, 2019 by guest

http://jvi.asm.org/

[image:2.490.57.250.313.511.2]
(3)

SEQUENCE

Hph I HgiAI EcoRIl HaeIII Asu I Hinfl HgiAl

GCCATTTGACCATTCACCACATTGGTGTGCACCTGGGTTGATGGCCGGACCGTTGATTCCCTGACGACTACGAGCACCTG

I ~~~~IHpa11Ava 11

20 40 60 80

BstEII Hae III Mbo I

CATGAAGCAGAAGGCTTCATTTGGTGACCCCGACGTGATAGTTAGGGAATAGTGGTCGGCCACAGACGGCGTGGCGATCC

HphI III

100 120 140 160

Mnl I Taq I EcoRIl Mnl I Bbv I Fnu4HI Mnl I AluI

TGTCTCCATCCGTCTCGTCTATCGGGAGGCGAGTTCGATGACCCTGGTGGAGGGGGCTGCGGCTTAGGGAGGCAGAAGCT

180ltsu 200200 Fnu4HI Dde I

~~~~~~~~~~~~~~~~~~~240

Rsa I Mnl IHgiAI A/u I EcoRII Asu I Asu I Dde I Mbo11

GAGTACCGTCGGAGGGAGCTCCAGGGCCCGGAGCGACTGACCCCTGCCGAGAACTCAGAGGGTCGTCGOAAGACGGAGAG

Dde I Sac HaeIIIHpa1 I2 Mnl I 3

260 280 n/I320

EcoRIl Hae III Mbo I HphI

TGAGCCCGACGACCACCCCAGGCACGTCTTTGGTCGGCCTGCGGATCAAGCATGGAAGCCGTCATTAAGGTGA1TTCGTC

360 380 400

Tha I Dde I Asu I

CGCGTGTAAAACCTATTGCGGGAAAATCTCTCCTTCTAAGAAGGAAATAGGGGCCATGTTGTCCCTGTTACAAAAGGAAG

I ~~~~~~~~~~HaeIII

420 440 460 480

Mnl I Hpa11 Asu IEcoRlI Xho 11 Fnu4HIH/haI

GGTTGCTTATGTCTCCCTCAGATTTATATTCTCCGGGGTCCTGGGATCCCATCACTGCGGCGCTCTCCCAGCGGGCAATG

Dde I Ava 11 BamHIMbo I HaeI1 560

Rsa I EcoRII Fnu4HI Xho I Mnl I

GTACTTGGAAAATCGGGAGAGTTAAAAACCTGGGGATTGGTTTTGGGGGCATTGAAGGCGGCTCGAGAGGAACAGGTTAC

I ~~~~~~~~~~IAvaITaq II

580 600 620 640

Dde I Mnl IMnl I EcoRIIAsuIHpa11 TaqI A/uI

ATCTGAGCAAGCAAAGTTTTGGTTGGGATTAGGGGGAGGGAGGGTCTCTCCCCCAGGTCCGGAGTGCATCGAGAAACCAG

660 680 Ava1172

Bbv I Mbo 11 Hae 11 Hha I Mnl I CTACGGAGCGGCGAATCGACAAAGGGGAGGAGGTGGGAGAAACAACTGTGCAGCGAGATGCGAAGATGGcGCCAGAGGAA

Hinfl MnlI Fnu4HI AcyI 8

740 760 780 800

Fnu4HI Pvu 11 BbvI Hha I

GCGGCCACACCTAAAACCGTTGGCACATCCTGCTATCATTGCGGAACAGCTGTTGGCTGCAATTGCGCCACCGCCACAGC

Hae III Alu I Fnu4HI I8

820 840 880

Mnl I Hae IIIMnlI EcoRII Fnu4HI Asu IEcoRII

CTCGGCCCCTCCTCCCCCTTATGTGGGGAGTGGTTTGTATCCTTCCCTGGCGGGGGTGGGAGAGCAGCAGGGCCAGGGAG

920 940 Bbv I HaeIII 960

AvaI Fnu4HI MnlI EcoRil Tha I ATAACACGTCTCGGGGGCGGAGCAGCCAAGGGAGGAGCCAGGGCACGCGG

BbvI 1010

FIG. 2. Nucleotide sequenceof cloned viral DNA. Thesequenceisnumbered from the5' end of the viral genome and depicts the same polarity as the genome ("positive strand"). Identifiable sites of cleavage by restrictionendonucleaseswerelocated by computer-assisted search. Identifiable regions within thesequence, as

described in thetext,are asfollows:R, 1-21; US, 22-101; (-)PB, 102-119;startofgagandp19, 372; end of p19,

902.

nying manuscript (13), a splice donor site has

beenmappedtothevicinity of position 390;an excellentfacsimile of thecanonical splice donor

sequence (22) occupies positions 387 through

395 (see below; Fig. 5).

Although the sequence given here is more

thanampletoaccommodate theentirety of p19,

wecannotpresently identify the carboxy

termi-nus of the protein with certainty. Since p19 is

but one of five proteins encoded by gag in a

singlepolyprotein (with the probable order p19-plO-p27-p12-pI5;seereferences 35 and 44;

per-sonalcommunication from E. Hunter), the

car-boxy terminus of p19 is not demarcated by a

termination codon. On the basis ofincomplete datadescribingamino acidsequencesin p19and

other gag proteins, however, we suggest that p19mayterminate with amino acid residue 177

(tyrosine) in Fig. 3 (see below).

DISCUSSION

Authentication of the nucleotidesequence. We presenthere the sequenceof1,010 nucleotides,

beginningatthe5' terminus of theRSVgenome

andextending through the p19 domain of thegag gene. We have anumber ofreasons tobelieve that our identification and sequencing of the

viral DNA have notbeen affected by anylarge errors. First, the subclone pPvu DG includes

bothcopies of the LTRsequenceand displaysat

least one of the functional activities of this domain-the ability to direct the initiation of

Fnu4HI Taq I Mnl I

Asu I Mnl I

9VW

VOL.41, 1982

on November 10, 2019 by guest

http://jvi.asm.org/

[image:3.490.47.442.62.445.2]
(4)

SWANSTROM, VARMUS, BISHOP

GCCAUUUGACCAUUCACCACAUUGGUGUGCACCUGGGUUGAUGGCCGGACCGUUGAUUCCCUGACGACUA

CGAGCACCUGCAUGAAGCAGAAGGCUUCAUUUGGUGACCCCGACGUGAUAGUUAGGGAAUAGUGUUCUGUCACAGACGUG GUGGCGAUCCUGUCUCCAUCCGUCUCGUCUAUCGGGAGGCGAGUUCGAUGACCCUGGUGGAGGGGGCUGCGGCUUAGGGA

GGCAGAAGCUGAGUACCGUCGGAGGGAGCUCCAGGGCCCGGAGCGACUGACCCCUGCCGAGAACUCAGAGGGUCGUCGGA

met glu ala AGACGGAGAGUGAGCCCGACGACCACCCCAGGCACGUCUUUGGUCGGCCUGCGGAUCAAGC AUG GAA GCC

10 20

val ile lys val ile ser ser ala cys Iys thr tyr cys gly lys ile ser pro ser lys GUC AUU AAG GUG AUU UCG UCC GCG UGU AAA ACC UAU UGC GGG AAA AUC UCU CCU UCU AAG

30 40

Iys glu ile gly ala met leu ser leu leu gin Iys glu gly leu leu met ser pro ser AAG GAA AUA GGG GCC AUG UUG UCC CUG UUA CAA AAG GAA GGG UUG CUU AUG UCU CCC UCA

50 60

asp leu tyr ser pro gly ser trp asp pro ile thr ala ala leu ser gin arg ala met

GAU UUA UAU UCU CCG GGG UCC UGG GAU CCC AUC ACU GCG GCG CUC UCC CAG CGG GCA AUG

70 80

val leu gly lys ser gly glu leu lys thr trp gly leu val leu gly ala leu lys ala GUA CUU GGA AAA UCG GGA GAG UUA AAA ACC UGG GGA UUG GUU UUG GGG GCA UUG AAG GCG

90 100

ala arg glu glu gin val thr ser glu gin ala Iys phe trp leu gly leu gly gly gly GCU CGA GAG GAA CAG GUU ACA UCU GAG CAA GCA AAG UUU UGG UUG GGA UUA GGG GGA GGG

110 120

arg val ser pro pro gly pro glu cys ile glu lys pro ala thr glu arg arg ile asp

AGG GUC UCU CCC CCA GGU CCG GAG UGC AUC GAG AAA CCA GCU ACG GAG CGG CGA AUC GAC

130 140

lys gly glu glu val gly glu thr thr val gin arg asp ala lys met ala pro glu glu

AAA GGG GAG GAG GUG GGA GAA ACA ACU GUG CAG CGA GAU GCG AAG AUG GCG CCA GAG GAA

150 160

ala ala thr pro lys thr val gly thr ser cys tyr his cys gly thr ala val gly cys

GCG GCC ACA CCU AAA ACC GUU GGC ACA UCC UGC UAU CAU UGC GGA ACA GCU GUU GGC UGC

170 180

asn cys ala thr ala thr ala ser ala pro pro pro pro tyr val gly ser gly leu tyr

AAU UGC GCC ACC GCC ACA GCC UCG GCC CCU CCU CCC CCU UAU GUG GGG AGU GGU UUG UAU

190 200

pro ser leu ala gly val gly glu gin gin gly gin gly asp asn thr ser arg glV arg

CCU UCC CUG GCG GGG GUG GGA GAG CAG CAG GGC CAG GGA GAU AAC ACG UCU CGG GGG CGG 70

150

230 310

380

440

500

560

620

680

740

800

860

920

980 210

ser ser gin gly arg ser gin gly thr arg

AGC AGC CAA GGG AGG AGC CAG GGC ACG CGG

FIG. 3. Proposedaminoacidsequence for thep19domain of RSVgag. The sequence presented in Fig. 2 is hererewrittenasviralRNA,beginningatthe 5'terminus of the RSV genome. Theunderliningdenotes the siteof

bindingfortRNAT1p.Anopenreadingframe that may encodep19wasidentified as described in the text and in thelegendtoFig.4and usedtodeduceaproposedamino acid sequence for thep19domain of gag.

RNA synthesis both in vitro (W. DeLorbe,

personal communication) and in vivo (7; P.

Luciw,personalcommunication). Second, both subclones have been mapped extensively with restriction endonucleases (7; Fig. 1). The se-quence illustrated in Fig. 2 contains all of the

sites predicted by these previous analyses. Third,ourpresentsequenceof theU5domain is

in agreement with previous results obtained from viralDNAsynthesizedinvitro(15, 37, 40). Fourth, thesequence correctly locates and rep-resentsthepreviouslycharacterized binding site for tRNATrP (4, 10). Fifth, the viral DNA was

derived fromareplication-competent virus, and

the clonedDNAisinfectious (7). Sixth, wecan

deduce fromoursequence thepreviously

deter-mined amino acid sequence from the amino

terminus of the gag gene (27); the sequence

containsasingleopenreading frame inthe gag

region.

Locating gag proteins on the nucleotide se-quence. The initial product oftranslation from

gag appears to be Pr76gag (29, 45), a

76,000-dalton protein whose amino-terminal sequence

has been determined to be:

met-glu-ala-val-ile-lys-val-x-x-ala-x-lys (27). A virtually identical

on November 10, 2019 by guest

http://jvi.asm.org/

(5)

NUCLEOTIDE SEQUENCE OF RSV 539

1 7 39 41

r. 54 o 62

4. 82

'H 83

S

o 105

"I 116

a 119 X 123 o' 130

0

U) 198 o 199

z 225

240

278

321

Reading FramE

2 3

OP 372

OP 386

AUG 391

OP 407

OP 437

AUG 448

OP 456

OP 489

OP 558

AM 583

AM 613

AM 644

AUG 670

OP 778

AM 786

OP 812

OP 901

OP 962

FIG. 4. Analysis ofreading frar

initiation andtermination codons we of the threereading frames; frame first nucleotide at the 5' terminus of 1 3), frame 2 beginswith the second,a

with the third. AUG, Methionine-; codon; AM, amber termination co ochreterminationcodon (UAA); OP codon (UGA). The AUG codon N

initiates thegaggene inframe 3 is

sequence is encoded by nucleot

to 410 in RSV RNA(Fig. 3), the

cy being the presence of thr

between valandala rather than t

unexplained discrepancy, we p

virtualidentity suffices to locate

the gag gene on thenucleotide

ThecleavageofPr76gaggives

proteins whose order within

precursor isprobably: p19-p10-l

44; personal communication fr

The carboxy terminus ofp19 is

by atermination codon, but we

todeduce theapproximatelocat

nus to be position 177 by usi

findings ofE. Hunter(personal c

(i) On the basis of carboxypept

p19ends intyrosine. Giventhen

ofp19,thetyrosine residueatpi

or 183 might represent the ca

(Fig. 3). (ii) The tyrosine at resi

to lie withinplO (E. Hunter, per

cation). (iii) Carboxypeptidase

fromp19 and thenfails toprog

the protein (E. Hunter, persoi

tion). The sequence before the

due 155 should be fully suscepi

sis. The proline residue tha

tyrosine at position 177 would

releaseof thetyrosine (14), mak

the likely carboxy terminus of

identification of the carboxy

tern

the exactlocalization of plO await further

analy-1 2 3 sis of amino acid sequence in the isolated

pro-AUG teins.

oc Ribosome

binding

and theinitiationof

transla-OP tion from

gag.

Translation from the mRNA's of

oc

eucaryotic

organisms

generally

initiates

only

in

OC the

vicinity

ofthe 5' terminusofthe RNA

(M.

AUG Kozak, in A. Shatkin, ed., Current Topics in

AUG Microbiology and Immunology, in press).

Sher-AUG man and Stewart

(36)

and Kozak

(20)

have

OP proposed that ribosomes inevitably bind at or

OP near the 5' end ofeucaryotic mRNA's and then

AUG "travel" to the first AUG downstream, where

AUG translation initiates. Previousfindingswith RSV

oc were in accord with these views: translation of

AUG the intact RSV genome initiates

only

at gag, the

genelocated closest to the 5' end of the genome

nes. All potential (29, 45); and a strong ribosome binding site has

-relocated ineach been identified that includes the first AUG

1begins with the downstream from the 5' end of RSV RNA (6).

thesequence

(F.ig

Paradoxically, however, the identified ribosome

otfenti beinitiation

binding

site doesnot

represent

thesiteof

initia-pton

(UAG);

oCn

tionfortranslationfromgag.This

paradox

was

',opaltermination apparent from a comparison of previous

analy-which we deduce sesof the nucleotide sequencein theU5domain

underlined. (15, 37) and the amino-terminal sequence of

Pr76rar

(27), and the paradox isfully manifest in

ourpresent data. Theinitiation codon for gag is

ideresidues 372 preceded by three AUG codons, eachofwhich

only discrepan- is followed shortly and in frame by one or more

ee amino acids termination codons (Fig.4). Ittherefore appears

wo.Despite this that the site to which ribosomesinitiallybind in

resume that the vitro at the 5' end of theRSV genomedoes not

thebeginning of of itself dictate theposition at whichtranslation

sequence. starts. These findings add another example to

risetofiveviral thegrowing list of eucaryotic mRNA's which do

the polyprotein not initiate translation at the first methionine

p27-p12-p15 (35, codon downstream from the 5' end ofthe RNA

om E. Hunter). (26; Kozak, in press).

notdemarcated Kozak has recently compiled the nucleotide

have been able sequences that adjoin initiation codons in

eu-Lionofthetermi- caryotic mRNA's and has found that adenosine

ing unpublished usually occurs in the third position upstream

Iommunication). from the AUG (83% ofavailable examples), and

idase treatment, guanosine occurs at the first position

down-nolecularweight stream from the AUG (63% of the available

osition155, 177, examples) (Kozak, inpress). The significance of

Lrboxy terminus

idue 183 appears rsonal

communi-cleavestyrosine

ress further into

nal

communica-tyrosineat

resi-tible to

hydroly-Lt precedes the

allow only the

ingthistyrosine

rp19. Definitive

minusofp19and

372 389

AUG GAA GCC GUC AUU AAG GUG AUU 5' end of gag

AAGUAG splice donor

CAG

GUA AGU consensus sequence

FIG. 5. Identification of a potential splice donor site inRSVRNA. Data described in theaccompanying manuscript (13) locate a splice donor site in the vicinity of residue 390 of the RSV genome. The sequences adjoining this position are here aligned withthe"consensus sequence" for splice donor sites, compiled from a large number of viral and cellular mRNA's(22).

VOL.41, 1982

on November 10, 2019 by guest

http://jvi.asm.org/

[image:5.490.46.241.61.238.2]
(6)

these findings is not known, but the same

fea-tures do occur in the sequences adjoining the

initiation codon for gag of RSV (Fig. 3). One

possibledistinctionbetween theAUG codons in

the 5'noncoding region of RSV and the initiation

codon of gag is that theformer are preceded by

a U at position -3 from the AUG. Of the 56

cellular mRNA sequences reviewed by Kozak (in press), none has a pyrimidine at -3 from the

initiationcodon. Twoof these RNAs have AUG

codons in the 5' noncodingregion;in each case

theAUG ispreceded by a C at position -3 from

the AUG. The presenceof a purine or a

pyrimi-dine at position -3from an AUG codon may be

one of thesignals acellnormally uses to

distin-guish between correct and incorrect initiation sites. Further examples are needed to clarify this correlation.

Splicing in the genesis of RSV mRNA's. The subgenomic mRNA's of RSV are formed by

splicingashortnucleotide sequence from the 5'

endoftheviral genome toatleasttwopositions

within the genome (5, 20a, 25, 38, 46). In the

accompanying manuscript(13), wehave located

the "donor" sitefor splicing approximately 18

nucleotides downstream from the beginning of

thegag gene.Thispositioniscontained withina

nucleotidesequence thatdisplaysclose

similar-ity to the previously enunciated "consensus

sequence" for splice donor sites in eucaryotic

mRNA's (Fig. 5). According tothe consensus

sequence, the position of the splice would be

located between residues 389 and 390 of RSV

RNA (Fig. 5), a deduction which is in exact

accordwiththe resultsobtained by mappingthe

splice donor site with S1 nuclease(13) andwhich

adds credence to those results.

If we havecorrectly located thesplice donor

site in RSVRNA, theinitiation codon for gag is

spliced onto the subgenomic mRNA's of the

virus. Mightthis codon be used for initiation of

translation in its transposed positions? The

questioncannotbe answered withassurance as

yet because the splice acceptor sitesin theenv

and src subgenomic mRNA's have not been

located,although indirect evidence suggeststhat

theinitiationcodon for the srccoding regionlies

outside of the spliced leader sequence (13). In

contrast toRSV,thesplicedonor site for theenv

mRNAof mouse mammarytumor virusappears

tolieupstream from the startofgag;there are

no AUG codons in the leader sequence (J.

Majors, personalcommunication).

Mapping 5' noncoding regions of avian

retro-viruses.Thesequence analysisof the 5' terminus

of RSVhasprovidedapotentiallyusefulbattery

ofrestrictionenzyme sitesthatcouldbeapplied

to rapidly mapping other avian retrovirus

iso-lates.Inparticular,a BstEIIsite(position 106)is

present within the sequence of the tRNATrp

binding site [labeled (-)PB in Fig. 1], and a

BamHI site (position 525) and an XhoI site

(position 625) are present within the coding

region of p19 (Fig. 1 and 2). All three sites have

been documentedby nucleotide sequence

analy-sis in avianerythroblastosis virus (M.Privalsky,

personalcommunication), and they map to

anal-ogous positions in the endogenous viral locus,

evl(16). The BamHI and XhoI sites are present

inthe DNA of RSV Prague strain, RAV-O, and

MC29 viruses (32, 33, 43). These three

restric-tion enzyme sites willprobably be reliable

mark-ers for quickly identifying the position of the

tRNATrP binding site and mapping the start of

the gag sequences in otherisolates.

ACKNOWLEDGMENTS

We thank Bill DeLorbe and Richard Parker for making clonedRSV DNAs available and JohnMajors, Martin Pri-valsky, DennisSchwartz, and Eric Hunter for communicating results before publication. We also thank Bertha Cook forhelp in preparing the manuscript. The computer analysisof the sequence wasdone with theassistance of Peter Czernilofsky and Hugo Martinez.

R.S. was supportedbyPublic Health Service Training grant 1T32 CA 09043 from the National Institutes of Health. The research wassupported by Public Health Service grants CA 12705and CA 19287 from the National Institutes of Health and

by American Cancer Society grant MV48G to J.M.B. and H.E.V.

LITERATURECITED

1. Bishop, J. M.1978.Retroviruses. Annu. Rev. Biochem. 47:35-88.

2. Blattner, F. R., A. E. Blechl, K.Kenniston-Thompson, H. E.Faber, J. E.Richards, J. L. Sightom, P. W. Tucker, and 0. Smithies.1978.Cloninghuman fetalglobinand mouse a-typeglobin DNA: Preparation and screening of shotgun preparations. Science 202:1279-1284.

3.BoHvar,F., andK.Backman.1979. Plasmids of Escheri-chia coli asCloningvectors.MethodsEnzymol. 68:245-266.

3a. Coffin,J. M.1979.Structure,replication, and recombi-nation of retrovirusgenomes: someunifyinghypotheses. J. Gen. Virol. 42:1-26.

4. Cordell,B., E.Stavnezer, R. Friedrich,J.M.Bishop, and H. M. Goodman. 1976. The nucleotide sequence that binds primer forDNA synthesis to the avian sarcoma

virusgenome. J.Virol. 19:548-558.

5. Cordell,B., S. R. Weiss, H. E.Varmuw,and J. M. Bishop. 1978. At least 104nucleotidesaretransposed from the 5' terminus of theavian sarcoma virus genome to the 5' terminiofsmallerviral mRNAs. Cell15:79-91.

6. Darlix,J.-L., P.-F. Spahr, P. A.Bromley, and J.-C. Jaton. 1979. Invitro,the major ribosome binding site on Rous

sarcoma virus RNA does not contain the nucleotide sequencecoding for the N-terminal amino acids of the gag gene product. J. Virol. 29:597-611.

7. DeLorbe, W. J., P. A. Luciw, H. M. Goodman, H. E. Varmus, andJ.M.Bishop. 1980. Molecular cloning and characterization of avian sarcoma virus circular DNA molecules. J. Virol.36:50-61.

8. Donoghue, D. J.,E. Rothenberg, N.Hopkins, D. Balti-more, and P. A.Sharp. 1978. Heteroduplex analysis of the nonhomology region between Moloney MuLV and the dual host rangederivative HIX virus.Cell14:959-970. 9. Donoghue, D. J., P. A. Sharp, and R. A. Weinberg. 1979.

An MSV-specific subgenomic mRNA in MSV

trans-formed G8-124 cells.Cell17:53-64.

10. Eiden,J.J.,K.Quade, andJ.L.Nichols. 1976. Interaction

on November 10, 2019 by guest

http://jvi.asm.org/

(7)

NUCLEOTIDE SEQUENCE OF RSV 541

oftryptophantransfer RNAwith Rous sarcoma virus 35S RNA. Nature(London)259:245-247.

11. FaBer,D.V.,J. Rommelaere, and N. Hopkins. 1978. Large

Tioligonucleotidesof Moloney leukemia virus missing in anenvgenerecombinant,HIX,arepresent on an intracel-lular 21SMoloneyvirus RNA species. Proc. Natl. Acad. Sci.U.S.A.75:2964-2968.

12. Furuichi, Y., A. J. Shatkin, E. Stavnezer, andJ. M.

Bishop. 1975. Blocked, methylated 5' terminal sequence in avian sarcomavirusRNA. Nature(London) 257:618-620.

13. Hackett,P.H.,R.Swanstrom, H. E.Varmus, and J. M.

BIhop. 1982. The leader sequence of the subgenomic

mRNA's ofRous sarcoma virus is approximately 390

nucleotides. J.Virol.41:527-534.

14. Hartsuck, J.A., and W. N. Lipscome. 1971. Carboxypepti-dase A. In P. D.Boyer (ed.),Theenzymes, vol. 3, p. 1-56. AcademicPress, Inc.,New York.

15. Haseltine, W.A., A. M. Maxam, and W. Gilbert. 1977. Roussarcomavirus genome isterminally redundant: the 5'sequence. Proc. Natl. Acad. Sci. U.S.A.74:989-993. 16.Hlishinuma,F.,P.J. DeBona, S.Astrin, and A. M. Skalka.

1981.Nucleotidesequenceof acceptor site and termini of

integrated avian endogenous provirus evl: integration createsas6bprepeatof host DNA.Cell23:155-164. 17. Hsu,T.W., J.L.Sabran, G. E. Mark, R. V.Guntaka, and

J. M.Taylor.1978.Analysisofunintegrated avian RNA

tumorvirus double-strandedDNAintermediates.J. Virol.

28:810-818.

18.Ju, G., and A. M. Skalka. 1980. Nucleotide sequence

analysis of the long terminal repeat (LTR) of avian

retroviruses: structuralsimilaritieswithtransposable ele-ments.Cell22:376-386.

19. Keith,J.,and H.Fraenkel-Conrat. 1975. Identificationof the 5'end of Rous sarcomavirus RNA. Proc. Natl. Acad. Sci. U.S.A. 72:3347-3350.

20. Kozak, M. 1978. How do eucaryotic ribosomes select initiationregionsin messenger RNA? Cell15:1109-1123. 20a. Krzyzek,R.A.,M.S.Collett,A.F.Lau, M. L. Perdue, J.

P. Leis, and A.J. Faras.1978. Evidence for splicingof aviansarcomavirus5'-terminalgenomicsequences onto

viral-specific RNA in infected cells. Proc. Natl. Acad.

Sci. U.S.A.75:1284-1288.

21. Leder, P.,D.Tiemeler,andL.Enquist. 1977. EK2

deriva-tives ofbacteriophage lambda useful in the cloning of DNA fromhigher organisms: theXgtWES system. Sci-ence196:175-177.

22. Lerner,M.R., J.A.Boyle,S. M.Mount, S. L.Wolin,and

J. A. Steitz. 1980. Are snRNP's involvedin splicing?

Nature(London)283:220-224.

23. Linial, M., E. Medeiros,and W.S. Hayward. 1978. An avianoncovirusmutant(SE21Q1b) deficientingenomic

RNA: biological and biochemical characterization. Cell 15:1371-1381.

24. Maxam, A.,and W.Gilbert. 1980.Sequencingend-labeled DNA with base-specific chemical cleavages. Methods

Enzymol.65:499-560.

25. Mellon,P.,and P. H.Duesberg.1977.Subgenomic, cellu-lar Rous sarcomavirus RNAs contain oligonucleotides

from the 3' half and the 5' terminus of virion RNA. Nature

(London)270:631-634.

26. Mulligan, R. C., andP. Berg. 1981. Factorsgoverning expressionofabacterialgene in mammaliancells. Mol.

Cell. Biol.1:449-459.

27. Palmiter,R.D., J. Gagnon,V.M.Vogt, S.Ripley, and R.

N. Elsenman. 1978. The NH2-terminal sequence of the avian oncovirusgagprecursorpolyprotein(Pr769af).

Vi-rology91:423-433.

28. Panet,A., M.Gorecki, S. Bratosin, and Y.Aloni. 1978. Electron microscopic evidencefor splicing ofMoloney murine leukemia virus RNAs.NucleicAcids Res.

5:3219-3236.

29. Pawson, T.,G. S.Martin,andA. E.Smith.1976.Cell-free

translation of virionRNAfromnondefective and transfor-mation-defective Roussarcomaviruses. J.Virol.

19:950-957.

30. Rothenberg, E., D. J. Donoghue,andD. Baltimore.1978.

Analysisof a 5' leader sequenceonmurineleukemiavirus 21S RNA:Heteroduplexmappingwithlongreverse tran-scriptase products.Cell 13:435-451.

31. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA

sequencingwithchain-terminating inhibitors. Proc. Natl. Acad. Sci.U.S.A. 74:5463-5467.

32. Shank,P.R., S. H. Hughes, H.-J. Kung, J.E.Majors,N. Quintrell, R. V. Guntaka, J. M. Bishop, and H. E. Varmus.1978.Mapping unintegrated avian sarcoma virus DNA: terminiof linearDNAbear 300nucleotidespresent

once or twice in two species of circular DNA. Cell 15:1383-1395.

33. Shank, P. R., S. H. Hughes, and H. E. Varmus. 1981. Restrictionendonucleasemapping ofthe DNA of Rous-associatedvirus 0 reveals extensive homology in

struc-tureand sequencewith aviansarcomavirusDNA.

Virolo-gy 108:177-188.

34. Shank, P. R., and M. Linial. 1980. Avian oncornavirus

mutant(SE21Q1b) deficient ingenomic RNA: character-ization ofadeletion in theprovirus. J. Virol.36:450-456.

35. Shealy, D. J., A. G. Mosser, and R. R. Rueckert. 1980. Novelp19-related protein inRous-associated virus type 61:implications for aviangaggene order. J. Virol. 34:431-437.

36. Sherman, F., and J. W. Stewart. 1975. The useof iso-1-cytochrome c mutantsofyeastfor elucidatingthe

nucleo-tidesequences thatgoverninitiation oftranslation,p. 175-191. InG. Bernardiand F. Gros (ed.), Organization and

expressionof theeukaryotic genome:biochemical

mecha-nism of differentiation in prokaryotes and eukaryotes,

Proceedings ofthe 10thFEBS meeting. American Else-vier, New York.

37. Shine, J., A. P. Czernllofsky, R. Friedrich, H. M. Good-man, and J.M.Bishop.1977. Nucleotide sequenceatthe 5' terminus of the avian sarcoma virus genome. Proc.

Natl.Acad. Sci. U.S.A. 74:1473-1477.

38. Stoltzfus, C. M. and L. K. Kuhnert. 1979.Evidence for the

identityof shared 5'-terminal sequences between genome RNA and subgenomic mRNA's of B77 avian sarcoma

virus. J. Virol. 32:536-545.

39. Swanstrom, R., W. J. DeLorbe, J. M. Bishop, and H. E. Varmus. 1981. Sequencing of cloned unintegrated avian sarcoma virus DNA: Viral DNA contains direct and inverted repeats similar to those in transposableelements. Proc. Natl.Acad. Sci. U.S.A. 78:124-128.

40. Swanstrom, R., H. E. Varmus, and J.M. Bishop. 1981. The terminal redundancyoftheretrovirus genome facili-tates chain elongation by reverse transcriptase. J. Biol.

Chem. 256:1115-1121.

41. Taylor, J. M. 1977. An analysis of the role of tRNA

speciesasprimers forthetranscriptioninto DNA of RNA tumor virus genomes. Biochim. Biophys. Acta 473:57-71. 42. Varmus, H. E., S.Heasley, H.-J. Kung, H. Oppermann, V. C. Smith, J. M.Bishop,and P. R. Shank. 1978.Kinetics of synthesis, structure and purification of avian sarcoma virus-specific DNA made in the cytoplasm of acutely infectedcells.J. Mol. Biol. 120:55-82.

43. Vennstrom,B., C.Moscovici,H. M.Goodman, and J. M. Bishop. 1981. Molecular cloning of the avian myelocyto-matosis virus genome, and recovery of infectious virus by transfection of chicken cells. J. Virol. 39:625-631. 44. Vogt, V. M., R. Eisenman, and H. Dlggelmann. 1975.

Generationof avian myeloblastosisvirusstructural pro-teins by proteolytic cleavage of a precursor polypeptide. J. Mol. Biol.96:471-493.

45. vonderHelm,K., and P. H.Duesberg.1975. Translation of Rous sarcoma virus RNA in cell free systems from Ascites Krebs II cells. Proc. Natl. Acad. Sci. U.S.A. 72:614-618.

46.Weiss,S. R., H. E. Varmus, and J. M. Bishop. 1977. The size and genetic composition of virus-specific RNAs in the cytoplasm of cells producing avian sarcoma-leukosis vi-ruses. Cell 12:983-992.

VOL.41,1982

on November 10, 2019 by guest

http://jvi.asm.org/

Figure

FIG.molecular
FIG. 2.genomerestrictiondescribed Nucleotide sequence of cloned viral DNA. The sequence is numbered from the 5' end of the viral and depicts the same polarity as the genome ("positive strand")
FIG. Analysisnes.initiation 4. of reading frar and termination codons we-re

References

Related documents