• No results found

Evolution of the nucleoprotein gene of influenza A virus.

N/A
N/A
Protected

Academic year: 2019

Share "Evolution of the nucleoprotein gene of influenza A virus."

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

0022-538X/90/041487-11$02.00/0

Copyright © 1990,American Society for Microbiology

Evolution of the Nucleoprotein Gene of Influenza

A

Virus

OWENT. GORMAN,* WILLIAM J. BEAN, YOSHIHIRO KAWAOKA,AND ROBERTG. WEBSTER Department ofVirology and Molecular Biology, St. Jude Children'sResearch Hospital, 332 North Lauderdale,

P.O. Box318, Memphis, Tennessee38101 Received13 September1989/Accepted 19 December 1989

Nucleotidesequencesof 24 nucleoprotein(NP)genesisolated fromawiderangeof hosts, geographic regions, and influenzaAvirus serotypes and 18published NPgene sequences wereanalyzedtodetermineevolutionary relationships. The phylogeny of NP genes was determined by a maximum-parsimony analysis of nucleotide

sequences. Phylogenetic analysis showed that NPgeneshaveevolved into five host-specific lineages, including (i) Equine/Prague/56 (EQPR56), (ii) recent equine strains, (iii) classic swine (HlNl swine, e.g., A/Swine/

Iowa/15/30) and human strains, (iv) gull H13 viruses, and (v) avian strains (including North American, Australian, and Old World subgroups). These NP lineages match the five RNA hybridizationgroupsidentified

by W. J. Bean (Virology 133:438 442, 1984). Maximum nucleotide differencesamongthe NPswas18.5%, but maximum amino acid differences reached only 10.8%, reflecting theconservative natureof the NP protein. Evolutionary ratesvariedamonglineages; the human lineage showed the highestrate(2.54 nucleotide changes peryear),followed by the Old World avian lineage (2.17 changesperyear) and therecentequine lineage (1.22 changesperyear). The per-nucleotideratesof human and avian NPgeneevolution (1.62x 1o-3to1.39x 10-3 changes peryear) are lower than thatreported for human NS genes (2.0 x 10-3 changes per year; D. A. Buonagurio, S. Nakada, J. D. Parvin, M. Krystal, P. Palese, and W. M. Fitch, Science 232:980-982, 1986). Ofthe five NP lineages, the human lineage showed thegreatestevolutionatthe amino acid level;over aperiod

of 50 years, human NPs have accumulated 39 amino acid changes. In contrast, the avian lineage showed remarkableconservatism; over the same period, avian NP proteins changed by 0to 10 amino acids. The specificity of the H13 NP in gulls and its distinct evolutionary separation from the classic avian lineagesuggests

thatH13NPsmayhavealarge degree of adaptationtogulls. Thepresence of avian and human NPs insome

swineisolates demonstrates the susceptibility of swinetodifferent virusstrains andsupportsthehypothesis that swinemay serve asintermediatesfor the introduction of avian influenza virusgenesinto the human virusgene

pool. EQPR56 is relatively distantly related toall other NPlineages, which suggeststhat this NP is rooted closesttothe ancestorof all contemporary NPs.On the basisof estimation of evolutionaryratesfromnucleotide branch distances, current NPlineages are at least 100 yearsold, and the EQPR56 NP is much older. The ubiquitous nature of hemagglutinin and neuraminidase gene subtypes in the avian gene pool has been interpretedasevidence ofanavian origin for influenza A virus. Our resultsareconsistent with this hypothesis.

We found that(i) older nonavian NPgenes are moreavianlike thanrecentNPs, (ii)recentNPgenesin the five lineagesare moredistant from NPs in other lineages than from themostprimitive avian NPs, (iii) nonavian NPs sharemoreprimitive avian characters than derived characters, and (iv) Old World avian NP proteinsappear

tobe inevolutionary stasis while nonavian NPsareevolving rapidlyawayfromavianlike ancestral NPproteins. These results suggestindependent evolution of NPs from anavianancestor.

InfluenzaAviruses infectavarietyof avian and mamma-lian hosts, including humans, pigs, horses, poultry, wild waterfowl, shorebirds, whales, and seals (for a review, see Hinshaw and Webster [11]). Migratory waterfowl and hu-mans have been recognized as major host reservoirs for influenza viruses (11). Recently, Kawaoka et al. (14) pre-sented evidence suggesting that shorebirds and gulls com-prise anadditional genepoolfor influenza viruses.

The results from a number of studies suggest that the nucleoprotein(NP) gene may be involved in the maintenance ofhost-specificgenepools.Scholtisseketal.(24), Tianetal. (28), and Snyderetal. (26) have implicatedthe NP gene in determining host range by showing that reassortants with

experimentallysubstituted NP genes have reducedviability. This suggests that NPsareadaptedto specific hosts. Schol-tissek et al. (25) have shown by RNA hybridization that human NP genescanbereadily distinguishedfrom avian and equine NPs. Also using RNA hybridization techniques, Bean (1) identified five host-specific NP groups: equine I

* Correspondingauthor.

(Equine/Prague/56

[EQPR56]), equine II (recent equine), swineplus human, H13 gull, and avian. On the basisofsix nucleotide sequences, Buckler-White and Murphy (4) iden-tified human and avian NPsasdistinct classes. Gammelinet al. (9) examined 16 NP gene sequences and supported the divisionofNPsinto human and avian classes. Collectively, these studies suggest that NPs can be divided into a mini-mumoftwohost-specific classes.

Because the NP gene codes for an internal

protein,

it is probably not subjected to strong selection pressure by the hostimmunesystem.

Instead,

evolutioninthe NP gene may reflect

host-specific

adaptation.

The

potential

host-specific

nature ofthe NP gene makes it a good candidate for the

investigation of evolutionary relationships of influenza A viruses. Inthestudyreported here,wepresent theresults of an

analysis

of24influenzaAvirusNPgene sequencesand 18 other NP sequences available from the literature. These NPs representawidevariety of influenza virus serotypes,

hosts,

and geographic regions. From our

analysis

we propose evolutionary

relationships

among NPgenesand address the issueofhost

specificity.

1487

on November 10, 2019 by guest

http://jvi.asm.org/

(2)
[image:2.612.62.556.89.489.2]

TABLE 1. Influenza virus strains used in phylogeneticanalyses

Strain Abbreviation Sourceorreference

A/Wilson-Smith/33 (HlNl) A/Swine/Iowa/15/30(HlNl) A/Swine/Tennessee/24/77 (HlNl) A/Swine/Netherlands/12/85(HlNl) A/Equine/London/1416/73 (H7N7) A/Equine/Kentucky/2/86 (H3N8) A/Equine/Tennessee/5/86 (H3N8) A/Gull/Astrakhan/227/84 (H13N6) A/Gull/Maryland/704/77 (H13N6) A/Gull/Maryland/1824/78 (H13N9) A/Gull/Maryland/1815/79(H13N6) A/Gull/Minnnesota/1845/80 (H13N6) A/Gull/Massachusetts/26/80 (H13N6) A/Duck/New Zealand/31/76 (H4N6) A/Grey Teal/Australia/2/79 (H4N4) A/Duck/Czeckoslovakia/56 (H4N6) A/Duck/Ukraine/2/60 (HllN8) A/Budgerigar/Hokkaido/1/77 (H4N6) A/Ruddy Turnstone/NJ/47/85 (H4N6) A/Whale/Maine/328/84 (H13N2) A/Chicken/Pennsylvania/1/83(H5N2) A/Turkey/Minnesota/833/80 (H4N2) A/Mallard/Astrakhan/244/82 (H?N6) A/Tern/South Africa/61 (H5N3) A/FPV/Rostock/34/Giessen(H7N1) A/PR/8/34(HlNl)

A/NT/60/68(H3N2) A/Hong Kong/5/83 (H3N2) A/Udorn/307/72(H3N2) A/Mallard/NY/6750/78(H2N2) A/Chicken/Germany N/49(H1ON7) A/Mink/Sweden/84 (H1ON4) A/Parrot/Ulster/73 (H7N1) A/Duck/Hong Kong/7/75(H3N2) A/Duck/Bavaria/2/77 (HlNl) A/Swine/1976/31 (HlNl)

A/Swine/Hong Kong/127/82(H3N2) A/Swine/Hong Kong/6/76(H3N2) A/Swine/Germany/2/81 (HlNl) A/Equine/Miami/63 (H3N8) A/Equine/Prague/1/56 (H7N7) B/Lee/40

WS33 SWIA30 SWTN77 SWNED85 EQLON73 EQKY86 EQTN86 GULAST84 GULMD77 GULMD78 GULMD79 GULMN80 GULMA80 DKNZ76 GTAUS79 DKCZ56 DKUK60 BDGRHO77 RTNJ85 WHALEM84 CKPENN83 TYMN80 MLDAST82 TERNSA61 FPV34 PR8-34 NT60-68 HK83 UDORN72 MLRDNY78 CKGER49 MINKSW84 PARU73 DKHK75 DKBAV77 SW31 SWHK82 SWHK76 SWGER81 EQMI63 EQPR56 B-Lee4O

This report This report This report This report This report This report This report This report This report This report This report This report This report This report This report This report Thisreport This report This report This report This report Thisreport This report This report

Mandler and Scholtissek (19) Winter and Fields (29) Huddleston and Brownlee (12) Gammelin etal. (9)

Buckler-White andMurphy (4) Buckler-White andMurphy(4) Reinhardt and Scholtissek(22) Reinhardt and Scholtissek (22) Steuleretal. (27)

Gammelin et al.(9) Gammelinetal.(9) Gammelinetal.(9) Gammelin et al.(9) Gammelinetal.(9) Gammelinetal.(9) Gammelinetal.(9) Gammelinetal.(9) Briedis and Tobin(3)

MATERIALS AND METHODS

Virus strains. A total of24 viral isolates (Table 1) were selectedfrom therepositoryatSt.JudeChildren'sResearch Hospital, Memphis, Tenn. The isolates were selected to represent a wide spectrum of geographic locations, hosts, and datesof isolation to complement18genesequencesfrom theliterature and data bank sources (Table 1).

Molecularcloning ofthe NPgenes. Viruses weregrownin 11-day-old embryonated eggs and RNA was extracted as described by Bean et al. (2). Full-length segment 5 genes werecloned by the procedure of Jones et al. (13) and Winter etal. (30)for other influenza A virus genes. To summarize, a 12-mer oligodeoxynucleotide primer (5' AGCAAAAGC AGG) was phosphorylated with T4 polynucleotide kinase and first-strand cDNA was synthesized from viral RNA template by using avian myeloblastosis virus reverse tran-scriptase. Second-strand synthesis was done with a phos-phorylated 13-meroligodeoxynucleotide primer (5' AGTAG AAACAAGC) and Escherichia coli DNA polymerase I (Klenow fragment). Full-length double-stranded cDNA was blunt-end ligated into the PvuII site of vector pATX (a derivative ofpAT 153, courtesy ofClaytonNaeve,

Molecu-lar Resource Center, St. Jude Children's Research Hospi-tal).

Nucleotide sequence determination. The nucleotide se-quencesofNPcDNAsweredeterminedby usingthe dideox-ynucleotide-chain termination method (23). Oligodeoxynu-cleotide primers (synthesized by the Molecular Resource Center) were annealed to double-stranded template DNA denaturedwith NaOHasdescribedbyChen andSeeburg(6) and extended with modified T7 DNA polymerase (Seque-nase; U.S. Biochemical, Cleveland, Ohio). The reaction products were separated on6% polyacrylamide-7 M urea 0.4-mmgels. Ambiguities were resolvedby sequencingthe complementarystrand ortheoriginalviral RNAwith avian myeloblastosis virus reverse transcriptase. For all virus strains, atleastone-third of theNPgenewas sequencedon both strands. Two or more clones were available and

se-quenced for 15 of the 24 virus isolates. The sequences of oligonucleotidesusedasprimersareavailable upon request. Sequence analysis. The Intelligenetics (Palo Alto, Calif.) software package was used for analysis and translation of nucleotide sequence data. The B/Lee/40 NP nucleotide

on November 10, 2019 by guest

http://jvi.asm.org/

(3)

EQPR56(H7N7)

1943 EQMI83 (H3NN) EOLON73(H7N7) - _

EQKY8O(H3N8) 1

1860- EQTN88 (H3N8)

1670? SWHK82(H3N2) -WTN77(H1N1)

1897? 8W1A30(HINI) 8W31(HINI)

W833 (HINI)

PRS-34(HINI) 1933 rNT6O48(H3N2)

140- 1960 K78(H3N2) U

1860? LJDORN72(H3N2)

1969it8 W83(H3N2)

GULAST84 (HU3NG) GIULMD77 (HU3NS)

GULMD78(H13N9)

.lLD79 (H13N6)

OULMNso(H13N8)

1U182? WHALEM84(Ht3N2)_(

12UUASO (Ht3N8)

MLRDNY78 (H2N2) RTNJ85 (H4N8)

TYMN80(H4N2)

1899? CKPENN83 (H5N2)

)0 , DKNZ78(H4N8)

GTAUS79(H4N4) Changes FPV34(H7N1)

925 -CKOER49(HtON7) s

DKCZ5 (H4N6)

DKUK8O(HIN8S)

TERNSA61(H5N3)

1"9 PARU73(H7N41)

1950 DKBAV77(HtNl)

BDGRHO77(H4NB)

_INKSW84(HION4)

1958 SWOERBi (HINI)

SWNED85(H1Nl)

1961 DKHK75 (H3N2)

19 MLDAST82(H?NO)

EOPR5S (H7N7)

E0M163 (H3N8) EQLON73(H7N7)

rEQKYS6(H3NS)

_LEOTN8 (H (HN)3

[image:3.612.67.558.81.364.2]

,NSWHK82(H3N2)

|SWTN77(HIN1)

SWLA30(H13N)14

8W31(H1N1) D7

_W3

HINI W833(H1Nl)

PR8-34(H1N1)

SNTSO-68((H3N2)

SWHK78(H3N2)

( WON7(H3N2) J

-G<ULAST84(HU3NG) HK83(H3N2) O-GLMD77(Ht3NS)

OULMD78 (H13N9)

OTAUS79(HI"1)79 1

G

(ULMNHN (HU3N A)

W ( ) WHALEM84(Ht3N2) _

GULMAO(H13N8)

MLRDNY78 (H2N2)

RTNJ8S(H4N8)

*TYMN80(H4N2)

CKPENN83(H5N2)

rDKNZ70 (144N8)

lGTAUS79(H4N4) 10

FPV34(H7Nl) AioAi

r DKCZ58 (H4N8)Amoci

DKUK30(HnN8)

TERN8A81(H5N3)_

PARU73(H7N1)

i

_

|.DKBAV77(HINI)

ODt3RHO77(114NG)

|MINKSW54(141044)

SWG3ER81(H1Nl)

|-C SWNED85(H1N1)i

DKHK75(H3N2)

1. DAST82(H7NS)

FIG. 1. Phylogenetictreesfor influenzaAvirus NPgenes. (Left) NucleotidetreerootedtoB/Lee/40NP.Sequencesof 42 NPgenes(24

from this study, 18 from published sources) were analyzed with PAUP (D. Swofford, Illinois Natural History Survey), which uses a

maximum-parsimony algorithm. The tree represents the first of six equal-length trees generated by PAUP with MULPARS, SWAP=

GLOBAL, and HOLD=10 options. The arrowindicates the direction of the B/Lee/40NPfrom the rootnode.With the exclusion of the B/Lee/40outgroup,the numberofvariable characters represented is 641, the totaltreelengthis 2,289 nucleotide changes,andthe consistency

index(proportion ofchanges duetoforward mutations) is 0.389. Horizontal distance is proportionaltothe minimumnumberof nucleotide

differencestojoinnodes and NPgene sequences. Vertical linesarefor spacing branches and labels.Datesfor hypotheticalancestornodes

werederived by dividing branch distance by evolutionary rate estimates(see Fig. 5). Question marks denote date estimates beyond the

immediateancestornode of the earliestisolates. (Right) Amino acidtree.Thetreewasgenerated by using the topology option ofPAUPto makepredicted amino acidsequencesconformtothetopology of the nucleotide tree(left). The number of variable charactersrepresented

is141,thetotaltreelengthis275amino acid changes, and consistency is0.680. Strain abbreviations usedin thesetreesarelisted in Table 1.

sequence(1,841 bases [3]) wasaligned with theinfluenzaA virus NPsbyuseof the Needleman-Wunsch pairwise

align-mentalgorithm. Alignmentwasaccomplished by eight dele-tions (276 bases) in the first half of thegene. Phylogenetic analysis ofsequence data was performed with the PAUP software package version 2.4.1 (David Swofford, Illinois Natural History Survey, Champaign, Ill.). PAUP employs the maximum-parsimony method to generate phylogenetic trees.Theshortest(mostparsimonious)treeswerefound by implementing the MULPARS, SWAP=GLOBAL, and HOLD=10options of PAUP.ThePRINTDoption of PAUP provided difference matrices forsequencedata. Treelength is measured in steps which are equivalent to nucleotide changes for nucleotide trees and amino acid changes for amino acid trees. The total tree length is the sum of all branchlengths.

RESULTS

Analysisofnucleotidesequences.Eachofthe 24clonedNP

genescomprised 1,565nucleotides andasingleopenreading

framewhich spanned positions 46 through 1539 and coded for a polypeptide of 498 amino acids. No insertions or deletionswereobserved inanyof thesequences.Nucleotide

sequencesof the 24cloned NPgenesarenotpresented here butareavailable fromGenBank(accessionnumbersM30746

to M30769).

Aphylogenetic analysis of41 influenzaA virus NP nucle-otide sequences(fromthe 24clonedgenes and 17published sequences)ispresentedas anevolutionarytreerootedtoan alignedB/Lee/40 sequence(Fig. 1, left). With the exclusion

of the B/Lee/40 sequence, PAUP found six equal-length treeswhichwere2,289steps(nucleotide changes) long. The next-shortest trees were 11 steps longer. The six equal-length trees varied in combinations oftwo possible place-mentsof GULMD79 and MINKSW84 with adjacent termi-nalbranches. We chose that tree (tree 1) witha branching orderfor GULMD79 and MINKSW84 that was most

con-gruentwith the sequence of isolation dates (Fig. 1, left). Whenconductingaphylogenetic analysis, it isdesirableto

use an outgroup (a closely related taxon that shares a common ancestor) to serve as a reference or root for the

groupof interest(inourcase,influenzaA virusNPs). Ifthe outgroup is closely related it can more accurately resolve differences betweenlineageswithin thegroup ofinterest. If theoutgroupistoodistantly related,itsvalueas areference is diminished. Initially, weincludedtwoalignedinfluenzaB

4-10

Nuclootide

on November 10, 2019 by guest

http://jvi.asm.org/

(4)

virusNPsequences (B/Lee/40 and B/Singapore/79 [18]) and analigned influenzaCvirus NPsequence(C/Calif/78 [20])as

outgroup sequences in ouranalysis. Of the three outgroup NPs, the NP ofB/Lee/40wasthe closesttoinfluenza A virus NPs andwaschosenastheoutgroup.The influenza C virus NPwaseliminated fromconsideration because itwasmuch moredistantfrominfluenzaA virus NPs.Structural similar-itiesfound between influenza A and B virus hemagglutinins (17) suggest a close evolutionary relationship between the twoviruses andsupportthechoice ofinfluenza B virus NP astheoutgroup. The NP of B/Lee/40 is separated fromthe rootof influenza A virus NPs bya minimum of 965 nucleo-tide changes (this does not include the eight deletions totaling 276nucleotides made in the B/Lee/40NP sequence

toalign it with influenzaAvirus NPs). At1,053steps,the NP ofEQPR56 is the closesttothat ofB/Lee/40, and the next closestNP, that ofMLRDNY78, is 1,191 steps away.

The phylogenetictree showsthat NPgeneshave evolved

intofivedivergent, host-specific lineages rootedatthedeep forks ofthe tree (Fig. 1, left). The first lineage is that of EQPR56 NP, which forms a sister group to all other influ-enza A virus NPs. The relative closeness to the B/Lee/40 root and the distance to other NPs indicates that EQPR56 has the mostprimitive NP;theclosestNPs inotherlineages are 314 to 343 steps away, and contemporary avian and human NPsare morethan 400stepsaway. Thenextfork in the tree represents a separation of H13 gull and avian lineages fromrecent equine and swine plushumanlineages; theclosest NPs between thetwosides of thetree (EQMI63 and MLRDNY78) are separated by 311 steps. The closest NPs between the equine and swine plus human lineages (EQMI63 and SW31)areseparated by 270steps.The NPsof the H13 gull lineage are distinctfrom the other avian NPs; theclosest ofthese(GULAST84andMLRDNY78) is

sepa-rated by 244 steps. With the exception of one NP derived fromawhale isolate, all ofourH13 NPswere derived from gullisolates.

The recent equine NP lineage contains examples of vi-ruses with H3N8and H7N serotypes. Although the recent equine EQLON73 strainshares thesameserotype(H7N7)as theEQPR56 strain, theirNPsarequitedistinct. Bean(1)has suggested that recent equine viruses of the H7N7 subtype maybereassortantsthat havereplacedtheEQPR56NPwith therecentequineNP. The failuretoisolateviruses contain-ingEQPR56-like NPs since 1956 suggeststhat this lineage

maybeextinct.

Theunion ofclassicHlNlswine and human NPlineages suggests that the human NPs are derived from a swine

ancestor or atleast thatthe two lineages share a common ancestor. Inthe classic swine lineage, SWHK82 is a reas-sortant containing avian or early human H3N2 surface

proteingenes(e.g., Aichi/68 [16])andaclassicHlNlswine NP. In the humanlineage the swine isolate SWHK76 is a typical H3N2 human virus, but thetreesuggests that itsNP was derived from humanNPs of the 1960s.

The avian lineage is a heterogeneous group; NPs were isolatedfrom wildand domesticducks, shorebirds, turkeys, chickens,parrots,budgerigars, mink,andswine.Tosimplify discussionwewillhereafter refertothis lineageasavianand

to the others (including H13 gull) as nonavian. Within the avian lineage, readily identifiable subgroups aredefined by geographic region: the North American lineage, which is closest to the ancestral avian NP (e.g., MLRDNY78); the Australia-NewZealand lineage (e.g., DKNZ76); andtheOld Worldlineage, with FPV34as itsmost primitive member.

OverallNPs,themaximumabsolute number of nucleotide

differences were found between the recent (most

derived)

NPs ofthe swine plus human, recent

equine,

and H13

gull

lineages (257to289differencesor16.4to

18.5%; Fig.

2).The oldest NPsofthese lineageswerelessdivergent (231 to252 differencesor14.8to

16.1%)

andweremoresimilartoolder,

more primitive avian NPs, e.g., FPV34 (197 to 222

differ-ences or 12.6to

14.2%).

In contrast to absolute

differences,

branch distances shown in the

phylogenetic

tree

(Fig.

1,

left)

account for reversemutations and thusprovidemoreaccurateestimates of

evolutionary

distance. It is

readily

apparentthat themost recentNPsattheterminal branchesof nonavian

lineages

are themost

divergent

NPs(theyare

separated

fromeach other byadistanceof416to535steps)but allareclosertotheroot

oftheavian

lineage

(263to382 steps;rootdated 1899 in

Fig.

1, left). Similarly, the oldest NPs ofthe nonavian

lineages

are closer tothe avian root(201 to 249 steps) than to each other(270to 376 steps).

Overall,

nonavianNPs share more

primitive(avian)characters thanderived

(nonavian)

charac-ters. This patternof

relationship

canbe

explained by

diver-gent evolutionofnonavianNP genesfrom a commonavian

origin.

The same type ofpattern appears within the avian lineage; e.g., the most

divergent

avian NPs, CKPENN83 and

BDGRHO77,

are

separated by

abranch distance of 311 stepsbuttheyare

only

111and 200 steps,

respectively,

from the avian root.

Relationships

among avian NP branches suggest evolution from acommon ancestor.

A low

consistency

index of 0.389 and the much

higher

number of nucleotide

changes

predicted by

the

phylogenetic

tree in

comparison

with the absolute differences

(Fig.

1 versus Fig. 2) indicates thatreverse mutationsare common in the evolution ofthe NPgene

(the

consistency

index isa measure of the

proportion

of nucleotide

changes

due to

forward mutations in

phylogenetic trees).

These reversals are

probably

relatedto constraints onNPamino

acids;

that is, silent mutations at the second and third

positions

of nucleotide codonsare saturated insome

lineages

and rever-salsareselected for inordertomaintain ancestral amino acid sequences.

Analysis ofamino acidsequences.Toevaluate the

product

of nucleotide evolution and the

host-specific

nature ofNP genes, we translated the open

reading

frames of the 41 influenzaAvirusnucleotidesequences.Theabsolute

differ-encesin deduced amino acid sequencesweremuchless than

expected,

given

the

large

number ofnucleotide

differences,

which indicates that theNP

protein

is

conserved,

especially

withinavian NPs

(Fig. 2).

Withinmammalian

lineages,

NPs show four toeight nucleotide

changes

for each amino acid changebetween

isolates,

but in the H13

lineage

therangeis 4to15

changes

and in the avian

lineage

the range is6to>40 changes. The most

divergent

amino acid sequences arethe human HK83 and H13 GULMN80 NPs

(54

differences or

10.8%).

Thepattern of differencesis the same asthe nucle-otide

data;

thegreatestdifferences occuramongmammalian and H13gull

lineages,

and theoldest NPsare moresimilarto avian NPs.

Since amino acid evolution is determined

by

nucleotide

evolution,

weused the

topology

option

of PAUPtogenerate anamino acid sequencetreethatconforms tothe

branching

patternofthe nucleotide tree (Fig. 1,

right).

This

approach

allows direct

comparison

of the nucleotide and amino acid treesfor differences in the effect of

genetic

changesamong the lineages. The tree shows that at the

protein

level the variousNPlineages aredivergent and evolvingat different rates. The avian

lineage

is the most

primitive;

several avian NPs(TYMN80, DKBV77, MLDAST82)are

separated

from

on November 10, 2019 by guest

http://jvi.asm.org/

(5)

EP EQ SW

11 2 3 4

_5

6 7 8

_9

DIFFERENCES INNUCLEOTIDESEQUJENCES

HumI N13 NA ANZ OW

10 11 12 13 14 151 16 17 18 19 20 21 22123 24 25 26127 28129 30 31 32 33 34 35 36 37 38 39 40 41

2. EOPR56 0262267 272 272 258 264 232231260 251262 267270275256265267 269 261 272 277 238 238 234238222 225 218 219 226 223 213 219 234 214 221 208 220 218 212 4

EQLN3

351KY3862666

402024

22526

5320

0325

4 2927 6 5 552925 5 1 3 3

22292171

262322 2 2827 2 1 ,5 EQTN86 4022 5 0 02.68 264 250 249250 250 263 260 269 278 257 250 256 255 259 256260228224 225 234 226 236 237232233 219 224234242 240234 228 235 229 217 6 5WNK82 37 3436 4040 067 181 182207 209 227 233 235 248 273 272 267 267 269 270 270 251 239 244255 230 254 243 233 243 240 232 234 234 229 245 233 235 237 248 7 SWTN77 40 35 39 43 43 8 0 153 150 180 185 212 222 216 235 269 274 275 275 276 273 279 250244249256 237246249233 248246234249 248244252242242 245 248 8 SWIA30 32 25 29 33 33 24 25 0 27113 125 164 174 172 196 230 231 241 241 242 236 245 210 202 202 224 197 202 206 202 203 208 194 210 214 209 213 200 210 202 204 9 SW431 33 25 30 33 33 23 26 7 01J14 125 154 166 164 192 227 232 242 242 245 237 248 206 19920i219 202 204 206 199 201 209 196 213 216 210 213 200 M 20520-4

10 WS533 42 37 40 44 44 38 38 25 26 0 60 127 142136 165 252 252 257 257 260 253 265 217 213 212 236 214 227 222 213 227 224 210 221 229 221 222 215 224 215 219 11 PR834 41 41 44464638 3928 29 15 0127144 139 166 262 251 262 262 265 260 274 225 227 221 245 216 229 220 213 225 223 208 221 232 222 221 211 220 207 217 12 NT60-68 48 45 45 47474544441 42 29 30 030 49 78264264273272 275 271 283 245 242246248237 253 249244246240234 240 246244235229 236 241 240 13 SWdNK76 44 41 43 45 45 42 43 39 40 29 30 7 0 63 92 264 266 274 274 277 273 283 241 245 247 245 235 251 247 246 243 238 240 240 249 246 234 234 240 243 241 14 UDORN72 48 45 45 4747464741 42 3232 10 11 0 49 265 264 270 270 273 266 278 254 253 253 267 246257 253 249 257 251 251 253 260 255 250 245 251 255 252 15 NK83 50 49 4951 51 5049 45 4638 38 15 18 11 0 271 277282 282284278-289270 266 268 276 2552691266259 269263 257 258 260 265 264 257 263 267 259 16 GQJLAST84 37 30 32 37 37 39 41 27 28 41 43 50 46 48 52 0143 141 140 151 139 154 215 216 203 29196 227 199 210 215 207 196 211 219 211 213202 217 214 213 17 GOJLND77 39 28 30 35 35 38 41 26 27 39 43 48 44 48 52 13 0 71 69 82 73 90 224 207 201 220209218 197 196 209 203 195 213 214 214 213 199 199 200 211 18 OJLNLD78 40 29 35 40 40 38 41 27 28 40 44 48 43 48 52 16 7 0 4 20 23 35 225223 211 230 217 232 200 202 211 204 200 220 229 220 214 202 209 206 216 19 GUL14D79 3928 34 39 39 374026 2739 43 474247 51 16 6 1 0 20 19 35 223 219 209230 213232 196 198209202 196216 225 216210 198205202 214 20 GQJLNN80 41 30 36 41 41 39 42 28 29 41 45 50 45 50 54 18 8 5 4 0 31 49 229 224 217 233215 230 202 203 214 203 203 219 232 223 215 201 210 205 217 21 GUJLMA8 39 28 34 39 39 374026 2739 4348 4348 52 16 6 3 2 2 039224219 210231 217 233198 200 209 205198 218225 216 218202 210 203 217

22 WNALEI48 382733 38 383639 25 26384247 4247 51 17 7 4 3 302229 223211 234225 236209 210217 208206 220 229220223206214 209 21 23 MLRDNY78 27 16 23 28 28 27 28 13 14 27 31 39 34 39 43 19 17 17 16 18 16 15 0 92 82 1061i54167 147 140 146 150 128 152 1601481i46145 154 146138 24 RTNJ85 27 18 23 28 28 26 27 14 15 27 31 38 34 38 42 21 19 20 19 21 19 18 6 0 49 91 155 162 142133 149 147 132143153 136 154 145158143 144 25 TYMN8O 25 14 21 26 26 25 26 11 12 25 29 37 33 37 41 17 15 16 15 17 15 14 2 4 0 82 1531571127125 138137 121 140 150135140 137 150 131137 26 CKPENN83 30 19 24 29 29 32 31 18 19 32 36 40 38 40 40 21 21 22 21 23 21 20 9 11 7 0 178 179 156154 155 157 146 173 176 170168163 172 163 163 27 DKNZ76 23 16 21 26 26 23 26 11 12 25 29 37 33 37 41 18 16 17 16 18 16 15 4 6 2 9 01181113128135 135 132 146 153 147 148 143 151138140 28 GTAUJS79 29122 26 31 31 29 30 18 19 29 33 40 36 40 44 28 26 27 26 28 26 25 13 14 11 18 11 0 138 149 158 147 152 154 161 155163152 158 156 150 29 FPV34 20

30 CKGER49 27

31 DKCZ56 26 32 DKUKC60 27 33 TERNSA61 26 34 PARUJ73 26 35 BDGRHO77 30 36 DKBAV77 24

37 MINKSW84 29 38 SWGER81 26 39 SWNED85 27

40 DKHK75 26

41 MLDAST82 24

17 26 31 31

19 24 29 29

19 26 31 31 17 22 27 27 19 22 27 27 18 23 28 28

22 30 35 35

16 21 26 26 19 24 29 29 19 25 30 30 20 26 31 31 18 23 28 28

16 21 26 26

28 29 12 13 26 25 12 14 28 29 15 16 26 27 13 14 26 25 13 14 25 26 12 13 30 29 18 19 23 24 10 11 26 27 15 16

27 28 14 15 28 29 17 18 25 26 12 13 23 24 10 11

26 30 40 36 26 30 38 34 28 32 41 37 26 30 38 34 26 30 37 35 25 29 36 32 32 36 42 40 23 27 36 32 28 32 40 36 26 30 39 35 27 31 36 32 25 29 38 34 23 27 36 32

38 38 41 38 39 36 44 36 41 39 36 38 36 42 42 45 42 41 40 46 40 45 43 40 42 40

20 20 21 20 20 19 21 19 20 19 21 19 23 21 22 21 23 21 20 22 18 21 20 22 20 19 22 20 21 20 22 20 19 21 19 20 19 21 19 18 25 24 24 23 25 23 22 19 17 18 17 19 17 16 23 21 22 21 23 21 20 21 19 20 19 21 19 20 24 22 23 22 24 22 23 21 19 20 19 21 19 18

19 17 18 17 19 17 16

7 7 a 7 7 6 1 1 4 9 a ; 11 6 4

9 5 10 7 5 12 10 613 7 5 12 7 5 10 6 4 11 13 914

4 29

9 7 12

8 6 13 11 916

6 4 11

42 9 14 0 13 8 14 9 13 6 13 8 12 7 16 12 10 5 15 10 14 9 15 12 12 7 10 5

86 95 91 87103121 0 84 79 76 96112 9 065 5692110 6 9 05681101 6 7 6 0 67 84 5 8 5 5 0 61

12 13 12 10 11 0 3 63 3 29 8 11 8 8 7 14

7 10 7 7 4 13 10 13 10 10 7 16

5 8 5 5 4 11

3 63 32 9

99115 108 119 108113 95 90 93 102 95108 97 98 94 89 90 85 87 81 96 76 91 69 82 69 86 60 79 39 76 58 86 62 80 68 88 91 114 92107 0 80 68 95 72 87

5 0 78 95 70 85

4 9 0 33 58 70 7 12 5 0 74 83 2 7 6 9 0 61 0 54 72 0

5 5 6 5 5 4 9 2 7 6 9 4 2

DIFFERENCES INAMINOACIDSEQUENCES

FIG. 2. Differencematrix of nucleotide and amino acid sequencesfor 41 influenza A virus NP genes. Group divisions correspond to

lineagesinFig.1asfollows:EP,EQPR56; EQ,recentequine;SW,classicHlNlswine; HUM, human; H13,H13gull;NA,NorthAmerican

avian; ANZ,Australia and New Zealand avian; OW, OldWorld avian. Strain abbreviations andsourcesarelisted in Table 1.

therootofinfluenza A virus NPsbyfewer thaneightamino acid differences, and other avian NPs are nearly as close. Evolution in the avianlineageisvirtuallystatic;forexample, althoughisolated 48yearsapart, FPV34differsbyonlyfour amino acid changes from the recent MLDAST82 isolate. Other lineages show significant evolution at the protein

level, especially human NPs. Nonavian NPs at terminal

branches differ from each other by 47 to 75 amino acid changes, but each is a minimum of 18 steps closer to the avianroot(22to53steps tothe node above MLRDNY78 in Fig.1,right). Similarly,the oldestnonavian NPs differby29

to 41 amino acidchanges but are at least 10 stepscloser to

the avian root(18 to25 steps). As with the nucleotide tree,

thispattern ofevolution away fromavianlike NPs suggests

an avianorigin for all NPs.

A comparison of amino acid sequences among the five major NP lineages shows suites of amino acids unique to

each lineage (Fig. 3). Todetect patterns ofunique(derived)

amino acids among the five lineages, we used the most primitiveNPsequence,TYMN80(theNPclosesttotheroot

of thetree), asabaseline. This choice is critical; ifweused

a derived human NP forabaseline, only twogroups would

beevident, human and nonhuman NPs.ThenonhumanNPs

would appear to form a group because they have retained primitiveavian amino acids that were lostindependently in thehuman NPs. In thissituation and nonhuman NPs would be artificially united by primitive rather than derived

char-acter states.

Despite more than 300 nucleotide changes among avian NPgenes, only two amino acids show a pattern correlated with a group: North American NPs possess unique amino acidsatsites 105 and 450. Mandler and Scholtissek(19)have

reportedatemperature-sensitive mutant of FPV34 in which

an amino acidchange occursat position 332.This mnutation is located within a 24-residue conserved region for the 41 NPs we analyzed (Fig. 3 and 4). These findings emphasize the conservednatureof the NPproteinamongavian strains. The swinea'ndmink isolates within the avian groupand the whale isolate within the H13 gull group show no u'nusual

patterns ofamino acid substitutions, nor convergence with

NPs of the mammalianlineages.

Amino acid changes that are correlated with groups are

summarized inFig.4.Incomparisonwith the avianNPs, the human NPs possess the most fixed amino acid differences

(39 in current human NPs) and H13 gull NPs possess the fewest(15in'currentH13NPs).EQPR56,recentequine,and classic swine NPs have 22, 26, and 19 fixed amino acid

differences, respectively. Most of these differences are

unique to each lineage, in'dicating thatevolution of NPs in each lineage is divergent and independent. For example,

even though human and classic swine appear to share a common ancestor onthe basis of the nucleotide phylogeny (Fig. 1,left),theoldest isolates shareonlytwouniqueamino acids. Sharedderived residues acquiredatalaterdate must

be due to convergence or parallelism. A comparison of accumulated amino acid differences over the past 50 years

indicates that evolutionat theproteinlevel isabout twiceas

fast in the humanlineage (39 versus 19 residues).

Estimation ofevoluationary rates. Within the mammalian andavianlineages,theoldest virus isolates aremorethan 50

yearsold. Thelong periodoverwhichisolateswereobtained provided anopportunity toestimate long-term evolutionary rates for NP genes in different lineages. The evolutionary ratewasestimatedbyplottingtheyearof isolation foravirus VIRUJS STRAIN

11

II

---I

1---7 7 B 7 7 6 4 9 B 6 4

on November 10, 2019 by guest

http://jvi.asm.org/

[image:5.612.61.555.79.362.2]
(6)

1. 25.50.75.100.125

EQPR56 P K N K N E I

EQM163 V H M

EQLON73 V N H M H

EQKY86 V N H K M H M

EQTN86 V N H K M H M

SWHK8 D I V I M V

SWTN77 D I V M V

SWIA30 I I I

SW31 T I I I S

WS33 TK D K I L V M

PR8-34 D K I L VN M

NT60-68 D K ID L K V M V G

SWHK76 D K IN L K V M V

UDORN72 D K ID L R K V R M V

HK83 D K ID L R R K V R M V

WLAST84 N R R E V

WULMD77 D N N T R V

WGLMD78 N N T R V L

WJLMD79 N N T R V L

WJLMN8O ND N T R V L

WJMA8O N N T R V L

WHALEM84 N N T R V L

TYRN0 MASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQ

RTNJ85 M

MLRDNY78 V

CKPENN83 H R

DKNZ76

GTAUS79 S F I C

FPV4 S R

CKGER49 M T

DKCZ56

DKUK60 D S M

TERNSA61 M

PARU73 K M

DKBAV77 M

BDGRHO77 I E

MINKSW84 H M

SWGER81 G M

SWNED85 G KK M

DKHK75 M

MLDAST82 M

126...150...175...200...225...250

EOPR56 I N K

EQNI63 T Q

EQLON73 M T G

EQKY86 M T A G

EQTN86 M T A G

SWHK I N K

SWTN77 I IA SG

SWIA30 T M V

SW31 M K

WS33 D M V

PR8-34 D M V K K D

NT60-68 D M T K S

SWHK76 D T K S

UDORN72 D M T K G

HK83 D R M T I K S

GQJLAST84 V S

GULJD77 V

GJLMD78 V

GIJLMD79 V

GJLMN80 H V D

GJLMA80 H V

WHALEM84 V

TYMN80 GEDATAGLTHLMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMDQVRESRNPGN

RTNJ85 L

MLRDNY78

CKPENN83 I I

DKNZ76 GTAUS79 FPV34

CKGER49 I

DKCZ56

DKUK60 TERNSA61 PARU73

DKBAV77

BDGRHO77 I T

MINKSW84 SWGER81 SWNED85

DKHK75 V

MLDAST82

FIG. 3. Predictedaminoacidsequencesofthe 41influenzaAvirus NP genes.AminoacidsequenceofTYMN80is written infull,and for other sequences only differences from this baseline areshown. NPgroups recognizedas lineagesin Fig. 1 are

separated

bylines in the followingorder(toptobottom):EQPR56, recentequine,classicHlNlswine, human,H13gull,NorthAmericanavian,Australia and New Zealandavian, andOld World avian. Strainsrepresentedarelisted in Table 1.

against the branch distance to the ancestor node of the lineage;theslopeof the

regression

line for the

plotted points

equalsthe numberof nucleotide

changes

per year

(Fig.

5).

In the human lineage, WS33 and PR8-34 are closest to the ancestornode

(Fig.

1)butareoutliers

(Fig.

5); they

appearto

have toomany derived characters for theirage. This situa-tionmaybe theresult of continuous

passaging

of the viruses before the adventof

deep

freezers.

Regression

of the

remain-ing

datum

points

yielded

a

slope

of 2.54nucleotide

changes

per year

(1.62

x

1O'

changes

pernucleotideper

year).

The

on November 10, 2019 by guest

http://jvi.asm.org/

[image:6.612.72.557.68.629.2]
(7)

251.275 .300. 325. 350. 375

EQPR56 I V K I K I I

EQMI63 T I K N K A

EQLON73 K K I K N K I A I

EQKY86 K K I K LN K I A I

EQTN86 K K I K LN K I A I

SWHK8 H K G KK I K V

SWTN77 H K G KK K VA

SWIA30 K

SW31 H K

W533 F P Y K K E

PR8-34 F T P Y L K K K E

NT60-68 P K K Y N L K S K DA E

SWHK76 P K K Y N L K S K D E

UDORN72 P K K Y N LL K S K D E

HK83 S P S K K Y LL KA K DA E

GCULATK84 A H AtL

GULMD77 A L N

WJLMD78 A N L N

WJLMD79 A L N

GJLMN80 A L N

WLMA80 A L N

WHALEM84 A I L N

TYMN AEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLONSQVFSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVVPRGQLSTRGVQIASNENMETMD

RTNJ85

MLRDNY78 P

CKPENN83 DAI

DKNZJ6 I

GTAUS79 K M S K K

CKGER49

DKCZ56 TTV F

DKUK60

TERNSA61 K F A

PARU73

DKBAV77

BDGRHO77 L A

MINKSW84 A

SWGER81 V

SWNED85 V K

DKHK75

MLDAST82

376. 400. 425. 450. 475. 498

EQPR56 N K K T S K D K

EQM163 C S L

EQLON73 S S F S

EQKY86 S R S F S

EQTN86 S R S F S

SWHKB2 K V I N S K L N

SWTN77 K V I N S K L N

SWIA30 V I S S

SW31 I S S

W533 S I D P L S AS

PR8-34 I D T V S AS

NT60-68 A DKP A G K EM A

SWHK76 A DKP GK EM

UDORN72 A DKS A G K E R

HK83 A KS A G K E R

GUJLAST84 K A K FS

WLMD77 K V S F S

GULMD78 K V S P S

GWLMD79 K V S P S

GJLMN80 K V P S

GULMA80 K V P S

WHALEM84 V P S

TYMN8O SSTLELRSRYWAIRTRSGGNTNQQRASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEI IRMMENARPEDVSFQGRGVFELSDEKATNPIVPSFDMSNEGSYFFGDNAEEYDN

RTNJ85 S S

MLRDNY78 CKPENN83

DKNZ76 S

GTAUS79 T S

FPV34 S S L

CKGER49 S

DKCZ56 K S

DKUK60 S S

TERNSA61 S

PARU73 S V

DKBAV77 S

BDGRHO77 N N S

MINKSW84 N S L N

SWGER81 K S V

SWNED85 K K S

DKHK75 R S

MLDAST82 S

FIG. 3-Continued. estimated evolutionaryrate for theOld World avian lineage

was slightly lower (2.17 changes per year; 1.39 x 10-3 changes per nucleotide per year) and lower still for the recent equine lineage (1.22 changes per year; 0.78 x 10-3

changes per nucleotide per year). Our values are smaller than the estimates reported for equine hemagglutinin (HA)

genes (2.8 x 10-3 substitutions per site per year [7]) and

human nonstructural

(NS)

genes

(2

x

1iO

substitutions per site per year

[5]).

The accuracy of these estimates is

im-proved by

a

longer

period

overwhich isolatesare

obtained,

the inclusion ofmore

isolates,

and an absenceof

large

time gaps between isolates. Basedonthese

criteria,

ourestimate of

evolutionary

rateismost accuratefor theOldWorldavian

lineage.

Because of the short time span over which North

on November 10, 2019 by guest

http://jvi.asm.org/

[image:7.612.64.552.78.642.2]
(8)

get#z

U z z

hE

z

960>k

) >- > >

69)

0 0

90

0 0

CLP/eZ

z tLh zz

////- I- I-*F I- I

///esv

a a a ;q a a 9st > > > >>

PS,.ssInIU L L LU Cest U L a. a. a. 05) 0 U40* 5CCtco CCDCW C

-"

I- -I-.

*i-t -

-=~~~~~LC Us z z z z

O

oseF

F5LCO

COKE

o

sLC

0 U

E

o

UCLULUWELUW

<

Gee.S >> > > >

CoGa 0 0 0 0

16CZ ZZKKE EZ

0

PPGO~~~~~SI

00NOE a

1.caa 3R a 0 a

0~~~~~

0-e1-GLLLIE-UI-~

C))LOe.e>>>

3.

LLteM-

- -U-

-CLoKExEa

L0->->-- -9na< U' <

9CL -'* *

04

COLGL

i C

OLLz- - '- -

-9LL- - - -

-o~~~~~~~~~~LtIL- *

--)--LOLE 00000

99' ' <'U - --

-8s- - -*- >

OSE2E0 0 0

PGCC CC CDU CC (C

LCKM E Eo K co Eo

GEl- i- - i- U

.-LEZZ U z z z

0.

0 0 0 0 0

(LU

on November 10, 2019 by guest

http://jvi.asm.org/

(9)

140 0 human b-2.54 ° avian b-2.17 120

x equine b-1.22

100 /

80 /

60

40

40

20

1933 1943 1953 1963

year ofisolation

35

:: r 0 human b-0.57 ° avianb--0.01

25- x equineb-0.33

20

-

15-10<

W0

5-l

1 i Li

1933 1943 1953 1963

yearof isolation

FIG. 5. EvolutionaryratesforNPgenesandr

nucleotideevolutionaryrate. (Bottom) NPamino

rate.Evolutionaryrateis estimated byregression

against branch distance from the common anc4 nucleotide and amino acid phylogenetic trees (F

statistic b provides rateestimates. The common

the human lineage corresponds to the node c

nucleotide tree(Fig. 1,left).Thecommonancestc

equine NPsandforOldWorld avian NPsaredal

respectively. Analogouscommonancestornodes

treeareundated. The1933and1934humanisolat

arenotincluded in regression of human data.

American avian andH13gull isolates weret,

calculate evolutionary rates for these lin human lineage, accuracy could be improv

sequences for the 1934 to 1968 gap; we be

increase the total branch length of the I estimate ofthe evolutionary rate.

The estimation of evolutionary rates for

quences was similar (Fig. 5). For the humar

and avian lineages the rates were 0.57,

amino acidchangesper year, respectively. obtained similar estimates for evolutiona nucleotidelevel for human andavianlineage

tobenooverallevolution atthe aminoacid World avian lineage for more than 50 yes

evolutionattheprotein level for therecent

about halfthatforhuman NPs.

Estimation of dates for ancestral NPs. Est tionary rates were used tocalculate datesi

ancestor nodes ofthe phylogenetic tree (F

was accomplished by dividing the branch

tanceby the evolutionaryrate,yielding a distance in years. If the tree branching and distances are correct, and if

Cn

evolutionary rates are accurate and constant within a lin-eage, thenestimates of dates for ancestornodescalculated from derivative branches should converge. Using this

ap-,i'

4 proach, we calculated datesfor ancestor nodes for the recent equine, human, and Old World avianlineages (Fig. 1, left). The estimates of dates for ancestor nodes were unusually concordantin these lineages. The recent equine and human lineages predicted a date of 1860 to 1870 for the common ancestor of recent equine and human plus swine lineages. I The combination of recent equine, human, and Old World avian lineages converges on the mid-19th century period of 1840 to 1860 for the

origin

of a common ancestor. The 1973 1983 EQPR56 NP appears to have branched from an earlier ancestor. Using the relatively high Old World avian evolu-tionary rate, weconservatively estimated a pre-1800 datefor the root node from which EQPR56 and all other NPs o descend. From this analysis, we conclude that the present

NP lineages are more than acentury in age.

The only branch of the mammalian lineages that does not o conform with the dates shown in Fig. 1 is that containing SWHK82 and SWTN77. Either the evolutionary rate is much reduced in this branch or the branch is misplaced;

moreclassic HlNl swinesequences between 1931 and 1977

X/

mayresolve this problem. The estimated common ancestor ° dates forthesemammalian lineages suggest thatHlNlswine NPs and recent equine NPs may have been present in the o c late 19th century. The mammalian isolates MINKSW84,

'y±Zyl

J

SWGER81,

and SWNED85 appear to be evolving more

1973 1983 slowlythan other NPs in the Old World avianlineage; their branch distance

points

are belowthe

regression

line of

Fig.

proteins. (Top) NP 5. The date of the common ancestor node for the swine and acid evolutionary avian isolates suggests that avian

HlNl

NPswerepresentin of year of isolation swine atleast 10 years before the dateofvirus isolation. At estor node of the the nucleotide level, the SWHK76 isolate in the human rig. 1). Regression lineage (Fig. 1) appears to be evolving at the same rate as

ancestor nodefor other human isolates but at the amino acid

level

appears to lated 1933 in the beevolving more

slowly

(Fig. 5). These

observations

suggest

rtnodes

for recet1 that human and avian NPsevolve

differently

inaswine host. ted 1942and 1925,

,of the amino acid Within the Old World avianlineage, the branchcontaining es (WS33, PR8-34) BDGRHO77 appears to be evolving faster than other con-temporary NPs (e.g., DKHK75 and MLDAST82); their branch distance points are well above the regression line of aken, wedid not

Fig.

5.

By

using

the

evolutionary

rate estimate for the Old ieages. For the World avian lineage, the projected dates for the common ted by obtaining ancestors are 1912 for the Australia-New Zealand NPs lieve this would (GTAUS79, DKNZ76), 1899 for the North American avian lineage and the NPs, and 1882 for the H13 gull lineage (Fig. 1, left). The dates of these

hypothetical

ancestors do not coincide with ramino acid se- the tree

position

or dates of isolation for the NPs in these i,recentequine, three descendant branches. The discontinuity of dates in ).33, and -0.01 these other

lineages

relative to the OldWorld avian

lineage

Even though we may be the result of slower evolutionary rates or missing

try rates at the data, i.e., a lack of older isolates to accurately root these

zs,there appears branches. For the North American and Australia-New level in the Old Zealand NP lineages the discontinuity is correlated with ars. The rate of geographic separation, but for the H13 gull lineage the equine lineage is discontinuity is correlated with a different

separation,

the limited host range of H13 viruses. Despite these

discrepan-timates of evolu- cies,theanalysis shows that the

evolutionary

relationship

of forhypothetical- these other avian lineages to the Old World

lineage

is

quite

ig. 1, left). This old; i.e., common ancestors must be dated well before the h-internodal dis- earliest Old World avianisolate (FPV34).

b r a n

c

h

d i

s

t a n

c

e

b r a n

c

h d

s

t a n

c

e

on November 10, 2019 by guest

http://jvi.asm.org/

[image:9.612.60.296.74.407.2]
(10)

DISCUSSION

Host specificity and NP geneevolution. Phylogenetic trees represent hypotheses about evolutionary relationships among taxa. The results of ourphylogenetic analysis identify the fivehost-specific NP RNA hybridization groups of Bean (1) as distinct lineages of NP genes. Furthermore, our analysis of amino acid sequences shows that these lineages are distinct at the protein level as well. Earlier studies have implicated the NP protein in determining the host range for influenza viruses (e.g., Scholtissek et al. [24]). This suggests that the NP protein may be involved in the evolution and maintenance of distinct gene pools for influenza A viruses. If this is true, then NP genes should fall into host-specific lineages, as we have demonstrated. Second, there must be adaptive functional differences at the protein level among host-specific NPs. This is suggested by the distinct amino acid differences we found among the lineages. Understand-ing of these differences must await detailed descriptions of the structure and function of the NP protein.

H13 NPs represent an unusual example of hostspecificity. Apparently influenza A viruses with H13hemagglutinins are always coupled with a specific NP, and with the exceptionof onewhale isolate, H13 NPs have notbeen foundoutside of gulls (10, 14). Viruses of other serotypes (Hi,H2, H4,Hll) bearingclassic avian NPs have been isolated from gulls (1, 14). The specificity of the H13 NP suggests that it ishighly adapted to gulls, but the basis for this adaptation is un-known.

Adaptation and rates of NP evolution. If an NP protein is well adapted to a host we might expect little, if any, evolution at the protein level to occur, but if an NP geneis introduced into a new host we might expect the NPprotein to evolve rapidly. Mutations at the nucleotide level should fuel theevolutionary changes. In our analysis, human and Old World avian lineages have similar estimated rates of evolution at the nucleotide level, but at the amino acidlevel the human lineage is evolving faster than all other lineages and the OldWorld avian lineage appears not to be evolving at all. The evolutionary stasis of the Old World avian NP protein suggests that it is being maintained at an adaptive peak; the protein has not changedsignificantly for more than 50 years despite more than 300 nucleotide changes in the lineage. Also, there are fewer than 10 amino acid changes in many avian NP proteins relative to the root of the NP lineage, which isestimated to be more than a century in age. In contrast, NP proteins in other lineages continue to change, especially in the human lineage. This pattern is congruent with the hypothesis of an avian ancestor for all NPs and the hypothesis that the protein has not yet fully adapted to these new hosts. The basis for differences in evolution between avian and human NPs may be related to changes in host and tissue tropism (from gut endothelium in birds torespiratory epithelium in humans).

Reassortment and host fidelity of NP genes. If NPs are adapted to specific hosts, they should show host fidelity. Thisappears to be the case for most recent equine, H13 gull, and human NPs. Swine, however, readily accept viruses containing NP genes typical of human, avian, and classic swine lineages. The appearance of swine isolates in avian and mammalian lineages lends support to the mixing vessel hypothesis of Scholtissek et al. (24). The susceptibility of swine to avianHlNl viruses and the ability ofhumans and swine to pass virus back and forth may have relevance to the 1918influenza A virus pandemic; perhaps swine served as an intermediate step for the introduction of a new avianlike

virusintothehumanpopulation(asproposedby Scholtissek

et al. [24]). Our analysis indicates thatclassic swine genes and genes introduced into the swine gene pool may be evolving slowly; this suggests that swine can serve as a stable reservoir for a

variety

of

swine,

human, and avian genes.Thus, swine appeartoefficientlymaintainavarietyof genes which are available for production ofnew reassor-tants.

Commonoriginforclassic swine and human NPs.Theclose relationship showninourphylogenetic analysisbetweenold humanisolates (PR8-34andWS33)and contemporary swine isolates (SWIA30 and SW31) compared with more recent human isolatesstrongly suggestsacommonoriginfor human andswineNPs. The estimated datefor theancestornode for these two groups is 1914, which is earlier than the 1918 pandemic of HlNl swinelike flu (Fig. 1). Assuming a com-mon ancestorforhuman and swine influenza viruses, Naka-jima et al. (21) found similar results with the M and NS1 genes.

Surprisingly,

the current swine viruses containing HlNl-type NPs (SWTN77, SWHK82) do not appearto be direct descendants of SWIA30 or SW31 but do share a

pre-1900 common ancestor. Perhaps the SWIA30 lineage

was derived from the 1918 pandemic virus but is now

extinct. Thisquestioncanberesolved bysequencing classic swine NPs isolated between 1931 and 1977.

Geographic separation and NP gene evolution. The evolu-tion ofNPgenelineagesappears tohave been influencedby geographic separation ofwild host populations. For exam-ple, theclassic avian lineage is divided into NorthAmerican, Australia-New Zealand, and Old World lineages (Fig. 1, left). The H13 lineage can be divided into Eurasian (GULAST84) and NorthAmerican sublineages. For each of these geographic discontinuities, there appears to be a mismatch in the estimated date for the common ancestor and the datesof isolation for NPs in the descendant side lineages. Forexample, the Australia-New Zealand lineage containing the DKNZ76 NP appears tobe older than FPV34, and NPs in thislineage appear to be evolving at less than half the rate of Old World avian NPs (Fig. 1, left). Donis et al. (8) have shown similar patterns of lineage divergence and reduced evolutionary rates correlated with geographic separation among H4hemagglutinin genes.

Estimated agesof NP lineages.The time frame proposed for theevolution of the current diversity of NP lineages exceeds a century. This estimate is reasonable given the more than 50-year span over which viruses have been isolated from avian, swine, andhuman lineages. Ourestimates for the age of NP lineages are likely to be conservative; the saturation of nucleotide sites with silent mutations and consequential undetected changes in oldbranches may result in smaller age estimates for older common ancestors. Thus we propose that the nonavian NPnucleotidelineages were well differentiated from the avian lineage by the end of the 19thcentury andthat the EQPR56 lineage as well differentiated in the early 19th century.

Avian origin for influenza A virus NPs. Our phylogenetic analysis makes no assumptions about relationships among influenza A virus NPs. Independent evidence supporting the hypothesis of an avian (waterfowl) origin for influenza A viruses includes theproliferation and maintenance of numer-ous hemagglutinin and neuraminidase subtypes among mi-gratory waterfowl populations, asymptomatic infection indi-cating coadaptation of virus and host, and a propensity for dissemination of viruses to other hosts because of migratory habits and the production of prodigious amounts of virus (11). In support of thishypothesis, most of the newinfluenza

on November 10, 2019 by guest

http://jvi.asm.org/

(11)

virus genes that have appeared in mammalian gene pools

overthepast30yearshavebeenshownultimatelytohavean avian origin (e.g., Hinshaw and Webster [11], Kida et al. [16], and Kawaokaetal. [15]). Theresults ofour

evolution-aryanalysis areconsistent with the hypothesis ofan avian

origin for all NPs. Oursummaryevidence for this hypothesis includes the following: (i) older nonavian NPgenesare more avianlike than recent NPs, (ii) recent NPgenes in the five lineages are more distant from NPs in other lineages than from themostprimitive avian NPs, (iii)nonavian NPs share moreprimitiveavian characters thanderived characters, and (iv) OldWorld avian NP proteinsappearto bein evolution-ary stasis while nonavian NPs are evolving rapidly away fromavianlikeancestral NPproteins. These results support the hypothesis of independent evolution of NPs from an

avian ancestor.

ACKNOWLEDGMENTS

We thank Evelyn Stigger, Raphael Onwuzuruigbo, and Daniel Channell for technical assistance, Clayton Naeve and the St. Jude Children's Research Hospital MolecularResource Centerfor

prep-aration ofoligonucleotides, Patricia Eddy and the St. Jude Chil-dren'sResearch Hospital MolecularBiology Computer Facility for computersupport, and David Swofford forassistance with PAUP software.

Thisworkwassupported by Public Health Serviceresearchgrant

AI-08831 and Public Health Service contract AI-52586 from the National Institute ofAllergyandInfectious Diseases, Cancer Center Support(CORE)grantCA-21765, National ResearchService Award 5T32-CA09346tothesenior author,andAmericanLebanese Syrian AssociatedCharities.

LITERATURE CITED

1. Bean, W. J.1984.Correlation of influenzaAvirusnucleoprotein

geneswith hostspecies.Virology 133:438 442.

2. Bean, W. J., G. Sriram, and R.G.Webster. 1980. Electropho-reticanalysis ofiodine-labeled influenzavirus RNAsegments.

Anal. Biochem.102:228-232.

3. Briedis, D. J., and M. Tobin. 1984. Influenza B genome:

complete nucleotide sequence of the influenza B/Lee/40 virus

genomeRNAsegment5 encodingthenucleoproteinand

com-parison with the B/Singapore/222n9 nucleoprotein. Virology 133:448-455.

4. Buckler-White, A. J., and R. R. Murphy. 1986. Nucleotide sequenceanalysis of thenucleoproteingeneofanavianand a

human virus strain identifies two classes of nucleoproteins. Virology155:345-355.

5. Buonagurio, D. A., S. Nakada, J. D. Parvin, M. Krystal, P. Palese, and W. M. Fitch. 1986. Evolution of human influenza A virusesover50years:rapid and uniformrateof change inthe NSgene.Science 232:980-982.

6. Chen, E. Y., and P. H. Seeburg.1985. Supercoil sequencing: a

fast and simple method for sequencing plasmid DNA. DNA 4:165-170.

7. Daniels, R. S.,J. J.Skehel,and D.C.Wiley. 1985. Aminoacid

sequences ofhaemagglutinins of influenza viruses of the H3 subtype isolatedfrom horses. J. Gen. Virol.66:457-464. 8. Donis,R.O.,W.J. Bean, Y. Kawaoka, and R. G. Webster. 1989.

Distinctlineages of influenza virus H4hemagglutiningenes in differentregionsof the world. Virology 169:408-417.

9. Gammelin, M., J. Mandler, and C. Scholtissek. 1989. Two subtypes ofnucleoproteins (NP) of influenza A viruses.

Virol-ogy170:71-80.

10. Hinshaw, V.S., W. J. Bean, J. Geraci, P. Fiorelli, G. Early, and R. G. Webster. 1986. Characterization of two influenza A viruses fromapilotwhale. J. Virol. 58:655-656.

11. Hinshaw, V. S., and R. G. Webster. 1982.Thenatural history of influenzaAviruses,p. 79-104.In A.S. Beare (ed.),Basic and appliedinfluenza research. CRCPress, Inc., Boca Raton, Fla. 12. Huddleston,J. A., and G. G. Brownlee. 1982. The sequenceof

the nucleoprotein gene of human influenza A virus, strain A/NT/60/68.NucleicAcidsRes. 10:1029-1038.

13. Jones, K. L., J. A. Huddleston, and G. G. Brownlee. 1983. The sequenceofRNA segment 1of influenza virusA/NT/60/68 and its comparison with the corresponding segment of strains A/ PR/8/34 andA/WSN/33. Nucleic AcidsRes. 11:1555-1566. 14. Kawaoka, Y., T. M. Chambers, W. L. Sladen, and R. G.

Webster. 1988. Is the gene pool of influenza viruses in shore-birds and gulls different from that in wild ducks? Virology 163:247-250.

15. Kawaoka, Y., S. Krauss, and R. G. Webster. 1989. Avian to humantransmission ofthe PB1 gene ofinfluenzaAviruses in the 1957and 1968pandemics. J.Virol. 63:4603-4608.

16. Kida, H., K. F.Shortridge,andR.G.Webster. 1988. Origin of thehemagglutiningeneofH3N2influenzavirusesfrom pigs in China. Virology 162:160-166.

17. Krystal, M., R. M.Elliot,E. W. Benz, Jr., J. F. Young, and P. Palese.1982.Evolution of influenzaAandBviruses: conserva-tion of structural features in the hemagglutinin genes. Proc. Natl. Acad. Sci. USA79:4800-4804.

18. Londo, D. R., A. R. Davis, and D. P. Nayak. 1983. Complete nucleotide sequence ofthe nucleoprotein gene of influenzaB virus.J.Virol. 47:642-648.

19. Mandler, J., and C. Scholtissek. 1989. Localisation of the temperature-sensitivedefect in thenucleoprotein ofaninfluenza A/FPV/Rostock/34 virus. VirusRes. 12:113-122.

20. Nakada, S., R. S. Creager, M. Krystal, and P. Palese. 1984. Complete nucleotide sequenceof the influenzaC/Californial78 virusnucleoproteingene. VirusRes. 1:433-441.

21. Nakajima, K., E. Nobusawa, and S. Nakajima. 1984. Gcnetic relatedness between A/swine/Iowall5/30 (HlNl) and human influenza viruses. Virology 139:194-198.

22. Reinhardt, U., and C. Scholtissek. 1988. Comparison ofthe nucleoprotein genesofachicken and amink influenzaAH10 virus. Arch. Virol. 103:139-145.

23. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc-ing with chain-terminatsequenc-ing inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467.

24. Scholtissek, C., H. Burger, 0. Kistner, and K. F. Shortridge. 1985. Thenucleoproteinasapossible major factor in determin-ing host specificity of influenza H3N2 viruses. Virology 147: 287-294.

25. Scholtissek, C., I. Koennecke, and R. Rott. 1978. Host range recombinants of fowl plague (influenza A) virus. Virology 147:287-294.

26. Snyder, M. H., A. J. Buckler-White, W. T. London, E. L. Tierney, and B. R. Murphy. 1987. The avian influenza virus nucleoprotein gene and a specific constellation of avian and human virus polymerase genes each specify attenuation of avian-human influenzaA/pintail/79reassortantviruses for mon-keys.J.Virol.61:2857-2863.

27. Steuler, H., B. Schroder, H. Burger, and C. Scholtissek. 1985. Sequence of the nucleoprotein gene of influenza A/parrot/ Ulster/73.VirusRes.3:35-40.

28. Tian, S. F., A. J.Buckler-White, W. J. London, L. J. Recle, R. M. Chanock, and B. R. Murphy. 1985. Nucleoprotein and membrane protein genes are associated with restriction of replication of influenzaA/mallard/NY/78 virus and its reassor-tantsinsquirrel monkey respiratorytract.Virus Res.1:625-630. 29. Winter, G., and S. Fields. 1981. The structure of the gene encodingthenucleoprotein ofhumaninfluenzavirusAIPR/8/34.

Virology 114:423-428.

30. Winter,G.,S.Fields,M.J.Gait,andG.G.Brownlee.1981.The use ofsynthetic oligodeoxynucleotide primers in cloning and sequencing segment 8 ofinfluenzavirus (AIPR18/34). Nucleic Acids Res. 9:237-245.

on November 10, 2019 by guest

http://jvi.asm.org/

Figure

TABLE 1. Influenza virus strains used in phylogenetic analyses
FIG.1.fromGLOBAL,differencesmaximum-parsimonyindexwereimmediatemakeisB/Lee/40 141, Phylogenetic trees for influenza A virus NP genes
FIG.2.lineagesavian;Difference matrix of nucleotide and amino acid sequences for 41 influenza A virus NP genes
FIG. 3.followingotherZealand Predicted amino acid sequences of the 41 influenza A virus NP genes
+3

References

Related documents

demon’s player may choose to gain the rote quality on the roll. Alternately, the player may choose to reroll all dice on a roll after the dice are rolled, but he must accept

Understanding that native Arabic speakers transfer their L1 reading strategy to English, and that this results in less sensitivity to phonological information during reading,

As compared to their Jewish counterparts, Bedouin Arab adolescents reported significantly higher levels of individual hope, collective hope, and state anger, as well as

O objetivo deste trabalho foi avaliar a força de músculos ventilatórios e de mão em pacientes na lista de espera para o transplante de fígado e associá-los à

(I) Finding an $s-t$ path of label $\alpha$ in a group-labeled graph for a given element $\alpha.$.. $(\Pi$ $)$ Finding an $s-t$ path whose label

Voor de proactieve component zullen de lidstaten, veel meer dan voorheen, gebruik kunnen maken van de gegevens over een groot aantal stoffen die de industrie aanlevert bij

• Excellent mechanical stability and corrosion protection • Excellent oxidation stability, helping extend grease life • Excellent extreme pressure, anti-rust and water

(Right triangles are congruent if any two sides are congruent because of the Pythagorean relationship, which guarantees that the third sides are also congruent.) This shows that