Regions of RNA secondary structure play essential roles in the replication cycle of HIV-1. The SHAPE technique has been applied to determine the RNA secondary structure of the full-length HIV-1NL4-3 genome, and this analysis has shown many
elements of RNA structure (172), but only a fraction of these have been previously studied. One tool to assess the importance of these structures is to determine the extent to which they are conserved over evolutionary time and the extent to which they are
maintained after mRNA splicing.
The second chapter describes the application of SHAPE technology to develop a secondary structure model for the genomic RNA of a second primate lentivirus, simian immunodeficiency virus (SIVmac239), which shares 50% sequence identity at the nucleotide level with HIV-1. In both genomes approximately 60% of the nucleotides are paired within the coding region (8,738 nucleotides). However, only about half of these paired nucleotides are paired in both sequences, and only 58 base pairs form with the same pairing partner in the coding region of both sequences. Thus on average the RNA secondary structure is evolving at a much faster rate than the sequence. Some structures are conserved between HIV-1 and SIVmac239, including in the 5' untranslated region (5' UTR), the Rev responsive element (RRE), a pseudoknot to sequester the 5'
polyadenylation sequence, the polypurine tracts (PPT and cPPT) that begin plus-strand synthesis, and the stem-loop structure that includes the first splice acceptor site. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form. As with all lentiviruses, the HIV-1 and SIVmac239 genomes are adenosine-rich and cytidine-poor. Approximately two-thirds of the cytidines, uridines, and guanosines are base-paired while only one-third of adenosines are base-paired, leading to the
concentration of adenosines in single-stranded regions (55% of the unpaired nucleotides). Thus the base composition of the structured regions is very different from either the unpaired regions or the genome as a whole. Structures with adenosine content equal to or greater than the number of guanosines had higher SHAPE reactivity and were not
conserved between the two genomes. By contrast, those structures in which guanosines were more abundant than adenosines had lower SHAPE reactivity and structure was maintained, although still undergoing significant evolution. This leads to the conclusion that much of the secondary structure reflects pairing in a state which allows the RNA to form and reform interactions throughout evolution of the sequence. However, regions of the structure that perform necessary functions within the viral replication cycle seem to have a high guanosine content, which stabilizes these structures and allows them to remain intact even through the course of sequence evolution.
The work in the third chapter examines regulation of splicing due to RNA secondary structure in the HIV-1NL4-3 transcript mRNA. I evaluate the importance of an
evolutionarily conserved stem-loop structure whose pairing interactions at the base of the stem were kept constant between HIV-1NL4-3 and SIVmac239 genomic RNA structures.
Mutations to this stem that disrupted the pairing interaction while keeping surrounding ESE sequences intact as well as the corresponding amino acid sequence were introduced to the HIV-1NL4-3 genome, creating the mutant virus SLSA1m. In a virus coculture assay,
the wild-type virus outcompeted the SLSA1m by a small margin. Separately, the mutant and wild-type viruses were passaged in cells and the mRNA profiles from these cells showed a difference in splicing pattern. Taken together, these data indicate a decreased viral fitness to SLSA1m and a change in usage of the splice sites based on the disruption of this stem structure. To examine other splicing regulatory features in the context of entire transcripts of fully spliced and partially spliced mRNA, I performed SHAPE analysis on in vitro transcribed RNAs representing the most abundant versions of spliced mRNA for all of the viral proteins. These structures exhibit maintenance of known motifs around splice sites SD1, SA2, SA3, and SA7, but with slightly altered conformations, emphasizing the importance of analyzing these structures and pairing interactions in a whole-molecule context. I observed maintenance of some previously unreported
structures around the known SRE sequence at SA4c/a/b, SA5, and SD4, implying a role of RNA structure in regulation of splicing at this region. Many of these known and newly identified structures are preserved even after splicing events excise large regions of sequence, however, some structures are altered based on initial splicing events. This leads to the conclusion that most RNA regulatory structures affecting splicing of HIV-1 are formed through local interactions and are thus made impervious to large sequence changes or deletions because of the need to maintain these structures intact in the mRNA after the initial splicing event, but some structures are altered to modify the occurrence of downstream splicing events.
In the fourth chapter, I will summarize the results of my thesis work and discuss areas where further research is needed.
CHAPTER 2
COMPARISON OF SIV AND HIV-1 GENOMIC RNA STRUCTURES REVEALS IMPACT OF SEQUENCE EVOLUTION ON CONSERVED AND NON-CONSERVED
STRUCTURAL MOTIF1