3 Materials and Methods
3.22 Molecular Modelling
All hydrogen atoms for both the protein and RNA were introduced to the PDB files, using XLEAP. Na+ ions were also introduced to give an overall neutral charge (all protein/RNA systems modelled otherwise had an overall negative charge due to the RNA phosphate backbone). TIP3P water was used as an explicit solvent, and was added in a truncated octahedron geometry to a distance of 9 Å around the protein. Energy minimisation was carried out using the SANDER modules of the AMBER program. Energy minimisation was conducted in three stages. The first step minimised the water molecules only, by applying a restraint mask with a force constant of 200 to all other atoms in the system. The second stage minimised the water and Na+ ions added to achieve an overall neutral
81
charge for the system, by restricting the restraint mask to the atoms of the protein and RNA only. The third step minimised the entire system, with no restraint mask. Each step included an initial minimisation by the steepest descent method, lasting for 50 cycles, followed by conjugate gradient minimisation for up to a maximum of 5000 cycles. Molecular dynamics simulations consisted of an initial 10 ps step at 100 K, followed by a temperature ramp to 300 K over a further 10 ps with a restraint mask of weight 200 applied to the protein and nucleic acid. The force constant of the restraint mask was then reduced to zero in seven stages (100, 50, 25, 10, 5, 2, 1) of 10 ps each. The final step of the molecular dynamics simulation was at 300 K with no restraint mask, and allowed to run for a minimum of 1 ns. The resulting models were viewed using Pymol.
Three crystal structures were used as starting points for constructing models: the structure of RRM3 in complex with the RNA sequence UGUGUG by Tsuda et al. (PDB ID: 2RQC), the structure of two RRM1 proteins in complex with the sequence GUUGUUUUGUUU by Teplova et al. (PDB ID: 3NNH), and the structure of the two N-terminal RRMs with RRM2 bound to the RNA sequence GUUGUUUUGUUU, also by Teplova et al. (PDB ID: 3NMR).
The first model to be constructed was of the two N-terminal domains (the t187 construct) in complex with the RNA sequence UGUUUUGU. The RNA in the structure of bound RRM1 (3NNH) was truncated to the sequence UUGU, leaving a single RRM1 protein bound to a UGU site. The second RRM1 protein was removed. The residues of the remaining RRM1 protein were then superimposed onto the RRM1 section of the protein in structure 3NMR, minimising RMSD for the protein. Replacing the atom coordinates of the RRM1 section of the protein in structure 3NMR with those from the superimposed RRM1 from structure 3NNH combined the two structures, resulting in a single protein of residues 14 - 187 of CELF1, with a fragment of RNA containing a UGU(U) site bound to each domain. Residues 1 – 13 are not included in any crystal structure, and since they were not believed to be involved in RNA binding they were omitted. The RNA
82
molecule bound to RRM2 in this structure was truncated to the core UGUU sequence in contact with the protein.
The RNA fragments UGUU (bound to RRM2) and UUGU (bound to RRM1) were then connected to form the RNA UGUUUUGU. Due to the orientation of the two domains in the original 3NMR structure this required the introduction of an implausibly long bond connecting the two RNA fragments in this initial structure. The energy minimisation step detailed earlier allowed the protein domains and RNA to shift until this bond had relaxed to a normal length. Molecular dynamics simulations were then conducted from this starting point, using the temperature ramps specified earlier.
This structure served as the N-terminal fragment for later models containing all three RRMs of the protein. Since no structural data exists for the RRM2 – RRM3 linker region a section of the RRM123 protein consisting of residues 186 - 216 was constructed in the program XLEAP, and then energy minimised in AMBER. This linker was then attached to the N-terminal fragment by superimposing residues 186 and 187 of the linker onto their counterparts in RRM2 of the N- terminal fragment (minimising RMSD). The C-terminal fragment was similarly attached by superimposing residues 215 and 216 of the RRM3 structure onto the corresponding residues in the RRM2 - RRM3 linker. Combining the atom coordinates from the PDB files resulted in a complete RRM123 protein with two separate RNA fragments, UGUUUUGU bound to the N-terminal domains and UGUGUG to RRM3. This model was energy minimised, and underwent molecular dynamics simulations in AMBER for 1 ns at 300 K.
To produce the complex with the EDEN-2U/4U RNA the fragment bound to RRM3 was truncated to the UGU site. A section of RNA consisting of the sequence UUUUUU was constructed in XLEAP, energy minimised in AMBER,
83 uracil of this UGU site. The 5’ uracil of this RNA spacer was then superimposed onto the 3’ U of the N-terminal bound UGUUUUGU fragment, which
necessitated the introduction of implausibly long bonds into the RNA backbone to make this connection. This structure, corresponding to RRM123 in complex with EDEN-2U/4U with the domains arranged in the order 2 – 1 – 3, was energy minimised over five steps, with a restraint mask applied to the protein and RNA. The force constant of this restraint mask was reduced (from 200 to 100, 50, 10 and 1) allowing all bonds to relax slowly to normal lengths. The structure was then subjected to molecular dynamics simulations in AMBER for 2 ns. A model of the same complex, but with the RRMs in the order 3 – 2 – 1 was also constructed by the same method.
Models of RRM123 in complex with the EDEN-2U/HL RNA sequence were also constructed. The RNA hairpin (consisting of the sequence UCCCGAGGACGGGU folded to form hydrogen bonds between the bases in
italics) was constructed in XLEAP, and energy minimised. The 5’ and 3’ uracils
were then superimposed onto U9 and U12 respectively of EDEN-2U/4U in the complex of this RNA sequence with RRM123. U10 and U11 of the EDEN- 2U/4U sequence were then deleting, leaving an overall RNA sequence matching EDEN-2U/HL. The model then underwent energy minimisation and a 2 ns molecular dynamics simulation.
84