• No results found

1.3 Proteins: Modelling and Applications

1.3.7 Applications

We end this section with a discussion of protein model applications and the results derived from their study. A final application, that of protein structure prediction, is deferred until later.

From the earliest lattice models, computational simulation has been used to study the process of protein folding, the characterization of folding pathways and the dynamics and thermodynamics of globular proteins.

Recent work by the Shaw group has shown the viability of using MD with AA models to fold small proteins and probe the protein folding pathways to atomic level accuracy (51), although special purpose hardware was required. However, it is important to remember that AA protein force fields have been optimized for proteins in their native state and it is unknown how well they model unfolded chains (129). For example, researchers have studied the accuracy of AA force fields when modelling small peptides such as alanine dipeptide (130), trialanine (131) and the five residue peptide Met-Enkephalin (132); they are small systems whose peptide bonds are believed to be behave reasonably similarly to those found in unfolded proteins. Results have been compared with other AA force fields, QM calculations or experimental work with mixed results; good agreement between different methods and force fields is often found (131) but this is not always the case (130).

Alongside protein folding studies, researchers have used protein models to study the behaviour of proteins unfolding. For example, Dudkoet al. have used a G¯o model to study the behaviour of proteins unfolding by the application of force (90). They show how the choice of reaction co-ordinate is crucial in understanding the energetic barriers to unfolding. These simulations can be directly compared to the experiments where single proteins are pulled apart by the application of force (133). Protein unfolding is

a much faster process compared to folding and hence is also accessible to AA models. For example Li and Daggett characterized the (unfolding) transition state of Chymotrypsin inhibitor 2 (84), and their later work compared simulations and experiments of the unfolding of the Engrailed Homeodomain protein (85).

The thermodynamics of proteins has also been extensively studied with protein models. The heat capacity of proteins can be experimentally measured and so its calculation in silico is highly desirable, although very challenging to compute. For example, Yeh et al. calculate the heat capacity for an SH3 domain, starting the simulation from the crystal structure (134), and Lee and Olson calculate the heat capacity for the Trp-cage (135). The position of the peak in the heat capacity curve corresponds to the temperature at which the protein unfolds, and alternatives to the heat capacity, such as the proportion of native contacts (Q) and even the radius of gyration Rg, are often reported. It is also common to

present free energy landscapes projected onto suitably chosen reaction co-ordinates. For example, Shea

et al. present the free energy surface as a function of Q and Rg for an SH3 domain (136) and Zhou

studies the effect of explicit and implicit water on the free energy surface of a β-hairpin using Rg and

number of H-bonds as the reaction co-ordinates (137).

The potential energy landscapes and folding funnels of proteins have also been studied and visualised using disconnectivity and later scaled disconnectivity graphs (138, 139). Koga and Takada have used protein models to perform in silico mutation analysis, mimicking experiments in order to study the mechanism of the rotary motor F1-ATPase (140).

Protein models have also been used to study peptide aggregation, one of the processes known to be heavily involved in diseases such as Alzheimer’s. Nguyen and Hall studied the sensitivity of fibrillization on temperature and peptide concentration (96). Their simulations provided evidence for the nucleated fibrillization hypothesis; an ordered nucleus is formed from a small amorphous aggregate and this is then followed by rapid fibril formation. Fawzi and coworkers use a CG model of the Alzheimer’sAβ1−40 peptide in order to study the propensity of different protofibiril seeds to form full fibirils, their patterns of growth and level of stability (141).

Since the 1970s there has been success using protein models to study protein-protein docking (142), that is building models of proteins known to interact and using the models to predict the interaction site and relative orientations of the molecules. Initial models treated the molecules as rigid bodies, but with the increase in computational power, more recent models allow for flexible docking. There has also been success combining models with experimental results, improving our understanding of protein docking (143). See the special review issue ofProteins: Structure Function and Bioinformatics (144) for details of the recent progress in this field.

Another application of protein models is in protein design. Using computational techniques, Kuhlman

et al. successfully designed a 93 residue protein, Top7, which was shown experimentally to be stable and have the designed tertiary structure (145). The Mayo lab have been at the forefront of computational protein design (146, 147) and they have incorporated protein models and computational design into directed evolution experimental pipelines. Directed evolution is an experimental approach to designing proteins. For example, to improve binding affinity to a specific ligand, a library of sequences is taken and random mutations are performed, and the sequences with the highest affinity are kept for the next iteration. Protein models can be used as a filter for desirable sequences beforein vitroexperiments (148). Finally, one of the main criticisms of CG models is that they may not be sufficiently accurate and therefore that the conclusions drawn from the models may not be relevant to real systems. Therefore, it is important to compare results to experiment (97, 149), or failing that, all-atom models of the same system (107, 150). However, if this comparison is not always shown or is very general, the results can

only provide insights into possible or qualitative behaviours of the system. Due to the increasing amount of experimental data, the increase in computing resources and the maturing nature of the field, recent CG modelling work is, generally, more able to compare to experimental or AA work than it was in the past.

In recent years, CG models have improved our understanding of biophysical systems. As compu- tational power increases, larger systems will be able to be studied using AA models (51), and for CG models to maintain their utility, new models will have to be developed which study larger, even mesoscale, systems (151, 152). Further work is also required to improve the transferability of CG models and in inferring their parameters. Novel ways of using CG models, for example by running hybrid CG-AA models (153), may also be an interesting line of enquiry.