• No results found

to answer biological questions using computational methods, it is important to represent the enzyme reaction mechanism quantitatively. The information available in MACiE is very useful for representing enzyme reactions not only at the overall reaction but also at the mechanistic level of reactions. One way is by representing data in a fingerprint [20], another by using sophisticated bioinformatics or statistical methods to estimate the similarities between reactions [20,33].

For investigating the enzyme reaction of hydrolase family EC 3.b.c.d Oliver Sacher [34] represented the data by combing active site and physio- chemical effects on the chemical reaction. They showed that physiochemical property overall compares well with the EC system. The enzyme reactions can be represented by the elementary steps of the reaction such as bond breakage in addition to its physiochemical effects [35,36].

Again, using MOLMAP descriptors defining the difference between the products and the reactants from the KEGG database [35], an investiga- tion was carried out to automate an assignment of EC number using the sophisticated decision making algorithm, Random Forest.

Such information is essential for the annotation and association of the EC classification system to newly discovered proteins. However, one should consider another possibility where enzymes with very similar overall reac- tions (EC number) can have quite different mechanistic steps to effect the reaction [20]. For example, MACiE recorded six different mechanistic reac- tions for metallo-β-lactamase (M002 - Class A, M0015, M0016 & M0258 -

Class B, M0210 - Class D & M0257 - Class C) where they possess hydrolase reaction (EC 3.5.2.6). Another well studied example is the mechanistically diverse enolase superfamily [32,37].

2.2

Evolution of Enzyme Function

Enzymes are very adaptive by nature. Through the course of evolution, the scaffold of enzymes has improved their functional, and specifically catalytic, efficiency. The main driving force for these evolutionary advances in enzymes is the requirement of natural selection to improve molecular and cellular function, increasingly optimising both catalytic capability and regulation.

The ancestral proteins are thought to have had very broad specificity and performed multiple functions. With due course of time these enzymes

Chapter 2 2.2. Evolution of Enzyme Function

evolved exquisite catalytic properties to effect the reaction [10]. Broadly speaking, during evolution some enzyme superfamilies exhibit divergent cat- alytic activity, whereas others possess common mechanistic features such as a cofactor, mechanistic step or strategy as a point to evolve into a new func- tion. At some point, these features are shared during evolution to improve the quality of function.

The evolutionary strategies are concatenated to provide two scenarios which are broadly registered as using similar properties, either chemistry or substrate, to evolve enzyme function. The one where enzymes share similar catalytic residues but perform dissimilar catalysis is called the ‘chemistry driven scenario’. Conversely, in the ‘substrate driven scenario’, different cat- alytic residues are recruited to yield the same required product [38,39]. For example, pairs of enzymes in tryptophan and histidine biosynthesis provide two examples of substrate-driven evolution. The increasing understanding of chemical mechanism and its role of active site features will continue to enrich our understanding of molecular evolution.

Such scenarios have suggested the possibility of proteins sharing com- mon function with completely different structures. One striking example is that of the Ser-His-Asp catalytic triad [40], which is very commonly found in a number of folds that have no significant sequence or structural similar- ity. Another example is functional convergence found in antifreeze protein (AFP, also known as thermal hysteresis proteins) [41]: they have a dissim- ilar sequence in plants and fish, but perform the same function, producing a difference between the freezing and melting points by depressing the non- equilibrium freezing point.

This phenomenon is quite common and often occurs to preserve the overall function of protein [42]. Understanding how the enzyme function evolved is vital to get insight for annotation, function prediction, and protein engineering [38, 43]. Also, this is one of the most intriguing problems in molecular biology: to understand the vast diversity of protein function.

2.2.1 More Definitions Suggesting Evolution Strategy

Evolutionary evidence supports the idea that the computational representa- tion of enzyme function should include structural elements which deliver cat- alytic ability. This is especially so in cases where enzymes perform different overall functions by utilising similar mechanistic steps. Such understanding

Chapter 2 2.2. Evolution of Enzyme Function

will aid our ability to predict the function of newly sequenced enzymes and in efforts to engineer new functions into existing enzymes.

As we mentioned in the previous section, one of the reasons for the false assignment of function to a novel enzyme is due to mechanistically di- verse enzymes. A mechanistically diverse enzyme [44–46] superfamily is a set of enzymes that utilize common mechanistic attributes, such as mech- anistic steps, to catalyse different reactions. An example supporting this scenario is phosphoglucomutase (MACiE: M0194; EC 5.4.2.8) and phospho- noacetaldehyde hydrolases (MACiE: M0181; EC 3.11.1.1). Another well studied example is the pentein superfamily (CATH 2.60.40.1700) [47] which are functionally diverse proteins grouped together based on similarity at structural fold level β/α. That includes enzymes that modify guanidines.

The enzymes in this superfamily participate in diverse biological roles in- cluding gene regulation, translation and signalling. Assigning structure and function to penteins is difficult due to low sequence similarity between mem- bers of this superfamily.

Another definition that is preferred for classification of enzyme function is ‘functionally distinct enzymes’. ‘Functionally distinct enzymes’ are groups of divergently evolved enzymes which perform different overall reactions and for which no common mechanistic steps are found to complete the reaction [45, 48].

2.2.2 Biostatistics to Study Evolution

Understanding how well enzyme function adapts its nature is still a challeng- ing task in molecular biology. To understand evolutionary trends of proteins, the biological data can be represented quantitatively to study the overall trend. The quantification of the relationships between various genomic and molecular variables are termed as ‘quantitative evolutionary genomics’ [49]. Quantitative evolutionary genomics has helped to understand depen- dency between structure and function [50]. For example, Log-normal dis- tribution shows the global trend of evolution rates between orthologous genes [49] and Power-law like distribution [49, 51] represents a membership in paralogous gene families. A power-law-like distribution shows that a few parts occur many times and most occur infrequently.