BRENDA Analysis - Impact Analysis Evaluation

5.6 Impact Analysis Evaluation

5.6.2 BRENDA Analysis

We believe that the reference BRENDA data is not complete and it can not show the performance of our system clearly. Thus, we manually annotated 40 full-text PubMed documents and assess the quality of our manually annotated data against the reference BRENDA data. Additionally, we evaluated the BRENDA data against our manually annotated corpus to demonstrate that the BRENDA data is incomplete.

Manual against reference BRENDA Data Evaluation. We assessed our 40 manually annotated documents against the reference BRENDA data;

the results are summarized in Table 34. In this task, the gold standard is

the BRENDA data and the test data is our manually annotated corpus. Table 34: Manual Evaluation against BRENDA data

Precision Recall F-Measure 71.6% 88.9% 79.3%

As discussed earlier in Section 5.6.3, impact mentions in tables are

them. Furthermore, some erroneous entries exist in the database. These

cases are reﬂected in the Recall section of Table 34. A precision of 71.6%

against the manual annotations shows that there are mentions of impacts missing in the database.

BRENDA against Manual Data Evaluation. The BRENDA data is also compared against the manually annotated documents; the results are

shown in Table35. In this task, our manually annotated corpus is consid-

ered as a gold standard and the BRENDA data is the test data.

Table 35: BRENDA Data Evaluation against Manual Annotations

Precision Recall F-Measure 79.7% 61.6% 69.5%

The recall of 61.6% against our manually annotated data indicates that many mentions of impacts are missing from the BRENDA data, and our system can help to complete the BRENDA data. The precision reﬂects the erroneous entries in the BRENDA data and the entries from the graphical diagrams that we do not manually annotate.

5.6.3 Impact Grounding

The results generated by the system can proven to be valuable by comparing them to a publically available gold standard. As shown earlier in the

preparation of the BRENDA data (see Section 5.1.3), each mutation is

retrieved together with the grounded impact. We use this data to test our grounding task. First, the system’s output of all detected impacts with their associated mutations is collected; then, using the queried BRENDA data, we test whether for a speciﬁc mutation an impact is detected.

In the following sections, we assess the performance of our developed system in grounding impacts against the BRENDA and our manually annotated corpus.

A true positive link represents a correctly identiﬁed association between a correctly detected impact and its mutation. Conversely, a true negative is the correct impact associated to a false mutation. Since the grounding

of impacts is highly dependent on the mutations detected in the text, we check the effectiveness of the impact extraction with respect to the result of the two mutation detection systems, Mutation Miner and MutationFinder.

System vs. BRENDA Results. We investigated the performance of the mutation-impact grounding against the curated BRENDA data, and sum-

marize the results of Mutation Miner and MutationFinder in Table 36.

Table 36: Mutation-Impact Relation Evaluation against the reference BRENDA data

Mutation-Impact Relation Evaluation against BRENDA–Mutation Miner #Documents Precision Recall F-Measure

100 57.5% 84.2% 68.3%

Mutation-Impact Relation Evaluation against BRENDA–MutationFinder #Documents Precision Recall F-Measure

100 57% 82.5% 67.4%

System vs. Manual Results. The evaluation of the semantic assignment of impacts to mutations has also been performed on the Xylanase corpus

(Table 37). Here, the mutations are extracted by Mutation Miner.

Table 37: Impact Grounding Evaluation on the Xylanase corpus– Mutation Miner

Impact Grounding Evaluation– Mutation Miner

Accuracy 75.7%

The performance of our system on our manually annotated corpus of 40

documents is assessed and the results are summarized in Table 38.

Table 38: Impact Grounding Evaluation on 40 manually annotated documents

Impact Grounding Evaluation– Mutation Miner

Accuracy 71.7%

Impact Grounding Evaluation– MutationFinder

Discussion. As mentioned earlier, graphical diagrams are converted to indistinct textual blocks and since we do not analyse tables, the mutations with their impacts reported in tables are not grounded correctly by our

system; this is reﬂected in the recall of our results in Table 36. For

example, in documents PMIDs 12702265 and 15152005, the mutations are indexed with their impacts on their kinetic properties in tables. There are also mutation mentions that are not detected by Mutation Miner and MutationFinder, such as K5S/K6S.

Some erroneous impacts are reported in the BRENDA data, such as

H154F in document PMID 12205101 and N275E in PMID 10544015, where

the reported mutations do not exist in the document.

False negatives of the impact grounding are mainly due to the use of pronominal and nominal references. For example, use of all four R277

mutant proteins in PMID 10955993 for expressing the impact: The rate of reactivity toward oxygen was unaffected.

Also, consider this example:

Mutation of the residues Ala-200, Leu-203 or Gly-204decreasesall kinetic parameterssigniﬁcantly, suggesting that these amino acids are essential

for the binding of the pyrophosphate moiety of the coenzyme.a

a_{Excerpt from The three zinc-containing alcohol dehydrogenases from}

baker’s yeast, Saccharomyces cerevisiae, PMID: 12702265

The mutations, Gly204Ala and Ala200 :Ala201Leu, are introduced

earlier in the article in tables (Ala200 refers to an insertion mutation).

Expressing mutations in natural language also adds more complexity to the grounding task:

. . . the introduction of an additional mutation of F241L to this Q137M

mutantagain converts it into thecold-sensitiveform [16].a

a_{Excerpt from Structural determinant for cold inactivation of rodent L-}

xylulose reductase, PMID: 12890481

In the above example, an additional mutation of F241L to this Q137M

mutant expresses a mutation series, F241L/Q137M, where our system fails

Document PMID 8519804 reports on the impacts of 14 mutations:

Within this collection, 14 mutants had single amino-acid changes that were divided into 4 groups: (a) amino-acid changes associated with proposed lig- ands to Zn2; (b) a substitution of one of several conserved glycine residues; (c) mutations at the substrate or coenzyme binding site; (d) alterations that resulted in a change of charge near the active site.a

a_{Excerpt from Functional analysis of E. coli threonine dehydrogenase by}

means of mutant isolation and characterization, PMID: 8519804

Our system detects the impacts of the 14 mutants, but the BRENDA data just reports the mutation C38S, which does not even exist in the document: All these cases result in false positives that explains our system’s low precision on the BRENDA data.

In our manually annotated corpora, some impacts cannot be grounded to a speciﬁc mutation, for example:

Moreover, disulﬁde bonds have usually been introduced by substituting more than two amino acid residues per monomer. These mutants generally

have multivalent disulﬁde-bond formations resulting indecreasedﬂexibility

of the quaternary structure.a

a_{Excerpt from Stabilization of quaternary structure of water-soluble quino-}

protein glucose dehydrogenase, PMID: 12746550

In the above example, substituting more than two amino acid residues

per monomer refers to ambiguous mutations that affect the ﬂexibility of

the quaternary structure, which still expresses an impact that can not be grounded. These are considered as erroneous grounding in our evaluation.

Ambiguous mentions of impacts are another source of false positives. Consider the following example:

Six mutant proteins (H86A/E/F/K/Q/W) were produced, puriﬁed and characterized. The six mutations reduced the afﬁnity of XlnA towards xylan without having any major effect on the catalytic constant. All these mutations also lowered the pKa of the acid-base catalyst by 0.46-1.94 pH units. The mutations decreased the enzyme stability at 60°C by up to 95% and the transition temperature by 2.2-5.8°C. Unfolding of the

protein with guanidine hydrochloride (Gdn-HCl) showed thatﬁve out of six

mutations decreasedthe concentration required to denature 50% of the

XlnA, conﬁrming the importance of H86 for the stability of the enzyme.a

a_{Excerpt from Site-directed mutagenesis study of a conserved residue in}

family 10 glycanases: histidine 86 of xylanase A from Streptomyces lividans,

PMID: 9681873

“Five out of six mutations decreased the concentration”, however, which 5 mutations is not speciﬁed.

Some mutations are not detected by the external mutation tagging systems. Consider the following example:

Asp37 of xylanaseC was replaced with asparagineand other residues by sitedirected mutagenesis. Analyses of the wild-type and mutantenzymes showed that Asp37 is important for high enzyme activity at low pH. In

the case of the asparagine mutant,the optimum pH shifted to 5.0 and

the maximum speciﬁc activitydecreasedto about15%of that of the wild-

typeenzyme.a

a_{Excerpt from Crystallographic and mutational analyses of an extremely}

acidophilic and acid-stable xylanase: biased distribution of acidic residues and importance of Asp37 for catalysis at low pH., PMID: 9930661

The above mutation, D37N, was not detected by Mutation Miner or MutationFinder. Thus, our system grounded the impacts, the optimum pH

shifted to 5.0 and the maximum speciﬁc activity decreased to about 15%,

incorrectly.

Mutation Miner does not check that the wild-type residue should be different from the mutant one, and detects some erroneous mutations, resulting in faulty grounding.

In document Automated Extraction of Protein Mutation Impacts from the Biomedical Literature (Page 115-121)