• No results found

5 Aims and hypothesis

6.1 Systems biology and computational methodologies

Matrix Laboratory (MATLAB) software was the base platform used for computational methodologies here and as such is briefly described. All analyses including statistical analysis were undertaken using Minitab, Graphpad software and excel functions.

6.1.1 MATLAB

MATLAB Is a high performance language software for scientific computing. MATLAB integrates numerical computation, visualisation and programming where formalisms are expressed in mathematical notations using MATLAB commands. Toolboxes are comprehensive collections of MATLAB functions (M-Files) which use logical Boolean operators of AND, OR, NOT. Analysis is in the form of numerical matrices referred to as scalars (a single number represented by a 1 x 1 matrix) and vectors (a one dimensional array of numbers represented by n x 1 column vector, or a 1 x n vector of n elements). Matlab (v.8.30) was downloaded from Mathworks Inc. available at:

(https://www.mathworks.co.uk/downloads/web_downloads) and used for all software extension platforms; CellNetAnalyzer and Cytoscape both described in below and in section 4, literature review.

6.1.2 Generation of all p53 models

6.1.2.1 Retrieval of protein - protein interaction evidence using STRING database Tian et al. (2013), previously conducted extensive screening of over 30 protein – protein interaction databases, identifying the The Search Tool for the Retrieval of Interacting Genes(STRING) database as the most suitable for model generation (Tian K, PhD Thesis, University of Manchester, 2013). The STRING database is a protein to protein interaction database that integrates extensive information of protein links and interactions from four main sources described in detail in section 4 of this report. An example screenshot of STRING interaction evidence is shown in image 6, between p53 and MDM2.

Image 6 STRING interaction evidence for p53 – MDM2 derived from the four main sources

Screenshot example of interaction evidence derived from STRING, assigned by the confidence schema shown as combined score above (0.999).

A total of 81,000 interactions were extracted from STRING using ULTRAEDIT, a text editor programme capable of handling large datasets, available at:

(http://www.ultraedit.com/products). Interactions with a confidence scores over 0.7 a high confidence schema assigned by STRING were only considered. Key identifiers assigned by STRING were used for extraction of target data. For example, in STRING the identifier of p53 is ‘ENSP00000269305’. This identifier was input into ULTRAEDIT and filtered to direct interactions with p53 only. These interactions were considered as direct to p53. To close and complete all models, evidence was extracted from STRING to include interactions between other nodes. For example, considering two

interactions included in the p53 model linking directly to p53: p53 activates CDKN1A and p53 activates PRKG1. These are direct and referred to as the first layer of the model. However for interactions between other nodes (as an example here PRKG1 activates CDKN1A) were referred to as the second layer of the model. This was

performed for all nodes in the p53 models using the same confidence score of over 0.7. The nature of all interactions (post translational modification, activation, inhibition or binding) were also extracted. Activation and inhibition were the key interactions considered which refer to the state of gene transcription if a gene is transcribed or not

transcribed respectfully, or if protein was activated or inactivated by other means such as protein degradation, subcellular localisation or post - translational modifications.

Due to several issues of STRING reporting incorrect interactions (discussed in section 8, final discussion), all interactions extracted were additionally manually curated by literature evidence for model accuracy. This was undertaken initially utilising the STRING text mining tool which links to PUBMED. For example, image 6.1 depicts the STRING text mining evidence between p53 and MDM2 linking to the PUBMED

database. In addition to the STRING text mining tool, manual curation of interactions was also performed using other literature sources; Google, Google Scholar and additional PUBMED searches. Further validation for all interactions in all p53 models were also confirmed by supervisor to ensure accuracy of model. At least one

documented scientific publication was used for a confirmed interaction, some described by more.

Image 6.1 Predicted evidence from STRING text mining application for p53-MDM2

In the above example, text mining evidence shows that p53 induces (activates) MDM2 with corresponding link to the PUBMED source.

6.1.2.2 Construction of all p53 models

All p53 models were generated by expansion of the PKT206 model generated by Tian et al. (2013). The Boolean PKT206 model was also constructed using the STRING database. Briefly PKT206 comprises 783 interactions of inhibition, activation or

ambivalent factor amongst 203 internal nodes. An input of DNA damage and two outputs of cellular senescence and apoptosis were also considered in PKT206. For consistency we used the same methodology as applied by Tian et al. (2013) and extracted all protein information from the STRING database as described above. All confirmed interactions from manual curation and validation were incorporated into the relevant models (PMH260, PMH302 and Meso-PMH61). The Meso-PMH61 model was constructed by integration of nodes representing genes considered as important to malignant mesothelioma, obtained from Melaiu et al. (2015). For consistency we followed the same principle (6.1.2.1) and input the total 119 genes into the STRING (v9.1) database to extract information for mesothelioma genes that interact with genes in the PMH302 model.

6.1.2.3 Addition of biological outputs and input to the p53 models Three additional biological outputs were initially integrated into PMH260;

angiogenesis, cell cycle arrest and DNA repair, an additional input of hypoxia was further included into the PMH302 model. Dependant on their biological processes that they regulate, all nodes within the relevant models were linked to these three

additional outputs and one input by their edge function of inhibition, activation or ambivalent factor. This was primarily undertaken using the Gene Ontology (GO) database available at (http://geneontology.org/). The GO database provides a list of defined gene terms considering, biological processes, molecular function and cellular components. Additional literature search was also undertaken for model accuracy to confirm these interactions using; PUBMED, Google and Google Scholar. As PKT206 did not consider these additional 3 outputs (DNA repair, angiogenesis and cell cycle arrest) PKT206 internal nodes (n = 203) were also linked to their relevant outputs. All

additional nodes included to generate the larger three models were also linked to the PKT206 outputs of apoptosis and cellular senescence and to the input of DNA damage.

6.1.3 Cytoscape

Cytoscape available at (http://www.cytoscape.org)is an open source platform used with MATLAB for viewing and constructing interaction networks and biological

pathways. Cytoscape allows for integration of these networks with annotations, gene expression and other state data analyses. Cytoscape (v.3.1) was utilised for the p53 interactome models serving two main purposes; visualisation of all p53 models and functional analysis of PKT206 by application of STSFA described in 6.4.

6.1.3.1 Visualisation of networks

For visualisation of p53 models, tab. delimited files were constructed in accordance with Kline et al. (2007). A new network was declared and relevant files imported into Cytoscape. Source interaction, interaction type and target interaction were

subsequently declared generating the visual networks for all models. For example, source interaction may be defined as p53, target interaction as MDM2 and interaction type given as activation. The network / model style may be modified accordingly with the user’s requirement using the Vizmapper tool.

Related documents