7.1 Introduction
7.2.1 Label-free quantitative proteomics analysis
Label-free quantitative proteomics was employed in this project to study the aucubin
responses in Staphylococcus aureus. Although 2D-PAGE has some significant disadvantages
such as the low load ability, difficulty in separation of hydrophobic proteins, and low reproducibility, the initial aim of the study was to apply the two-dimension gel electrophoresis to quantify only differentially expressed proteins in bacteria, and detect the most abundant proteins in control and treated samples. The reason for this is we tried to stay within the budget limit assigned for this project, as the MS analysis is too expensive for the total protein quantitation. Unfortunately, after many trials with 2D gel electrophoresis, we could not achieve the reproducibility, there was a lot of vertical and some horizontal streaking, even when the 2D cleanup kit was used. An example of the 2D-PAGE gels conducted in this study is available in the appendices. To tackle this problem, total bacterial protein quantification was employed instead. 1D SDS-PAGE was applied for protein separation (Figure 7.1). Bands were excised into a fresh siliconised Eppendorf tube with a fresh, clean scalpel blade, followed by In-Gel trypsin digestion (see chapter 3/section 3.2.4.5).
162
Figure 7.1: Image of 1D-SDS-PAGE separation of Staphylococcus aureus (MSSA) total
proteins in two conditions; untreated cells, and treated cells with aucubin.
Protein samples were solubilised using 2 x Final Sample Buffer then heated at 95ºC for 10 min and allowed to cool to room temperature prior to loading. 20 μl of protein marker was used. Separation was performed using 12 % polyacrylamide resolving gels and 4 % stacking gel. SE600 Standard Dual Cooled Vertical Unit (GE Healthcare) was used at 50 v for 15min then followed by 200 v. The gel was stained with Coomassie Brilliant Blue R-250 (BioRad) and destined with destining solution. The gel was scanned using GS-800 Calibrated Densitometer (Bio- Rad) and Quantity One image analyser software was used.
Standard shotgun proteomics (bottom-up proteomics) technique were used to analyse the samples. The resulting peptides mixture, after tryptic digestion, were processed on ultra-high- pressure liquid chromatography (uHPLC) coupled with mass spectrometry (MS) system (BioMicS, University of Sheffield, UK), (See chapter 3/section 3.2.4.6). Data-Dependent Acquisition (DDA) approach were applied in this study. The produced MS2 spectra were further analysed using Progenesis QI for proteomics software (v 4.1 with a trial license generously provided by Nonlinear Dynamics, Waters, UK). The raw data files were loaded into the software followed by automatic alignment using the default parameters as recommended by the developer
(Di Luca et al., 2015). The software processed the raw data where each sample run was
subjected to alignment, which involved aligning the data based on the LC retention time of each sample. This allows for any drift in retention time giving an adjusted retention time for all runs in the analysis. The sample run that yielded most features (i.e. peptide ions) was used as the
163 reference run, to which retention time of all of the other runs were aligned and peak intensities were normalised. Fractions were recombined at the end of the analysis (Figure 7.2).
Figure 7.2: Graphical views of the treated sample alignment used in Progenesis QI for
proteomics software. (A) Ion intensity map, and (B) Total Ion chromatogram shows the automatic alignment (green) overlaid on the reference chromatogram (magenta).
The resulting spectra were searched with Mascot engine (Matrix Science) against the Swissprot database for peak identification, followed by protein identification and quantification. Two different conditions of total bacterial proteome were analysed: untreated and treated group. Principal component analysis (PCA), was performed on Progenesis QI for proteomics software in order to observe the variation between sample groups (figure 7.3).
164
Figure 7.3: Biplot of Principal Component Analysis (PCA) analysis of MSSA proteome data
sets. Control group on the left (purple), and aucubin treated group on the right (Blue).
In this study, 827 and 795 quantifiable proteins were identified in Staphylococcus aureus
(MSSA) in the treated and control group, respectively. Among these proteins, 743 proteins are common in both groups, 84 proteins (10.15 %) are unique for aucubin treated MSSA, and 52 proteins (6.54 %) are unique for control group (Figure 7.4).
Figure 7.4: Venn diagram of proteome comparison on control sample of Staphylococcus aureus
165 Aligned runs were analysed to create a data set containing the information about all the signals identified in all sample runs. In order to compare relative protein level between control
and treated Staphylococcus aureus proteome, the relative abundance of a protein was calculated
after normalisation using Progenesis QI software. This was done by calculating the average of the integrated intensity of the most abundant peptides for each protein across all their peptide ions. In this study, all 1622 proteins were used to normalise the samples. Data was filtered
according to the p-value ≤0. 05 and fold changes higher/lower than ±2 fold in order to rely on
proteins that were confidently identified. A cutoff for at least three unique peptides per protein was applied (To ensure only high quality data contributed to the analysis). Using these criteria, results showed only 74 proteins were differentially expressed in aucubin treated MSSA compared with untreated, 47 of which were significantly upregulated and 26 were significantly downregulated (Full list of up/down regulated proteins is available in the appendixes).
Furthermore, in order to obtain a general figure of the distribution of the differentially
expressed proteins in the aucubin-treated Staphylococcus aureus (MSSA) cells, enrichment
gene ontology (GO) analysis was undertaken through PANTHER (Protein Analysis Through
Evolutionary Relationships) classification systems (Huaiyu et al.,2015), using the list of the 74
differentially expressed proteins. Enrichment Gen ontology is an annotation database, where standardised term are grouped under three main ontologies; cellular component, biological process and molecular function. The ontology terms are assigned to individual proteins by collaboration with numerous databases. The gen ontology annotation database can easily be used to identify overrepresentation of proteins set to obtain initial insights in the sample
characterisation (Burge et al., 2012). The analysis classified the differentially expressed proteins
into several categories (Figure 7.5).
According to the protein and gene annotation database available in PANTHER classification system, the molecular function classification (Figure 7.5 A) showed that the majority of the identified up/downregulated proteins (59.6 %) were categorised as enzyme- related proteins and shown to have a catalytic activity which are involved in all aspects of cell metabolism, this includes the digestion of the large nutrient molecules (such as proteins, carbohydrates, and fats), the conservation and transformation of chemical energy, and the construction of cellular macromolecules from smaller precursors (Stephen & Hammes, 2003).
166
Figure 7.5: Gene ontology (GO) of the differentially expressed proteins in MSSA. 74 proteins
were inputted into the PANTHER protein classification software, in order to determine the GO terms for (A) molecular function, (B) biological processes and (C) sub-cellular localisation. Pie charts are representations of the percentage of proteins within each classification.
167 An example of these enzymes, which were overexpressed in aucubin-treated MSSA are; hydrolase (enzymes catalysing hydrolysis of a variety of bonds, such as esters, glycosides, or peptides), dehydrogenase (enzyme that oxidizes a substrate by transferring hydrogen to an acceptor that is either NAD/NADP or a flavin enzyme.), glycosyltransferase (enzyme that catalyses the transfer of a sugar (monosaccharide) unit from a sugar nucleotide derivative to a sugar or amino acid acceptor), transaminase (A subclass of enzymes that catalyse the transfer of an amino group from a donor (generally an amino acid) to an acceptor (generally 2 keto acid)), and ligase (A class of enzymes that catalyse the formation of a bond between two substrate molecules, coupled with the hydrolysis of a pyrophosphate bond in ATP or a similar energy donor). This high percentage (59.6 %) of catalytic enzymes gives a hint that the aucubin
might be involved in one or more of the metabolic process (cycles) in Staphylococcus aureus
(MSSA) cells pathways. Binding proteins have a total share of (21.3 %), which includes nucleic acid binding proteins such as ribosomal protein, isomerase, and G-protein. Translation regulator activity proteins represent (6.4 %), whilst structural molecules and transporter proteins represent (4.3 %), in each group, of all 74 differentially expressed proteins. Antioxidant activity and receptor activity proteins represent only (2.1 %) in each group.
By looking at the protein’s biological process classification results (Figure 7.5 B), the majority of differentially expressed proteins (67.4 %) related to bacterial metabolic process. A total of (19.6 %) is related to cellular processes proteins, an example of that is the aminoacyl- tRNA synthetase enzyme (aaRS), which is an enzyme that attaches the appropriate amino acid
onto its tRNAto form an aminoacyl-tRNA. Biological regulation and localisation represent (13
%) of the differentially expressed proteins in each of them. The distribution of cellular component proteins classification (Figure 7.5 C) showed that the differentially expressed proteins fall into only two categories. (63.6 %) classified as macromolecular proteins and (36.4 %) classified as cell part proteins. The general figure for the distribution of total bacterial
proteins, in the aucubin-treated Staphylococcus aureus (MSSA) and untreated (MSSA) samples,
are illustrated in figures (7.6 and 7.7). Enrichment gene ontology (GO) analysis was undertaken through the PANTHER classification systems and revealed that the majority of identified proteins in both groups were involved in the metabolic process of the cells, and mainly localised in the cytoplasm. Other proteins groups were distributed in all other parts of the cells with different functionality.
168
Figure 7.6: Gene ontology (GO) of total MSSA proteome in untreated sample. 795 proteins
were inputted into the PANTHER protein classification software, in order to determine the GO terms for (A) molecular function, (B) biological processes and (C) sub-cellular localisation. Pie charts are representations of the percentage of proteins within each classification.
169
Figure 7.7: Gene ontology (GO) of total MSSA proteome in treated sample. 795 proteins were
inputted into the PANTHER protein classification software, in order to determine the GO terms for (A) molecular function, (B) biological processes and (C) sub-cellular localisation. Pie charts are representations of the percentage of proteins within each classification.
170