Analytical proteomics - General introduction

Chapter I General introduction

I.1. Proteomics

I.1.3. Analytical proteomics

The complex nature of the proteome demands the use of different analytical technologies to obtain the global picture of the cellular state [18]. Most proteomics workflow use MS for protein profiling through a bottom-up approach. However, MS analysis of complex biological samples generates incomprehensible data due to the enormous amount of proteins present in different abundance levels with a dynamic range higher than 5 orders of magnitude, and in different forms with diverse post- translational modifications [19]. Therefore, a reduction in the complexity of these protein mixtures is crucial to obtain good and reliable MS results. This is generally achieved through fractionation and separation techniques such as: two-dimensional gel electrophoresis (2DE) and liquid chromatography (LC) [19-24].

I.1.3.1. Two-dimensional gel electrophoresis

Electrophoresis is defined as the movement of charged molecules under an electrical field towards the opposite charged electrode. Due to their varying charges and masses, different molecules move with different velocities and became separated into single fractions [25].

During the first years, most proteomics studies relied on 2DE for protein separation from complex samples [22, 26]. Yet, this technology had been developed years before by 3 independent scientists. O’Farrel reported in 1975 the development of a high resolution and sensitive technique for the analysis of complex biological samples, which he successfully used to resolve a great number of proteins (1100) from a complex E. coli sample, combining separation by isoelectric focusing in the first dimension with sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) in the second dimension [27]. During the same year, Klose reported the protein mapping of mouse tissues with high resolution and reproducibility by a similar methodology [28]; and Sheele used slab gel isoelectric focusing and a gradient SDS-PAGE for the characterization of secreted proteins from a guinea pig pancreas [29]. The fundamental concept of this technique is the separation of complex mixtures based on two independent protein characteristics: the isoelectric point (pI) and the molecular mass (MW).

Proteins are first separated according to their net charge by isoelectric focusing (IEF). In the second dimension, proteins are separated by SDS-PAGE according to the molecular mass [27-29].

In the first procedures developed for two-dimensional polyacrylamide gel electrophoresis (2D-PAGE), the IEF was performed in polyacrylamide tube gels with pH gradients produced by amphoteric compounds in liquid form, carrier ampholytes (CA). When an electrical field is applied to these small ionisable molecules they migrate according to their isoelectric point and a pH gradient is generated [30]. However, the lack of IEF reproducibility due to (i) the instability of the pH gradient; (ii) the batch-to-batch variability of the CA mixtures; and (iii) the technical difficulties in transferring the proteins from the tube gel to the second dimension SDS-PAGE gel, hampered the exchange of 2D- PAGE data between different laboratories and the widespread use of this technique [11, 19]. Meanwhile, some technical limitations were overcome by the introduction of immobilized pH gradients (IPG) for IEF [31], allowing higher resolution in the separation process, higher loading and buffering capacity, and improved reproducibility of the 2D maps [32]. The original IPGs used acidic or basic buffering groups (Immobiline) covalently linked to the polyacrylamide gel to generate the desired pH gradient, between pH 3 and 10 [31]. Currently, in most 2D-PAGE experiments the IEF is performed with commercially available IPG strips in variable lengths and with different pH ranges [33].

With the current 2D-PAGE methodologies thousands of proteins can be resolved and visualized at the same time on a single 2D gel, providing a global view of the proteome at a particular time. Important information on the molecular mass and isoelectric point of the proteins can be obtained, and relative quantitation can be performed by gel comparison. In addition, proteins with post-translational modifications can be easily detected and image analysis can be used to select specific protein spots for MS analysis. Last but not least, this is a relatively low-cost technology accessible to most proteomics laboratories [19, 33]. Nevertheless, this separation tool has some important limitations: (i) it is labor intensive; (ii) has low throughput and reproducibility issues; (iii) most hydrophobic and membrane proteins are difficult to analyze by IEF; (iv) co-migrating spots can produce incorrect quantitative measurements; (v) low-abundance proteins are under-represented and most of the time they cannot be detected due to the limited detection range of the staining reagents used for protein visualization [19, 33-35]. Despite these limitations, 2D-PAGE is still an essential tool for protein separation in many proteomics studies and projects [36].

The introduction of narrow range IPGs and sample pre-fractionation techniques improved 2D-PAGE resolution and the ability to detect low-abundance proteins [33, 37]. While the wide range IPGs (pH 3- 10) provide a general overview of the proteome, narrow range IPGs generally cover 1 pH unit per strip (e.g. pH 4-5; pH 5-6) and can seriously increase the resolution of the 2D-PAGE separations [30, 37, 38]. Sample pre-fractionation techniques are also used to improve the representation of low abundant

9 proteins in 2D-PAGE through the simplification of crude samples. There are different pre- fractionation methods available and their application depends on type of sample [1, 36, 39, 40]. Subcellular fractionation [41-43], differential solubilization [44-47], chromatography techniques [48- 51], continuous free-flow electrophoresis (FFE) [52-54], laser capture microdissection (LCM) [55-57], the Rotofor system [58, 59], high-abundance protein depletion [60, 61] and the Protein Equalizer™ Technology [62, 63] (disclaimer: specific company, products and equipment names are given to provide useful information; their mention does not imply recommendation or endorsement by the author) are only a few examples among the different pre-fractionation methods available. Nevertheless, it must be stressed that the pre-fractionation approaches have some reproducibility issues and additional variability might be introduced in the analysis [37].

I.1.3.2. Liquid chromatography separation – shotgun proteomics

Proteomics methodologies based on liquid chromatography separations, also known as “shotgun” proteomics, were developed as an alternative to the established 2D-PAGE approach, which has important limitations in the analysis of membrane proteins, proteins with extreme pIs and low abundance proteins [64-67]. The general procedure in shotgun proteomics is based on the enzymatic digestion of a complex mixture of proteins into a mixture of peptides, which is then separated by high- performance liquid chromatography (HPLC) and analyzed by tandem mass spectrometry (MS/MS) [68, 69]. However, the complexity of the peptide mixture formed after protein digestion is so great that it cannot be completely resolved in a single chromatography run before the MS analysis. This limitation produces reproducibility problems and a reduction on the number of identified proteins, due to the high amount of co-eluting peptides that enter into the mass spectrometer at a rate exceeding the rate of the MS/MS analysis [65, 68]. This problem was overcome by the introduction of multidimensional liquid chromatography (MDLC) [64-66, 68-71]. MDLC separations rely on coupling two or more chromatographic separation dimensions to increase (i) the peak capacity, i.e. separation resolving power; (ii) the dynamic concentration range and sensitivity to detect low abundance proteins; and (iii) sample throughput [65, 70]. There are many MDLC approaches, with different combinations of chromatographic techniques for peptide separation, but most procedures use reverse phase (RP) liquid chromatography as the last separation dimension. This is due to the efficiency in the resolution of complex mixtures of peptides and the compatibility of the solvent systems used for peptide elution and the electrospray ionization for mass spectrometry [64, 65].

One of the most common MDLC approaches is the Multidimensional Protein Identification Technology (MudPIT), developed in late 1990’s at the Yates group [72-74]. This approach was successfully used to study the yeast proteome, allowing the identification of 1484 proteins in a single run [73]. Briefly, this technology is based on a two-dimensional chromatographic separation in a single biphasic microcapillary column packed with a strong cation-exchange (SCX) resin, and a C18

reverse phase. In the first dimension the separation is based on the electrostatic interactions between different charged peptides and the SCX resin, and in the second dimension the separation occurs due to the hydrophobic interactions between peptides and RP packing material. First, the acidified peptide mixture is loaded into the SCX phase and eluted with increasing salt gradient buffers to the RP part of the column. Then, the salts are washed off with a specific buffer, and the peptides are eluted with a reversed-phase gradient buffer directly into the MS analyzer. The end of the capillary column is tapered and is also used as the electrospray needle for peptide ionization. Finally, the MS/MS spectra obtained for the different peptides is matched against mass spectrometry databases derived from the genome of the organism being studied by bioinformatic algorithms [73, 74]. As an alternative to the described biphasic columns, triphasic columns were introduced for MudPIT analysis [75]. These columns, packed with an extra RP material prior to the SCX phase, allow the online analysis of high salt concentration samples without previous offline desalting.

According to Wolters et al., the MudPIT technology has a dynamic range of 10 000:1 and is suitable for the analysis of a great variety of proteins: low abundance proteins, proteins with extreme pI or MW, hydrophobic proteins and membrane proteins [74]. Nevertheless, MudPIT presents also some limitations, such as: (i) the processing time of a biological sample can be as high as 30 h; (ii) experimental costs associated with the analysis time and the high grade solvents used; (iii) column clogging due to sample impurities; and (iv) peptide co-elution, which hampers correct protein identification and decreases reproducibility [69, 76, 77]. Even though, MudPIT is a valuable and important tool for proteomics studies and has been used in different works: proteome analysis of different bacteria, plants, membrane proteins studies and quantitative experiments [76].

MDLC was developed to overcome some important limitations of the 2D-PAGE methodology. However, these technologies are not competitors; instead, they are complementary tools which provide different but important information to understand complex proteome systems. Table I.1 summarizes the main differences, the advantages and the limitations of the MudPIT and 2D-PAGE methodologies.

In document Development of new methodologies in sample treatment for proteomics workflow based on enzymatic probe sonication technology and mass spectrometry (Page 39-42)