Protein digest separation using gradient UHP-RPLC-MS

CHAPTER 3: Off-line LC x UHPLC-MS separations of E coli proteins

4.2 Experimental methods

4.2.6 Protein digest separation using gradient UHP-RPLC-MS

Like the intact protein separations, the separations of protein digests were also performed using gradient ultra-high pressure reversed phase liquid chromatography (UHP- RPLC). A second preloaded gradient UHPLC pump, identical to the one described previously, was used to inject the sample and generate the gradient. The column differed slightly from the one used for intact protein separations. It was a 50 μm ID capillary, 30 cm in length, packed with 1.5 μm bridged-ethyl hybrid silica particles modified with a C18 stationary phase. The particles have 145 Å pores, which are suited for use with smaller molecules such as peptides. The column was operated at a pressure of approximately 1,600 bar (23,000psi); the flow rate through the column was around 150 nL/min. Since the UHPLC pump was operated at 4 μL/min, the split ratio was approximately 26:1.

134

followed by a 5 minute hold at 50%B, and then returned to initial conditions in 1 minute. The CapLC autosampler loaded 1 μL of sample onto the gradient storage loop for each run. Based on the split ratio, the volume of sample injected onto the column was approximately 40 nL.

4.2.6.2MS instrumentation and run conditions

The outlet of the reversed-phase capillary column was coupled to a fused silica nano- electrospray emitter using the same Waters Universal Nanoflow Sprayer described in section 4.2.5.2. Due to the reduced flow rate as compared to the column used for protein

separations, it was necessary to use a pulled-tip emitter as opposed to the blunt-ended

NanoEase emitter. Thus, a fused silica Pico-Tip emitter was purchased from New Objective (Woburn, MA). This nano-electrospray emitter has an inner diameter of 20 μm, which tapers to 10 μm at the tip. It does not have an electrically conductive coating, since the voltage is still applied through the metal junction of the sprayer.

On-line positive ion mode electrospray - time of flight MS and MS/MS were performed using a Waters Q-TOF micro. All mass spectra were acquired using a capillary voltage of +2000 V, a sample cone voltage of +30 V, and an extraction cone voltage of +2 V, with the source temperature set at 100 ºC. For the first set of runs of the protein digest fractions, peptide mass fingerprinting (PMF) was used. Therefore, the instrument was operated in TOF-only mode, with the quadrupole set to allow all ions to pass to the TOF mass analyzer. Mass spectra were acquired at a frequency of 1.67 Hz over a mass range from 450-1600 Da for the duration of the runs.

The protein digest fractions were separated and analyzed a second time, using a data- directed analysis (DDA) method obtain MS/MS spectra of the peptides as they eluted from the column. The workflow of the DDA experiment is as follows. Initially, the mass

spectrometer runs a survey scan over the mass range from 450-1600 Da, in which the quadrupole is set to pass all ions to the TOF analyzer and the collision energy is set at its default low-energy value (7 V) so that ions are not fragmented. If no peaks above a threshold of 25 counts/second are detected, the instrument continues running survey scans at a

frequency of 0.91 Hz. When the intensity of a peak exceeds a set threshold, the instrument switches to MS/MS mode. This sets the quadrupole to allow only ions with a m/z that matches the above-threshold peak to pass to the TOF analyzer. The collision energy is increased in order to induce fragmentation, and the resulting fragment ions between 200- 1800 Da are detected. In order to obtain good quality MS/MS spectra, the collision energy is stepped through four different voltages, each with a scan time lasting 1.1 seconds, including a 0.1 second inter-scan delay time. The precise values of the collision energy depend on the m/z of the precursor ion, and are set using a collision energy profile table (shown in Table 4-1). After all MS/MS scans are complete (4.4 seconds after initiating the first scan), the instrument returns to performing survey scans, and continues to do so until another peak exceeds the threshold. The instrument is set to be able to trigger MS/MS on two precursor ions simultaneously. This allows data to be obtained on two co-eluting peptides. In order to avoid acquiring redundant data, the instrument is programmed not to switch to MS/MS mode on any peaks already analyzed within the last 60 seconds. To prevent MS/MS from being triggered on isotope peaks of the same component, a mass range of +/- 2.3 Da from the main peak is also excluded for the same period of time.

4.2.7 Data analysis

136

peptide database searching. Then, it was possible to consider the data from the experiment as a whole, by attempting to match intact protein masses with protein identities.

4.2.7.1Intact protein data workup

Data workup for the intact protein separations was performed using the methods described in Section 3.2.7. Maximum entropy de-convolution was performed on all mass spectra within the LC-MS chromatograms using AutoME. Parameters for the de-convolution were set as follows: the time segment width was 6 seconds, the output spectrum resolution was 1Da, the range of masses in the output spectrum was 5,000-80,000 Da, and the

maximum number of iterations was 50. 2D chromatograms and protein peak lists were both generated from the AutoME-processed data.

4.2.7.2Protein digest data workup

LC-MS data from the runs of the trypsin-digested fractions were collected and analyzed using two different methods: peptide mass fingerprinting (PMF) and data directed analysis (DDA) MS/MS. A different data workup method is necessary for each of these methods. In order to identify proteins from the PMF data, peak lists were generated which contain the mass and charge of all components in each chromatogram above a user-defined threshold. Although this can be performed manually, typically an add-on package to

MassLynx 4.0 called ProteinLynx was used to generate the peaks lists. ProteinLynx uses an algorithm which detects peaks, recognizes their charge state via a de-convolution routine known as MaxEnt3, and records the information in a peak list file. Once the lists were generated, the web-based database search program Aldente

(http://www.expasy.org/tools/aldente/) was used to match the masses with those of peptides that would be expected from tryptic digests of known E. coli proteins listed in the Swiss-Prot protein knowledgebase (http://us.expasy.org/sprot/). Numerous parameters can be specified

to define the criteria which Aldente uses to perform the search. A detailed list of the search parameters used for this experiment is provided in Table 4-2. Based on the output of the search, a list of proteins possibly present in the sample was produced. Aldente assigns each protein a score which indicates the probability that the identification is correct.

The runs performed using DDA MS/MS were analyzed up using ProteinLynx Global Server 2 (PLGS2, Waters Corp.), which is a software package that automates processing of LC-MS data for proteomics. Unlike Aldente, which requires the user to generate a peak list, PLGS2 allows LC-MS/MS data acquired in MassLynx to be loaded directly into the

program. Generally, data analysis using PLGS2 is performed with two components: the data preparation tool and workflow designer. The data preparation tool automates noise

reduction, de-isotoping, and centroiding of the mass spectra acquired during the runs. In essence, this generates data equivalent to the peak list used by Aldente. The workflow designer allows automation of databank search queries using the processed mass spectra. A variety of parameters must be specified; they are shown in Table 4-3. As with Aldente, data were searched against the Swiss-Prot E. coli database, and the end result was a list of proteins which were identified as being present in the sample. Each protein is assigned a probability score, along with other statistics.

4.2.7.3Comparison of intact protein and peptide data

The intact protein and peptide data analyses generated two lists: one with intact protein masses, and the other with the names of proteins identified and a predicted mass for each protein. It was then necessary to correlate one list with the other and thereby associate intact protein masses with a protein identity. In some cases, this was as simple as comparing

138

was apparent, but the difference between the observed mass of a protein and a peptide- predicted mass was within several hundred Daltons, the mass difference was checked against the Delta Mass database of common post-translational modifications

(http://www.abrf.org/index.cfm/dm.home). If a logical match was found, it was recorded as a possible PTM of the identified protein.

In document Multidimensional liquid chromatography coupled to mass spectrometry for the analysis of complex mixtures of proteins (Page 156-161)