Mass
spectrometry
based
proteomics
Zsuzsanna Darula Institute of Biochemistry
November 25, 2015
„Practice-oriented, student-friendly modernization of the biomedical education for strengthening the international competitiveness of the rural Hungarian universities”
Proteomics
: large‐scale study of proteinsProteome
: ‐the complete set of proteins‐expressed by a cell / tissue / organism ‐at a given time
Separation
/
fractionation
of
protein
samples
1.:
gel
electrophoresis
1D: SDS‐PAGE (size separation, limited resolution, low‐complexity samples)
2D: IEF + SDS‐PAGE
isoelectric point + size
resolving power: a few thousand proteins/gel
limitations: membrane proteins
highly acidic / basic proteins
very small/ large proteins
16‐BAC/SDS – PAGE
size‐separation in both dimensions, limited resolution
“Native gel” + SDS‐PAGE
analysis of protein complexes: isolation of complex in 1st dimension,
Type Abbr. Principle of separation
Size exclusion chromatography SEC Differences in size and shape
Ion exchange chromatography IEC Electrostatic interactions (pKa, pKb) Normal phase chromatography NPC
Polar interactions Hydrophilic interaction chromatography HILIC
Reversed phase chromatography RPC
Dispersive interactions Hydrophobic interaction chromatography HIC
Affinity chromatography AC Specific interactions
Separation
/
fractionation
of
protein
Identification
of
separated
proteins
1.
Edman
sequencing
→requires pure protein
→requires free protein N‐terminus (acetylation?) →~30 (max 50‐60) AA can be determined
→sample requirements: ≥10 pmol →slow (~ 1h / AA)
→ ~ isobaricAA’s can be distinguished (I/L; Q/K)
Amino Acid 1 AA 2 AA 3 AA 4 AA 5 REAGENT + Amino Acid 1 AA 2 AA 3 AA 4 AA 5 REAGENT Amino Acid 1 AA 2 AA 3 AA 4 AA 5 REAG ENT + H2N H2N HPLC O HN H2N R pH:8 H+, heat H2N Peptide N NH S R O Phenylthiohydantoin
amino acid‐reagent product
N C S Peptide reagent first amino acid unit + + Phenylisothiocyanate (PITC)
→sensitive →specific →(semi)quantitative →requires antibody →foreknowledge of protein of interest is required →expensive
Identification
of
separated
proteins
2.
Identification
of
separated
proteins
3.
Mass
Spectrometry
(MS)
→ Sensitive (fmol/amol range) → Quick → No antibody or external standards required → Amenable with mixtures blocked peptides modified peptides (PTM analysis)Mass
Spectrometry
(MS)
Determination of m/z value of gas‐phase ions (mass‐to‐charge ration, m: mass, z: charge)
signal
ION SOURCE MASS ANALYZER DETECTOR
sample VACUUM SYSTEM Int. m/z 400 600 800 1000 spectrum
Ion
source
generation of gas‐phase ions
“soft” ionization techniques
MALDI: matrix assisted laser desorption ionization, singly charged ions
ESI: electrospray ionization, multiply charged ions
Mass
analyzer
separation of ions according to m/z
defines performance of the mass spectrometer
Sensitivity Resolution
Mass accuracy (absolute / relative (ppm!!!))
Linear dynamic range
Speed
Mass range
analyzers used in proteomics: Quadrupole, Ion trap, flight tube (TOF), FT‐ICR,
R=500 R=1000
R=2000 R=10000
Element Mass number Natural occurrence % C 1213 991 H 12 99.990.01 O 16 17 18 99.76 0.04 0.2 N 1415 99.60.4 S 32 33 34 36 94.93 0.76 4.29 0.02 3000 2500 2000 1500 1000 500 0 1571 1570 1569 1568 1567 1567.69 Monoisotopic mass only 12C, 1H, 14N, 16O (and 32S) 1 x 13C (and …) 2 x 13C (and …)
Monoisotopic peak is ALWAYS the first in the isotope cluster
but not necessarily the most abundant!
m/z: 300 m/z: 1800
m/z: 5000 m/z: 3000
Δ m/z: 1 z=1 Δ m/z: 1/2 z=2 Δ m/z: 1/3 z=1
Isotope spacing enables charge determination
1 4 0 8 1 2 _ 4 8 _ 6 P M #1 9 0 7 R T : 1 3 .7 3 A V : 1 N L : 1 .6 7 E 5 T : F T M S + p N S I F u ll m s [3 8 0 .0 0 - 1 4 0 0 .0 0 ] 5 1 9 5 2 0 5 2 1 5 2 2 5 2 3 5 2 4 5 2 5 5 2 6 5 2 7 5 2 8 5 2 9 5 3 0 5 3 1 5 3 2 m /z 0 5 1 0 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 5 5 6 0 6 5 7 0 7 5 8 0 8 5 9 0 9 5 1 0 0 R e la tive A b u ndance 5 2 8 .9 2 6 5 2 9 .2 6 1 5 2 3 .2 8 5 5 1 9 .1 3 9 5 2 9 .5 9 5 5 2 3 .7 8 8 5 2 0 .1 3 9 5 2 8 .3 0 3 5 2 1 .1 3 6 5 2 9 .9 2 9 5 2 4 .2 8 9 5 2 2 .1 3 6 5 3 0 .2 6 4
MS
analysis
of
proteins
Top down Bottom up
Intact proteins are investigated Peptides generated from proteins
are investigated
Peptide mass fingerprinting Tandem mass spectrometry
TOP
‐
DOWN
APPROACH
• molecular weight determination of intact proteins • fragmentation of intact isolated proteins (ETD, ECD, CID, HCD) degradation products sequence variants combinations of post‐translational modifications positions of disulfide bridges • techniqually challenging lower MW proteins limited sample complexity high resolution (R>200 000!), special instrument requiredBottom
up
approach
Generation of peptides from proteins
Enzymatically
Chemically
Cyanogen bromide (Met ↓ (Trp ↓)) Acidic hydrolysis (Asp↓, (Glu↓))
Endopeptidase Specificity pH‐range
Trypsin R, K ↓ 7.5‐9.0 Chymotrypsin Y, F, W, L ↓ 7.0‐9.0 Glu C E, D ↓ 7.5‐8.5 Asp N ↓D 6.0‐8.0 Arg C R ↓ 7.5‐8.5 Lys C K ↓ 7.5‐8.5 Lys N ↓N 8.5‐9.5
Enzymatic
protocols
most
often
use
trypsin
• Cleaves after Lys and Arg (except the next AA is a Pro) • Provides at least one basic amino acid per peptide (facilitates ion generation for MS) • Statistically the size of the tryptic peptides are perfect for m/z range of analyzers (~10% of the AA content is R or K) • Cheap, reliable, known sequence • Modified trypsin is available commercially (↓autolysis)Peptide
mass
fingerprinting
(PMF)
2D gel
electrophoresis reduction trypsin
alkylation extractionpeptide desalting
MALDI‐TOF Database
Search Int. m/z Protein list 1. 2. 3. MS compatible staining
Reduction: DTT (dithiothreitol), mercaptoethanol, TCEP (tris(2‐carboxyethyl)phosphine)
MALDI Ion Source (Matrix Assisted Laser Desorption Ionization) H3CO HO OCH3 OH O HO OH O OH HO OH O CN
Commonly used matrices
DHB CHCA SA
2,5‐DihydroxyBenzoic Acid α‐Cyano‐4‐HydroxyCinnamic Acid Sinapinic Acid
Laser beam
Protein or peptide analyte Acidic matrix molecule
Ta rg et Pla te Desorption Desolvation + -+ + + -Ionization + + to Analyzer hν Co‐crystallized matrix
and analyte molecules
TOF
Analyzer
(time
‐
of
‐
flight)
Repeller Detector Plate Grid ( ‐pole) + + + +Analytes with different m/z values
(different velocity → different amount of time to reach the detector)
Spectrum m/z Int. + Laser beam Signal + + +
generated ions are accelerated by applied electric field ions of same charge have the same kinetic energy
velocity of the ions depends on their mass‐to‐charge ratio time to reach the detector is measured, m/z calculated
2 2 2 L V t z m or t m z
9 0 0 1 1 0 0 1 3 0 0 1 5 0 0 1 7 0 0 1 9 0 0 2 1 0 0 m /z 0 2 0 0 0 4 0 0 0 6 0 0 0 8 0 0 0 1 0 0 0 0 1 2 0 0 0 1 4 0 0 0 a .i.
100
%
sequence
coverage?
‐ peptide m/z out of detection range (short and long peptides) ‐ poor recovery of hydrophobic peptides (extraction from gel) ‐ loss of hydrophilic peptides (desalting) ‐ ion suppression (MS analysis) ‐ incorrect sequence in database ‐ protein processing (signal peptide, propeptide)T
T
MALDI-TOF MS, without fractionation
PMF – SUMMARY
Protein is cleaved into peptides in a specific manner (usually
using trypsin)
The m/z value of the resulting peptides is determined using
mass spectrometry (typically MALDI‐TOF)
Protein is identified by database search by comparing
experimentally determined peptide m/z’s to theoretical ones
generated in silico from proteins present in database
For
more
complex
peptide
mixtures:
MS/MS
based
protein
ID
Tandem in space (two separate analyzers, Q‐TOF, Q‐TRAP, IT‐Orbitrap…)
Tandem in time (single analyzer, ion traps)
Int. m/z MS/MS (fragmentation) spectrum sample + + + + ionization Ion source + -+ + ionization Analyzer 1
ion selection fragmentationionization
Detector
detection m/z separation
Multiply charged ions are formed On‐line coupling to HPLC: LC‐MS/MS + + + - + + + + + + -(heated) capillary + + + + ++ ++ + + + + + + +++ ++ + + + ++ + + + + + + + + + + + + + + ++ + + + to the analyzer e -High voltage +
-Electrospray
ionization
(ESI)
Sample nebulized to fine aerosol
Size of sample droplets is reduced by applied electric field and heat (desolvation) Droplets explode when electrostatic repulsion overcomes surface tension
FRAGMENTATION
Energy‐based
Energy is put into the peptide (weakest bonds break)
Collision‐Induced Dissociation (CID/CAD, HCD)
Infra‐Red MultiPhoton Dissociation (IRMPD)
Radical‐based
Electrons create unstable radical ions that spontaneously fragment at
sites of electron capture
Electron Capture Dissociation (ECD)
H2N N H H N N H OH R1 O R2 O R3 O R4 O a1 x3 a2 x2 a3 x1 c1 z3 c2 z2 c3 z1 b1 y3 b2 y2 b3 y1 Sequence ions are formed by fragmentation of the peptide backbone:
y1 y2 y3 y4 y5 y6 S A M P L E R +H 2 OH
b
fragment ionsy
fragment ionsS A M P L E R H +H OH 2 + S A M P L R H +H OH 2 + E S A M P L H + +H R OH 2 E S A M H + +H L R OH 2 P E S A H S H + + L R OH +H 2 P E L R OH +H 2 A M P E M b6 b1 b2 b3 b4 b5 6 2 1 3 4 5 6 5 4 3 2 1
Collision
induced
dissociation
(CID)
Instrument
‐
dependent
fragmentation
Fragment
ions
generated
by
CID:
1.
Sequence
ions (a,
b,
y)
2.
Internal
fragments
(if
multiple
fragmentation)
3.
Satellite
ions
(water
and
NH
3loss
of
fragment
ions)
4.
Immonium
ions
(info
on
amino
acid
content)
RT:0.00 - 89.88 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 Time (min) 0 20 40 60 80 100 R e la tiv e A b u ndan ce 32.38 633.2811 63.80 711.1080 52.48 738.0400 38.21 704.3377 40.64 531.8317 56.59 511.2694 47.84 535.2967 22.64 483.2597 31.21 651.861368.43 421.7585 60.53 836.4495 74.79 1066.5383 29.05 457.7869 83.26 942.1589 21.10 527.7853 2.27 445.1201 8.67 445.1200 11.87 445.1201 NL: 2.72E7 Base Peak F: ms MS 140325_27 140325_27 #8826 RT:41.43 AV:1 NL:4.20E6 T:FTMS + p NSI Full ms [380.00-1600.00] 400 450 500 550 600 650 700 750 800 850 900 950 m/z 0 20 40 60 80 100 R e la tiv e A b u ndan ce 521.3063 z=2 626.8540 z=2 672.8633 z=2 543.2941 z=2 477.7903 z=2 575.9614 z=3 441.2477 z=2 737.3463 z=2 801.4092z=2 689.6523 z=3 486.7901 z=2 863.4384 z=2 602.3219 z=2 427.5571 z=3 654.3456z=2 767.8786 900.9740z=2 z=2 946.5130z=1 977.9644z=2 140325_27 #8827 RT:41.44 AV:1 NL:2.90E5
T:ITMS + c NSI d w Full ms2 [email protected] [160.00-1265.00]
200 300 400 500 600 700 800 900 1000 1100 1200 m/z 0 20 40 60 80 100 R e la tiv e A b unda nc e 742.4 671.3 557.3 855.5 743.4 582.3 511.3 398.3 672.4 1040.5 926.5 327.2 469.2 583.3 720.3 299.1 232.1 347.1 399.2 662.4 744.5 810.5 906.5 1021.6 213.1 952.7 1106.5 1164.8 1248.7 BPI MS MS / MS
Ions eluting from HPLC
Ions detected in Orbitrap analyzer at 41.43min MS/MS spectrum of precursor ion m/z: 626.854 (2+), fragmented and
measured in ion trap
1. Generation of peak‐list (txt, mgf, dta …):
List of all MS/MS data acquired during an LC‐MS/MS experiment: For each MS/MS spectra:
precursor m/z, precursor charge state
list of m/z and intensity values for observed fragment ions 2. Database search
All MS/MS spectra is searched individually, results are given as proteins listing peptides
assigned to them
Search engine (Mascot, Protein Prospector, OMSSA, Sequest, Proteome Discoverer, Byonic…)
1. defines peptide candidates with the same (theoretical) m/z that is observed for the
precursor (in‐silico digestion of proteins in the database)
2. compares observed fragment ion list to theoretical fragment ion list of peptide
candidates (instrument‐dependent fragmentation!!!)
3. assigns a score and / or a probability value (expectation value, E‐value) to peptide
matches for deciding the “goodness” of identifications
One
peptide
to
multiple
proteins?
•
Homology
Between species Protein families within the same species Different isoforms of the same protein•
Coincidence
Sample Gene Acc.No. Protein Protein MW
Unique Peptides
Coverage
1 At3g02090 Q42290 mitochondrial-processing peptidase beta subunit, (MPP beta)
59 32 55.00%
1 At3g16480 O04308 mitochondrial-processing peptidase alpha subunit 2, (MPP alpha-2)
54 1a 3.00%
1 At2g07727 P42792 Cytochrome b (MTCYB) (COB) (CYTB)
44.5 1b 3.30%
2 At1g51980 Q9ZU25 mitochondrial-processing peptidase alpha subunit 1 (MPP Alpha-1)
54c 24 40.40%
2 At3g16480 O04308 mitochondrial-processing peptidase alpha subunit 2, (MPP alpha-2)
54c 1c(9) 16.40%
3 At5g40810 Q0WNJ4 Cytochrome c1 (CYC1-2) 33 6 20.80% 4 At5g13430 Q9LYR3 Ubiquinol--cytochrome-c reductase
(REISKE subunit)
26 6 22.10%
5 At4g32470 Q9SUU5 Ubiquinol-cytochrome C reductase complex 14 kDa protein
14.5d 5 49.20%
5 At5g25450 Q3E953 Ubiquinol-cytochrome C reductase complex 14 kDa protein
All
MS/MS
data
assigned
to
peptides?
– NO
‐non‐peptide components (salts, detergents, derivatizing agents…)
‐incomplete reduction / alkylation upon sample preparation
‐post‐translational modifications
‐side reactions during sample preparation
cyclization (N‐terminal Gln, Cys(CAM))
methylation (Glu, Asp)
oxidation (Met, Trp, Cys, Tyr)
carbamylation (N‐term, Lys)
carbamidomethylation (N‐term, Lys, Met, His, Asp)
deamidation (Asn, Gln)
S‐acrylamide formation (Cys)
formation of alkali metal adducts of peptides
‐nonspecific cleavages
‐incomplete digestion (number of missed cleavages?)
‐multiple peptides selected and fragmented
‐in‐source fragmentation resulting in “nonspecific” peptides
‐in‐source water loss (Ser, Thr, Asp, Glu)
‐short peptides may not yield enough fragments for confident ID
‐low quality spectra
‐incorrect monoisotopic m/z
‐incorrect charge state
• proteolytic cleavages
N‐terminal Met cleavage
signal peptide
propeptide
• chemical group
acetylation, phosphorylation, glycosylation, ubiquitination,
sumoylation, lipidation...
• intra‐ or inter‐ peptidic linkages
disulfide bonds...
(if) reflected in the molecular weight of the protein and corresponding peptide: amenable to MS usually substoichiometric: requires enrichment of modified peptides prior to MS modified peptides may feature similar or different fragmentation pattern: alternative fragmentation techniques (e.g. ETD for glycopeptides) characteristic fragmentation: pinpointing modified peptides (carbohydrate oxonium ions for glycopeptides, neutral loss ions for phosphopeptides / Met‐oxidized peptides)
+80 Da ‐98 Da ‐98 Da‐98 Da ‐98 Da ‐98 Da ‐98 Da HVG m/z 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 % 0 100 % 0 100
100728_szbk_01 57 (21.531) Cn (Top,4, Ar); Sm (Mn, 2x1.00); Sb (1,40.00 ); Cm (57:58) 2: TOF MSMS 607.97ES+ 7.71e3 184.10 147.13 70.07 139.07 127.10 117.08 357.29 202.11 245.15 312.17 284.17 339.28 501.36 444.34 358.30 443.30 414.28 483.35 445.33 942.64 885.61 701.50 614.46 502.37 511.31 550.32 615.46616.48 683.50 788.54 702.51 770.52 703.51 789.56 867.59 790.54 822.49 886.62 924.65 906.63 943.63 944.64 1013.69 945.65 1070.69
100728_szbk_02 5 (21.980) Cn (Top,4, Ar); Sm (Mn, 2x1.00); Sb (1,40.00 ); Cm (5:6) 2: TOF MSMS 647.97ES+
174 184.10 70.07 139.07 127.10 117.08 99.07 483.35 202.11 245.16 357.29 255.14 312.16 284.17 330.18 426.33 358.30 465.31 867.58 683.52 596.47 581.36 484.35 568.37 608.36 647.47 770.56 694.43 770.48 781.48 850.47 924.63 868.53 868.60 925.63 1022.67 965.59 1052.72
Phosphopeptides fragment similarly to unmodified peptides (CID)
phosphopep
tide
peptide
TPIVGQPSIPGGPVR
Same peptide glycosylated (HexNAcHexSA) 300 400 500 600 700 800 900 1000 1100 1200 1300 m/z 0 10 20 30 40 50 60 70 80 90 100 R el at iv e A bunda nc e 879.6 582.4 880.6 1064.7 578.4 893.6 596.4 1163.8 875.6 312.3 411.3 485.4 597.4 894.6 1007.8 383.4 468.4 695.6 782.6 1277.0 310.2 339.3 561.5 639.0 762.7 858.5 895.6 990.8 1104.8 1146.9 1242.9 080213_02 #2601 RT:33.15 AV:1 NL:1.53E4
T:ITMS + p NSI t d Full ms2 [email protected] [185.00-1435.00]
300 400 500 600 700 800 900 1000 1100 1200 m/z 0 10 20 30 40 50 60 70 80 90 100 R el ati ve A bu ndan ce 921.0 840.0 839.7 738.3 738.0 920.3 921.7 739.3 274.0 292.0 560.0 656.7 838.7 614.0 453.3 841.3 366.3 492.3 659.0 293.0 411.3 559.0 582.3 630.7 720.0 740.3 833.3 910.0 924.3 965.3 1066.31097.0 1164.3 1212.0 1257.7 b3 b4 b6 b9 y6 y9 y11 y12 y13 y5 y8* (2+) y8 y10
Sialic acid‐related
oxonium ions
Carbohydrate‐loss related
fragment ions
Glycopeptides show different fragmentation
Quantitative proteomics 1.
Gel‐based
• 2D gel‐electrophoresis
– time consuming, labor intensive
– limited dynamic range
– not suitable for
• LMW (15kDa) and HMW (>150 kDa) protein
• hydrophobic (e.g. membrane) proteins
• insoluble proteins
– needs min. 100 g total protein/gel
– 3 replicates / sample
• Improvements
– more sensitive staining
– large format – higher resolving gel
– sample prefractionation
Quantitative proteomics 2.
Gel free –MS based
• Stable isotopic labeling (2H, 13C, 15N, 18O)
– Chemical labeling
• ICAT (isotope coded affinity tag, Cys)
• iTRAQ, TMT (Multiplexed isobaric tagging technology, Lys/peptide N‐term)
• ICPL (isotope‐coded protein label)
• Formaldehyde + NaBH3CN (peptide N‐term, Lys)
– Enzymatic labeling
• 16O/18O exchange catalyzed by trypsin (peptide C‐term)
– Metabolic labeling
• SILAC (Stable isotope labeling with amino acids in cell culture)
• 15N or 13C labeling (complete or partial)
• Label‐free LC‐MS quantification
Cell/
tissue Protein Peptide
MS
SILAC ICAT iTRAQ, TMT Internal
Standards
Thank you for your attention!
This work is supported by the European Union, co-financed by
the European Social Fund, within the framework of "
Practice-oriented, student-friendly modernization of the
biomedical education for strengthening the international
competitiveness of the rural Hungarian universities "