Introduction to Proteomics 1.0
CMSP Workshop Tim Griffin
Associate Professor, BMBB
Faculty Director, CMSP
Why are we here?
Objectives
For participants:
• Learn basics of MS-based proteomics
• Learn what’s necessary for success using MS-based proteomics
• Designing experiments; sample preparation; data analysis
For CMSP staff:
• Prepare users so they are equipped to have success working with CMSP
• Manage expectations – what can these technologies
do and not do
Alien language made understandable
CMSP Participants
Terminology made sensible
CMSP
Participants NanoLC
MS/MS
iTRAQ
monoisotopic ESI
MALDI b-ion
Precursor ion
quadrupole Stage-tip Ion trap
HCD TOF
Right on!???
Who we are
Center for Mass Spectrometry and Proteomics
• Operated through the Department of Biochemistry, Molecular Biology and Biophysics
• Serving biological MS-related research needs across UofM campus and external institutions/private companies
• Fee-for-service Internal Service Organization (ISO)
• Supported by all Colleges at UofM using CMSP and Office of Vice President for Research (plus variety of granting sources)
• Extensive collaboration with Minnesota Supercomputing Institute
• Primary mission to support research efforts at the University of Minnesota, but also train others in the use of advanced
technologies and research approaches
Who we are
• 150+ collective years of experience in biological MS; hundreds of scientific publications
• Diverse expertise – design, sample preparation, instrumentation, data analysis
• Experience with MANY sample types and research studies
• Fish….gophers….periodontal bacteria…snake venom
‘Omic technologies and the molecular biology paradigm
Why proteomics and direct protein analysis?
(Genomic sequencing is cheaper, faster, more comprehensive…why proteomics?)
• DNA/RNA characterization cannot predict post-transcriptional events
“Proteomics includes not only the identification and quantification of proteins, but also the determination of their localization, modifications, interactions, activities, and, ultimately, their function.”
-Stan Fields in Science, 2001.
Alternatively: proteomics = high-throughput biochemistry
Proteomics: A definition
• measurement of protein response, which is not always indicated by mRNA response
• post-translational modifications
• macromolecular interactions
• sub-cellular location
• high-resolution structural and molecular characterization
• integration with genomic/transcriptomic data to comprehensively characterize biological systems
Proteomics as a complement to genomics
• two-dimensional gel electrophoresis
• mass spectrometry
• protein chips
• yeast 2-hybrid
• phage display
• antibody engineering
• high-throughput protein expression
• high-throughput X-ray crystallography
• cell imaging
Proteomic technologies and approaches
Enabling MS-based proteomics: “soft” ionization
Electrospray ionization (ESI)
Matrix-assisted laser desorption/ionization (MALDI)
• Making large, non-volatile biomolecules
fly
ionization
+ -
+ + + -
- - separation by m/z* detection
+ +
Î quadrupole Î ion trap
Î time-of-flight Î MALDI
Î Electrospray:
liquid chromatography nanospray
Î mass analysis of proteins, peptides
Nuts and bolts of mass spectrometry
*m/z = mass-to-charge
Many instruments, same underlying process
Image from : http://www.medwow.com
Image from: https://www.sdstate.edu/chem/mass-spec
Image from: http://planetorbitrap.com Image from: http://planetorbitrap.com
Example of technology progress: more sensitive MS
The information currency of MS
200 400 600 800 1000 1200
m/z
Relative Abundance
The “guts” of a mass spectrometer
ionization
m/z separation
m/z separation and detection
m/z separation and detection
Doing protein and proteomic analysis via MS
Sample preparation
Sample
preparation MS analysis MS analysis Data analysis Data analysis
Biological inquiry Hypothesis Experimental design
• Workshop structured to follow this ordering
• All aspects are important: each must be done well for success
• Challenge:
• technologies within each component always changing
• interdisciplinary
The importance of sample preparation
• Garbage in, garbage out
•Protein mixtures isolated from biological sources are complex (hundreds to thousands of components)
• Mass spectrometers have limited peak capacities requiring separation and fractionation of protein and peptide mixtures prior to analysis
• Separation methods include:
• gels
• liquid chromatography
• affinity chromatography
Protein chemistry: a challenge
• Proteins offer unique challenges compared to other biomolecules (e.g. nucleic acids):
– Solubility
– Abundance (no PCR!) – Chemical heterogeneity
Each protein is a unique
character!
base peak intensity
time
organic concentration in mobile phase
The workhorse: LC-MS
• Separating molecular mixtures prior to introduction into
MS
Some example applications: from simple to complex
Gygi, et. al. 1999, Molecular and Cellular Biology 19:1720
The “simple”: identifying a gel separated protein
2D gel electrophoresis: the original proteomics technology…but how to ID proteins?
Even the simple still requires care…..
In-gel digestion
Peptide purification
LC-MS/MS
• Process of identifying a gel-separated protein
Gygi, et. al. 1999, Molecular and Cellular Biology 19:1720
A bit more complicated: Identifying PTMs on a protein
• Phosphorylation
• Glycosylation
• Oxidations
• Acetylation
• Methylation
• Lipid anchors
• Ubiquitinylation/sumoylation
BUT….PTM analysis is not necessarily routine or easy!!
(abundance, enrichment, ionization, fragmentation….)
……
etc
Still more complicated: identification of proteins in complex mixtures
• More complicated sample preparation (fractionation)
A bit more complicated: Quantitative proteomics
S. cerevisiae cell cycle
(compliments of J.A. Huberman)
• Protein abundance is dynamic in response to environmental,
genetic, biochemical, pathological perturbations.
Quantitative proteomics: many methods available
• Labeled versus un-labeled
Systems biology: integrating ‘omics data
Data acquisition
Raw data processing (Database searching)
Analysis of processed data
(Statistical filtering, quantitative analysis)
Data organization and interpretation
Archiving and databasing
Dealing with the data: the rate-limiting step?
Workflow for protein identification
KEGG pathways