• No results found

Already said. Already said. Outlook. Look at LC-MS data. A look at data for quantitative analysis using MSight and Phenyx. What data for quantitation?

N/A
N/A
Protected

Academic year: 2021

Share "Already said. Already said. Outlook. Look at LC-MS data. A look at data for quantitative analysis using MSight and Phenyx. What data for quantitation?"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

A look at data for quantitative

analysis using

MSight and Phenyx

Atelier Protéomique Quantitative

25-27 Juin 2007

La Grande Motte

Pierre-Alain Binz

Institut Suisse de Bioinformatique

GeneBio SA

Already said

• Importance of biological question, sample choice, experimental

strategy

• Complexity of sample is a challenge for MS

– Peak capacity, concentration range, chemical properties,…

• Many methods with goods and bads

– iTRAQ, SILAC, ICAT, MRM, label-free, …

• Many instrumental settings: heterogeneity of data

– type, amount, resolution

• Many bioinformatics tools

– Identification, signal detection, quantitation

• Validation methods

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Already said

• Importance of biological question, sample choice, experimental

strategy

• Complexity of sample is a challenge for MS

– Peak capacity, concentration range, chemical properties,…

• Many methods with goods and bads

– iTRAQ, SILAC, ICAT, MRM, label-free, …

• Many instrumental settings: heterogeneity of data

– type, amount, resolution

• Many bioinformatics tools

– Identification, signal detection, quantitation

• Validation methods

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data

• Detect signal

• Align LC-MS runs

• Match images (differential analysis)

• Add identification results

• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

What data for quantitation?

• MS data: dimensions:

– m/z

– Intensity

– Rt, pI, scan number

• Secondary data

– Sample (one, more than one)

– Molecular interpretation (peptide, protein)

– Quantitation method (label description, comparison

method, thresholds, corrections)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Look at LC-MS data

• Raw MS traces or peaklists (spectrum view or

gel view)

• Chromatographic profiles (TIC, XIC )

• 2D images (LC-MS)

• Annotated spectra

• Overlapped spectra, head-to-head view

• Overlapped images

(2)

Visualise LC-MS data:

spectrum view, gel view, chromatograms

m/z

I

Rt

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2D representation

183 122888498 104 104 104 110 108 108 116 112 106 116 12098743276 122 145 141 133 10684 110 1167874 104 104 114 1029288 10672888292989690828286909094969474442652 100 141 124 114685088 1005692 100 116 1169682 108 687282868488928884586056506656664036283866928258282650524872 102 114 1207288 120 66648274625254747048465044364036343024323442363628242628284068 1148884 112 131 72606456564234424836363834323630303432343432302624242420182436526094 131 135 54525046403432343432323430322628282630422836383448242620182622223882 124 133 4234363834303432324032263232262626262432364668563632262026181828203674 100 40343026283434323436323426222226262828645084 108 100805440262018282018264272 322828342828282836263228262426262432365276 131 159 147 1359264362220202418202650 2026282830283434262234303020262858445282 120 159 195 195 175 143 108864022181618222432 2220203234446050222644422418141872724872 112 173 205 207 193 175 161 1498454242440605846 242024324866766632426464281612183676483256 161 207 207 203 195 193 187 13396564244929668 2226323852789076304080806026182050 102623650 155 207 207 201 201 201 195 171 13988565280 12498 302626505682 1049654347886765648507698583474 175 211 207 207 211 207 203 193 171 120745474 114 112 262226525880 106 112865040688470707868564048 116 199 213 211 213 215 207 211 203 183 14598605496 118 36263252687696 120 10474404264727664603648 104 171 211 215 213 215 215 207 211 205 183 155 114806886 118 60302432508094 104 116 102744440364240466296 157 199 211 211 213 211 213 205 203 195 175 155 124 1068490 116 744828265688 104 100 106 112 104826646485874 118 155 189 205 207 205 205 211 207 199 197 189 161 143 133 124 11094 106 886850506686 104 110 108 116 124 112988890 106 131 159 187 199 201 205 207 211 213 211 205 201 175 147 131 139 137 143 102 100 106948276808282 100 110 122 133 135 124 124 133 141 155 183 189 195 199 201 205 211 213 211 203 181 155 139 133 141 149 133 106 102 122 114 1069686926858 102 116 129 133 141 145 151 151 155 167 173 175 189 187 195 197 195 187 175 165 169 151 143 137 129 116 106 108 129 131 120 10298 1049498 112 104 100 106 124 126 135 147 149 147 155 167 165 171 179 179 169 177 181 187 189 179 157 151 147 143 131 131 137 135 126 112 108 116 118 118 116969896 114 10084 112 126 131 141 147 141 143 165 157 135 157 159 163 175 173 171 169 173 173 157 143 143 141 131 129 124 131 131 124 1149892 110 1168874 106 120 122 124 1209296 120 10488 120 157 159 165 165 179 175 175 167 155 139 141 143 139 133 139 135 135 124 12098 110 120 1129876 120 120 131 129 133 104 100 120 11490 116 165 149 143 153 165 161 163 149 147 133 135 141 139 139 147 145 143 135 122 110 120 122 114 104 100 129 118 129 133 137 11498 126 131 120 129 165 141 141 149 149 149 149 141 137 137

38x26

m/z

Rt

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Time 30 20 40 10 0 m/z 1200 400 600 800 1000

200 Da

20 min

Example: LC-ESI-Q-TOF

28’800’000 measures

(55 MB)

900 spectra

3 s

0-45 min

time

32’000 measures

0.025 m/z

400-1200 m/z

mass

sampling rate

interval

42-59 kDa extract of human BJAB B-cell line

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Image part to display

6000x400

Data display principle

MS data

32000x900

Screen size

800x600

Projection

Time

m/z

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Time 30 20 40 10 0 m/z 1200 400 600 800 1000

200 Da

20 min

Full image

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Less than 0.001 % of the data displayed

0.5 Da

30 s

m/z 660.25 658.25 659 659.25 659.5 659.75 660 658 658.5 658.75 657.5 657.75 660.5 Time 32.5 33

Zoom 256x

0.33

3+

0.5

2+

(3)

MSight

• LC- MS data analysis tool

• Developed by the

Proteome Informatics Group of the

Swiss Institute of Bioinformatics

• Based on Melanie 2D gel analysis software

It looks a bit like Melanie

http://www.expasy.org

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Why MSight?

• Generate and evaluate LC-MS images

– Import LC-MS and MS/MS runs from various MS instruments and formats

– Workspace to manage experiments and data

– Rich visualisation and annotation

– Visualise the complexity of a LC-MS run

– Detect contaminants, running aberations

• Perform peak detection from raw LC-MS data

– Improve Rt and m/z accuracy using 2D

• Quantitation and comparison

– Alignment and matching of LC-MS “images”

– Quantitation reports for differential expression analysis

– Label-free quantitation,

– Generation of inclusion/exclusion list

• Integrate with identification tools (Phenyx)

– Annotate MS “peaks” with peptide identity labels

– Use the annotations to validate matching peaks across LC-MS experiments

Import

• Raw LC-MS and MS/MS data format

– Native format (yep, baf, fid, T2D, dat)

– mzXML, mzData

– Ascii exports

• Handle big original files (100MB-1GB)

• Include profile LC-MS trace and MS/MS spectra

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Visualisation

• Open multiple images

• Zoom in/out

• Chromatographic profile (« XIC »)

• Spectrum view

• Editable and searchable annotations

– landmarks, Rt, m/z, peptide sequence, hyperlinks, others

• Synchronisation between views

• Superpose images in transparency mode and

complementary colors

• 3D view

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Artefacts

1 min

100 Da

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Artefacts

(4)

500 Da

SDS-MALDI-TOF

Mass calibration

4’392’000 measures

90 spectra

48’800 measures

0.05 m/z

560-3000 m/z

mass

sampling rate

interval

2 Da

0.15

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2 Da

30 s

100 Da

5 min

Contaminants

44 Da Polymer PEG

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Contaminants (2)

5 min

100 Da

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Redundancy: Peptide modifications

10

min

100 Da

Spot from 2DE gel

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Redundancy: Peptide modifications

2

min

5 Da

5.33

(3+)

5.33

(3+)

Oxidation

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Redundancy: Peptide modifications

10

min

100 Da

3+

2+

4+

3+

5+

4+

2+

Oxidation

(5)

Outlook

• Visualise LC-MS data

Detect signal

• Align LC-MS runs

• Match images (differential analysis)

• Add identification results

• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Peak detection

• Detect and quantify MS peaks in a 2D image

• Interactive use

• Manual validation via visualisation

• Export in centroid mode

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Peak detection variability

• High vs low resolution in m/z axis

– Isotopic profile vs bump

• Sampling resolution (Rt and m/z)

– LC-MALDI < ESI-MS with MS/MS < ESI-MS (QTOF<LTQ)

• Noise (chemical, electronic)

• Shape (rectangle, circle, other)

• Intensity (max, sum, fit max, integrate)

• And for quantitation:

– Detect individual sample and compare vs

align and use one single shape per aligned feature

P-A Binz, Atelier Proteomique Quantitative, juin 2007

5 min

5 Da

15 s

Locating the source of noise

37.15

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2 min

1 Da

37.15

Locating the source of noise

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Streak

10

min

1 Da

2000

(5+)

12000

(2+)

80

807 808 809 810m/z

3000

(2+)

b

c d

e f

g h

a b c d e f g h

a b c d e f g h

b

c d

e f

g h

i

j

k

L

i

j

k

L

m

n

m

n

28 min

i

j

k

L

i

j

k

L

m

n

m

n

(6)

time: 31.9 min

Peptide deconvolution

1 Da

1 min

2+

2+

4+

2+

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data

• Detect signal

Align LC-MS runs

• Match images (relative quantitation)

• Add identification results

• Quantitation with identification results

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Alignment and comparison

• Align images via landmarks (corrections for

local deviations)

• Match images (pair peaks together)

• Report relative quantification information

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Alignment

620 624 628 632 m/z

transformation

4 min

P-A Binz, Atelier Proteomique Quantitative, juin 2007

A

-

B

1

min

2 Da

B

A

1

min

2 Da

Migration variability

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data

• Detect signal

• Align LC-MS runs

Match images (differential analysis)

• Add identification results

• Quantitation with identification results

(7)

• Protein Mixture

– 32-45 kDa fraction of lysate from a culture of a

B-cell line

– ~ 1 pmol

– up to 180 proteins detectable in this sample

when analysed extensively by LC-MS/MS

10 Da

5 min

+26 fmol

+83 fmol

+520 fmol

BSA

Quantitation

740.35 (2+)

LGEYGFQNAL

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2 Da

2 min

+26 fmol

+83 fmol

+520 fmol

Quantitation

3+

3+

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation

+26 fmol

+83 fmol

+520 fmol

P-A Binz, Atelier Proteomique Quantitative, juin 2007

5 min

20 Da

BSA

BSA+Lyz

Differential (low resolution)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Differential analysis

A

B

A

B

A

-B

100 Da

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Differential analysis

A

A

-B

2 Da

(8)

Outlook

• Visualise LC-MS data

• Detect signal

• Align LC-MS runs

• Match images (differential analysis)

Add identification results

• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Coupling with

identification

• Sofar, quantitation without consideration of

molecular interpretation

• To quantitate protein, need to select signals

and to couple with peptide identification

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Phenyx

A software platform dedicated to the identification and

characterization of proteins and peptides from

mass spectrometry data

Developed by GeneBio, in collaboration with the Swiss

Institute of Bioinformatics (SIB)

Launched in September 2004 (version 1.8)

Version 2.3 in April 2007

Rapid development and recognized tool

Integration in a number of third-party software (Scaffold,

TPP, MSight, ProteinScape, Proteus LIMS, …)

Adopted by a number of large renowned Proteomics centres

http://www.phenyx-ms.com

http://phenyx.vital-it.ch/pwi

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Some features

Core calculation

Robust and flexible scoring

including

log likelihood measures

Conflict resolution

algorithm

Use of annotations

in databases (

PTMs, variants

,

AA modifs

…)

Flexible and interactive interface: the “

Phenyx Web Interface

User and jobs properties

(user privileges, job sharing)

Manual validation

functionality

Import

third party jobs (Mascot, Sequest, X!Tandem, Popitam, …)

Many

exports

(native Phenyx, Excel, XML, text…)

Results comparison

functionality

Integration

of Phenyx into workflows: a job follows a suite of

configurable events (pre-processing, processing and

post-processing)

http://www.phenyx-ms.com

http://phenyx.vital-it.ch/pwi

P-A Binz, Atelier Proteomique Quantitative, juin 2007

The Phenyx Web Interface:

Excel, xml and text exports

Desktop

Results

views

Submission

Management console

Results comparison

http://phenyx.vital-it.ch/pwi

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Integrate MSight and Phenyx

• Example: Annotate LC-MS images with peptide

identifications

Raw

LC-MS

Peaklists

Exported

peptide

identifications

Annotated images

Phenyx interface

(9)

Phenyx results are stored as

annotations in the images

P-A Binz, Atelier Proteomique Quantitative, juin 2007

LC-MS and MS/MS: undersampling

621 m/z 655 21.15 Time [min] 34.85

LC-MS and LC-MS/MS on a QStar of 49-62 KDa SDS separated and

trypsin digested proteins, from a human B-cell line

Focus on a small time x m/z region

(about 1/250 of the full run)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

LC-MS and MS/MS: undersampling

621 m/z 655 21.15 Time [min] 34.85

7/40 peptides analysed

3/7 identified

< 10% positively identified using stringent criteria

FFADLLDYIK

SLDLDSIIAEVK

LALDLEIATYR

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data

• Detect signal

• Align LC-MS runs

• Match images (relative quantitation)

• Add identification results

Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation with search engine

• Use of MS/MS data

– Reporter ions: isobaric labeling (iTRAQ, TMT)

– emPAI (~ratio observed/predicted peptides)

– Multiplex (SILAC, 18O)

• Use of MS raw traces

– Stable isotope labeling (ICAT, SILAC, AQUA, 18O, ICPL, …)

– Label-free

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation: needed

information

• Need identified peptides

• Need access to intensities (MS/MS and MS)

• Need quantitation method

– Labeling method (fixed, variable mode)

– Definition of “pairs”

– Intensity correction factors

– Thresholds for what peptides to consider (confidence levels,

scores, #pep / protein)

– Create report, calculate ratios, evaluate outliers

– Include in search engine GUI

(10)

A quantitation module for

Phenyx

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Generic

Quantitation

methods

Generic

Quantitation

methods

Prediction of

Co-peptides

Prediction of

Co-peptides

Extraction of

Intensities:

MS level

Extraction of

Intensities:

MS level

Extraction of

Intensities:

MS/MS level

Extraction of

Intensities:

MS/MS level

+

Calculation

of ratios;

exportation

Calculation

of ratios;

exportation

A quantitation module for

Phenyx

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation

module

Quantitation

module

API

InSilicoSpectro

PhenyxPerl

API

InSilicoSpectro

PhenyxPerl

Quantitation

Result file

(text)

(Phenyx)

result file

Labeling

config file

(xml)

InSilicoDef

definition

file (xml)

External

statistics ( R )

External

statistics ( R )

One possible integration with

MSight (label-free)

Raw LC-MS Peaklists Exported peptide identifications Annotated images Raw LC-MS Peaklists Exported peptide identifications Align, compare Annotated peptide ratios

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Phenyx: generate reports from

identification results

Perl scripts to generate many kinds of exports

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Example for iTRAQ

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Examples of filters and search

parameters that alter

quantitation results

• Minimal number of peptides per protein

• Minimal number of proteotypic peptides

• Minimal score for each peptide

• Filter on redundancy

– same sequence (same or different charge states)

– same exact primary structure,

– Imbedded sequences (missed-cleavages, etc.)

• Remove outliers (quant values > threshold CV)

• Number of missed cleavages allowed

• Semi-tryptic peptides and fully unspecific cleavages

• Number of queried modifications

(11)

Only valid peptides:

6 proteins, 22 peptides

4 proteins, 19 peptides

Min. 3 valid peptides:

Min. 3 valid peptides, Intensities

>10’000: 4 proteins, 15 peptides

Min. 3 valid peptides, Intensities >10’000,

CV<20%: 2 proteins, 7 peptides

Effect of filters

2

7

+ CV

4

15

+ Intensity

4

19

+ 3 peptides

6

22

Z-score

# proteins

# peptides

Filter

P-A Binz, Atelier Proteomique Quantitative, juin 2007

# peptide in decoy database

# peptide in forward database

False discovery rate export

Number of valid hits as fct of zscore

0 2000 4000 6000 8000 10000 4.0 6.0 8.0 10.0 12.0 14.0 z-score # h it s True hits Hits in reverse

FDR (hits in rev / hits in fw d)

0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 5.0 6.0 7.0 8.0 9.0 10.0 z-score F D R (h it s in r e v / h its i n fw d )

P-A Binz, Atelier Proteomique Quantitative, juin 2007

(12)

Calibration status of instrument

(3 datasets)

Calibration status of instrument

3 5 7 9 11 13 15 17 19 -0.6 -0.4 -0.2 0.0 0.2 0.4 de lta m/z zs c o re

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Effect of the search parameters

1rnd,

Only 3 fixed mods

131 valid,

75% cov.

2rnd,

Add variable mods

205 valid,

84% cov.

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2rnd,

With all mods

And half cleaved

348 valid,

90% cov.

Import jobs into Phenyx

Mascot

X!Tandem

Sequest

Phenyx

Manual validation and then quantitation as if Phenyx job

Results comparison tool

What protein in what job?

What peptide in what protein/job?

Concatenate results from different runs/search engine

And then go to quantitation…

Summary

• LC-MS data and 2D image analysis (MSight)

– Rich source of information

– Detect strange behaviors (discontuity, contaminations, QC

issues)

– Use of 2 dimensions efficient for signal detection

– Alignment of multiple MS runs: consider local aberrations

– Quantitation possible for pairs and for groups (statistics)

• Quantitation with protein identification tool only

(Phenyx)

– Quantitation methods limited to information in peaklists

(isobaric labeling, emPAI, Multiplex)

• Quantitation with MSight and Phenyx

– Get access to raw data information

– Full panel of quantitation methods

– Need tight integration (annotation, statistics, filters)

– Thanks to import functionality, access to other search

engines

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Take-home messages

Biological variability

Experimental variability Error to appreciate

Quantitation method tolerance

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Many tools available, make your choice according to:

biological question

capacity to analyse data from the chosen quantitation method

capacity to analyse data from your instruments

possibility to validate generated data (interactivity)

Understan

d, evaluat

e

Understan

d, evaluat

(13)

Aknowledgements

• Phenyx devel team

– Alexandre Masselot

– Nicolas Budin

– Anne Niknejad

– Olivier Evalet

• PIG group

– Ron Appel

– Daniel Walther

– Gerard Bouchet

– Sébastien Catherinet

– Stéphane Pelhâtre

– Patricia Palagi

• BPRG

– Ali Vaezzadeh

• PAF

– Manfredo Quadroni

• University Bern

– Manfred Heller

• IPBS

– David Bouyssié

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Thank you for your attention!

MSight: http://www.expasy.org

Phenyx: http://phenyx.vital-it.ch/pwi

References

Related documents

2005: Helminth parasites of green toad, Bufo viridis (Anura: Bufonidae); tree frog, Hyla arborea. savignyi (Anura: Hylidae) and marsh frog, Rana ridibunda ridibunda

Yu (1997) employed daily data on markets of Hong Kong, Tokyo, and Sin- gapore for the period 1983–94 and detected bidirectional relationship in Tokyo, no causation for the

This is seen in the way in which law is relied upon as the means of achieving justice, in particular the way in which the boundaries of the concept of

• Describe the 4 components of CPSP • Explain the purpose of CPSP Orientation • Explain how to use Provider Handbook, Steps to Take Guidelines, and Protocols • Describe

 Give more of an immediate graphical sense of actual task and project duration  But less compact than PERT and harder to see critical path.. Sample

The well documented postpartum increase in NEFA concentration was not altered by quercetin supplementation in the transition period (Stoldt et al., 2015), however cows receiving

64.2 58.5 55.3 52.9 51.9 51.9 52.6 53.6 54.7 5.7 3.2 2.4 1.0 0.7 1.0 1.1 40 45 50 55 60 65 EBITDA 1 - 9/2013 Č realiza R - pokles č ní ceny elekt ř iny Č R - mimo ř ádný vliv po č