This study was designed to investigate the correlation of the data obtained from the analysis of identical biological samples on two different mass spectrometry instruments. Data from different instruments is often compared in the literature, if not directly, then in the form of the conclusions made based on that data. However there are few studies which look at the direct comparison of results when biologically identical samples are run on different instruments. Though the two instruments used in this study are both considered to be high accuracy, there is potential for variation due to instrumental differences. The Waters Synapt G2 may be expected to give higher accuracy quantitation data as quantitation in time-of- flight instruments is achieved via direct ion counting, rather than through a Fourier transform as in the Thermo Scientific Orbitrap VELOS. However, quantitation accuracy is also improved through increasing the number of peptides observed, and an orbitrap gives increased sensitivity due to the ion trapping and accumulation stage prior to detection.
These results show that, for these two instruments at least, the correlation between the results is high at the protein level, while it is somewhat lower at the feature level. The application of score thresholds (i.e. retaining the top 10, 25 or 50% of the data when ordered by peptide score) is not observed to have a predictable effect on the correlation of the two datasets; however the application of abundance thresholds does show a favourable relationship between increased abundance thresholds and a higher Pearson value. While the feature level correlation can be improved considerably by applying abundance thresholds, this is mitigated at the protein level due to the combination of several peptide features into each protein abundance measurement. Therefore it is considered that based on what is seen in this dataset there is little benefit to setting thresholds at the peptide level, while it could be detrimental in terms of losing proteins at the next level because they no longer possess sufficient quantitation peptides to be retained following the application of peptide thresholds.
The number of proteins identified by each instrument was also investigated, and it was seen that the Waters Synapt G2 identifies fewer proteins than the Thermo Orbitrap VELOS – 1792 vs 720 proteins (2.49 times the number) and that the majority of these proteins are common to the two instruments. These numbers are reduced for both instruments when a three
Waters Synapt G2 data respectively, but the ratio between the two instruments remains approximately the same (the number of peptides identified by the Thermo Orbitrap VELOS being 2.57 times greater than that for the Waters Synapt G2).
The assessment of the coefficient of variance for both instruments at all time points shows a greater variance in the Waters Synapt G2 data, though the median values for the average coefficient of variance are comparable between the two instruments. It was observed from the creation of box plots that normalisation has a favourable effect on the coefficient of variance observed for both instruments, and has a greater effect on the Thermo Orbitrap VELOS data.
It is concluded that for this data at least, there is little benefit to the application of score thresholds at either the feature or the protein level due to the unpredictable effect of their application on the correlation between the data from the two instruments. The most beneficial threshold to apply for this data seems to be a 25% abundance threshold at the protein level, which gives a positive effect on the correlation between instruments and a tolerable reduction in the number of identified proteins. Median absolute deviation normalisation is seen to reduce the instrument dependent variation seen between the separate biological replicates studied for each time point, and is therefore recommended. Hi3 data is seen to yield a higher Pearson value than total abundance data suggesting that the hi3 method is preferable, at least for this data set.
As future work it would be beneficial to design or acquire a dataset composed of more samples, which had been run on a greater variety of instruments with identical chromatographic conditions and with at least three technical replicates. It would also be beneficial to use multiple spike ins at various protein concentrations (possibly as a protein mix spike in, or by using a weighted sample containing multiple protein populations such as that described earlier in this thesis), in order to make an assessment of accuracy and reproducibility between instruments where the ground truth is fully known. Ideally this analysis would be completed using the most up-to-date instruments, for example using a Thermo Scientific Fusion (quadrupole-iontrap-orbitrap tribrid instrument) versus Waters SYNAPT G2si (TOF). It would also be useful to conduct an analysis with identical chromatographic conditions, as it is likely that one set of the conditions used in this work is more suited to this particular sample and it is not possible to dismiss the use of different
chromatographic gradients as the source of some of the variation between instruments that was observed in this study. However, though identical chromatographic conditions are ideal to create fully comparable results, it can be argued that the comparison as presented is more relevant to the field as biological conclusions compared in the literature will arise from the optimised conditions used in each individual laboratory. While a difference in chromatographic conditions may impact peptide-level identifications, this effect should be mitigated when peptide data is taken up to the protein-level. For those peptides which are identified, peptide quantitation should be broadly unaffected by any difference in the chromatographic conditions used.