• No results found

CHAPTER 2: MATERIALS AND METHODS

2.10 Data Analysis and Statistics

2.10.1 Chapter 3: IFN Screen

The normalised S:N values, generated as describe above, were used for all further analysis, presentation and generation of plots. Additionally, as it is challenging to confidently assign peptides to a specific human leukocyte antigen (HLA) alleles, and to account for different alleles being expressed in the different donors, the S:N values were summed to give a single value for HLA-A, HLA-B, HLA-C and HLA-DRB1. Additionally, all classical HLAs were excluded from the investigation of protein abundance.

In the dot plot of IFN induced changes at the surface of THP-1s, p-values were estimated based on significance B (Cox & Mann, 2008), calculated using Perseus (Tyanova et al, 2016). Significance B assumes a normal distribution of fold changes (FCs) observed, but allows a different standard deviation (SD) for up- and down- regulated proteins. Additionally, proteins are grouped into bins according to their S:N; proteins with a higher S:N are more accurately quantified and therefore a smaller FC may still be significant.

Abundance values for proteins were calculated based on the ‘iBaq’ methodology, adapted from the original description (Schwanhäusser et al, 2011). The maximum MS1 precursor intensity for each peptide was determined, and a summed MS1 precursor intensity for each protein across all matching peptides was calculated. This summed intensity was divided by the number of theoretical tryptic peptides for that protein between 7 and 30 amino acid residues in length to give an estimated iBAQ value. To determine the abundance of a protein at the surface of unstimulated cells, the summed intensity was adjusted in proportion to the S:N values: (Donor 1 + Donor 2 + Donor 3 + Donor 4 + Donor 5 unstimulated) / ∑(all donors with/without IFN). Colour coding for the comparison of the abundance of proteins at the surface of monocytes and T cells was based upon calculating the mean and SD of the log transformed ratios for all proteins quantified, and defining cut off ratios at one, two or three SDs away from the mean. For donor-to-donor comparison, the intensity was adjusted in proportion to the unstimulated sample for that donor (eg Donor 1 unstimulated / ∑(all donors + / - IFN).

When examining IFN induced FC, in order to avoid artificial inflation of changes due to poor quantitation of a protein in one channel for a given donor, the data for a donor where either the stimulated or unstimulated channel contributed less than 2 % of the total S:N across the 10-plex for that proteins was removed. This was only applied to the primary monocytes and T cells, where data for five donors was available. Only proteins with data for three or more donors were examined.

For identifying IFN regulated proteins, upregulated proteins met the ‘sensitive’ criteria by having an average FC > 1 SD away from the mean (calculated as described above for the abundance ratio), and a FC > 1 in all donors where quantified. The SD away from the mean FC determined the colour of the average FC bar in the graphs. The filter for being stringently upregulated employed an additional criteria of a Benjamini-Hochberg (BH)- corrected p-value < 0.05, calculated using a paired, two-tailed student’s t-test on log transformed data. Downregulated proteins were assessed identically. This was all calculated using excel.

R2 values for donor-to-donor comparisons were calculated using excel.

2.10.2 Chapter 4: Vaccinia Virus Screen

For analysis of the triplicate time course dataset, the S:N in each channel was analysed as a fraction of the total observed S:N for that protein across the 11 plex. Considering the fractional TMT signal rather than the absolute normalised intensity effectively corrected for difference between the number of peptides quantified for each protein and between replicates.

For analysis of proteomic changes in the host, proteins were defined as ‘sensitively’ upregulated if they had a FC > 2 at any infection time point compared to the 18 h mock, and to be stringently upregulated they were also required to have a BH-corrected p-value < 0.05 at that time point. The same criteria were applied to downregulation. A two-tailed Student’s t-test was used to estimate p-values for proteins quantified in all three replicates, comparing each time point with the 18 h mock, and these were BH-corrected. These p- values are also displayed in the dotplot and on graphs of expression profiles. Expression

profiles display the mean abundance across all replicate time-courses where the protein was quantified, with error bars denoting the standard error of the mean (SEM). Hierarchical clustering was performed using Cluster 3.0 on the average of the three time- courses, with the FC at each time point compared to the average of the three mock samples.

For clustering of viral proteins, the AraC channels were removed, and the data normalised to the maximum signal for each protein. XLStat (Addinsoft) was used for k-means clustering, and each cluster was then subjected to hierarchical clustering using cluster 3.0.

The 11 plex incorporating the MG132 and C6 deletion mutant data encompassed triplicate samples for WT infection, WT + MG132, and infection with a C6 deletion mutant. P- values were estimated using a Students two-tailed t-tests on the normalised data comparing either WT + MG132 or the C6 deletion mutant to the triplicate WT infection, and these were BH-corrected.

Viral replication assays were performed and analysed by Dr Yongxu Lu (Smith lab). Experiments were performed in triplicate, and the data displayed is the mean ± SEM. P- values were estimated using a two-tailed t-test.

Pathway analysis for identification of enriched categories of proteins was performed using the Database for Annotation and Visualisation and Integrated Discovery (DAVID) version 6.8 (Huang et al, 2009a, 2009b). The set of proteins indicated in the text was searched against a background of all human proteins quantified in that experiment.

2.10.3 Chapter 5: Candidate Antiviral Restriction Factors

Overlap between IFN stimulated proteins and virally downregulated proteins was performed using the sensitive criteria for IFN stimulation described above, and viral downregulation from multiple datasets as described in the text.

For all ‘single colour’ HCMV restriction assays, data is presented as the mean ± SEM, if triplicate data is available. If the data is duplicate, error bars indicate the range and this

is specified in the figure legend. P-values were estimated using a two-tailed Student’s t- test comparing the percent infection in the three test samples to the three control samples.

For two colour restriction assays, the restriction ratio is calculated as described in the text. Bar plots display the mean restriction ratio ± SEM where triplicate data is available. P- values were estimated using a paired two-tailed Students t-test, with paired samples representing the percent infection of the iRFP cells compared to the mCherry cells in each of the three replicate wells.

Related documents