• No results found

3. Development and validation of an in vitro coculture model of gemcitabine resistance in

4.6 Post-screen sample preparation and sequencing

Upon completion of the screen, shRNA profiles for each assay condition and replicate were to be assessed, to identify enrichment and depletion changes incurred through the screen and allow for comparative analysis between conditions. As a first step, genomic DNA was extracted from surviving K8484-inf cells, to provide template for shRNA amplification, sequencing, and downstream analysis. For the coculture conditions this extraction was preceded by a 96-hour exposure to 100ng/ml diphtheria toxin for selective ablation of the MH17031-DTR cells, purifying a K8484-inf population. This genomic DNA was used to amplify out the shRNA cassette, following by a separate barcoding PCR to individually label each condition and each replicate within it for pooled sequencing. Samples were sequenced using Illumina HiSeq to a depth of at least 10 million reads.

Sequencing reads were analysed by first mapping to an index of each shRNA sequence and gene name, as provided by the shRNA library manufacturer, Transomic Technologies. Differential expression analysis was then performed by comparing shRNA representation in

107

each condition to the timepoint zero set. Differential expression was quantified using two separate tools, DESeq2 (Love et al., 2014), and median Z scores. DESeq2 is a purpose-built analytical tool for carrying out differential analysis of high-throughput sequencing data, using shrinkage estimators for expression variance and fold change to account for low sample number differential expression scoring calculations. Median Z scores are created using a function to calculate the mean and standard deviation of each shRNA, defining a Z- score where Z measures the number of standard deviations of the shRNA expression level is from the population median (DE scoring performed by Chandra Sekhar Reddy Chilamakuri, CRUK Cambridge Institute).

With genome-wide data, the threshold of what is termed a “hit” or not can be adjusted along a gradient, depending on the degree of confidence required and number of candidate shRNAs required for subsequent analytical steps, such as pathway analysis. Given the purpose of this primary screen in identifying a large list of genes of interest to inform later more targeted screening protocols and factoring in the confounding issue of off-target- driven false negatives and positives, hit scoring criteria were set to be minimally restrictive. For DESeq2, shRNAs were considered as hits if they had a Log2(FoldChange) of ≤ 0 and an adjusted p value of ≤ 0.05. Similarly, for median Z scoring, shRNAs were termed a hit with a median Z score of ≤ -2, therefore an shRNA representation of a minimum of two standard deviations below the median shRNA representation value. Further, for a gene to be recorded as a hit, at least 50% of shRNAs targeting that gene needed to be hits, or if only one shRNA was targeting that gene, it had to be significantly depleted (Table 4.1).

Gene hit criteria

DESeq2 Median Z score

≤ 0.05 padj ≤ -2 Z score

< 0 Log2(FoldChange) ≥ 50% targeting shRNAs hitting ≥ 50% targeting shRNAs hitting

Table 4.1 Hit calling criteria using normalised primary shRNA screen data, for both DESeq2 and median Z score differential expression analysis methodologies.

108

The subset of most interest within this screen was the coculture condition plus gemcitabine. By identifying genes that are significantly depleted in this subset when compared to

timepoint zero, and not depleted in any of the other three conditions, it is possible to identify genes which may have a functional role in driving the observed resistance in the original model of coculture-driven resistance. This coculture condition plus gemcitabine had the highest number of significantly depleted shRNAs using both scoring methodologies, with 2,357 shRNAs depleted using DESeq2 and 5,563 shRNA depletion hits using median Z scores (Table 4.2), with an associated 291 genes significantly depleted with DESeq2, and 959 genes with median Z score. The distributions of ShRNAs in DESeq2 analysis and median Z score analysis are shown in Figures 4.6 and 4.7 respectively. Notably, for DESeq2 analysis there were over eight times the number of shRNAs identified as depleted relative to timepoint zero in the coculture plus gemcitabine condition compared to the monoculture plus DMSO condition, and over three times the number relative to the monoculture plus gemcitabine condition. These differences are potentially a product of the presence and concentration of drug used, whereby higher doses of gemcitabine may induce changes in gene expression within cells through their enhanced dependence on resistance-driving genes for survival. Additionally, with a higher GI50 in the coculture plus gemcitabine condition, there exists a larger range of gemcitabine GI50s that cells can be sensitised to, and therefore an increased ability to exhibit statistically significant change versus the DMSO control.

Condition Median Z score DESEq2

shRNAs genes shRNAs genes

Coculture + gemcitabine 5563 959 2763 291

Coculture + DMSO 2741 448 1043 159

Monoculture + gemcitabine 2064 298 738 78

Monoculture + DMSO 1545 212 345 38

Table 4.2 Number of shRNAs and genes significantly depleted within the coculture and monoculture conditions +/- gemcitabine. Compared to timepoint zero, from primary ShRNA

109

Figure 4.6 shRNA differential expression scoring between conditions using DESeq2.

Significantly differentially expressed shRNAs (red dots) called using R Bioconductor package DESeq2 for coculture plus (A) gemcitabine, (B) DMSO, and monoculture plus (C) gemcitabine and (D) DMSO. shRNAs with a Log2FoldChange value of less than 0 indicate depletion through the screen, whereas those greater than 0 indicate enrichment.

110

Figure 4.7 shRNA differential expression scoring between conditions using Median Z Scores. Significantly differentially expressed shRNAs called using Median Z Scores for

coculture plus (A) gemcitabine, (B) DMSO, and monoculture plus (C) gemcitabine and (D) DMSO. shRNAs were ranked based on Median Z Score value (standard deviations from population median expression), with shRNAs with a ≤-2 or ≥2 value (marked by red lines) considered significantly depleted or enriched, respectively.