3. Development and validation of an in vitro coculture model of gemcitabine resistance in
4.10 Design and implementation of a validation screen to refine gene target list
There are major inherent challenges around biological and technical biases and
reproducibility in large scale high-throughput screens, including for technologies such as CRISPR and RNA interference. This can drive a high false positive rate, often a product of off- target effects, the latter of which is widely documented as a confounding factor in data analysis (Jackson and Linsley, 2010). The primary risk of using shRNAs in a depletion screen is off-target effects, wherein an shRNA will appear to be depleted when an off-target non- specific effect decreases the viability of a cell type either in general or when exposed to gemcitabine. To counter this in the primary multiple unique shRNAs were included per gene, only scoring a gene as a hit when a majority of its shRNAs were significantly depleted. However, the large number of hits required further filtering to identify the most robust hits to follow up.
Taking a more refined panel of genes highlighted as potentially causative of gemcitabine resistance in the initial screen, a second more targeted validation screen was designed. For this screen, 1,973 genes were collated from a variety of sources, including all hits coming from the initial coculture and monoculture plus gemcitabine conditions, the gene sets from the most significantly depleted GSEA networks, as well as some hypothesis-driven panels, such as enzymes involved in the gemcitabine metabolic and transport processes. These were combined with a panel of 200 human olfactory receptors to serve as negative controls. Within this screen there was an average of 4.21 shRNAs targeting each gene (Fig 4.15), as opposed to 3.46 for the initial screen, thereby serving to minimise off-target related false positives.
125 1 2 3 4 5 6 7 8 9 1 0 1 1 0 2 0 0 4 0 0 6 0 0 8 0 0 1 0 0 0 N u m b e r o f s h R N A s t a r g e t i n g e a c h g e n e N u m b e r o f g e n e s
Figure 4.15 The number of shRNAs targeting each gene within the validation screen library used.
The screen protocol was replicated as per before, with six cycles of gemcitabine used for each condition of coculture and monoculture either plus gemcitabine or plus DMSO control. Data was analysed for differential expression within each condition using the same two scoring methodologies as above, DESeq2 and median Z scores.
Hit rates across each condition, except for coculture plus gemcitabine, were all significantly lower than in the initial screen, a product of validation screen library being built specifically to interrogate the hits arising from the coculture plus gemcitabine condition primarily.For the coculture plus drug condition, 17.2% of the library shRNAs were scored as significantly depleted when combining scoring methodologies, when compared to only 8.5% from the initial screen, indicating the targeted focus on this condition was evident in the output data (Table 4.5, Fig 4.16, 4.17).
126
Condition Median Z score DESEq2
shRNAs genes shRNAs genes
Coculture + gemcitabine 1343 299 1317 281
Coculture + DMSO 221 40 164 11
Monoculture + gemcitabine 210 39 118 7
Monoculture + DMSO 157 35 66 5
Table 4.5 Number of shRNAs and genes in the validation screen (1973 genes targeted in total) significantly depleted within the coculture and monoculture conditions +/-
gemcitabine, when compared to timepoint zero.
Figure 4.16 Differential depletion and enrichment of shRNAs in the validation screen calculated using Median Z Score.
127
Figure 4.17 Differential depletion and enrichment of shRNAs in the validation screen calculated using DESeq2.
Reproducing the outcome from the initial screen, both Chek1 and Atr both were scored as significantly depleted in the coculture plus gemcitabine condition, again adding confidence to the biological relevance and translational value of hits coming out (Fig 4.18). Similarly,
Dck as a biological negative control, as its depletion increases gemcitabine resistance, was
significantly enriched in the validation screen. Results held true through both for DESeq2 and median Z score scoring methodologies.
128
Figure 4.18 Differential expression of validation screen control shRNAs. (A) Differential
expression levels of positive controls Atr and Chek1 alongside negative control Dck from validation screen data, using both Median Z Scores. A value of ≤-2 is considered significant depletion of the corresponding shRNA. (B) Differential expression levels of the same shRNAs scores using DESeq2, with significant enrichment or depletion (padj ≤ 0.05) represented using
red dots.
Interrogating the validation screen data there were 299 genes significantly depleted using median Z scores, which is 31% of the number called from the original screen. This contrasts with the DESeq2 output, wherein the 281 genes depleted in the validation screen was similar to the number called in the original screen (291). The median Z score total is reduced
129
to such an extent given the validation screen gene list had a strong skew towards genes with a role in the gemcitabine resistance effect, therefore the median expression level within the pool would be skewed similarly. As Z score is a relative score based on the median
expression level in the entire population, a cut off of a < -2 Z score is significantly more restrictive. Whereas for DESeq2, with MA plots across conditions show similar distributions when compared to the timepoint zero population, indicating that calls of significance remain valid within the validation screen.
Figure 4.19 Validation screen common hits between differential expression scoring methodologies in coculture plus gemcitabine condition exclusively.
Of interest was the consistency of hit calling for each scoring methodology between the first screen and the validation screen. For median Z scores, approximately 43% of the hits
exclusive to coculture plus gemcitabine in the validation screen were also hits in the original primary screen, whereas for DESeq2 this figure was lower at 17% (Fig 4.20). The additional hits called from the validation screen may be a product of the additional shRNAs added per gene in the validation screen involved in the networks and pathways of interest, which did not themselves significantly deplete in the primary screen but are associated with pathways that did exhibit depletion. Overall, the combined hit list from the validation screen supports
130
more thorough investigation into select biological process highlighted from the initial screen.
Figure 4.20 Common hits for DE methodologies between the initial whole-genome screen and the subsequent validation screen. Shared hits between initial shRNA screen and
validation screen in the coculture plus gemcitabine condition only for each of (A) Median Z Score and (B) DESeq2 differential expression scoring techniques.
4.11 Qualitative identification of highest confidence genes driving gemcitabine resistance