submitted to the
Combined Faculty of Natural Sciences and Mathematics
of the Ruperto Carola University Heidelberg, Germany
for the degree of
Doctor of Natural Sciences
Presented by
M.Sc. Janine Jung
Born in: Rottweil, Germany
Oral examination: 30.10.2020
3
Identifying and characterizing
functionally relevant microRNAs and 5’isomiRs
in triple-negative breast cancer
Referees: Prof. Dr. Stefan Wiemann
Prof. Dr. Peter Angel
5
Declaration of authorship
I hereby declare that the work presented in this thesis was carried out between November 2016 and August 2020 under the supervision of Prof. Dr. Stefan Wiemann in the group ‘Molecular Genome Analysis’ at the German Cancer Research Center (Heidelberg, Germany).
If not stated otherwise and referenced accordingly in the text, the data described in my dissertation is original, has been gathered by myself and has not yet been presented as part of a university examination. The main sources as well as the work of joint cooperation have been referenced appropriately. I, as author, herewith declare no potential conflict of interest.
Heidelberg, _____________________ Janine Jung
7
ACKNOWLEDGEMENTS
First of all, I would like to thank Prof. Dr. Stefan Wiemann for giving me the opportunity to conduct the work for this thesis in the Division of Molecular Genome Analysis and for providing a scientific environment that allowed me to develop my own project and to explore my ideas. I really enjoyed working at MGA.
Second, I would like to thank Dr. Cindy Körner for designing this project with me and for the supervision during the past years. I really appreciated your support, especially in difficult times. I would also like to thank my Thesis Advisory Committee members Prof. Dr. Peter Angel, Prof. Dr. Claudia Scholl and Prof. Dr. Dirk Grimm for their valuable advice. Moreover, I want to thank my examiners Prof. Dr. Stefan Wiemann, Prof. Dr. Peter Angel, Prof. Dr. Claudia Scholl and Prof. Dr. Joachim Wittbrodt for evaluating my thesis.
Furthermore, I would like to thank Subarna Palit, Susanne Ibing and Shashwat Sahay for the bioinformatic support.
I am also very grateful to Prof. Dr. Peter Sinn and Dr. Martina Kirchner for allowing me to use their NanoString nCounter system and for the experimental help provided by Martina.
Many thanks to all members of MGA for the warm and positive atmosphere in the lab. I am especially thankful for the nice and familial atmosphere that Angelika and Heike spread in ‘our’ lab wing. Thank you, Angelika, for your help with the last-minute experiments.
A huge ‘thank you’ goes to my ‘girls office’. Thank you so much for everything Xiaoya and Zhivka. I am grateful for all the scientific and non-scientific conversations we had and all the help you offered when I needed it. And most importantly, I enjoyed our shared love for food and us actually sharing a lot of food.
Furthermore, I want to thank Ana, Anni and Philipp from the R&D career day marketing team. I enjoyed working and not-working with you guys. I loved our coffee breaks!
I am also very grateful for the support my parents and my sister showed me throughout my entire education. In addition, I want to thank my parents-in-law for their support and for welcoming me into their family.
I owe more than words can say to my friends. Most of you have been my support system for many years and for you this is just the third thesis in a row, but to me it means so much more that you came this long way with me. We shared all the bad and mainly good and crazy times and not taking life or us too serious helped with everything. Thank you, Verena, Arthur, Dan, Jens, Bryan and
8
Björn, for all the parties, food, more food and all the funny and insane conversations (not to forget all the awesome business ideas). And Eddy, although we don’t manage to meet that often, it is always like our time as Bachelor students never ended whenever we see each other. Marcel, thank you for introducing us to the world of handball and enlightening the gym every single day (mostly with your good mood, but also with the color of your head and hair :D). A huge thank you goes to my best friend Michael. There are really no words that describe the past three years and I am grateful for all the funny and awesome moments and (partially insane) conversations.
My deepest gratitude, however, goes to my husband Michael for his constant support in literally everything. Without you this thesis and many more things would not have been possible or turned out very differently. This journey would have simply not been the same without you.
9
TABLE OF CONTENT
ACKNOWLEDGEMENTS... 7 TABLE OF CONTENT ... 9 1. SUMMARY / ZUSAMMENFASSUNG ... 13 1.1 Summary ... 13 1.2 Zusammenfassung ... 14 2. INTRODUCTION ... 17 2.1 Breast cancer ... 172.1.1 Triple-negative and Basal breast cancer ... 18
2.1.2 Triple-negative breast cancer subtypes ... 18
2.1.3 Chemotherapy as standard care for triple-negative breast cancer patients ... 19
2.2 microRNAs ... 21
2.2.1 Biogenesis and function of microRNAs ... 21
2.2.2 Studying microRNA:target interactions ... 24
2.2.3 isomiRs - microRNA sequence variants ... 25
2.2.4 Quantification of isomiR expression ... 27
2.2.5 microRNAs in triple-negative breast cancer ... 28
2.2.6 Clinical application of microRNAs and isomiRs ... 28
2.3 Aim of the project ... 31
3. MATERIAL AND METHODS ... 32
3.1 Material ... 32
3.1.1 Laboratory equipment ... 32
3.1.2 Consumables ... 33
3.1.3 Chemicals and reagents ... 34
3.1.4 Commercial kits ... 35
10
3.1.6 Cell lines and growth medium ... 37
3.1.7 Bacterial strains ... 37
3.1.8 Mouse lines ... 37
3.1.9 Primers and oligos ... 37
3.1.8 siRNAs and microRNA mimics ... 43
3.1.9 Plasmids ... 43
3.1.10 Databases and software... 44
3.2 Methods ... 45
3.2.1 Cloning of the pre-microRNA library ... 45
3.2.2 General cell culture ... 47
3.2.3 Stable cell lines ... 49
3.2.4 Expression analysis ... 51
3.2.5 Phenotypic assays ... 55
3.2.6 Bioinformatic and statistical analysis ... 58
3.2.7 Establishment of a customized NanoString assay ... 59
3.2.8 Analysis of NanoString data ... 63
4. RESULTS ... 65
4.1 microRNAs and 5’isomiRs in chemoresistance of triple-negative breast cancer ... 65
4.1.1 Establishment of an in vitro system to study microRNAs in chemoresistance ... 65
4.1.2 Establishment of a NanoString assay to detect microRNAs modulating chemoresistance ... 72
4.1.3 NanoString assay detects strong enrichment of pre-miR-103a-1 in 3D ... 74
4.1.4 Heterogeneous overexpression and selection bias of pre-microRNAs in the library ... 79
4.2 Functional differences of divergent 5’isomiRs in triple-negative breast cancer ... 82
4.2.1 miR-1307-3p I0 and its divergent 5’isomiR are highly abundant in breast cancer... 82
4.2.2 pre-miR-1307 reduces mesenchymal traits of MDA-MB-231 ... 84
4.2.3 miR-1307-3p I1 reduces migration and proliferation ... 86
11
4.2.5 Myc as potential regulator of pre-miR-1307 and ATP5MD ... 89
4.2.6 miR-1307-5p I0 as driver for high miR-1307-3p levels in breast cancer patients? ... 92
4.2.7 miR-1307-3p I0 and miR-1307-3p I1 have distinct and shared target subsets ... 93
4.2.8 miR-1307-3p I1 might reduce migration and proliferation by targeting NCS1 ... 95
4.2.9 miR-1307-3p I0 might reduce migration by targeting LBH ... 97
4.2.10 miR-1307-3p I0 targets multiple ATPase subunits and might play a role in autophagy99 5. DISCUSSION ... 102
5.1 Mammosphere assay to identify microRNAs in chemoresistance ... 102
5.1.1 Proof of principle - enrichment for chemoresistant BCSCs in mammospheres ... 102
5.1.2 Enrichment of pre-miR-103a-1 in spheres is not reflected on the functional level ... 104
5.1.3 Selection bias of microRNAs for the pre-microRNA library due to TCGA batch effects 108 5.2 Divergent 5’isomiR miR-1307-3p I1 promotes a different phenotype than miR-1307-3p I0 ... 111
5.2.1 pre-miR-1307 reduces migration and invasion in TNBC ... 111
5.2.2 miR-1307-3p I0 and I1 play divergent functional roles in a cell line-specific manner .. 112
5.2.3 Divergent target spectra provide target genes that might explain the phenotypic and mechanistic differences mediated by miR-1307-3p I0 and I1... 113
5.2.4 Upregulation of a tumor-suppressive microRNA in breast cancer... 118
5.2.5 Conclusion and outlook ... 121
6. SUPPLEMENTARY ... 124
7. LIST OF FIGURES ... 127
8. LIST OF TABLES ... 128
9. ABBREVIATIONS ... 129
13
1. SUMMARY / ZUSAMMENFASSUNG
1.1 Summary
Triple-negative breast cancer is a highly aggressive breast cancer subtype and the treatment options are mainly limited to chemotherapy, however, the patients frequently develop resistance. As endogenous regulators of gene expression, microRNAs are involved in tumor development, progression and treatment resistance. microRNA sequence variants with a shifted seed sequence are termed 5’isomiRs and extend the complexity and impact of the miRNome in cancer. A shift in the seed sequence by only one nucleotide can drastically alter the target spectrum of a 5’isomiR compared to its canonical microRNA. Hence, this study aims at identifying microRNAs and 5’isomiRs with a potential role in tumorigenesis and chemoresistance and focuses on characterizing their functional differences in triple-negative breast cancer.
I selected microRNAs and 5’isomiRs that were differentially expressed between tumor and normal tissue of patients from the TCGA cohort and, thus, potentially involved in tumorigenesis and chemoresistance. Growing mammospheres from MDA-MB-231, HCC1806 and SUM-159 cells that overexpressed the selected microRNAs as pooled library enriched for cells with increased stemness and chemoresistance. Read-out of the library composition by NanoString after several sphere generations revealed strong enrichment of miR-103a-1. In validation experiments, pre-miR-103a-1 overexpression did not influence stemness or chemoresistance.
In the second part of the project, I focused on the functional characterization of miR-1307-3p I0 and its 5’isomiR miR-1307-3p I1. Both were selected from the list of differentially expressed microRNAs based on their similar expression levels. Phenotypic assays in triple-negative breast cancer cell lines showed that both microRNAs reduce migration, miR-1307-3p I0 in a cell line-specific manner and less pronounced than miR-1307-3p I1. miR-1307-3p I1 repressed proliferation in a cell line-dependent context. Target predictions identified genes that might contribute to these phenotypes and explain differences between cell lines. The putative targets suggested that miR-1307-3p I0 plays a role in autophagy.
In summary, I showed that miR-1307-3p I0 and I1 influence different and similar phenotypes in a partially cell line-dependent manner by targeting specific as well as shared putative target subsets. This study underlines how complex and context-dependent microRNAs and their 5’isomiRs modulate gene expression and that they are of biological relevance. Consequently, diagnostic, prognostic and therapeutic approaches should discriminate between 5’isomiRs.
14
1.2 Zusammenfassung
Triple-negativer Brustkrebs ist eine sehr aggressive Form von Brustkrebs und die Behandlungsmöglichkeiten beschränken sich hauptsächlich auf Chemotherapie, gegen welche die Patienten jedoch häufig eine Resistenz entwickeln. Als endogene Regulatoren von Genexpression sind microRNAs in Entwicklung, Progression und Behandlungsresistenz von Tumoren involviert. microRNA Sequenzvarianten mit einer verschobenen Seedsequenz werden 5’isomiRs genannt und erweitern die Komplexität und den Einfluss des miRNomes in Krebs. Eine Verschiebung der Seedsequenz um nur ein Nukleotid kann das Targetspektrum einer 5’isomiR im Vergleich zur kanonischen microRNA drastisch verändern. Daher zielt diese Studie darauf ab microRNAs und 5’isomiRs mit einer möglichen Beteiligung an Tumorigenese und Chemoresistenz zu identifizieren und konzentriert sich darauf die funktionellen Unterschiede in triple-negativem Brustkrebs zu charakterisieren.
Ich wählte microRNAs und 5’isomiRs aus, die in Tumor- und Normalgewebe von Patienten aus der TCGA Kohorte differenziell exprimiert werden und daher wahrscheinlich in Tumorigenese und Chemoresistenz involviert sind. Die 3D-Kultivierung von MDA-MB-231, HCC1806 and SUM-159 Zellen, welche die ausgewählten microRNAs als gepoolte Library überexprimierten, reicherte Zellen mit Stammzell-Charakter und erhöhter Chemoresistenz an. Die Bestimmung der Library Zusammensetzung mit NanoString nach mehreren 3D-Generationen zeigte eine starke Anreicherung von pre-miR-103a-1. In Validierungsexperimenten beeinflusste die Überexpression von pre-miR-103a-1 Stammzell-Charakter und Chemoresistenz nicht.
Im zweiten Teil des Projekts konzentrierte ich mich auf die funktionelle Charakterisierung von miR-1307-3p I0 und ihrer 5’isomiR miR-miR-1307-3p I1. Beide wurden aus der Liste differenziell exprimierter microRNAs aufgrund ihrer ähnlich starken Expression ausgewählt. Phäntotypische Experimente in triple-negativen Brustkrebszelllinien zeigten, dass beide microRNAs Migration reduzieren, 3p I0 auf zelllinienspezifische Weise und weniger ausgeprägt als miR-1307-3p I1. miR-1307-miR-1307-3p I1 unterdrückte die Proliferation in zelllinien-abhängigem Kontext. Mit Hilfe von Targetvorhersagen wurden Gene identifiziert, die wahrscheinlich zu diesen Phänotypen beitragen und die Unterschiede zwischen den Zelllinien erklären. Zudem deuteten die putativen Targets darauf hin, dass miR-1307-3p I0 eine Rolle in Autophagie spielt.
Zusammenfassend zeigte ich, dass miR-1307-3p I0 und I1 unterschiedliche und ähnliche Phänotypen in teilweise zelllinienspezifischer Weise beeinflussen. Dies geschieht über spezifische und gemeinsame Teilmengen des Targetspektrums. Diese Studie unterstreicht wie komplex und
15 kontextabhängig microRNAs und ihre 5’isomiRs die Genexpression modulieren und dass sie von biologischer Relevanz sind. Infolgedessen sollten diagnostische, prognostische und therapeutische Ansätze zwischen 5’isomiRs unterscheiden.
17
2. INTRODUCTION
2.1 Breast cancer
Breast cancer is a common disease in women1. In 2019, the American Cancer Society reported that
268,600 women were diagnosed with breast cancer and 41,760 died from breast cancer2. These
numbers render breast cancer the most common cancer entity in women and the second most frequent cancer-related cause of death. The heterogeneity of breast cancer requires the discrimination between subtypes in order to recommend a suitable therapeutic strategy. In the past, breast cancer was classified mainly based on immunohistological characteristics, for instance, the presence of estrogen receptor (ER), progesterone receptor (PR) and human epidermal receptor 2 (HER2)3. ER+ breast cancer is the most common receptor status of breast
tumors and is treated with endocrine therapy, while patients with HER2+ breast cancer receive HER2-targeted therapy, for instance, Trastuzumab (Figure 1)4. Both patient groups have a more favorable prognosis than triple-negative breast cancer (TNBC) patients that lack ER, PR and HER2 expression. The lack of receptors that can be targeted explains why there is currently no targeted therapy available for the 15-20 % of patients that are diagnosed with TNBC5,6. Approximately 75 %
of all breast cancer patients, however, have a good prognosis since they are ER+ and, thus, benefit from endocrine therapy7,8.
Figure 1: Breast cancer subtypes and targeted therapy options. Breast cancer classification based on
receptor status: ER+, HER2+ and TNBC. These three groups strongly differ in their prognosis and different targeted therapy options are available. Each breast cancer type is divided into further subtypes. SERM = selective estrogen receptor modulator, AI = aromatase inhibitor. The figure was modified from Ma et al., 20184.
18
Nowadays, gene expression profiling and RNA sequencing allow to refine this classification by taking gene signatures into account and, thus, improve prognostic and therapeutic approaches. The PAM50 intrinsic subtype classifier discriminates between five breast cancer subtypes based on the mRNA expression pattern of 50 genes: Luminal A, Luminal B, HER2-enriched, Basal-like and Normal-like9–11. Luminal A is the most frequent PAM50 subtype (50-60 %), followed by Luminal B
(15-20 %), HER2 (15-20 %) and Normal-like (5-10 %)12. The abundance of Basal breast cancer
(8-37 %) varies a lot based on the prevalence of poorly differentiated grade 3 cases in the evaluated population. Basal and HER2+ breast cancer are the most aggressive subtypes and associated with poor prognosis13,14. Luminal A patients have the best prognosis, followed by Luminal B patients12.
The Normal-like subtype, however, is poorly characterized.
2.1.1 Triple-negative and Basal breast cancer
In 70-80 % of all cases, Basal breast tumors are also classified as TNBC11. While TNBC lacks ER and
PR expression as well as HER2 amplification, approximately 20 % of the Basal breast tumors overexpress either ER or HER25. Moreover, several other markers including EGFR, c-Kit as well as
cytokeratins 5, 6, 14 and 17 are associated with Basal tumors15. Basal patients overexpress genes
related to proliferation, cell cycle and DNA damage response16. TNBC reveals a high frequency of
TP53 mutations (80 %), while PIK3CA mutations present in 8 % of all cases17,18. The
tumor-suppressors PTEN, RB1 and BRCA1 are frequently lost in TNBC, whereas MYC is commonly amplified19. Despite several differences, TNBC and Basal breast cancer also have common
features: both are highly aggressive, frequently metastasize to lungs and brain and patients with residual disease after chemotherapy have a poor overall prognosis since they are more prone to relapse than other subtypes5,20–22.
2.1.2 Triple-negative breast cancer subtypes
Based on gene expression and pathway activities TNBC was originally classified into six subtypes: Basal-Like 1 (BL1), Basal-Like 2 (BL2), Immunomodulatory (IM), Luminal Androgen Receptor (LAR), Mesenchymal (M) and Mesenchymal Stem-Like (MSL) (Figure 1)23. However, Lehman et al.
showed that the subtypes IM and MSL originated from tumor-infiltrating lymphocytes and stromal cells, respectively24. Thus, the classification was simplified to the four subtypes BL1, BL2, LAR and
M. BL1 is characterized by an increase in gene expression related to cell cycle and DNA damage response, moreover, this subtype represents the most prevalent one and has the best prognosis24.
19 The TNBC subtype M is driven by enhanced EMT and growth factor signaling. The subtypes with the worst prognosis, BL2 and LAR, are enriched in growth factor signaling and myoepithelial markers or luminal gene expression, respectively. The highly proliferative subtype BL1 responds well to therapies targeting mitosis, while M patients might benefit from agents targeting angiogenesis and PI3K/mTOR inhibitors based on their enrichment in EMT signatures25. The
dependence on the androgen receptor renders anti-androgen therapies highly relevant for LAR patients26. Since the BL2 subtype is characterized by upregulated growth factor signaling,
angiogenetic factors, glycolysis and gluconeogenesis, inhibitors targeting the receptors of VEGF, PDGF and FGF show some promise27. Overall, the classification of TNBC into subtypes allows a
more refined treatment strategy and is of high interest to patients who do not respond to chemotherapy or relapse after treatment. The distinct molecular signature allows identifying potential targets and therapeutically exploiting pathway vulnerabilities. The respective targeted therapies, however, still need to be tested with clinical trials.
2.1.3 Chemotherapy as standard care for triple-negative breast cancer patients
Due to the lack of targeted therapy options, neoadjuvant chemotherapy is the standard of care for patients with TNBC28. In 50 % of the patients, neoadjuvant chemotherapy does not result in
pathological complete response (pCR). The lack of complete response is associated with a high rate of recurrence (40-60 %)29. Overall, TNBC patients with less progressed disease respond more
likely to chemotherapy30. The pCR in the subtype BL1 is around 41-52 %24,28,31, whereas the other
subtypes have a worse response to chemotherapy. BL2 and LAR show a pCR of 0-18 % and 10-29 %, respectively24,31. In the subtype M, almost 40 % of the patient have a complete response to
chemotherapy24.
Anthracyclines and taxanes are administered as standard chemotherapy regimen in the neoadjuvant setting32. Platinum-based regimen have been proposed, but their effectiveness is still
under research and they are not part of the standard treatment protocol32. Adding cisplatin or
carboplatin to the chemotherapy regimen increased pCR rates, the improvement of survival, however, needs to be determined33. Anthracyclines and platinum agents target DNA synthesis and
integrity, while taxanes affect cytokinesis34. Patients that receive anthracycline-based
chemotherapy and relapse can be treated with taxanes and vice versa. For patients with recurrent disease after treatment with anthracyclines and taxanes, fluorouracil/capecitabine, eribulin, gemcitabine, vinorelbine or ixabepilone represent alternative treatments35.
20
While TNBC patients with residual disease after neoadjuvant chemotherapy but without a relapse are characterized by a luminal-like gene signature, patients with recurrent disease reveal a stem cell-like gene signature36. Stem cells and stemness-related pathways strongly contribute to
chemoresistance in TNBC37. However, several other molecular mechanisms play a major role in
the development of chemoresistance: senescence and autophagy, for instance, circumvent apoptosis and thereby allow TNBC cells to escape chemotherapy38. The upregulation of ABC
transporters helps the cells to get rid of the chemotherapeutic drugs37.In conclusion, insight into
the mechanisms of chemoresistance of TNBC patients with recurrent disease after chemotherapy is urgently required to develop targeted therapies or strategies to overcome treatment resistance.
21
2.2 microRNAs
2.2.1 Biogenesis and function of microRNAs
MicroRNAs are major regulators of endogenous gene expression and typically regulate hundreds of targets39,40. Mammalian microRNA genes are mainly located within introns of protein-coding or
non-coding genes, a smaller fraction overlaps with exons of protein-coding or non-coding transcripts41,42. In several cases it is determined by the splicing pattern whether the microRNA
gene is located within an exonic or intronic sequence42. microRNA biogenesis starts with the
primary microRNA (pri-microRNA) being transcribed from the microRNA gene by RNA polymerase II43 (Figure 2). The pri-microRNA is composed of 500-3,000 nucleotides, a 7-methylguanosine cap
at the 5’terminus and a polyadenylated 3’tail44. In the nucleus, the pri-microRNA is cleaved by a
complex formed by Drosha and DGCR8, which results in a precursor microRNA (pre-microRNA) of 70-80 nucleotides length45,46. Export of the pre-microRNA into the cytoplasm by Exportin-5 allows
further processing by the Dicer-TRBP complex, which forms a microRNA duplex consisting of a guide strand and a passenger strand43. Usually, the guide strand is characterized by lower stability
at the 5’terminus or the base uracil at the 5’end47. After the microRNA duplex is loaded into the
RNA-Induced Silencing Complex (RISC) formed by Argonaute proteins, both strands are unwound and the passenger strand is degraded48. The microRNA guide strand, however, remains
incorporated into the RISC (miRISC) and governs the complex to the respective mRNA targets in order to repress their translation49. Alternatively, the target mRNA is degraded 40,50. microRNA
biogenesis can also occur via Drosha- or Dicer-independent pathways, however, these alternative pathways are less well studied51.
Mature microRNA are comprised of 21-25 nucleotides, of which mainly the seed sequence (nucleotides 2-8) determines specific binding to a complementary region within the 3’UTR of mRNA targets52,53. The 3’UTRs of microRNA targets often harbor several binding sites for the same
or other microRNAs39,54, which amplifies the repression of the mRNA target. Moreover, mRNA
targets may influence the levels of the microRNAs regulating them55, which shows the complexity
of these bidirectional microRNA-target relations. Overall, more than 60% of all protein-coding genes were found to be conserved microRNA targets56. This shows that microRNA-regulated gene
expression is a global phenomenon and that their deregulation exerts a great impact on cellular fate by affecting multiple signaling pathways57, which makes microRNAs crucial players in cancer
22
Figure 2: microRNA biogenesis. Mammalian microRNA biogenesis starts with pri-microRNA transcription by
RNA polymerase II from the genome. Cleavage of the pri-microRNA by the microprocessor complex consisting of Drosha and DGCR8 results in the pre-microRNA. Subsequently, the pre-microRNA is exported to the cytoplasm by Exportin-5. In the cytoplasm, the Dicer-TRBP complex cleaves the pre-miRNA, which results in a microRNA duplex. The microRNA guide strand is bound by the RNA-Induced Silencing Complex (RISC) formed by Argonaute proteins. The mature miRNA guides the RISC to the mRNA targets, resulting in translational repression or degradation of the bound transcripts. During microRNA biogenesis, alternative Dicer cleavage sites, post-transcriptional modifications and RNA editing produce microRNA sequence variants, so-called isomiRs. The figure was modified from Bajan et al., 201443.
23 Different types of binding sites within the 3’UTR determine the impact that microRNAs have on their mRNA targets. Canonical binding sites are the most common type and differ in the number of bases that are complementary to the seed sequence of the microRNA (Figure 3). The extent to which a microRNA represses its target is largely determined by the amount of bases within the 3’UTR that are complementary to the seed sequence: 8mer binding sites in the 3’UTR consist of more base pairs that are complementary to the seed sequence than 7mer-m8, 7mer-A1 or 6mer binding sites, which results in a higher binding affinity and, thus, stronger repression of the mRNA target59,60. Besides the canonical binding sites, microRNAs can bind and regulate their targets via
3’supplementary and 3’compensatory sites. Both target sites provide additional base pairs complementary to parts of the microRNA: the 3’supplementary site is characterized by additional pairing of 3-4 nucleotides at position 13-16, while the 3’compensatory site reveals additional pairing at position 13-16 in order to compensate for discontinuous complementary between 3’UTR and seed sequence61,62.
Although microRNAs predominantly target 3’UTRs, they repress mRNA targets by binding to the 5’UTRs or coding sequences as well63. microRNA-532-5p, for instance, downregulated RUNX3 by
targeting the 5’UTR64 and let-7 repressed Dicer by binding to the coding sequence65. Moreover,
targeting of promoters upregulates gene expression. miR-324-3p, for instance, induced expression of RelA by binding within the promoter sequence66.
Figure 3: Different types of canonical microRNA binding sites. Canonical microRNA binding sites are
differentiated based on the extent to which the seed sequence of the microRNA is complementary to a sequence within the 3’UTR of a mRNA target. The more bases of the 3’UTR are complementary to the microRNA seed sequence, the stronger the microRNA:target interaction and the resulting downregulation of gene expression.
24
2.2.2 Studying microRNA:target interactions
microRNA targets are usually identified and validated in a process consisting of multiple steps: target prediction tools such as PicTar, MiRanda or TargetScan are employed to detect targets with conserved microRNA binding sites60. A microRNA may target around 200 mRNAs, the number of
predictions, however, often comprises thousands of potential targets containing numerous false positives40. To exclude false positives, the predicted targets are overlapped with experimental
perturbation data that is generated by microRNA overexpression. Typical formats for the experimental validation are sequencing or MicroArray-based profiling of all expressed genes, whereas qRT-PCR-based validation is carried out for a smaller number of targets that has been preselected by literature, for instance. To confirm direct binding of validated mRNA targets by the microRNA, the 3’UTRs of the respective mRNA targets are cloned into a luciferase reporter construct. Cells are transfected with microRNA mimics together with the luciferase reporter and direct binding of the 3’UTR is detected by a reduction in luciferase activity compared to the non-targeting control67. Mutation of the binding sites within the reporter should abolish the effect of
the microRNA on luciferase activity, which provides another layer of validation67.
More advanced approaches to identify microRNA targets focus on determining the targets while the miRISC is interacting with them. Cross-linking and immunoprecipitation (CLIP) strategies employ UV light for crosslinking of mRNA targets bound by miRISC and immunoprecipitation of Ago to pull-down the miRISC complex68,69. Subsequent high-throughput sequencing allows
identifying the immunoprecipitated targets and microRNAs that were incorporated into the RISC (HITS-CLIP)68,70. For photoactivatable-ribonucleoside-enhanced CLIP (PAR-CLIP), cells are fed with
medium containing the nucleoside analog 4-Thiouridine and RNA:protein interactions are cross-linked at a different wave length than HITS-CLIP70. Further development of CLIP allows profiling of
cross-linked sites at individual-nucleotide resolution (iCLIP)71. Although the mentioned techniques
detect only RNA:protein interactions, intersecting the generated data with predicted targets allows to narrow down the number of putative microRNA:target interactions. Transfection of biotinylated microRNA mimics allows pull-down of mRNA targets regulated by the respective microRNA with streptavidin-coated beads72. Sequencing the precipitated mRNA targets and
overlapping them with target predictions is another approach to obtain valid target candidates. One method that does not depend on intersecting the data with predicted targets to unravel putative microRNA:target interactions is crosslinking, ligation and sequencing of hybrids (CLASH). CLASH directly identifies the respective microRNA that represses the mRNA target68. While the
25 of Ago-RNA complexes. The modification is required to ligate microRNAs and their mRNA targets for subsequent sequencing73. A major benefit of CLASH is the detection of targets harboring
canonical as well as non-canonical binding sites74.
2.2.3 isomiRs - microRNA sequence variants
Currently 2675 mature canonical microRNAs are annotated in miRbase75. In recent years,
however, microRNA sequencing revealed that one locus gives rise to multiple sequence variants, which are termed isomiRs76,77. Partially, the variation in microRNA sequence length can be
explained by alternative Drosha- or Dicer-mediated cleavage during miRNA biogenesis78–81
(Figure 2). Alternative processing by Drosha or Dicer is a templated process since the isomiR sequence still matches the pre-microRNA sequence from which the mature isoform is derived82.
Moreover, microRNA sequence variants can result from 3’trimming by 3'-to-5' exoribonucleases83,84. A non-templated mechanism that generates isomiRs, is the
post-transcriptional addition of nucleotides at the 5’ or 3’ terminus of the microRNA by nucleotidyl transferases85. The microRNA sequence variants generated by alternative Drosha/Dicer cleavage,
3’trimming or post-transcriptional 5’/3’ nucleotide addition differ in their lengths and/or sequence at the 5’ and 3’end (Figure 4) and are termed 5’isomiRs or 3’isomiRs, respectively82. Polymorphic
isomiRs form the third class of isomiRs and differ in their sequence composition by harboring single nucleotide mismatches compared to the canonical microRNA82. Literature suggests that
polymorphic isomiRs are a result of RNA editing86.
The expression of isomiRs varies across cell and tissue types or developmental stages and reveals race- and gender-specific patterns76,87–90. Moreover, the abundance of isomiRs is regulated in a
dynamic manner and can be modulated by various biological stimuli and different conditions, for instance, hypoxia, ischemia or interferon β stimulation91–93. While 3’variants affect microRNA
stability as well as stability of the microRNA:target duplex by additional pairing with the mRNA target at the 3’end59,94, 5’isomiRs are of greater functional relevance since their seed sequence
differs from the canonical form87. A shift in the seed sequence of 5’isomiRs can drastically alter
the mRNA target spectrum95, which adds another layer of complexity to gene expression regulated
by the miRNome. However, many 5’isomiRs still share a large subset of their targets with the respective canonical microRNA and thereby might have synergistic functions96.
26
Figure 4: 5’isomiRs and 3’isomiRs - microRNA sequence variants. 5’isomiRs and 3’isomiRs differ in their
length and/or sequence at the respective end from the canonical microRNA. While 3’variants affect the stability of the microRNA and the microRNA:target duplex, 5’isomiRs are shifted in their seed sequence and, thus, are of functional relevance. 5’isomiRs are discriminated based on their target spectrum: looking at the longer sequence variant when comparing two microRNAs, a U at nucleotide 2 indicates a convergent seed sequence. A, C or G at position 2 of the longer microRNA identifies a 5’isomiR with a divergent seed sequence. Divergent seed sequences require more extensive seed pairing for both microRNAs to share the binding site within a target and, thus, have more likely discrete target spectra.In this project, all microRNA sequences with the same 5’ends were summed up and considered as one 5’isomiR. The nomenclature that was used in this thesis refers to the 5’end of the sequence only and indicates how many bases the 5’end is shifted compared to the canonical microRNA. I1 indicates that the 5’isomiR is shifted by one nucleotide in 3’direction, I-1 would indicate a shift towards the 5’end.
5’isomiRs are divided into two categories: 5’isomiRs with a convergent seed sequence and 5’isomiRs with a divergent seed sequence. When comparing two 5’isomiRs, the second nucleotide of the longer sequence variant decides whether both 5’isomiRs share a majority of their targets or whether they rather regulate distinct target subsets. The nucleotide U at position 2 of the microRNA sequence indicates a convergent seed sequence, while A, C or G at position 2 identifies a 5’isomiR with a divergent seed sequence96 (Figure 4). A divergent seed sequence of the longer
27 5’isomiR variant requires more nucleotides of the target 3’UTR to be complementary to the seed sequence of the microRNA. As a result, 5’isomiRs with a divergent seed sequence have more likely less overlapping target spectra.
To annotate isomiRs, Loher et al. proposed a system that uses the 5’end and 3’end of the canonical microRNA annotated in miRbase as reference89. A shift of the 5’end or 3’end in 3’direction by one
nucleotide is annotated as I+1, whereas I-1 indicates a shift by one nucleotide in 5’direction. In this study, microRNA sequences with the same 5’ends were summed up and considered as one 5’isomiR disregarding the 3’ends (Figure 4). To simplify the nomenclature, a shift by one nucleotide in 5’direction is referred to as I1 instead of I+1.
2.2.4 Quantification of isomiR expression
Commercial qRT-PCR assays do not allow isomiR-specifc quantification of microRNA expression levels97, neither do MicroArrays. While microRNA sequencing allows to unravel the isomiR
expression landscape of the entire cell, the availability of custom-made, isomiR-specific detection assays for individual isomiRs of interest is limited. For detection of individual isomiRs and reliable discrimination from other isoforms, the specificity of adapters and probes is the limiting factor. One method that describes isomiR-specific detection is the Dumbbell-PCR, which employs adapters that are ligated to the 5’end or the 3’end of the isomiR of interest by T4 RNA ligase 298,99.
High specificity of the T4 RNA ligase 2 and subsequent qRT-PCR with TaqMan probes targeting the ligation product allows to determine isomiR expression levels.
Another method uses DNA probes as detection switch for isomiRs. The DNA probe contains a RNA hybridization module for the isomiR, a switching module and a restriction site for the nicking endonuclease Nt.BstNBI100. The DNA switch is activated upon binding of the respective isomiR to
the DNA probe and reverse transcription of the assembled molecule. The DNA probe forms a hairpin only upon incorporation of nucleotides complementary to the 5’end of the bound isomiR. Subsequently, the signal is amplified by multiple cycles of DNA nicking and polymerization in a non-linear reaction.
A two-tailed qRT-PCR approach aims at isomiR detection by using two probes incorporated into a primer for reverse transcription101. One probe is located at the 5’end of the primer, the other
probe is located at the 3’terminus. Binding of both probes to the target isomiR forms a stable complex for reverse transcription. In a second step, qRT-PCR primers bind to the target for amplification and SYBRGreen-based detection. Overall, assays for quantifying isomiR-specific
28
expression are still under development and there is no straight-forward solution yet. Moreover, optimization might be required for each individual isomiR in order to avoid unspecific detection of other isoforms or similar microRNA sequences.
2.2.5 microRNAs in triple-negative breast cancer
microRNAs have been implicated in a wide range of cancer-associated phenotypes and signaling pathways involved in tumorigenesis and therapy response. Features modulated by microRNAs include proliferation and cell cycle, motility and metastasis, apoptosis, autophagy, metabolism, stemness and resistance102–106. Cancer-associated phenotypes can be driven by various pathways
in different cancer entities and subtypes. Exploiting pathway dependencies of cancer subtypes for targeted therapy is a common strategy and of high interest for entities with limited therapeutic options, such as TNBC.
The Notch and Wnt pathways, for instance, have been identified as highly relevant in TNBC, especially for the stem cell fraction of TNBC107,108. This renders Notch and Wnt signaling
particularly relevant considering the association of stem cells and chemoresistance and the fact that only 60-70 % of the patients respond to chemotherapy109. And as for the majority of signaling
pathways, microRNAs strongly affect these pathways: miR-124-3p was shown to promote TNBC cell growth via Wnt signaling110, whereas miR-125b increased proliferation as well as motility via
the Wnt pathway111. MiR-6838-5p on the other hand repressed Wnt signaling and, thus, motility
in TNBC112. miR-105/93-3p affected TNBC stemness and chemoresistance by promoting Wnt
signaling113. So far, only two microRNAs were associated with Notch signaling in TNBC:
miR-106b-25 enhanced tumorigenesis via Notch signaling114, whereas miR-34a targeted the Notch pathway
to mediate tumor-suppressive effects115.
Resulting from the heterogeneity of TNBC and its subtypes a large variety of signaling pathways contributes to tumorigenesis, disease progression and patient survival. This provides a lot of possibilities for targeted therapies, however, there is still a long way to go considering the complexity that microRNAs and isomiRs add to this context.
2.2.6 Clinical application of microRNAs and isomiRs
Since especially 5’isomiRs are of functional importance, discrimination between isomiRs is highly relevant in the context of biomarker discovery and clinical research. While microRNA-based
29 therapeutic approaches are still in the early phases of clinical approval, profiling of isomiRs opens extensive new possibilities for biomarker research that are exploited already116,117. 3’isomiRs of
miR-574-3p and miR-205-5p, for instance, were identified as diagnostic biomarkers for esophageal squamous cell carcinoma118. On a larger scale, isomiRs were employed to distinguish between 32
cancer entities using 11,000 samples from the TCGA patient data117 and to differentiate between
breast cancer subtypes119 or discriminate breast cancer tissue from normal tissue76. Even in
patient serum or extracellular vesicles from cell culture supernatant deregulation of specific isomiRs allowed to detect breast cancer120.
microRNA-based therapy works in two different ways: either by microRNA replacement therapy using microRNA mimics / precursors or by suppressing microRNAs with inhibitors or sponges121.
microRNA replacement is a suitable approach for tumor-suppressive microRNAs that are downregulated in the tumor, for instance. microRNA inhibition on the other hand, aims at quenching oncogenic microRNAs that are highly abundant in the tumor. To date, there are no FDA-approved therapies based on microRNAs available, whereas some microRNA candidates are in clinical development or phase 1 / phase 2 trials. miR-16, for instance, completed a clinical phase 1 trial as second line or third line treatment for patients with lung cancer or recurrent thoracic cancer122. The clinical trial intravenously administered drug delivery vehicles that contained
miR-16 mimics and were tailored to the EGF receptor with an antibody123. Moreover, several phase 2
trials subcutaneously injected an antisense oligonucleotide targeting miR-122 for the treatment of hepatitis C virus infection and were successfully completed122,124,125.
Overall, there are a lot of promising microRNAs that could be exploited for therapeutic approaches. The major problem with their use as a drug, however, is their short half-life that is determined by the presence of nucleases and the necessary delivery to the cell via a carrier that allows passing membranes121. Two different classes of carriers are employed for microRNA
delivery: viral vectors and non-viral carriers121. Carrier-free approaches are under development as
well, for instance, the coupling of microRNAs to folate which allows the uptake into cancer cells that overexpress the folate receptor126. This method was refined recently to enhance endosomal
escape of the microRNAs that were successfully delivered to the cell127. The microRNAs were
coupled to nigericin in addition to folate, which promoted the swelling and bursting of endosomes and released the microRNA into the cytosol. However, folate receptor-mediated uptake of microRNAs strongly tailors this therapy to very specific subsets of cells or cancer types and cannot be applied as a general therapy concept.
30
microRNA-based therapies provide a lot of potential, especially for cancer entities and subtypes that lack targeted therapy options, such as TNBC. miR-708, for instance, was coupled to nanoparticles and administered to mice and reduced lung metastasis derived from the TNBC cell line MDA-MB-231128. Another study delivered nanoparticles tailored to the CD133 receptor and
coated with a miR-21 inhibitor to TNBC cells and breast cancer stem cells129. Despite some success
stories, microRNA-based therapies still have a long way to go until FDA-approval and general use in the clinics. Moreover, 5’isomiRs are currently not exploited for clinical studies, which might result from the fact that a multitude of studies does not discriminate between isoforms when profiling the miRNome in tumor tissue or patient serum.
31
2.3 Aim of the project
For the highly aggressive breast cancer subtype TNBC, there are currently no targeted therapies available and patients treated with chemotherapy frequently develop resistance towards the treatment. Thus, unraveling the molecular mechanism of chemoresistance and identifying targets for therapeutic approaches is of high interest. Since microRNAs modulate gene expression which affects the majority of signaling pathways, they play a crucial role in tumorigenesis and influence the cellular response towards chemotherapy. In the past years, our knowledge of the miRNome gained more complexity with the discovery of 5’isomiRs, microRNA sequence variants with a shifted seed sequence that can affect the target spectrum drastically. The majority of studies, however, does not take the functional divergence of microRNAs and their 5’isomiRs into account. In conclusion, this study aimed at identifying microRNAs with a key role in tumorigenesis and chemoresistance and focused on characterizing the functional differences of particular microRNAs and their 5’isomiRs in this context.
To achieve this aim, this study focused on:
1) Identifying microRNAs and 5’isomiRs of high relevance for breast cancer tumorigenesis and with a potential impact on response to chemotherapy.
2) Establishing an experimental system to identify microRNAs and 5’isomiRs with a major role in chemoresistance.
3) Characterizing the effect of selected microRNAs and their 5’isomiRs on chemoresistance and other cancer-associated phenotypes.
4) Analyzing the functional and mechanistic differences between selected microRNAs and their 5’isomiRs with a focus on their direct targets.
32
3. MATERIAL AND METHODS
3.1 Material
3.1.1 Laboratory equipment
Bacterial incubator (37°C) Memmert
Bacterial shaking incubator (37°C) HT INFOS Minitron
Balance Kern
CASY cell counter Roche Innovatis
Cell culture hood HERA Safe Thermo Fisher Scientific
Cell culture incubator (37°C) Heraeus
Centrifuges Eppendorf, Heraeus
DNA gel apparatus Bio-Rad
Electrophoresis power supply Pharmacia
Freezer (-20°C) Liebherr
Freezer (-80°C) Eppendorf
Fridge (4°C) Liebherr
Gel documentation system Herolab
Glomax Microplate Reader Promega
Light microscope Hund
Micropipettes Gilson
Microwave Panasonic
ImageXpress Micro Confocal Microscope Molecular Devices ImageXpress Micro XLS Widefield Microscope Molecular Devices
Multichannel pipette Eppendorf
Multipette plus Eppendorf
NanoDrop nd 1000 spectrophotometer Thermo Fisher Scientific
nCounter FLEX Analysis System NanoString
Pipetboy Integra Biosciences
Qubit Fluorometer Thermo Fisher Scientific
Real-time PCR Thermocycler Applied Biosystems
Rocker Platform NeoLab
33
Thermomixer Eppendorf
Tube Rotator VWR
Vacuum Aspirator Integra Biosciences
Vortex Mixer NeoLab
xCELLigence Real-time Cell Analyzer Roche
3.1.2 Consumables
10 cm Ø petri dish TPP
24-well transwell plates (5.0 μm, 8.0 μm) Corning
6-well plate, flat bottom, transparent Greiner Bio-One 96-well deep well plate, 2.2 mL Fischer Scientific 96-well plate, flat bottom, Black Greiner Bio-One 96-well plate, flat bottom, transparent Greiner Bio-One 96-well plate, flat bottom, white Greiner Bio-One
Adhesive optically clear plate seals Thermo Fisher Scientific Cell culture flasks (25 cm2, T75 cm2) TPP
Cell culture flasks (175 cm2) Greiner Bio-One
Cell scraper Corning
CIM Plate 16 OLS OMNI Life Science
Combitips advanced (1 mL, 5 mL, 10 mL) Eppendorf
Conical tubes (15 mL, 50 mL) Greiner Bio-One
Costar Ultra-low attachment plates (24-well, 6-well) Corning
Cryovials (1.8 mL) Nunc
E Plate 16 (PET) OLS OMNI Life Science
Filter tips (10 μL, 20 μL, 200 μL, 1000 μL) Neptune Scientific
Inoculation loops (10 μL) Copan
Matrigel invasion chamber (8.0 µm) Corning
MicroAmp optical 384-well reaction plate Applied Biosystems Microcentrifuge tube (1.5 mL, 2.0 mL) Eppendorf
PCR strips Steinbrenner
Pasteur capillary pipettes (230 mm) Waltham Serological Pipettes (2.5 mL, 5 mL, 10 mL, 25 mL) Corning
34
Sterile syringes Sigma
3.1.3 Chemicals and reagents
Agar Sigma
Agarose Roth
Ampicillin Sigma
B27 supplement (1x) Gibco
Bacto Trypton Difco
CASYton Roche Innovatis
Chloroform Sigma
Complete Mini Protease Inhibitor Cocktail Roche
DMEM Gibco DMEM/F-12 Gibco DMSO Sigma Doxycycline Takara D-PBS Gibco EDTA Sigma
EGF, recombinant Corning
Epirubicin Biomol
Ethanol Sigma
Ethidium bromide Sigma
Fetal Bovine Serum (FBS) Gibco
FGF, recombinant human basic R&D
Geniticin Sigma
Glycerol Roth
Glycine Gerbu
Ham's F-12 Nutrient Mix Gibco
Heparin sodium salt Sigma
HEPES buffer solution Gibco
Hoechst 33342 Thermo Fisher Scientific
Insulin Sigma
Isopropanol Sigma
35
Methylcellulose Sigma
NEAA (non-essential amino acids, 100x) Gibco
Nuclease-free water Ambion
OptiMEM Gibco
Paclitaxel Biomol
PeqGold ladder (1 kb) Thermo Fisher Scientific
PhosSTOP Roche
Polybrene Merck Millipore
Poly-L-lysine Sigma
primaQUANT qPCR Probe Master Mix Steinbrenner
Propidium iodide Sigma
Puromycin Gibco
Restriction enzymes and buffer New England Biolabs
RNase Qiagen
RPMI 1640 Gibco
SOC medium Invitrogen
Tris HCl Sigma
Tris base Sigma
Trypsin-EDTA (0.05 %, 0.25 %) Sigma
Tween 20 Sigma
Yeast extract Gerbu
3.1.4 Commercial kits
DNeasy Blood & Tissue Kit Qiagen
miRNeasy Kit Qiagen
miScript precursor assays Qiagen
miScript primer assays Qiagen
miScript RT Kit Qiagen
miScript SYBR Green PCR Kit Qiagen
NEBuilder HiFi DNA Assembly Cloning Kit New England Biolabs
NucleoBond Xtra Midi Kit Macherey-Nagel
QIAprep Spin Miniprep Kit Qiagen
36
RNase-free DNase Set Qiagen
RNeasy Mini Kit Qiagen
Titanium Taq DNA polymerase CLONTECH
Universal Probe Library Roche
Wizard SV Clean Up System Promega
XT Elements Master Kit NanoString
XT Elements TagSet-84 NanoString
3.1.5 Solutions and buffers
LB Medium 10 g Bacto Trypton
5 g yeast extract 10 g NaCl
dissolve in ddH2O up to 1 L, autoclave
LB-Agar 15 g in 1 L of LB medium, autoclave
50x TAE (Tris-acetate-EDTA) 242 g Tris base 57.1 mL acetic acid 100 mL 0.5 M EDTA (pH 8.0) ad 1 L ddH2O TE-Tween 10 mM Tris pH 7.5 1 mM EDTA 0.1 % Tween 20
DNA precipitation buffer 9 mL ethanol (absolute) 300 µL 3 M NaAc (pH 5.2) 1800 µL ddH2O
37 3.1.6 Cell lines and growth medium
The parental cell lines used in this thesis are listed in Table 1. All parental cell lines were authenticated prior to and in the end of the study (Multiplexion, Heidelberg). Parental cell lines and stable cell lines derived from the parental cell lines were cultivated in the growth medium specified below and tested for potential mycoplasma contamination on a regular basis.
Table 1: Cell lines and growth medium used in this thesis.
cell line obtained from derived from full growth medium
HEK293-FT ATCC (PTA-5077)
embryonic kidney cells, human, transformed with SV40 large T-antigen
DMEM, 10 % FBS, 1 % NEAA, 1 % Geneticin
MDA-MB-231 ATCC (HTB-26) breast adenocarcinoma (metastasis), human
RPMI-1640, 10 % FBS HCC1806 ATCC
(CRL-2335)
breast squamous cell carcinoma, human RPMI-1640, 10 % FBS SUM-159 Roberto Würth (A010, DKFZ) pleomorphic breast carcinoma, human Ham's F-12, 5 % FBS, 10 mM HEPES, 1 ug/mL Hydrocortisone, 5 ug/mL Insulin
3.1.7 Bacterial strains
Competent MACH1 (E.coli) Thermo Fisher Scientific
3.1.8 Mouse lines
NSG mice the mice were bred at the DKFZ mouse facility
3.1.9 Primers and oligos
All primers and oligos used for cloning, sequencing, TaqMan or as NanoString probes were purchased from Sigma-Aldrich. All sequences are given in 5' - 3' direction.
Sequencing primers
The miRseq5 primer (tgtttgaatgaggcttcagtac) published by Fellmann et al130 was used for
38
TaqMan primers
TaqMan primers were designed using the online tool 'Assay Design Center' for the 'Universal Probe Library' (UPL) of Roche. The primers and the respective UPL probes can be obtained from Table 2.
Table 2: Primers used for TaqMan assays.
target fw primer rev primer probe
House-keeping genes
ACTB ccaaccgcgagaagatga ccagaggcgtacagggatag 64
GAPDH gcccaatacgaccaaatcc agccacatcgctcagacac 60
HPRT1 tgaccttgatttattttgcatacc cgagcaagacgttcagtcct 73
Drug efflux pumps and detoxification enzymes
ABCC1 aatgcgccaagactaggaag ttctgtggggacttgacga 10
ABCC2 cttttcctggatcacctcca ccatcatcaaggctgaaaaga 1
CAT ctccggaacaacagccttc atagaatgcccgcacctg 1
GPX1 caaccagtttgggcatcag gttcacctcgcacttctcg 77
SOD2 aatcaggatccactgcaagg taagcgtgctcccacacat 3
Breast cancer stem cell markers
ALDH1A1 ccaaagacattgataaagccataa cacgccatagcaattcacc 82
CD24 atgggcagagcaatggtg ccagttgttgtttcactggaat 23
CD44 gacaccatggacaagttttgg cggcaggttatattcaaatcg 13
ITGA6 tggcctcttcatttggctat aaaatactgtggggctccaat 77
ITGB3 catccacgaccgaaaagaa tgaaggtagacgtggcctct 76
PROM1 ggaaactaagaagtatgggagaaca cgatgccactttctcactgat 86
NANOG tctccaacatcctgaacctca ttgctattcttcggccagtt 87
EMT marker
CDH1 cccgggacaacgtttattac gctggctcaagtcaaagtcc 35
FN1 gggagaataagctgtaccatcg tccattaccaagacacacacact 25
CDH2 ggtggaggagaagaagaccag ggcatcaggctccacagt 66
VIM gaccagctaaccaacgacaaa gaagcatctcctcctgcaat 39
SNAI1 tacagcgagctgcaggact atctccggaggtgggatg 11
SNAI2 tggttgcttcaaggacacat gcaaatgctctgttgcagtg 7
Others
ATP5MD ctccagctgtgaaagcaaca ttatcacatgatgagttggcatt 80
PDCD11 gagagggcccttaagacca cacccacacgttcagcttc 68
39
Primers for pre-amplification of the NanoString samples
The following primer pair was used to pre-amplify the pre-microRNA barcodes that were integrated into the genomic DNA. The primers bound to a sequence within the retroviral backbone RT3GEPIR130, which was used to generate the parental cell lines with the pre-microRNA library.
fw aacgagaagcgcgatcacatggt rev gggaacttcctgactaggggagga
Oligos for cloning of the pre-microRNAs and probes for subsequent detection by NanoString
Each pre-microRNA was designed with two or three partially complementary oligos that covered the pre-microRNA sequence obtained from miRBase. After annealing, the pre-microRNA construct had sticky ends that matched with the EcoRI- and XhoI-digested vector RT3GEPIR. The probe oligos that were used to detect the retrovirally integrated pre-microRNAs are listed in Table 3. The probes were designed by NanoString.
Table 3: NanoString probes used for pre-microRNA detection.
target probe sequence
pre-let-7c A GCTCCAAGGAAAGCTAGAAGGTTGTACAGTTAACTCCCAGGGTGTAAC TTCCTTCCTGTGTTCCAGCTACAAACTTAGAAAC pre-miR-100 A CCTAACAGACACATACCTATAGATACAAGCTTGTGCGGACTAATACCAC ACATAAAATTGGTTTTGCCTTTCAGCAATTCAACTT pre-miR-103a-1 A CAATGCCTTCATAGCCCTGTACAATGCTGCTTGATCCATATGCAACAACT GGTCAAGACTTGCATGAGGACCCGCAAATTCCT pre-miR-103a-2 A TGGTTCTTTCATAGCCCTGTACAATGCTGCTTGACCTGAATGCTACCTTT CGTTGGGACGCTTGAAGCGCAAGTAGAAAAC pre-miR-106b A CCTGCTGGAGCAGCAAGTACCCACAGTGCGGTAGCCCAGCAGACCTGC AATATCAAAGTTATAAGCGCGT pre-miR-10a A AGAGCGGAGTGTTTATGTCAACTACATATTCCCCTAGATACGAATTTGT GCCTGCCAATGCACTCGATCTTGTCATTTTTTTGCG pre-miR-10b A TGAAGTTTTTGCATCGACCATATATTCCCCTAGAATCGAATCTGTGACTA CAAACTGGAGAGAGAAGTGAAGACGATTTAACCCA pre-miR-125b-1 A AGCACGACTCGCAGCTCCCAAGAGCCTAACCCGTGCGATTGCTGCATTC CGCTCAACGCTTGAGGAAGTA pre-miR-125b-2 A TCCCCTCCGCCTAGGTCCCAAGAGCCTGACTTGTGCTGAGGCTGTTAAA GCTGTAGCAACTCTTCCACGA pre-miR-126 A TGCCGTGGACGGCGCATTATTACTCACGGTACGAGCTAGGACGCAAAT CACTTGAAGAAGTGAAAGCGAG pre-miR-127 A GATGATGAGACTTCCGACCAGCCAAGCTCAGACGGATCCACGCGATGA CGTTCGTCAAGAGTCGCATAATCT pre-miR-1307 A TGCATGACCGCCTATCTACCACGACCGACGCCACGCATTTGGAATGATG TGTACTGGGAATAAGACGACG
40 pre-miR-130b A GACCTGACCGATGCCCTTTCATCATTGCACTGCTTCCACAAGAATCCCTG CTAGCTGAAGGAGGGTCAAAC pre-miR-139 A GTTACTCCAACAGGGCCGCGTCTCCAGCCTCCGAGCTTGACGTAGATTG CTATCAGGTTACGATGACTGC pre-miR-140 A GGTGCCCCGGTATCCTGTCCGTGGTTCTACCCTGTCTTACAGATCGTGT GCTCATGACTTCCACAGACGT pre-miR-141 A GAACCCACCCGGGAGCCATCTTTACCAGACAGTGTCTTGGAGGAGTTG ATAGTGGTAAAACAACATTAGC pre-miR-142 A CACAGTACACTCATCCATAAAGTAGGAAACACTACACCCTCCAGTGCTG TCCTACGTATATATCCAAGTGGTTATGTCCGACGGC pre-miR-143 A GCTGCAGAACAACTTCTCTCTTCCTGAGCTACAGTGCTTCATCTCCAGCA AGAAGGAGTATGGAACTTATAGCAAGAGAG pre-miR-144 A GGCGGTGCCCGGACTAGTACATCATCTATACTGTAGTGTCTCATCCACC CCTCCAAACGCATTCTTATTGGCAAATGGAA pre-miR-145 A AACCATGACCTCAAGAACAGTATTTCCAGGAATCCCCATCTTAGCATCT ACCCGAAGCAATACTGTCGTCACTCTGTATGTCCGT pre-miR-148b A TAGAAAGCTTTCGAGACAAAGTTCTGTGATGCACTGACTTTCAGAGAG CCCCGGGAATCGGCATTTCGCATTCTTAGGATCTAAA pre-miR-151a A GAGGTGAGTATGACCATCCCTGTCCTCAAGGAGCTTCAGTCTAGTACCG ATCTTCATAACGGACAAACTGAACGGGCCATT pre-miR-155 A CTGTTAATGCTAATATGTAGGAGTCAGTTGGAGGCAAAAACCCCTATCA CCGCTATGCAGACGAGCTGGCAGAGGAGAGAAATCA pre-miR-16-1 A GTCAACCTTACTTCAGCAGCACAGTTAATACTGGAGATAATTTTAGAAT CCATTCGCAACCATGTGAAGTAATGTGAGCGTACTT pre-miR-16-2 A GTCACACTAAAGCAGCACAGTAATATTGGTGTTTAATATATATTTCACT ACACCAGTTAGCGTGGCGTATACCATGTTGTTAACA pre-miR-17 A GTCACCATAATGCTACAAGTGCCTTCACTGCAGTAGATGCACATATCAC TCCTGAATCAATAGAACAATATCAGTTATGGCGGTG pre-miR-182 A GTGCCGGCTGAGTCCTCGCCCCATAGTTGGCAAGTCGGTTGTTAATATG ACAGGCCGCTAAAGACGTTCT pre-miR-183 A TCGTGGATCTGTCTCTGCTCTGTTTATGGCCCTTCGGTAATTCCCGTCTC AGATGAGTGGGTTAATCAATCAAGTATG pre-miR-190b A CTGCTGCTGTAAGAATATGTTTGACATTTAGTTGGTTCCTAATTAAACA ACTGACACATTAGTAACGTCGGCAAGCACTTAGTCG pre-miR-191 A AGGCAGGAGAGCAGGGGACGAAATCCAAGCGCAGCCGTGAACCAGAT TATGTATGGACGCGCAATAGATA pre-miR-192 A GCTGGCATTGAGGCGAACATACCTGTGACCTATGGAATTGCATACGAA ATTTGAGCAAGCAATTGAAGGCTTAGA pre-miR-200a A GCGGGTCACCTTTGAACATCGTTACCAGACAGTGTTAGAGTCAAGCTAT CAGCTAATAGGGTCGGCTCAACAGTGTATCC pre-miR-200c A CCTCCATCATTACCCGGCAGTATTAGAGACTCCCAACCGCTATCAATTC GTGACCCCGATCATCCAGTCCAGAA pre-miR-203a A TCGCTGTCGCCGCGCCCGCCGGGTCTAGTGGTCCTCTTGAGCTCTAGGC CCAAAACGACCTTAATGGTCA pre-miR-204 A GCCAGTGATGACAATTGAACGTCCCTTTGCCTTCCCACTAGCCCAGATC CTACGAGATGAGCTACGTAACTA pre-miR-21 A TGTCAGACAGCCCATCGACTGGTGTTGCCATGAGATTCAAATGCACTCT ATATGGAGGGAGAGTAGCTGGAT pre-miR-210 A GGGTCGCGCTGCCCAGGCACAGATCAGCCGCTGTCCCTGGTCTAGGTA TCTAATTCGTGGGTCGGGTACT
41 pre-miR-22 A GGCAGAGGGCAACAGTTCTTCAACTGGCAGCTTTAGCATTAGCTCGGA TGCTATCAGCTTGCGCCTATTAT pre-miR-29a A ATAACCGATTTCAGATGGTGCTAGAAAATTATATTGACTCTGAACACCA ACACGATCTGTATTTTGCACCTTTCGCTATGCTGAG pre-miR-29b-1 A CCCCCAAGAACACTGATTTCAAATGGTGCTAGACAATCACTATTTAAAT CCTGTGTCCGTCTATACGCATACTGGTCCACATATA pre-miR-29b-2 A CTCCTAAAACACTGATTTCAAATGGTGCTAGATACAAAGATGGAAAAAT CCATGTTGGAGTTAACGGAGACCCGCCATCGTTTAC pre-miR-3065 A CTGTCCTCTCCAACAATATCCTGGTGCTGAGTGATGACTCAGGCGCTCA TTTTGAACATACGATTGCGATTACGGAAA pre-miR-326 A TGAATCCGCCTCGGGGCTGGAGGAAGGGCCCAGAGCCTATGCATCATG TGCCTCACTAGGACATCATGCT pre-miR-337 A TTGAAGGGGATGAAGAAAGGCATCATATAGGAGCTGGATAACTGTGC ATCCCTAAATTGGGAAAAAAGGTTTTAGCTATTGATGG pre-miR-342 A TAAGTAGGCCAAGGTGACGGGTGCGATTTCTGTGTGAGCTTCAGTTAA AGGCTATCTTGCTCCGCTCGTTCTC pre-miR-365a A TGCAAGAGCAATAAGGATTTTTAGGGGCATTATGATAGTGGAATGGAA ACCTTAAAGCTATCCACGAATGTCAAAAATGTGGTTT pre-miR-375 A GCCTCACGCGAGCCGAACGAACAAAACGCTCAGGTCCCGAATGTATAA TGCTGACGTTCTTGCTTTTGGC pre-miR-378a A AGGCCTTCTGACTCCAAGTCCAGTGCTATTTCTAGGTAACACACAGCCT ATTGAAGCAATCCTCTCCCCAATACTTAAAAA pre-miR-379 A AGAGTTAGTGGACCATGTTACATAGGTCAGAAATCATAACGCCTACGTT CCTACGGTTACCGTCTTTATAAGTGAACAAAACCGG pre-miR-381 A TACTCACAGAGAGCTTGCCCTTGTATATTCCATGTCAATAAACCGAATA TCTCTGTGAACTGTCATCGGTCCGATCAATTAGTCT pre-miR-425 A GAAAGAGCACTGGGCGGACACGACATTCCCGATGGCTCCCCTTTCCCA AGTAAATGTACGGGAATTATCG pre-miR-451a A TCTGGGTATAGCAAGAGAACCATTACCATTACTAAACTCAGTAATGGTA ACGCTTTATTATGTGTTCGTCTAACTCTGTTTCTGT pre-miR-452 A GCAAAGCACTTACTTCTTTGCAGATGAGACTGAGACATAGTTACAAAGT CCCGAGTGCATGAGCTGTCTTTCACATGATACATCG pre-miR-455 A GATGACATAGGCCTTGAGGCAAGTGTATATGCCCATGGACTGCATGGT GCCTATTTCTGTTCACGGATGAAGGCCTATATCAATG pre-miR-486-1 A GTATCCTGTACTGAGCTGCCCCGAGCTGGGCAGCACCATCCACTTTCAT GGAAACAATAAGAGCAGGGAA pre-miR-486-2 A CATCCTGTACTGAGCTGCCCCGAGGCCCTTCATGCCACAAACTCACTAC TACCAACAACCTCACCAAAAA pre-miR-497 A CCTCGGCGGTGCCTCCCCCACCCTCGCTCTAACACCTCATGTCCTCTGTT AATCCAGCCTGAATATGCCA pre-miR-551b A TTATTCTCACAGCCTCTGAAACCAAGTATGGGTCGCCTTCCCAGAAATG TCACTCCCATGGTGGCTGATATAGAAA pre-miR-7-1 A CTGTAGAGGCATGGCCTGTGCCATATGGCAGACTGCATGTCGAACCTT GGATAGGAGCGACCGATTACGT pre-miR-92b A GGGCCGGGCGGGCCGGAGGCCGGGACGAGTGCAATCTCAGGTTGTTA CTTGAAGGGTTCAACACGAGCTC pre-miR-93 A CCGGCGGCTCGGGAAGTGCTAGCTCAGCAGTAGGTCAGAAGATCAAA AAACGATCCCTGTCCATCAATAC pre-miR-99a A CACACTGACACAGACCCATAGAAGCGAGCTTGTGCCTTAGGCTACCAA ATGAATTTAAAGCCAGCTGAAA
42 pre-miR-99b A GACACGGACCCACAGACACGAGCTTGTGTGCGGCGCCAATGCTTGCAG TATGTATCCTGATCGTGCGTGC ctrl0_RT3GEPIR A GCATAGGAATTATAATGCTTATCTATACATCTGTGGCTTCACTATAGATA CCTGCATTCTCATGGAAATGCAATGGATTCATTCC ctrl1_cel-pre-miR-67 A AAGTTTTAAAATCGATCTACTCTTTCTAGGAGGTTGTGATGCTTAATCTG CCTGTTGCAGTATCACGTAAATACCTACTTCGATA ctrl2_cel-pre-miR-239b A AGATAAAAGCAACTTGCCATTTTTGCACACCACAAAAGTGCTGAGCCTA GCTGTTATGGCTATTGCTGAAACAGCAAAATT ctrl3_cel-pre-miR-1022 A AGCCTTGAACAGCTGGATCATCATTGGACTATCATCTTTATATTGCTTCA CCTTACGACTTCACTGCAATTGACGATTCAGTTAA ctrl4_cel-pre-miR-254 A AAAAACTGCATGTTCGCCGCCTACAGTCGCGAAAGATTTGCCTCATACC AATGTAAAGTATAGTTAACGCCCTGT ctrl5_cel-pre-miR-36 A TCCGCGTCGGGGACCCATGCGAATTTTCACCCGGTCATCTCCATGACTG CTTGAGCGGCTGGAGAATCTG ctrl6_cel-pre-miR-71 A TTCCAGGTCACGATCCCGACGGCGAAAAACAGAATAGTGATACCTTTC GCCACCCATATAAACCCCACTTCGTCCTCA ctrl7_cel-pre-miR-800 A ACGGCGGCAGACAATTTCCGAGTTTGGCCACTGATTATAACAAGGCAG AGCAAATGTGACACTGTCTATCAGTAC ctrl8_cel-pre-miR-90 A TGGCATCCAATTCAAGGGGCATTCAAACAACATATCAACACGCAAAAG TGCCTACATATATAGGAAAAGGGAAGGTAGAAGAGCT
all constructs B CGAAAGCCATGACCTCCGATCACTCTAAACAAGATAATTGCTCGAATTC TAGCCCCTTGAAGTCCGAGGCAGTAG
43 3.1.8 siRNAs and microRNA mimics
All siRNAs and microRNA mimics as well as the respective non-targeting controls used in this thesis are listed in Table 4. All sequences are given in 5'-3' direction. The siRNA pools obtained from siTools Biotech contained 30 different siRNAs targeting the gene of interest.
Table 4: siRNAs and microRNA mimics used in this study.
mimic sequence company
siAllStars - Qiagen
mimic ctrl2 - Dharmacon
miR-1307-3p I0 ACUCGGCGUGGCGUCGGUCGUG Dharmacon miR-1307-3p I1 CUCGGCGUGGCGUCGGUCGUG Dharmacon
siRNA sequence company
sictrl1 - siTOOLs Biotech
sictrl2 - siTOOLs Biotech
siMyc GAGAACAGTTGAAACACAA GGACTTGTTGCGGAAACGA siTOOLs Biotech GCCATAATGTAAACTGCCT GAGGAGCAAAAGCTCATTT GGTACTATAAACCCTAATT CAGCATACATCCTGTCCGT GGAAAACGATTCCTTCTAA GAGCTAAAACGGAGCTTTT GGCGAACACACAACGTCTT CTGAAAGATTTAGCCATAA CCCTGGTGCTCCATGAGGA CCTAGTATTATAGGTACTA CTCACAACCTTGGCTGAGT GGGTCAAGTTGGACAGTGT GCATGATCAAATGCAACCT CTCCTACGTTGCGGTCACA CCCAAGGTAGTTATCCTTA CCCTACCCTCTCAACGACA CTGCCTCAAATTGGACTTT GCCACAGCAAACCTCCTCA GCCACGTCTCCACACATCA CAGATCCCGGAGTTGGAAA GACTATCCTGCTGCCAAGA CAGAGGAGGAACGAGCTAA CGGTGCAGCCGTATTTCTA GACATGGTGAACCAGAGTT CCTATGAACTTGTTTCAAA CGACGAGACCTTCATCAAA GTCCTGAGCAATCACCTAT CTGCTCTCCTCGACGGAGT 3.1.9 Plasmids
pHIT60 kindly provided by Yuko Soneoka131
pMD2.G Addgene
44
3.1.10 Databases and software
cBioPortal http://www.cbioportal.org/
Cellosaurus https://web.expasy.org/cellosaurus/
COSMIC https://cancer.sanger.ac.uk/cosmic/
GraphPad Prism 5 http://www.graphpad.com/
MiRanda http://www.microrna.org/microrna/home.do
miRBase http://www.mirbase.org/
Molecular Devices Analysis Software Molecular Devices Molecular Signature Database Broad Institute
NCBI http://www.ncbi.nlm.nih.gov/
nSolver Software NanoString
QuantStudio Software Thermo Fisher Scientific
Roche UPL Design Center Roche
UCSC Genome Browser https://genome.ucsc.edu/
SDS 2.2 Applied Biosystems
TargetScan http://www.targetscan.org/
45
3.2 Methods
3.2.1 Cloning of the pre-microRNA library
Preparation of the vector
1-2 µg of the vector RT3GEPIR were digested with EcoRI and XhoI in parallel. For the digest, 5 µL NEB CutSmart Buffer (10x) were added to the vector and 1 µL of each restriction enzyme (20 units). The reaction volume was adjusted to 50 µL with ddH2O and incubated over night at 37°C. This step was followed by heat inactivation at 65°C for ten minutes. The linearized plasmid was purified with the Wizard SV Clean Up System according to the manufacturer’s instructions.
Preparation of the pre-microRNAs
To generate pre-microRNA constructs, two or three oligos that were partially complementary to each other (15-80 base pairs overlap) were designed. For annealing of individual pre-microRNAs, 5 µL of each oligo (100 µM) were combined in a well of a 96-well plate. 2 µL NEB ligation buffer (10x) were added and the reaction was filled up to 20 µL with ddH2O. The mix was boiled at 95°C for five minutes and allowed to cool down slowly at room temperature.
Ligation via NEBuilder reaction
The linearized vector RT3GEPIR was ligated with the annealed pre-microRNA oligos using the NEBuilder HiFi DNA assembly Master Mix. Since 72 pre-microRNAs were selected for the library, the constructs were cloned in pools. Ten or eleven of the annealed pre-microRNA oligos from the previous step were combined and diluted to a final concentration of 0.5 µM per pre-microRNA. The reaction mix was set up as described below and was incubated at 50°C for one hour.
NEBuilder reaction: 30 ng vector (linearized)
2 µL annealed oligos (0.5 µM, to use 1 µmol per pre-microRNA) 10 µL NEBuilder HiFi DNA assembly Master Mix (2x)