D) Reprogramming vectors disappearance
2.6 Genome Stability analysis
As a consequence of reprogramming, prolonged culture, correction process and clonal selection of pluripotent cells, many genomic aberrations could be generated. PKD2iPSC line was already proven to be karyotipically normal (Fig. 17) but, as reported by many authors, numerous genomic abnormalities could have been generated that are not detectable by G banding of metaphasic chromosomes, which has a resolution of 3-10 Mb. Therefore, we analyzed Copy Number Variations (CNV) by array based Comparative Genomic Hybridization (aCGH) and somatic mutations by exome sequencing of the following samples: Peripheral blood mononuclear cells of PKD2 patient (PKD2MNC), the hiPSC line generated from this sample (PKD2iPSC) and one of the corrected clones generated from PKD2iPSC (coPKD2iPSC).
Copy Number Variation (CNV) analysis
For CNV analysis an array of probes that covered the whole human genome with some spacing was used. CNV were detected as DNA segments that have gained (CN ≥3) or loss (CN ≤ 1) a copy number
90 Table 9: Copy Number Variations (CNVs) in coPKD2iPSC, clone 11.
state in comparison with the reference genomic DNA. Losses of Heterocygosity (LOH) events are also
detected due to collection of probes for Single Nucleotide Polymorphisms (SNPs) genotyping. Thirty two CNVs were found in coPKD2iPSC clone 11 (Table 9). The majority of them were already
present in the uncorrected PKD2iPSC (91%) and many of them also in the original population of mononuclear cells (PKD2MNC) (75%). An interesting and surprising observation was the one seen in CNV nr. 7, which is a LOH of 6.3 Megabases (Mba) found in PKD2MNC that was not detected in PKD2iPSC and detected back in coPKD2iPSC. Only CNV nr. 1 and 29 were generated during gene correction or during its consequent clonal selection and culture. The first one is a deletion of 60.6 kb that includes the following described genes: OR2T10, OR2T11, OR2T35, which are a family of olfactory receptor genes. The second is an amplification of 6 kb that includes the FGD1 gene. None of them have been associated to survival advantage related to pluripotent cultures. They could also be a consequence of TA off target cutting and therefore we analyzed the sequences surrounding the CNV for homology with the PKLR TALEN targeting sites. No sequences with less than 8 mismatches were found in the vicinity of these CNVs. CNV marked in light red were detected in PKD2iPSC and not in PKD2MNC, probable being a consequence of reprogramming induction and prolonged culture. None of the genes present in these amplified/deleted regions have ever been associated neither to a proliferative or survival advantage nor to be involved in hematopoietic differentiation.
91
Somatic mutation analysis
The whole exome of the three samples (PKD2MNC, PKD2iPSC and coPKD2iPSC) was interrogated by Illumina HiSeq 2000 system. After bioinformatics analysis by comparing the sequencing data with the human genome reference, variant calling was performed and a list of 67729 variants was generated. After removing the ones that were already present in PKD2MNC, the list was reduced to 5797 variants. After selecting those included only in exonic regions the list decreased to 420 variants and after removing the ones that were included in the SNP database to 202. Then the ones in which the number of reads was lower than 8 and the variants that were present in less than 20% of the reads were removed, generating a list of 76 variants. By looking at the sequencing raw data of these 76 genomic regions in the IGH (Integrative Genomic Viewer, Broad institute) genome viewer, some of them were removed for having less than 8 reads in the original PKD2MNC and therefore being impossible to discard if they were already present in the original population before reprogramming. The final list of variants (Table 10) included 10 variants and 4 of them were also detected in PKD2iPSC. From them, just 3 of them were present in around 40% of the reads. In order to verify the presence of these mutations by Sanger sequencing, these regions were PCR amplified and sequenced. The mutations in RUSC2, TACR2 and in APOA5 were confirmed (Fig 38). The rest of the mutations were also analyzed by sequencing, being none of them verified.
Table 10: Genetic variants in coPKD2iPSC clone 11
Figure 38 Single nucleotide variants (SNV) verified by Sanger sequencing in coPKD2iPSC clone 11. RUSC: G>T, APOA5: G>A and TACR2, A>G (complementary reverse sequence is displayed in chromatogram, T>C).
93
VIII Discussion
During the present thesis we have generated hiPSCs from PKD patients, with high efficacy and on a safe manner, which recapitulated the PKD phenotype once differentiated into the erythroid lineage. Moreover, we have accomplished their gene correction through a Knock-In approach in PKLR locus that restored the PKD phenotype and analyzed the genome alterations using state of the art techniques for genome sequence analysis.
1 Cell Reprogramming
The idea of generating an unlimited source of cells with the ability to differentiate into any cell from an adult somatic tissue is presented as a very promising approach for regenerative medicine (Robinton and Daley 2012, Cherry and Daley 2013, Svendsen 2013). The establishment of the conditions for the generation of hiPSC, by means of reprogramming, has opened a big window in this new field. Consequently, this discovery was awarded with the 2012 Nobel Prize for Physiology or Medicine to two researchers that were crucial for the development of this technology; Shynya Yamanaka (Takahashi and Yamanaka 2006) and John Gurdon (Gurdon 2006). The two main applications of reprogrammed cells are the already under development use for disease modeling and drug discovery (Egawa, Kitaoka et al. 2012) and as autologous cell source for regenerative medicine, which will probably be demonstrated in a near future.