Cell lineage infering is powerful to decipher cell lineages with no “a priori” but is not experimental- based. In order to demonstrate experimentally the origin of each cell in a given system, new experimental techniques allowing the combination of single cell transcriptomics and cell lineage labelling have been published. These new techniques have a powerful outcome, integrating genetic inheritance by lineage labelling and the differentiation trajectories derived from transcriptome analysis of single cells (Kester and van Oudenaarden, 2018). The combination of both approaches generates an integrative scheme of the differentiation process. (Fig. 32)
Fig. 32. Combination of Single-Cell Genetic Lineage Tracing and Single-Cell Transcriptomics. First a phylogenetic tree is constructed based on genetic labels identified in single cells. This tree can then be refined using transcriptomics-based lineage reconstruction algorithms. Finally, gene-expression gradients can be projected onto the phylogenetic tree to identify gene-expression dynamics throughout the system.
From (Kester and van Oudenaarden, 2018)
The different techniques combining both approaches are classified in two categories:
VII.1 Prospective lineage-tracing
a) Model of Viral Barcoding-Based Lineage Tracing: mice cells can be infected with libraries of virus harbouring unique DNA barcodes. With this technique, founder cells are going to be labeled with unique barcodes. Founder cells are going to differentiate and progeny is going to be harvested and sequenced. The different clones can be identified through the barcode sequence and the lineage trajectory can be described (Fig34a). This method has been extensively used for example in hematopoietic cell lineage.
b) Polylox mouse model: based on the fluorescent label of founder cells through a system Cre-lox which can be activated at any moment during the process of interest for labelling them, this method has the limitation of the colour used to follow the lineage. Now, researchers have improved this technique instead of using fluorescent probes they use DNA barcodes (Fig. 34b).
c) CRISPR-Cas9 Genome Editing-Based Lineage Tracing: this method consists in the use of the technique CRISPR-Cas9 direct genome editing. The idea of CRISPR/Cas9 lineage tracing is to use Cas9 to create deletions or insertions in transgenic target sites in the genome, this modifications are random and they can be used as barcodes that are going to be inherited by the progeny of the cells giving the necessary tool to follow the cell lineage in the developing tissue (Spanjaard and Junker, 2017) (Fig. 34c). This approach has been used in very different ways. For example, McKenna (Raj et al., 2018)’s group has generated a new transgenic zebrafish line with synthetic concatemerized target sites in the 3’ UTR of a GFP transgene, approach that the authors called GESTALT (Genome Editing of Synthetic Target Arrays for Lineage Tracing) (Fig. 33). Junker’s group developed LINNAEUS (Fig. 33). They have used Cas9 targeting an RFP transgene that was existing in a zebrafish line, like this they were sure of not interfere with the normal development of the fish (Spanjaard and Junker, 2017; Kester and van Oudenaarden, 2018).
Fig. 33. Massively parallel lineage tracing using the CRISPR/Cas9 system. (a) In scartrace/LINNAEUS, an existing fish line with multiple integrations of a transgene is targeted by Cas9. The sequences of the resulting ‘genetic scars’ (light gray) are used as lineage markers. (b) GESTALT uses the same principle, but a new line with concatemerized target sites is used. Compared to scartrace/LINNAEUS, this has the advantage that the individual sites (different colors) can be distinguished. From (Spanjaard and Junker, 2017)
The problem with all these techniques in the necessity to introduce exogenous material, which in some cases is ot easy due to the nature of the sample. That is why there was the necessity of implementing other techniques termed retrospective lineage tracing.
VII.2 Retrospective lineage-tracing
Consist in the lineage tracing of natural occurring mutations such as somatic mutations or Copy Number Variations (CNV).
a) Tracking of somatic mutations: this technique arises due to the limitations of the above-described tracing methods, for example the limitation of lineage tracing in humans. The aim of this technique is the study of the spontaneous mutations occurring naturally in the cells, these mutations are inherited by the cell progeny and can be analysed (Kester and van Oudenaarden, 2018)(Fig. 34d).
Fig. 34. Overview of Genetic Lineage Tracing Strategies. (A) Lineage tracing through viral barcoding. Cells are infected with a virus library containing many different barcodes. After a period of time clones can be identified through the barcode sequence. (B) The Polylox mouse model was created by the introduction of a set of barcodes interspersed with loxP sites. Upon activation of the Cre recombinase, the Polylox cassette recombines, thereby producing unique cellular barcodes via a combination of losses and inversions of single barcodes. After a period of time the DNA is isolated and clone identification is done through the assessment of the combination of losses and inversions of barcodes. (C) Lineage tracing using CRISPR- Cas9 can be done in cultured cells and zebrafish. Introduction of CRISPR-Cas9 and gRNA into the cells results in the scarring of the target sequence during a given time window. After a period of time cells are harvested, genomic DNA is isolated, and scars are sequenced. The combination of scars in each cell produces a unique barcode, and the construction of multi-level phylogenetic trees is possible since scarring takes place over a long period of time and a portion of the scars will be shared between clones. (D)
Tracking of somatic mutations can be performed in any model organism. Somatic mutations arise spontaneously and accumulate over time, thereby marking single cells and all their progeny. Clones can be identified through whole-genome or targeted sequencing. Construction of multi-level phylogenetic trees is possible since mutations arise over a long period of time. From (Kester and van Oudenaarden, 2018)
b) Tracking CNV: the CNVs can easy be detected by sequencing. The problem is that in healthy tissues, they are not really common so the construction of phylogenetic trees in samples from healthy patients are not really possible. However, CNVs are abundant in cancer and change during tumour developing what makes them suitable for the lineage reconstruction in cancer. In the work of Navin and Wang (Navin et al., 2011; Wang et al., 2014) studying breast cancer, they found enough CNVs to generate the phylogenetic tree.
c) Tracking Single-Nucleotide Variants (SNV), indels and repeated regions: all exist in non-coding regions of the genome, but for instance, for their detection is only possible performing whole genome sequencing, making difficult the detection of these isolated events, and also the amplification step for the whole sequencing of the genome hamper the detection of the SNV. Beside SNV also retrotransposon elements and microsatellite repeats have been used for lineage tracing.
d) Tracing through the detection of Epigenetic marks: using DNA methylation or hydroxymethylation. Methylation of DNA is a regulator of gene expression and is maintained during cell division, the methylation of DNA can change in time depending of the cell requirements but, for instance, this changes are occurring In a very slow fashion, meaning that are a good source to detect cell lineage in a developing tissue.
The fast and vast development in single cell trancriptomics has revolutionized the scientific community. With this new tool we can go deeper in the understanding of cellular and molecular events regulating all the biological processes occurring in the different systems, and now, more than ever, we can describe and characterize dysfunctional processes that lead to disease.
The scientific community now work together in a challenging project: The Human Cell Atlas (HCA). The aim of this project is to define all human cell types in terms of their distinctive patterns of gene expression, physiological states, developmental trajectories, and location and the creation of a “comprehensive reference maps of all human cells as a basis for both understanding human health and diagnosing, monitoring, and treating disease” (Hon et al., 2017; Paper, 2017).
This project will help to the community being a reference, putting together the different discoveries and applications creating a new tool to fight against the different diseases tethering the human life, helping in the