Yeast cells divide rapidly with a cell cycle time of between 90 minutes and 2 hours. Budding of yeast involves a cycle of mitosis and is generally studied in the haploid state. The stages in the yeast cell cycle are similar to other eukaryotic cells. It involves two
main phases, namely, S phase and M phase with two gap phases, the G1 and G2 between the main phases (see Figure 3.1 below).
Figure 3.1
The events during the eukaryotic yeast cell cycle.The main events of cell cycle are chromosome duplication (S phase), and chromosome segregation, nuclear division, and cell division (M phase). G1 phase is the gap phase between M and S phases, whereas G2 is the gap phase between S and M phases. A yeast cell decides whether to commit to a new cell cycle during the start-transition (START) in the G1 phase. Also shown on this diagram is the canonical model of yeast cell cycle regulation from transcription factor binding data of eight well-known cell cycle transcription factors [87, 88].
During S phase, DNA is replicated and chromosomes are duplicated by proteins carrying out DNA synthesis. Protein synthesis (e.g. histone proteins) is required as the DNA needs to be packaged into chromatin (chromatin condensation). The duplicated chromosomes are known as sister chromatids. Cytoplasmic components are duplicated as well throughout the cell cycle. The transition phase between S phase to the next M phase is known as G2 phase. In G2 phase, additional time is provided for cell growth, duplication and segregation as well as protein synthesis, as the cell prepares for mitosis. During M phase, two major events occur which are mitosis and cytokinesis. During mitosis, sister chromatids are distributed equally into a pair of daughter nuclei [89]. This major phase is divided into two other sub-phases called metaphase and anaphase. In metaphase, pairs of sister chromatids are attached to the bipolar mitotic spindle oppositely. Contraction of spindle fibres forces sister chromatid separation towards opposite ends of the cell. During cytokinesis, the cell division occurs where a new plasma membrane and cell wall are generated and contraction of actin filaments and myosin
under the cell membrane takes place. The resulting two daughter cells fate will be determined at G1 phase which acts as the checkpoint of the cell cycle progression [89]. The start-transition (START) checkpoint at the end of the G1 phase will determine the cell cycle progression depending on cell mass as well as on environmental cues such as nutrient availability and mating pheromone [90].
Regulation of yeast cell-cycle dependent genes has been investigated rigorously. Transcription factors which regulate these genes have been identified and include Ace2, Mbp1, Mcm1, Ndd1, Swi4, Swi5, Fkh1, and Fkh2 [88, 89, 91]. MBF- a complex of Mbp1 and Swi6 and SBF-a complex of Swi4 and Swi6 control the activation of genes required in the transition between G1 phase to S phase by binding to the DNA sequence elements called MCB and SCB respectively [88, 89]. The genes activated during this phase include the cyclins (Cln1,Cln2 and Cln3). Cyclins regulate cyclin dependent kinases (Cdks) e.g. Cdk1, which promotes cell cycle progression to the S phase [87]. In addition, SBF/MBF heterodimer also promotes the activities of S phase cyclins, Clb5 and Clb6. At the transition between G2/M phase, another regulatory protein complex, Mcm1-Fkh1/2- Ndd1 activates the expression of G2/M genes responsible for mitotic regulatory proteins, e.g. Clb2 and Cdc20 which are required for mitotic entry and mitotic exit activity [87]. At the late mitosis M/G1 phase, TFs Swi5 and Ace2 stimulate expression of M/G1 genes responsible for mitotic exit and cytokinesis.
Around 204 transcription factors have been identified in Saccharomyces cerevisiae but its functional regulatory networks have not been fully discovered. Mapping functional regulatory networks requires characterized functional interaction of TFs to their targets. The degree of complexity involved in the functional regulatory network mapping of this simple organism is high, where the observed TF-DNA interactions are not necessarily direct (i.e. through interaction of TF with other proteins) - and involves a cascade of downstream gene activation.
Furthermore, not all TFs binding regulates gene expression. This is an inherent concept that should be noted in building gene regulatory networks. Haynes and co-workers (2013) have highlighted this issue of irrelevant binding of TFs where they found that 98% of yeast genes bound by TFs but only 45% of these genes were actually regulated by those TFs when perturbation on TFs were performed [92]. Joint clustering of gene regulatory information (i.e. TFs binding) with gene expression (i.e. cell cycle, knock out of the TFs, or other type of perturbations) could provide insights on functional as well as irrelevant binding of the TFs and could also discover new interactions in the networks. To date, investigations on regulatory control have been either in small-scale systems dealing with small genomic regions and a few genes, or focused on general properties
of genome scale systems neglecting the detail of control of any individual gene. We are aiming to generate models that yield greater mechanistic insight into the regulation of individual genes and groups of genes, using a novel probabilistic approach to jointly cluster regulatory information and gene expression patterns in yeast data.
Methodology
To test if our flexible model-based clustering can be applied to genetic regulation, we used the well-studied yeast cell cycle system. The experimental data set includes ~6000 genes from yeast cell cycle genes expression using microarray technology across 18 time-points from Spellman and co-workers [77]. The yeast cells in this experiment were sampled in rich media at each time point following α-factor synchronization of cells at the very beginning of the experiment. For the transcription factor binding data set, 103 yeast transcription factors (TFs) from the regulatory map published by Harbison and co- workers [93] were retrieved. Yeast TFs binding from [93] and yeast cell cycle genes expression from [77] are widely used for constructing yeast functional regulatory networks. Our working hypothesis is that yeast has an underlying gene regulatory network that is always the same, thus, using a combination of datasets from different experiments is possible and has been used previously in research that constructed the yeast transcriptional regulatory networks. Before proceeding to the clustering, both genes expression and TF binding data were subjected to pre-processing.