Ecology and evolution in the RNA world dynamics and stability of prebiotic replicator systems

(1)

Ecology and evolution in the RNA world dynamics and stability of prebiotic replicator

systems

Szilágyi, András; Zachar, István; Scheuring, István; Könny, Balázs; Czárán, Tamás

Published in:

Life

DOI:

10.3390/life7040048

Publication date:

2017

Document version

Publisher's PDF, also known as Version of record

Document license:

CC BY-NC-ND

Citation for published version (APA):

Szilágyi, A., Zachar, I., Scheuring, I., Könny, B., & Czárán, T. (2017). Ecology and evolution in the RNA world

dynamics and stability of prebiotic replicator systems. Life, 7(4), [48]. https://doi.org/10.3390/life7040048

(2)

Review

Ecology and Evolution in the RNA World Dynamics

and Stability of Prebiotic Replicator Systems

András Szilágyi1,2,3, István Zachar1,2,3 ID_{, István Scheuring}1,3 ID_{, Ádám Kun}1,2,3_,

Balázs Könny ˝u4and Tamás Czárán1,3,5,*

1 _{Evolutionary Systems Research Group, MTA, Centre for Ecological Research,}

Hungarian Academy of Sciences, Klebelsberg Kuno u. 3, 8237 Tihany, Hungary; [email protected] (A.S.); [email protected] (I.Z.); [email protected] (I.S.); [email protected] (Á.K.)

2 _{Center for the Conceptual Foundations of Science, Parmenides Foundation, Kirchplatz 1,}

82049 Pullach/Munich, Germany

3 _{MTA-ELTE Theoretical Biology and Evolutionary Ecology Research Group, Department of Plant Systematics,}

Ecology and Theoretical Biology, Eötvös Loránd University, Pázmány Péter sétány. 1/c, 1117 Budapest, Hungary

4 _{Department of Plant Systematics, Ecology and Theoretical Biology, Eötvös Loránd University,}

Pázmány Péter sétány. 1/c, 1117 Budapest, Hungary; [email protected]

5 _{Biocomplexity Group, Niels Bohr Institute, Copenhagen University, Blegdamsvej 17,}

2100 Copenhagen, Denmark

* Correspondence: [email protected]

Received: 30 September 2017; Accepted: 13 November 2017; Published: 27 November 2017

Abstract:As of today, the most credible scientific paradigm pertaining to the origin of life on Earth is undoubtedly the RNA World scenario. It is built on the assumption that catalytically active replicators (most probably RNA-like macromolecules) may have been responsible for booting up life almost four billion years ago. The many different incarnations of nucleotide sequence (string) replicator models proposed recently are all attempts to explain on this basis how the genetic information transfer and the functional diversity of prebiotic replicator systems may have emerged, persisted and evolved into the first living cell. We have postulated three necessary conditions for an RNA World model system to be a dynamically feasible representation of prebiotic chemical evolution: (1) it must maintain and transfer a sufficient diversity of information reliably and indefinitely, (2) it must be ecologically stable and (3) it must be evolutionarily stable. In this review, we discuss the best-known prebiotic scenarios and the corresponding models of string-replicator dynamics and assess them against these criteria. We suggest that the most popular of prebiotic replicator systems, the hypercycle, is probably the worst performer in almost all of these respects, whereas a few other model concepts (parabolic replicator, open chaotic flows, stochastic corrector, metabolically coupled replicator system) are promising candidates for development into coherent models that may become experimentally accessible in the future.

Keywords:RNA-world; ribozymes; coexistence; ecological stability; evolutionary stability; template replication; modelling the origin of life; evolvability

1. Introduction

Prebiotic systems are assemblages of dynamically coupled replicative entities hypothesized to have existed before biological evolution, during the chemical evolutionary phase of molecules leading to the first cells and life, about 3.5–4 billion years ago on Earth. The idea of prebiotic evolution is not limited to our planet, of course: any habitat in the universe offering suitable physical-chemical conditions for the emergence and maintenance of such replicative entities may have undergone similar

(3)

evolution. The units of evolution at the prebiotic era on Earth were molecular replicators (most probably RNA molecules) and their evolution may have led to the emergence of the first chromosomes and, ultimately, to the first cells. None of the recent, highly evolved biochemical machinery controlling and regulating the replication of information (such as modern error-correction mechanisms of DNA copying) had existed then. Therefore, some serious obstacles had to be overcome on the evolutionary route leading to the first individual cells and biological evolution.

The first such problem that prebiotic systems may have faced was transgressing the information threshold, i.e., escaping Eigen’s paradox. The paradox poses the following issue: The critical amount of information within a replicator system that is sufficient to keep it running through many generations is constantly ruined by mutational loss. Lacking a reliable replication mechanism, the mutation rate was probably very high. As a consequence, the critical amount of information could not be condensed into a single, long replicator, because copying errors (mutations) would have easily eroded much of the vital information in a single step. Maintaining a sufficient diversity of different replicator species, each containing a small, more reliably replicable part of the critical information, could be the solution to the information threshold problem [1]. The combined information content of such a maintainable replicator set may have been sufficient to code for a viable system. However, several other system-dependent problems had to be solved by even the simplest prebiotic replicator system. We have defined a minimum set of system-level criteria that any prebiotic replicator set would certainly have had to meet in order to be able to maintain itself for a sufficiently long time and evolve toward higher complexity:

•

Ecological diversity—maintaining the coexistence of a sufficient number of different species (replicators, sequences, genotypes, etc.) in light of the Gause-principle (see later), which poses a strict limit on the number of coexisting species based on the number of regulating factors.

•

Ecological stability—maintaining dynamical stability in a given set of coexistent species against

external perturbations.

•

Evolutionary stability—maintaining an adequate amount of information (a critical diversity of replicator species) from generation to generation and avoiding information decay (diversity reduction) in spite of frequent mutations and the lack of error correction.

Any model intended to represent the dynamics of prebiotic systems must satisfy at least these three criteria (beyond biological plausibility and interpretability). Note that there are more criteria to be met by the replicators themselves for complex life to unfold from the prebiotic systems they constitute. For replicators to be the units of open-ended [2] evolutionary change, they have to be capable of unlimited heredity [3], self-referentialism and evolution of evolvability [4], etc. (for a summary, see [5]). In this paper, we focus our attention on the diversity and stability aspects of replicator communities as emphasized above, assuming that all other requirements are met by the constituent replicators.

We will use the term “replicator” for any kind of biological or chemical entity that is capable of replication in the broadest sense (see [6]), i.e., is multiplying, has variations that affect its reproduction/survival and is creating more of its type (with variations being heritable). Mutations will play a crucial role in the evolutionary dynamics of species of replicators, the time scale of which may or may not be substantially different from that of ecological changes, depending on the actual model.

In the following section—after a brief methodological characterization of dynamical modelling—we will clarify the concepts of diversity maintenance (coexistence), ecological stability and evolutionary stability in some detail. Then we will investigate a set of models previously introduced, along the lines of these three criteria, under a separate heading for each version (differing in spatial and/or temporal resolution) of each dynamical scenario. Our aim is to provide a comparative review of the field’s most important models. We will confine our focus on models of linear polymer replicators (string replicators) and will not survey models dealing only with higher-level (compositional) dynamics such as the GARD model [7–9] or the models of autocatalytic sets [10–12], as those models can be understood as special cases of others discussed in this paper (for critical analyses of GARD, see [13,14]).

(4)

2. The Three Pillars of Prebiotics

Prebiotic systems are usually investigated by dynamical models. In turn, we will discuss some of the most thoroughly studied ones. Dynamical models can be classified into different categories depending on certain aspects of the dynamics they assume. The two most important such aspects are temporal and structural resolution: models may be discrete or continuous in time and they may or may not postulate spatial, group or other structure with local interactions.

Continuous time models are formalized as differential equations specifying the state of the system at t + dt, based on the state at t, where dt is an infinitesimally short time period. Temporally discrete systems are difference equations or update rules that define the state of the system at time t + 1 as a function of its state at t. Spatially structured models can be treated in continuous space by partial differential equations (PDEs) or in discrete space, as cellular automata (on different types of grids or lattices). Note that the analysis of PDE models requires numerical methods in almost any case, just as the lattice models they approximate. Since the corresponding lattice model is usually much easier to handle and it can approximate continuous time by sequential random updating rules, PDE models play a minor role in studies of replicator dynamics.

2.1. Maintaining Diversity

Ideal populations of replicators not limited by external factors exhibit exponential growth. For any biological entity (or replicator), the size of offspring in the population is proportional to the actual number of reproducing entities in the population (or to the whole population if everyone reproduces). In models of population dynamics, the factor of proportionality is the Malthusian growth rate, characteristic of the species, denoted by r (r > 0). Thus, the continuous-time dynamics of a population that grows without any internal or external limitation is the following:

.

x

(

t

) =

rx

(

t

)

, (1)

where x

(

t

)

is the amount (or concentration) of a replicator species at time t and x.

(

t

)

is the time derivative. The solution of this differential equation is the well-known exponential growth formula x

(

t

) =

x

(

0

)

ert, defining the actual population size at any time t. Exponential growth would increase population size beyond all limits, whereas the growth of every natural population slows down and ultimately stops growing due to the exhaustion of the limiting resource (food, space, etc.). Such regulating factors are extremely important in the coexistence of different replicator (or biological) species (see later).

The interesting dynamics arose when multiple different species are competing for the same resource. Assume two replicators with Malthusian growth rates r1and r2. The ratio of their numbers

at time t is x1

(

t

)

x2

(

t

)

=

x1

(

0

)

x2

(

0

)

e(r1−r2)t ₍₂₎

of which the limit at t

=

∞ is:

lim t→∞ x1

(

t

)

x2

(

t

)

=

     0, i f r1

<

r2 ∞, i f r1

>

r2 x₁(0) x2(0), i f r1

=

r2 (3)

meaning that the replicator with the higher replication rate exponentially outcompetes the inferior replicator. The inferior species becomes extremely diluted in finite time, which practically means its extinction. The relative growth rate of the competitors is the difference between their Malthusian parameters. Coexistence of the two replicator populations is impossible without a mechanism that ensures the two growth rates to settle at exactly the same value. In case of any arbitrarily small difference between r1 and r2, the difference in the densities grows exponentially. This is the core

(5)

problem of maintaining diversity of exponentially growing populations—and exactly the same dynamics are the indispensable basis of natural selection. It is the requirement of both maintaining diversity and remaining selectable that makes the problem particularly difficult.

As mentioned before, coexistence requires regulating factors which mitigate exponential competitive exclusion [15] and thereby allow coexistence. Gause’s CCCC principle (“Complete Competitors Cannot Coexist”) can be rephrased using the concepts of modern ecology: the number of coexisting species cannot exceed the number of regulating factors in equilibrium. We may consider any factor a regulating factor if: (i) it affects the growth rate of a species and (ii) it is affected by the number (density) of the same (and possibly also some other) species. Any factor that is a regulating factor for at least one of the species in a given species pool increases the possible number of coexisting species and the robustness of their coexistence. Note that the identification of regulating factors is sometimes trivial (e.g., the number of limiting resources or self-inhibition) but often it is more difficult (e.g., spatial constraints, stochasticity, periodic solutions, etc.) Furthermore, the determination of coexisting species (and hence their number) may be also complicated (as is the case for replicating nucleotide sequences, where a complementary pair of strands counts as a single replicator instead of two [16]. The presence of different regulating factors increases the chance of coexistence by relaxing competition. This is obvious in case of the self-inhibition of replicators, for which the dynamics takes the following form:

.

x

(

t

) =

rx1/2

₍

_t

₎

_{; for details see [}₁₇_{] and the section about parabolic replicators in this paper.}

Beyond the occurrence of additional regulating factors, the intrinsic variability of the dynamics can also act as a factor facilitating and maintaining diversity. Periodic or chaotic variation of densities in time generated by the dynamics itself (intrinsic fluctuations) can help to maintain diversity, the fluctuations themselves acting as regulating factors, see e.g., [18–20].

2.2. Ecological Stability

While regulating factors affect the growth rates and are affected by the densities, other external factors may also affect growth rates (mortality and fecundity) but remain unaffected themselves by the densities of replicators. Such factors are usually abiotic, such as temperature, pH or humidity. The robustness of a fixed set of coexistent replicator species (community) against changes in external factors is the key concept of ecological stability. If typical external perturbations can cause a system to collapse or a reduction in the number of coexisting species, the system is considered ecologically unstable.

Assuming that the change in external factors occurs on a time scale shorter than that of evolutionary changes (the accumulation of mutations), ecological robustness applies to a fixed set of species, even if ecological and evolutionary time scales overlap in prebiotics (discussed in turn), due to the large mutation rates involved. Note that the variation of external factors is not necessarily detrimental to an established community; environmental variation can also act as a potential diversity-maintaining factor, as it is well known in ecology and discussed in the previous section. 2.3. Evolutionary Stability

Because of the high mutation rates in prebiotic scenarios, there is no clear distinction of “ecology” and “evolution” in terms of time scale separation in the dynamics. The evolutionary stability of a system means the robustness of the resident species against any invading mutant. If there are mutants that corrupt the system or reduce the number of coexisting species (thus decreasing the sustainable amount of information), the system is considered evolutionarily unstable, for detailed analysis, see [21]. Even though this aspect is often disregarded, any candidate model of a prebiotic system must meet the evolutionary stability criterion, otherwise it is seriously underestimating the potential effects of mutations. Evolutionary stability against deleterious mutants is at least as important as ecological stability, precisely because of the overlapping time scales.

An indispensable aspect of “forward” evolutionary stability is evolvability: the propensity of the system to adopt new replicator species (possibly originating as mutants of existing ones or supplied

(6)

from outside) if they are of any advantage in terms of the collective fitness of the system. (See the corresponding group selection arguments later, in Sections3.2.3and3.2.4). All the models discussed in this paper will be scrutinized also in this respect, by assessing the probability of the given system to produce and incorporate beneficial mutants.

3. Models of Prebiotic Systems

In this section, we will scrutinize some of the most important and mainstream dynamical models of prebiotics with respect to their ecological and evolutionary stability properties. We aim to analyse the explanatory power and applicability of these models in the context of the three criteria explicated in the Introduction above. Specifically, our analysis includes the following models:

•

The Quasispecies model (QS) [1,22]

•

The Hypercycle (HC), spatial hypercycle (SHC) [23–26], compartmentalized hypercycle (CHC) [27] models

•

The Parabolic replicators (PR) model [17]

•

The Stochastic Corrector model (SCM) [27–30]

•

The Open chaotic flow (OCF) model [31,32]

•

The Metabolically Coupled Replicator System (MCRS) model [33–38]

•

Trait Group Models (TGM) [39,40]

Table1categorizes these model types on the basis of their temporal and structural resolution; Figure1provides a “genealogy” of the models.

Table 1.Categorization of dynamical models with respect to their temporal and structural resolution. For details of the models see the main text and references. Note that unstructured replicator models in discrete time are generally lacking as fully (i.e., in both space and time) continuous models are much easier to handle analytically.

Structure/Time Discrete Time Continuous Time

Without structure (only global interactions) - QS, HC, PR With structure (global and local interactions) Compartmentalized SCM CHC, TGM

Spatial MCRS SHC, CM

3.1. Models Assuming no Structure

Models without spatial or compartmental structure can be easy to deal with, as there is no need to account for the corresponding spatial aspects of the dynamics, so that local differences in concentrations/amounts, limited ranges of interactions and localized physical processes (droplet formation, diffusion, bonding to surface, vesicle division, etc.) can be drastically simplified or even omitted. Mean-field simulations are usually easy to approximate analytically. On the other hand, the lack of any structure means that these models have a limited ability to maintain diversity.

(7)

Figure 1.Genealogy of prebiotic replicator models. The simplest possible model for replicator dynamics is exponential growth, which does not allow coexistence as the fittest always wins. Since it is an idealistic case, all sorts of extensions and deviations from the basic model are intended to make prebiotic systems more realistic and more permissive in terms of coexistence, ultimately crossing the barrier beyond which a sufficient amount of information can be stably maintained on the evolutionary timescale for cellular life to emerge.

3.1.1. Hypercycle (HC)

The hypercycle was proposed by Eigen and Schuster [22,41–43] as a solution to the error threshold [1], a severe limit to the information content of primordial biological sequences. The replication of information-carrying macromolecules is prone to error [44] and the error rate (mutation rate) was higher at the origin of life [45], due to the lack of effective and high fidelity replicase enzymes and proofreading mechanisms. A functional sequence is replicated but some of its progeny will be of a different—most probably non-functional—type due to mutations. The following equations describe a system of a replicating functional, wild-type sequence (its concentration denoted by xw) and all of its possible mutants lumped together (their total concentration denoted by xm):

.

xW

=

xw

[

QAW

−

Φ

]

(4)

.

xm

=

xm

[

Am

−

Φ

] + (

1

−

Q

)

AWxW (5)

where Q is the probability of faithful replication of a sequence; AWand Amare the replication rates

(Malthusian growth rates) of the wild-type sequence and its mutants, respectively; and Φ is the outflow term to keep the total concentration constant. It is evident that in such a system the wild type will go extinct if Am > AW. Coexistence, i.e., the survival of the wild type is only possible if QAW> Am.

(8)

replication fidelity. Given a sequence of length L, the fidelity of replication is Q = (1

−

µ)L

∼

=

e−Lµ. We can then arrive at the inequality of the error threshold setting an upper limit to reliably replicable sequence lengths:

L

<

ln

(

AW/Am

)

µ (6)

Assuming that the per nucleotide mutation rate is 1% [45] (which is a realistic assumption for replications unaided by efficient enzymes) and that the wild type has better replication rate than any mutant at least conforming to the ln(AW/Am) = 1 relationship, we find that a wild type sequence of length

100 but not more, can be stably maintained. Note that since the threshold expression (Equation (6)) is proportional to the logarithmic ratio of the functional and the non-functional replication rates, increasing the replication rate of the wild-type does not increase the length of the maintainable sequence too much. This result yields Eigen’s Paradox [46]: there is no accurate replicase without a large genome and there could be no large genome without an accurate replicase. Thus, the information that can be reliably replicated is less than the information necessary to code for the replicating machinery. This is a key dynamical problem to which the early evolution of life had to find a solution [47–49].

The hypercycle was devised to overcome Eigen’s Paradox. If a single sequence cannot maintain enough information, then the necessary amount of information needs to be replicated in several sequences. Information stored in short sequences can be replicated, whereas the same amount of information in a single sequence may be far above the error threshold assuming the same mutation rate. However, the different sequences will inevitably compete with each other and given the limited number of resources (monomers) and the lack of other regulatory constraints, only one (or as many as there are different resources) of the sequences will survive. Thus, a mechanism is required to establish cooperation among the sequences so that none of them outcompete the others. In the hypercycle, each replicator (sequence) catalyses the replication of another sequence in the set. Each replicator catalyses the replication of only one other replicator and receives catalytic aid from only one other replicator, the interaction thus occurring in a circular topology. For example, in a three-membered hypercycle R1 catalyses the replication of R2; R2 catalyses the replication of R3; and R3 catalyses the replication of R1 and closes the hypercycle (see Figure2).

Figure 2.A 3-membered hypercycle. Each member (Ri) of the hypercycle can catalyse its own replication

(Ai) and the replication of the next member in the cycle (Ki+1).

Formally, the concentration of a replicator i in an n-membered hypercycle can be written as

.

xi

=

xi

[

Q

(

Ai

+

Kixi−1

) −

Φ

]

(7)

where K is the catalytic aid received from the previous member in the hypercycle, i = 1 . . . n, (x0

≡

xn);

all other symbols are as above.

We need to stress here that members of the hypercycle catalyse the formation of the next member but they themselves are not converted to the next member (i.e., reactions are second order of the form

(9)

R1 + R2

→

R1 + 2R2). There is a lingering misconception in the literature [50] which results in the cyclic (first order) production of certain molecules being called a hypercycle, which it is not.

How efficient is a hypercycle in integrating information, i.e., how many functional sequences could coexist in it? The higher the number of coexistent sequences, the more information the system maintains. If for all i, Ai = 0, i.e., the replicators cannot replicate on their own, only with the help

of another catalyst, then the system is fully cooperative and all members coexist [51–56]. This is the homogeneous hypercycle. Assuming that all catalytic rates are the same, the dynamics leads to a stable fixed point for two-, three- and four-membered hypercycles [51,52]. Furthermore, if there are five or more members in the hypercycle, then the system approaches a stable limit cycle. Theoretically, any number of sequences can coexist but with high numbers of members some replicator concentrations may decrease to very low values during oscillations and with any one of the members lost the whole system collapses. Therefore, for n > 4 the system is unstable.

In the inhomogeneous hypercycle (Ai > 0) the members are also in competition and if the Ai

values are too large compared to the Ki values, then one or more of the sequences can be lost [54].

Again, hypercycles with n

≤

3 members converge to a stable fixed point [51] and ones with five or more members exhibit oscillatory behaviour (stable limit cycles [57,58]. Hypercycles of six or more members can be unstable [59]. Stability is further affected by differences in the catalytic aid members give to each other [60]. In conclusion, we may say that the hypercycle can show rich dynamics [61,62], although its ability to maintain the coexistence of even a moderate number of different replicators is limited.

So far, we have not considered the quasispecies, i.e., the cloud of mutants generated around the wild-type sequence in a hypercyclic system. We can lump all mutants together and follow their concentration in a way similar to that of Equation (5):

. xm

=

xm

[

Am

−

Φ

] + (

1

−

Q

)

" _n

∑

i=1

(

Aixi

+

Kixixi−1

)

# (8) Analysing the dynamics of hypercycles and the mutants of the master sequences uncovered a new threshold [63,64]. A replication fidelity lower than the error threshold does not allow for the maintenance of a single long molecule but shorter sequences organized into a hypercycle can coexist with their mutants. There is a lower critical copying fidelity below which even the hypercyclic organization collapses, because the mutants overwhelm the system. Yet there is a range of copying fidelity which does not allow a single long molecule to coexist with its mutants but the same amount of information arranged in a hypercycle can be maintained.

Silvestre and Fontanari [65] have cast some further doubt on the information integration capability of hypercycles. While they were able to show that even long hypercycles with n = 12 can be maintained, the copying fidelity puts an upper limit on the number of sequences (n) that can coexist. They find that

if all Aiare the same (A) then

n

<

Q

2

4A

(

1

−

Q

)

(9)

Thus—they argue—chopping up the information into many smaller bits does not help. On the other hand, differences in replication and catalytic rates can ensure that information in many pieces can be maintained whereas a long chromosome cannot [66].

The hypercycle as an organization is capable of information integration. The question now is whether it is capable of evolution toward increased information content? Once a hypercycle is established, it is difficult to replace it with another hypercycle [41]. The hypercycle as a whole system exhibits hyperbolic growth and entities initially having a higher population size have an advantage in such a growth regime [67]. Even if we start from the same concentration, no coexistence of competing hypercycles is possible [68].

(10)

Catalytic species sometimes also inhibit some reactions. The hypercycle was also studied considering inhibitory/suppressive interactions. If there is strong suppression, then even-membered hypercycles cannot maintain all their species, whereas odd-membered hypercycles can. But even-membered cycles outcompete odd-membered cycles and thus the hypercycle generally breaks down under strong suppression [69].

As a consequence, while half a dozen sequences can possibly coexist, the system cannot evolve to incorporate more members. The evolutionary potential of the hypercycle is thus severely limited.

Niesert and co-workers [70] and Maynard Smith [71] pointed out a series of even more severe problems of the hypercycle that arise if mutations are allowed. There are two kinds of mutation that can destroy the system. One mutation turns a regular member to a selfish parasite, a sequence that accepts the catalytic aid given by a member but does not reciprocate (does not help the next member). If this parasite receives strong enough catalysis, then it can spread and channel away catalytic aid, leading to the collapse of the hypercycle (see Figure3, left panel). A second class of mutation can alter the specificity of aid given to other members of the hypercycle. If a new mutant arises that helps the replication of a member of the hypercycle other than the next one in the cycle, then a shortcut forms (see Figure3, right panel). Such a shortcut parasite reduces the hypercycle to one that consists of fewer members than the original (i.e., reduces overall diversity). A shorter hypercycle having shared members with a longer hypercycle can spread in expense of the longer one. This represents evolution to shorter and shorter hypercycles. Information is lost with each loss of a member.

Figure 3. Evolutionary instability in the hypercycle. (a) A parasite (RM) that enjoys catalysis from

a member of the hypercycle (R2) but does not take part in the hypercycle organization. (b) A shortcut

mutation (red dotted arrow) which changes the specificity of the catalysis offered by a member of the hypercycle (R2) so that it catalyses the replication of a member it should not catalyse. R1, R2and R4

now form a 3-membered hypercycle, which can replicate faster than the 4-membered hypercycle.

So far, we have assumed that the replication rates of the wild-type sequences are higher than the replication rates of the mutants. If there is a mutant with a higher replication rate, then it could outcompete its slower master sequence. Functional sequences are usually long and their shorter mutants replicate faster, as demonstrated in Spiegelman’s experiment [72,73], in which the Qβ phage genome was replicated without selection for function. The functional sequence was lost by the fourth serial transfer. The sequence population was evolved to be mere fifth of the length of the original sequence but it was replicated 15 times faster [72]. That is, mutations allowing faster replication for the quasispecies are all potentially deleterious to the “naked” (non-spatial) hypercycle. For a recent review on the hypercycle, see [74].

(11)

3.1.2. Parabolic Replicators (PR)

The first problem that a prebiotic replicator community has to solve if it is to start up life is to avoid the competitive exclusion of its constituent replicators, i.e., to maintain a critical diversity of replicator species even in the face of the shortage of resources (for string replicators: nucleotides) that at some point inevitably occurs for any replicator population capable of exponential increase. The problem seems even more difficult to solve given the lack of other conceivable regulating factors (mainly due to the simple chemical kinetics of prebiotic systems). As we have shown in the previous section the hypercycle, the first solution attempt to the coexistence problem, may fail to be a solution for more than a few reasons. In this section, we present another simple chemical kinetics of template-based replication for a special case in which Darwinian selection does not occur and the system ends up in a “coexistence of everyone” regime. Note that in the “standard”, resource-regulated dynamics of template replication with a detailed chemical kinetics (per base elongation of sequences) a complementary pair of sequences counts as a single replicator, not the solitary strands. Consequently, four different nucleotides can maintain the coexistence of four complementary pairs of strands [16]. This poses a strict limit on the diversity of the coexistent replicator community in the lack of other regulating factors.

The simple kinetics of template-based replication is the following. Assume that replicator A reacts with resource R at rate K and produces another replicator A that remains associated with the original (AA) and there is an association-dissociation process between double and single strands (with rates k and k0, respectively):

A

+

R

→

K AA, A

+

A

→

k AA, AA

→

k0 A

+

A

For this type of dynamics to occur the self-association of molecules is needed, which is possible e.g., in case of palindromic sequences. An important and chemically plausible restriction is the order of the rate constant values: K < k0 << k, i.e., association is much more probable than dissociation. Note that the result of the replication is the complex AA which is inert to replication, thus this chemical machinery has an interesting feature: it is self-regulated—the higher the concentration of A, the stronger the self-regulating effect. The speed of replication is determined by the concentration of (dissociated) A as this can act as a template for the replication. As von Kiedrowski first pointed out in 1986 [75], chemically embodied artificial replicators (modified hexanucleotides) behave according to this type of kinetics (see [76] for a detailed analysis of dynamics and [77] for an overview of artificial self-replicators.)

This type of self-replication substantially alters the dynamics of the system; replicator concentrations undergo parabolic rather than exponential growth (cf. Equation (1) and see e.g., [75]):

.

x

=

rxp (10)

where x is the total concentration of A and AA, p = 1/2, r

=

ρK q

k0

2k and ρ denotes the concentration

of R.

In almost all experimentally investigated systems p

≈

1/2 (p = 1 corresponds to “standard” exponential dynamics, the 0 < p < 1 interval corresponds to the parabolic growth in a broader sense). It can be easily seen (cf. [17]) that this type of dynamics maintains the coexistence of an arbitrary number of replicators. By introducing the constraint of the total replicator concentration being 1, the dynamics of N different types of replicators with Malthusian parameters ritakes the following form:

. xi

=

rix_ip

−

xi N

∑

j=1 rjx_jp (11)

(12)

After a simple rearrangement we get . xi

=

x_ip ri

−

x1−p_i N

∑

j=1 rjx_jp !

<

x_ipri

−

Nrmaxx1−p_i (12)

where rmaxdenotes the largest Malthusian parameter and we used the following inequality:∑N_j=1rjx p j <

∑N

j=1rj<Nrmax. Obviously, any replicator has a positive growth rate if its concentration drops below a critical

threshold: xi< 1 N ri rmax _1−p1 (13) thus, at least theoretically, the advantage of rarity warrants the survival of everybody, whenever the replicators are in a competitive situation. This result was extended by Varga and Szathmáry [78] showing that there is a single internal and globally stable rest point of the system of Equation (11).

It is instructive to compare the solution of the dynamics of exponential growth, (Equation (1), or p = 1 in Equation (10)) and parabolic growth (Equation (10) with p = 1/2) for two replicators. While in exponential growth the ratio of the concentrations of the two replicators is an exponential function of time (resulting in competitive exclusion), in the parabolic case the ratio is:

x1(t) x2(t) = px1(0) + r1t/2 2 px1(0) + r2t/2 2 →_t→lim∞ x1(t) x2(t) = r 2 1 r₂2 (14)

meaning that the ratio of the equilibrium concentrations depends on the ratio of the squared Malthusian parameters (this is where the name of parabolic replication comes from). Interestingly, this result is in line with Darwin’s statement on the geometrical increase of populations: “The Struggle for Existence amongst all organic beings throughout the world [ . . . ] inevitably follows from their high geometrical power of increase . . . ” [79].

Note that for a large number of replicator types (N >> 1) the equilibrium concentration may be very low, so that stochastic drift can drive some replicator species extinct from the community even if their growth rates are positive. Despite this effect and the chemical constraint of the self-association of replicators, the system seems to be solving the problem of the maintenance of a critical diversity, because it is capable of storing a large amount of information (cf. Section3.1.1, the Eigen-model and information threshold). The beneficial ability of parabolic replication to maintain diversity is itself also its most serious fault: selection (in the Darwinian sense) is not possible in this type of system. Since better replicability does not mean competitive dominance, there is no room for evolution because selection is paralyzed. As we will see later, in this sense the parabolic replicator model is homologous to the model of replicator dynamics in chaotic flows.

The analysis can be extended by treating the dynamics of single and double strands separately. In this case the selection-free regime exists only above a critical total concentration [17,75]. The explanation is straightforward: at low concentrations, single strands do not frequently pair up to form double strands; thus, self-inhibition is weaker than cross-inhibition. The selection-free regime switches to selective upon assuming the (naturally present) exponential decay of single-strands [80]. Exponential decay is the most conservative assumption for decay of atoms and molecules including replicators, with the number of decaying replicators assumed to be proportional to the number of replicators present. The behaviour of the system may change if both single- and double-strand forms can exponentially decay (even if the decay rate of the double-strand is much smaller than that of the single-strands), as it is shown in [81] for two competing replicators. In this case the outcome of competition depends on the parameters, mainly on the influx of the resource and the decay rates of single- and double-strands. At high levels of resource influx, the replicator concentration is high and thus parabolic replication and coexistence remain possible, whereas below a critical level of influx, selection sets in and the superior replicator outcompetes all the others—the influx of nutrients acts as the control parameter of selectivity.

In an attempt to extend the investigations beyond the spatially homogenous case described by Equation (10), template-directed replication was assumed to occur on a surface [82]. Double-strands bind to the surface stronger than single-strands, which in terms of decay corresponds to the assumption that single-strands have a higher decay rate. A semi-analytic investigation of the corresponding model shows that two parabolic replicators competing for their building blocks on a mineral surface are subjects of Darwinian selection under a wide range of

(13)

parameter values. Differential adherence to the surface guarantees different decay rates, while the different influx of nutrients (the control parameter of different selective regimes) is due to different rates of resource supply.

Product inhibition leads to parabolic replication in non-enzymatic (artificial) replicator systems, resulting in parabolic amplification that switches off selection, consequently it cannot be the mechanism of evolutionary dynamics. A few investigations have revealed that Darwinian selection can still operate under rather specific circumstances (separate dynamics for single and double strands, exponential decomposition of strands and surface-bound template-based replication). Yet, at its present state parabolic replication seems to be of limited relevance in prebiotic evolutionary research, precisely because of its narrow scope for evolvability.

3.2. Models Assuming Structure

These models either assume a strict spatial order (usually on a 2D lattice) or a vesicular grouping of replicators into compartments. Either way, local interactions dominate the system, often coupled with multiple levels of dynamics and/or selection.

3.2.1. Spatial Hypercycle (SHC) and Compartmentalized Hypercycle (CHC)

There could be a way out of the evolutionary problem for the hypercycle, especially from the problem posed by fast replicating mutants. If the hypercycle is localized onto a surface or into compartments, then higher level selection can weed out the parasites.

Boerlijst and Hogeweg were the first to analyse a spatial version of the hypercycle [25,26]. The spatial version of the hypercycle alleviates the problem of the stability of large systems (consisting of more than four members), thus solving one of the problems of the non-spatial version. Furthermore, the spatial spiral patterns formed conveys some resistance to parasites. A pure parasite which appears after the formation of the spirals is ousted to the outer edges of the spiral, where it decays. Parasites can kill a spiral if they are introduced exactly to the middle of the spiral. Then neighbouring spirals take over the space and thus the parasite is destroyed or lingers in a kind of “cyst.” Inhibitory effects [83] and a gradient in the decay rate of molecules [84] can further fortify the spirals against parasites. The partial differential equation model of the same system exhibit less spiral formation and it is prone to parasites that kill the spirals [23,24].

Spatial arrangement and compartmentalization [27] seems to solve the problem of a fast replicating parasite. Shortcut mutants, however, still outcompete longer sequences [85]. The short cycles that cannot form spirals are at a selective disadvantage compared to ones that can form spirals and thus exclude parasites. So, a shortcut mutant can spread and then the system becomes prone to parasites. Evolutionarily the spatial hypercycle is as limited as the non-spatial one: once established no novel, disjunct hypercycle can invade the system [25].

Based on the above and to put it bluntly: the hypercycle does not work; it is evolutionarily unstable. This is an important and rather old result that has never been circumvented. Thus, the hypercycle cannot solve the problem of prebiotic information integration. Despite its rich literature, it is time to put this model to rest.

3.2.2. Coexistence in Open Chaotic Flow (OCF)

A prebiotic habitat without spatial structure is generally considered to be a set of replicators mixed intensively in an aquatic medium. However, the mixing of liquids is rarely perfect: peculiar spatial structures often emerge because of nonlinear phenomena in hydrodynamics. Open chaotic flows—one specific realization of the huge branch of complex hydrodynamical phenomena—are particularly interesting from our point of view. A flow is considered to be open if there is a continuous flux into and out of the observed region of the fluid medium and the recirculation time is much longer than the life time of the advected particles [86]. If the flow is time dependent but non-turbulent, then advected particles (replicators, in our case) move chaotically through this observed region following complex trajectories. The whole branch of possible trajectories then forms a fractal set and particles move along this fractal for a long time before they escape from the observed parcel of liquid (Figure4).

(14)

Figure 4.The motion of particles in an open chaotic flow. The blinking vortex-sink system is used for demonstration. It models the outflow from a large bath tub with two sinks that are opened in an alternating manner. Crosses denote the sinks. (a) Diverging trajectories of two particles that initially are close to each other. In this example, they even leave the bath in different sinks. (b) A snapshot on particles distributed along a fractal set (chaotic saddle) in open chaotic flow generated by the blinking vortex-sink system. (Based on [32].)

How do populations of replicators living and multiplying in an open chaotic flow behave? The dynamics of autocatalytic replicators in two-dimensional open chaotic flows has been derived by [87] as

.

x= −κx+νx−β (15)

where x is the number of replicators along the fractal filament and κ is the average escape rate from the observed region. The second term is the production proportional to the autocatalytic reaction rate defined above; β = (D−1)/(2−D) depends purely on the fractal dimension D of the filaments. Since 1 < D < 2 in two dimensional flows, then β > 0 and thus reproduction becomes more and more effective as the number of replicators decreases. This advantage of rarity is the consequence of the fractal structure itself, which therefore acts as a catalyst in generating the peculiar nonlinear population dynamics. The dynamics leads to a stable stationary equilibrium of replicators along the fractal set as x* = (ν/κ)1/(β+1)[87].

It is precisely this non-standard dynamics of replicators that leads to the advantage of local rarity, balancing the concentration differences of different replicators competing for the same limiting resource and thereby allowing for their coexistence. To formally approach this problem a two-dimensional flow is modelled around a cylinder. For a wide range of inflow velocities there is a periodic vortex detachment in the wake of the cylinder. The flow is, then, time periodic and purely deterministic but particles move chaotically in the wake of the cylinder [31]. The limiting resource flows constantly into the mixing region. Replicators were modelled as particles moving along the flow, decaying spontaneously and replicating if there are resource particles at sufficient density in the neighbourhood of a replicator. While competition for a single limiting resource leads to the survival of only the most effective replicator in a well-mixed habitat, simulations with the model have revealed that competitors coexist along the fractal set in the wake of the cylinder (Figure5).

(15)

Figure 5.The distribution of two replicators (red) B and (blue) C competing for the same resource material (white) in the wake of a cylinder. The flow is from left to right. The inset in (a) shows the time-dependence of the population numbers nB, nCand clearly indicates the approach to a steady state

of coexistence. A blow-up of the region indicated by a rectangle in (a) is seen in (b). B-s replication rate is 4/3 of the C-s, while decay rates are the same for the two species. Coexistence of 35 species is experienced in other simulations. (Based on [31].)

The results obtained by numerical simulations are reinforced by analysing the dynamics of competing replicators in open chaotic flows by mathematical means. The analysis makes use of the fact that there is only stretching and folding along the fractal set providing habitats for the competitors and thus they are arranged into more or less parallel stripes. In the simplest case with two competitors this leads to the dynamics

.

xi= −κxi−q(D−1)νx−β−1xi+qpi(x1/x2)νix−β

where q is a geometric constant, νiis the speed of the reaction front for replicator i (= 1,2), piis the probability that

replicator i is at the boundary of the habitat stripe and the resource and ν = p1ν1+ p2ν2is the average reaction

speed [88]. Due to dimensionality and symmetry reasons

p1 x1 x2 = x1 x2 α ω+ x1 x2 α, p2 x1 x2 =1−p1 x1 x2 (16)

where α and ω are positive constants. The positive fixed point of this system is stable if 0 < α < 1. For α = 1, ω = 1 which is the case if mixing is complete; then there is no coexistence. Similarly, for α > 1 the initially more abundant competitor wins (over dominance) [88]. Analysis of the individual-based (IB) model of this system pointed out that p_ireally follows Equation (15) and, whenever coexistence is observed, the inequalities 0 < α < 1 hold, just as the analysis forecasts. Competition rules can be defined in different ways in the IB model. Interestingly, competitors could not coexist when some rules were applied but these rules always imply α > 1, as expected. (For α = 1 either species 1 or 2 wins the competition depending on other parameters of the model.) Thus, IB models reveal that without knowing the detailed mechanism of competition we cannot determine the dynamical behaviour

(16)

of the replicators in open chaotic flows [88]. Moreover, although the analysis has been completed only for the two-species model so far, it is straightforward to extend it to many species with pi=ωixαi/∑

m

i=1ωixαi, where m

replicators compete along the fractal set. Using the method presented in [17] it can be shown that any number of replicators coexist if 0 < α < 1, see Section3.1.2. That is, the dynamics are formally equivalent to those of parabolic replication, although the subexponential term in the replication dynamics follows from imperfect mixing along the fractal set and not from the self-inhibition of replicators [89].

3.2.3. Trait Group Model and Kin Selection (TGM)

The requirement of maintaining an above-minimal level of information in a replicative system can be translated to the issue of slow replicators coexisting with faster ones. For structured populations, the first models dealing with coexistence originated from social ecology. There the problem translates to whether inferior replicators (slowly replicating but altruistic in terms of providing help to the group) can survive against selfish ones. In broader terms, the question is whether a useful replicator can successfully compete with its own mutants. Answering this question requires the introduction of multiple levels of selection.

Wilson constructed his trait group model (TGM, structured deme model) [39,90] to demonstrate that group selection at the supraindividual level can lead to the coexistence of altruistic individuals with inferior fitness compared to more selfish, opportunistic ones. Altruism in this context means favouring others at the expense of the fitness of the altruist. Wilson based his group dynamics on the reproduction and dispersal of multicellular organisms. Individuals compete within local groups called “trait groups”, which are smaller than a deme but larger than a single individual. After a generation, groups disband, individuals are dispersed and mixed and ultimately form new trait groups. The model effectively separates global genetic mixing at the deme level and local ecological interactions at the trait level. Ecological dynamics in the dispersal phase are different from those in the competitive, non-dispersal phase. This distinction effectively imposes structure on the population with a new, higher level of selection in effect at the deme level. Higher level success is indirectly linked to within-group selection governed by individual growth and replication rates, as traits affect the group’s fitness. Small group sizes, low migration rates and rapid removal of compartments infected with the selfish replicator all favour group selection [91].

Wilson’s model (see Figure6) explicitly assumes two types (two alleles of a gene) in the population. Wilson proves that “altruistic” individuals (those decreasing their own fitness in exchange for helping others) can coexist with selfish ones or increase their frequency if the variance within groups is higher than random, i.e., if the groups are not random samples of the population (see [91]). The slightest deviation from perfect genetic mixing and random reassortment could provide the required non-random distribution (for example, relatives tending to stay together) hence there is no need for compartmentation (physical separation) and Wilson’s model simplifies to any kin selection model fulfilling Hamilton’s equation [92], which requires a larger-than-zero relatedness for an altruistic trait to increase its frequency.

In Wilson’s model, individuals do not replicate within trait groups, only undergo selection, though this scenario can be replaced with replication and selection to comply with prebiotic replicators. Individuals sharing an altruistic trait correspond to cooperative replicators and individuals lacking this trait are selfish ones. Results in general are invariant for variable trait group sizes. In the general case, an all-defector population is stable against invasion by co-operators [93]. If, however, the defective replicator is in some sense dependent on the cooperative one, there is a wider scope for stable coexistence and an all-defector population may allow the invasion of co-operators. In a hypercyclic example (modelling defective interfering viruses, DIV), there are two outcomes: either the cooperative replicator wins if it helps itself more than it helps the defective one; or they will coexist and the co-operator cannot disappear [94]. Since a defector can only replicate by coupling with a co-operator (an assumption specific to the DIV model), any group consisting only of defectors perishes, ultimately increasing the overall frequency of co-operators in the population. Hence the all-defector group is evolutionarily unstable and any stochastic process generating co-operators (like mutation) would lead to their successful invasion, regardless of variable trait group sizes.

The trait group model directly relates to other models of the field. Increasing the diffusion rate in the MCRS (see Section3.2.5) leads to the TGM with the replicative phase taking place locally (due to surface-binding) but genetic mixing is intense (due to diffusion; [33]). If compartments divide instead of intense global mixing (and replicators independently replicate within the compartments), then the SCM emerges (see Section3.2.4).

The problem with the TGM is precisely what Maynard Smith recognized: a trait group, due to global mixing, does not form a true unit of selection but simply realizes kin selection (for example, by locally reproducing

(17)

organisms forming a kin). To tap the advantage of true group selection, one must maintain group structure continuously, so that selection at the group level can act against groups of inferior compositions. This is what the stochastic corrector model realizes.

Figure 6.Wilson’s trait group (or structured deme) model. (A) Individuals with different traits, black and white dots, form localized trait groups. (B) After ecological interactions and selection, (C) survivors are released to form a pool, where they can reproduce. (D) New groups form (After [39]).

3.2.4. Stochastic Corrector Model (SCM)

The SCM was designed [28,95] to remedy both the trait group model’s fuzzy compartmentation and lack of proper higher-level unit of selection and the hypercycle’s frailty against mutants [28,96,97].

The model implies the following steps (see Figure7): different replicator types proliferate within vesicles (compartments, possibly “simple cells”). When the internal replicator concentration reaches a limit, the cell splits in two due to naturally emerging physical constrains within its assumed lipid boundary. If there is an optimal composition of selfish and altruistic replicators (i.e., the compartment containing them has the highest fitness at the group level), then it can be proven that this optimal composition will always be present in the equilibrium distribution of various compositions. For this to occur, the following assumptions must be met:

• Template replicators replicate independently within vesicles (they are not hypercyclically coupled), competing for shared resources (nucleotides, enzymes, space).

• Replicators contribute to a common good (e.g., metabolism) such that they affect the selection of the whole group, thus compartment fitness (group replication rate) depends on composition.

• Replicators are essential: a group can only replicate if both replicator types are present.

• The redistribution of molecules during fission is not biased by any replicator type but is random for each molecule, hence they will follow a hypergeometric distribution in the offspring.

• Compartment size is relatively small and fission happens before equilibrium is reached in cells.

• Replicator migration (or other transposon-effects) from one compartment to another is negligible. The internal dynamics for the two replicator types x1and x2are:

. x1=ax1(x1x2) 1 4₋_dx 1−x1(x1K+x2) . x2=bx2(x1x2) 1 4₋_dx 2−x2(x1K+x2) (17)

where a and b are replication rate constants; a > b ensures competition. Both types’ degradation rates are d and the common carrying capacity K ensures that the internal population of a cell cannot grow to infinity. The exponent1_⁄₄

(18)

(in fact, any exponent smaller than1_⁄₂_{) ensures that in the limit both replicators go extinct (without group structure,}

of course), thus it acts as a worst-case assumption.

If vesicles were split after the internal equilibrium has settled, either both types or one of them would be extinct by the time of division according to dynamics, ultimately leading to the whole population losing information. Due to the stochastic nature of replication-degradation (“demographic stochasticity”), compartment fission and molecule reallocation, the optimal combination reappears and the distribution of probabilities for the different combinations can be calculated at group-level equilibrium [28]. Since replicators provide aspecific help to the group via a shared metabolism (like in the MCRS, see Section3.2.5) rather than direct help to other replicators (like in the DIV model), they are not forming a hypercycle.

Szathmáry and Demeter have applied the quasispecies model to the various compartments rather than to individual molecules (assuming a finite number of compartment types; [98])—emphasizing that in this case, compartmentalized groups of replicators follow their own internal dynamics depending on their internal states. It can be shown that internal dynamics and compartment splitting lead to a dominant equilibrium quasispecies [22] in which all compartment types coexist (and thus no replicator is lost; [28,29]). This condition is always met, as the stochastic replication and reassortment of molecules ensure that each compartment composition can turn into any other [28].

In conclusion, independently replicating different replicator types functionally complement each other within a compartment. Consequently, the compartment with the optimal composition of replicators has the best fitness. Stochasticity in replication and reallocation during fission generates the necessary variation, on which natural selection at the compartment level can act [28]. The SCM effectively realizes group selection that guarantees replicator coexistence.

Zintzaras and his co-workers compared compartmentalized hypercycles (CHC) with the SCM [27]. They have found that both compartment-systems can integrate information successfully, though the SCM is able to operate under higher deleterious mutation rates and settles at a lower equilibrium mutational load than the CHC, which, however, reaches better average fitness values. The important caveat here is that compartments only contained two-membered hypercycles. Scalability (maintaining more replicator types) obviously favours the SCM, as larger hypercycles are prone to shortcut mutations. Whether fusion of compartments increases diversity and stabilizes the system in general is not clear yet, though some theoretical results indicate positive effects [99]. Hubai and Kun, under more realistic assumptions, concluded that ~100 different genes could have survived in a simple protocell [30]. In vitro realizations prove that (transient) compartmentalization is effective in maintaining a functional diversity of replicators within vesicles [40].

It is worth noting that the SCM was the first model to explicitly assume all three subsystems of cellular organization and thus life, as defined by the Chemoton model [100–102]: it deals with the competition of different information carrying templates, while assuming an unspecified metabolism and a boundary subsystem that encloses the composition. A stochastic implementation of the chemoton model (approximating the SCM) proves that two different competing template replicators can coexist in a protocell [103]. The SCM also realizes multilevel selection properly, hence it is modelling the result of a major evolutionary transition in which competing elements of a lower level of selection are successfully integrated at a higher level [2,104].

Figure 7.The stochastic corrector model. The two replicator types are indicated with filled and empty small circles. Arrows indicate transitions, as individual compartments grow and divide. Compartments with green highlight (after division) contain the optimal composition of replicators (after [29]).

(19)

3.2.5. Metabolically Coupled Replicator System (MCRS) The Concept of a Metabolically Coupled RNA World

All the RNA-based models of prebiotic evolution are built on the assumption—which, at the time of writing, remains empirically unproven—that the template replication of the first RNA replicators was possible in the RNA World [98], even if it was slow and unreliable at the beginning. This assumption is indispensable because, for the evolutionary machinery to swing into action, populations of self-replicating entities are a necessary condition [2]. Thus far we know of no non-enzymatic RNA-replicating mechanism capable of copying reasonably long strands of RNA and it seems improbable that we can ever come up with one, so it is straightforward to postulate that RNA replication must have been enzymatic from the outset. Under the most likely prebiotic conditions, which surely did not provide the efficient peptide-based biochemical devices of recent cells, the necessary enzymatic help could not come from anywhere else but within the RNA World itself: RNA-dependent RNA polymerase ribozymes (or groups of ribozymes) that ignited prebiotic replicator evolution must have existed. However, recently synthesized RNA-dependent RNA polymerase (replicase) ribozymes [105–107] are highly complex and quite long (longer than the maximal length allowed by Eigen’s paradox). This is not surprising, given that RNA polymerization is a highly complex catalytic process that includes the ligation of nucleotides and the binding of template and copy strands, as well as their separation at the end of the process. Since ligation does not require a long ribozyme sequence [108], it is template binding and daughter-strand separation that necessitate the help of more complex and longer RNA polymerase ribozymes. Such a ribozyme complex has a very low chance of assembling in a short time, even from the huge random RNA population that we may assume to be produced by abiotic reactions in places such as near hydrothermal vents at the bottom of the prebiotic ocean [109]. However, considering the enormous amount of time at the disposal of prebiotic attempts to boot up RNA replication using a huge initial RNA pool with a fast turnover of random sequences, the assumption that some slow and vague ribozyme-aided RNA replication mechanism appeared by chance at some point and took the first evolutionary steps towards life seems not to be entirely remote. Such a self-replicating ribozyme replicase complex has not yet been discovered experimentally but neither have we spent eons of time looking for it in a practically infinite sequence pool. Wu and Higgs [110] suggest a simple model for a self-inducing evolutionary mechanism capable of producing replicase ribozymes with relatively high catalytic activity, starting from a very inefficient “primordial” replicase.

With the ribozyme-based replicase function in place, another necessary condition of self-replication must be met: a continuous supply of activated nucleotide monomers in spite of the rapidly increasing monomer consumption by the exponentially growing RNA replicase population. Given that abiotic monomer production must have been very slow (if it occurred at all) without enzymes under prebiotic circumstances [109,111], we must assume that metabolism was also catalysed and the necessary catalytic help for monomer production came from within the random pool of RNA sequences as well. Any sequence that happened to have some catalytic activity capable of speeding up, at least to some extent, a metabolic reaction of the actual reaction network producing the monomers offered an indirect selective edge to the replicase, which, therefore, “adopted” the new sequence by giving it a replication advantage in exchange for the metabolic one received. Keeping the metabolic machinery running requires all the ribozymes of the system that contribute to the metabolic reaction network with their enzymatic activities to remain robustly coexistent. This is not an easy task for different species of replicators competing for the same limiting resource (the monomer pool) that they depend on for their replication. The ecological principle of competitive exclusion (the Gause-principle, see Section2.1) permits the survival of just a single replicator species on a single limiting resource [15] but a single ribozyme cannot, obviously, catalyse all the chemical reactions of even a simple metabolic network. The mutual dependence of each metabolically indispensable replicator species on the presence of all the others may seem to alleviate the exclusion principle but it is easy to show that even the mandatory cooperation of the replicators is not sufficient for that to happen if the system is well-mixed without any local inhomogeneity permitted. With complete spatial homogeneity (and/or global mass interaction of the replicators and the metabolites they use and produce) assumed, the replicator system follows the simple mean-field dynamics

.

xi=rixiM(x) −Φi(x) (18)

where x = (x1, x2, . . . , xs) is the population density vector for the s different, metabolically essential replicator

species with replication rate vector r; M(x) is the monomer supply provided by metabolism at ribozyme densities x; and φ is an outflow vector ensuring that the total density∑s

(20)

In accordance with the assumption regarding the metabolic role of essential ribozymes, the metabolic function M(x) must take the value 0 if any one of xiis zero. A simple realization of this constraint is using a metabolic

function proportional to the geometric mean of replicator densities:

M(x) =c s

∏

i=1 xi !1_s (19) where c is a positive constant. Since the metabolic effect of the actual monomer supply is the same for all replicator species at any particular moment, their relative (instantaneous) growth rates are determined by their (constant) rigrowth parameters alone, i.e., the metabolic function has no effect on the order of the growth rates riM(x) at

any time. Therefore, the replicator with the highest growth parameter excludes all the others, in agreement with the Gause-principle.

The dynamics of the system are radically different, however, if the highly unrealistic assumption of complete global homogeneity postulated in the mean-field model is relaxed. The growing family of the Metabolically Coupled Replicator System (MCRS) models offer a chemically and ecologically feasible spatial mechanism for the robust maintenance of a metabolically active set of different ribozymes attached to mineral surfaces, assuming that

1. the replicase function is given: any RNA sequence is capable of producing a copy of itself by template replication if it has a sufficient concentration of monomers at its disposal.

2. all the members of the metabolic replicator set are indispensable in running a simple metabolic reaction network (metabolism) producing the monomers; if any one replicator type is missing from the set, monomers are not produced at all and the system goes extinct.

3. the replicators are attached to a 2D mineral surface on which their horizontal mobility is limited; replicators leaving the surface are lost to the system (replicator “death”).

4. nutrient compounds (external initial substrates of the metabolic reaction network) are supplied from the third spatial dimension in excess.

5. the metabolites (substrates and products of the reactions that the replicators catalyse) are also attached to the surface, on which they may diffuse to a certain distance d before either being detached from the surface and lost, or used in a metabolic reaction or in replication (Figure8).

6. the metabolic contribution to the probability of a certain replicator being copied is dependent on the local ribozyme composition of its metabolic neighbourhood (i.e., within the distance d defining the metabolic neighbourhood of the focal replicator); only metabolically complete neighbourhoods (which have at least one copy of each essential ribozyme) allow for replication.

7. the metabolically active set of ribozyme replicators may have enzymatically inactive parasites, i.e., replicators which do not contribute to monomer production but use the monomers produced by the cooperating replicators for their own replication.

Replicator Diversity and Ecological Stability in MCRS Models

Unlike the mean-field version, the stochastic (lattice) implementation of the spatially explicit MCRS model keeps all the metabolic replicators coexistent and shows robust ecological stability (Figure9) [33,36,37]. Limited mobility and localized interactions of the replicators and the metabolites allow local group selection to operate: parts of the community lacking any one of the metabolic replicators are doomed to local extinction, pre-empting the habitat for invasion from nearby, metabolically complete neighbourhoods. The system is also resistant to its parasites in the sense that, even though parasitic replicators can invade the metabolically cooperating ribozyme community, they cannot destroy the cooperation altogether, because the damage they inflict on the system remains local and ephemeral. Wherever parasites take over locally, they stop monomer production and thus, indirectly, they commit suicide by disrupting their own monomer supply and starving themselves to death. This result is in line with that of Szostak et al. [112], whose model predicts parasite invasion in a replicase ribozyme population but without the parasites destroying the system. The coexistence of a replicase ribozyme and its parasitic quasispecies in a multilevel selection regime has also been demonstrated in [113].

(21)

Figure 8. Basics of the MCRS algorithm. (1) Metabolic support of the four replicators in the von Neumann neighbourhood of an empty site (black X). Red, green and blue Ss denote different, metabolically active replicator species, the yellow P stands for a parasitic replicator. (2) The replicator taking the empty site by the next generation is determined by a random draw, with the empty site to remain empty having a constant claim and the claim of each adjacent replicator depending on its own replicability (R) and the metabolic support it receives from within its own 3×3 metabolic neighbourhood. (3) Each replication step is followed by replicator diffusion, implemented as D elementary steps of the Toffoli-Margolus algorithm [114] at random positions of the lattice.

Figure 9.Persistence of the MCRS at different sizes of the metabolic and the replication neighbourhood. Neighbourhood sizes are given as side lengths of a square-shaped section of the lattice that is centred on the focal site; N stands for von Neumann neighbourhood. The i/j values inside the table cells specify the numbers of persistent/extinct systems out of five replicate simulations; grayscale values are system densities in percentages of sites occupied by replicators within the lattice after 10,000 generations. Panel (a) D = 0, Panel (b) D = 1, Panel (c) D = 4 and Panel (d) D = 100 (Based on [36]).