Satisfiability
and
Evolution
Adi Livnat, Christos Papadimitriou, Aviad Rubinstein, Greg Valiant,
“One curious aspect of evolution is that everybody thinks he understands it!”
Jacques Monod
“Nothing makes sense in life except in the light of evolution”
Waddington’s Experiment (1952)
Generation 0
Waddington’s Experiment (1952)
Generation 1
Temp:
40
oC
Waddington’s Experiment (1952)
Generation 2
Temp:
40
oC
Waddington’s Experiment (1952)
Generation 3
Temp:
40
oC
Waddington’s Experiment (1952)
(…)
Generation 20
Temp:
40
oC
Surprise!
Generation 20
Temp:
20
oC
Genetic Assimilation
Is There a Genetic Explanation?
Suppose:
•The “red phenotype” depends
on genes x1, …, xn plus h = “high temp”
•“red” = F(x, h), Boolean
How do Allele Frequencies Change
from Generation to Generation?
• Suppose xi = 1 with probability pi • Next generation?
A Genetic Explanation?
Wanted: Boolean function F ( x, h ) with these properties:
•Initially, Prob x ~ p[0] [F ( x, h = 0)] ≈ 0% •Then Probp[0][F ( x, 1)] ≈ 15%
•After breeding Probp[1][F ( x, 1)] ≈ 60%
A Genetic Explanation!
“red” = “x1 + x2 + … + x10 + 3h ≥ 10”
•T = 0, ~0% •T = 1, ~15% •T = 2, ~60% •T = 20, ~ 99% •h = 0, ~25%
Stepping Back now:
Evolution Today
• A powerful and prestigious theory
• Founded on the ideas of Darwin and Mendel • Informed by sophisticated math models
developed in the early 20th century (mainly,
population genetics)
Yet
many important
mysteries remain:
• What is the role of sex/recombination? • Why is there so much genetic diversity
within species?
Now back to Waddington’s
Experiment
•The “red phenotype” seems to be a complex trait which actually does emerge…
So, how about generalizing it
• We have an arbitrary Boolean function of genes (no environmental variable h)
• Suppose the satisfying genotypes have a small fitness advantage (1 vs. 1 + ε, say) • (Instead of a 0 - 1 advantage as in
Waddington’s experiment)
Perhaps Monotone Functions?
• Recall: pi' = prob
p [ xi = 1 | F(x, h) ]
• Now, pi’ ≈ (1 – ε) × p i +
ε × probp [ xi = 1 | F(x) ]
• If F(x) is monotone, and xi has some influence, then pi' > p
i
Monotone Functions (cont.)
• After exponentially many steps, done
Theorem: n / (ε×σ0) steps suffice
σ0 = initial satisfaction probability
Proof: Boolean Fourier analysis
A perenthesis: Genetic
Algorithms
• My road to Evolution
• In life sex is succesful and ubiquitous
• Why do GA perform so poorly when
compared to Simulated Annealing?
• Answer: Evolution is not a good metaphor for heuristics
Back to monotone functions:
But wait a minute…
• Why are we assuming a product distribution?
• Isn’t there linkage (correlation) in genetics? • For example, imagine F = … (x1 = x2) …
• Prob [x1 = x2 ] = ¾ ≠ ½ × ½ + ½ × ½
Nagylaki’s Theorem
Theorem [N 1993]
: After O(log n)
generations, LD = O(ε)
Why? Trace genes in ancestor tree
…
3 log
n
Bounding
Linkage Disequilibrium
• But if there is selection, sampling is not
quite uniform
• ~ ε bias introduced at each generation • Therefore, LD = O(ε log n)
So, Assumption Justified!
• OK to assume product distribution. • (Since fitness values are 1 and 1 + ε)
Arbitrary Boolean Functions!
Main Theorem:
Any Boolean function
of genes which confers a small
evolutionary advantage will be
Wait a minute, this is wrong!
• XOR: Suppose that F = (x1 ≠ x2)
• What will happen if we start uniform?
• No change!
• Also, if, e.g., F = “exactly k out of n are true” and we start at xi = k/n
Main Theorem:
Precise statement
• To form the next generation:
1. Sample from current product
distribution {pi}, N individuals, let the empirical distribution be {qi}
(gets you unstuck in XOR etc.)
2. Then apply Selection: pi’ ≈ q
Main Theorem:
Detailed Parameters
• n = number of genes involved
• σ0 = initial satisfaction probability • ε = selection strength, must be < 1/n
• N = population > n3 / (σ0)4
ε < 1/n ?!?
• How come a theorem about the effectiveness of selection seems to fail when selection is strong?
• Intuitive explanation: In the interior of the cube the process is close to gradient descent
Main Open Problem
Greg’s Conjecture: Result remains true even
if the fitness is 0 – 1
•Evidence from experiments
Main Theorem: Proof
• We want to show that the sample - select
process leads to a satisfying population • We track the expected fitness f[p]
• Core of the proof: We bound the variance introduced by sampling by the expected fitness increase in the selection step:
pi’ = q
Main Theorem: Proof (cont.)
pi’ = q
i + ε × probq [ xi = 1 | F(x) ]
•We show:
variance introduced fitness increase
Eq[(f (q) –f (p))2] ≤ E [f (p’) – f (q)]/(N(1 – nε))
Main Theorem: Proof (cont.)
Eq[(f (q) –f (p))2] ≤ E
Main Theorem: Proof (cont.)
Eq[(f (q) –f (p))2] ≤
linear mass of the q-biased Fourier transform of f
Eq [Σi (Fq
{i})2]/N
Main Theorem: Proof (cont.)
Eq[(f (q) –f (p))2] ≤
Eq[Σi (Fq
{i})2]/N
Main Theorem: Proof (cont.)
Eq[(f (q) –f (p))2] ≤
Eq[Σi (Fq
{i})2]/N
Main Theorem: Proof (cont.)
Eq[(f (q) –f (p))2] ≤
Σi (Fq
{i})2
Main Theorem: Proof (cont.)
• Next: the total effect of the variance steps is small • Idea: Σt f (qt+1) – f (pt) is a martingale
• But no obvious upper bound
• Need specialized martingale inequality
Main Theorem: Proof (cont.)
• Finally: the process gets so close to the boundary that increase is miniscule
• Random walk (with absorbing boundaries!) will eventually get stuck at a vertex of the cube
• End of proof
Discussion
• An interesting and nontrivial algorithmic fact about satisfiability
• Remember Greg’s conjecture
• Parameter bounds should be very improvable • Monotone functions bound essentially tight
Implications for
Evolution
?
• Interesting new mechanism for the emergence of complex traits
Implications for
Evolution
?
(cont.)
Implications for
Evolution
?
(cont.)
• [CLPV, PNAS 2014]: Evolution under sex is tantamount to a repeated coordination game
played by the genes: the strategies are the
allleles, the probabilities are the frequencies in
the population, the utility is fitness, and the
game is played through multiplicative weights!
Soooo,
Evolution
and TCS
Remember the three mysteries of Evolution (not the only ones, btw):
•What is the role of sex/recombination?
•Why is there so much genetic diversity within species?
• How do complex traits emerge?