• No results found

Stochastic Gene Expression in Prokaryotes: A Point Process Approach

N/A
N/A
Protected

Academic year: 2021

Share "Stochastic Gene Expression in Prokaryotes: A Point Process Approach"

Copied!
63
0
0

Loading.... (view fulltext now)

Full text

(1)

Stochastic Gene Expression in Prokaryotes:

A Point Process Approach

Emanuele LEONCINI INRIA Rocquencourt - INRA Jouy-en-Josas Mathematical Modeling in Cell Biology

March 27th 2013

(2)

Introduction Central role of protein production in prokaryotes

Central role of protein production

Proteins are the core of biologic processes: enzymes, DNA replication

machinery, . . .

∼50% of the bacteria dry weight

∼3.5 millions of proteins in each cell

∼2000 types of proteins produced at any time at any growth

condition (volume growth)

proteins ranging from few dozens up to 105

A highly consuming process:

at each generation, bacteria has to duplicate all proteins more than 85% of cell resources

(3)

Introduction Central role of protein production in prokaryotes

Central role of protein production

Proteins are the core of biologic processes: enzymes, DNA replication

machinery, . . .

∼50% of the bacteria dry weight

∼3.5 millions of proteins in each cell

∼2000 types of proteins produced at any time at any growth

condition (volume growth)

proteins ranging from few dozens up to 105

A highly consuming process:

at each generation, bacteria has to duplicate all proteins more than 85% of cell resources

(4)

Stochasticity in protein production: experimental viewpoint

ADk cytoplasm protein1

1Yuichi Taniguchi et al.Science(2010), pp. 533–538.

(5)

Introduction Stochasticity and living organisms

Stochasticity in bacteria

Sources of stochasticity:

bacterial cytoplasm: disordered medium

main cellular motility mechanism: diffusion in a stiff medium most cellular processes require the encounter of macromolecules (Poisson process)

Protein production: inherently stochastic process.

(6)

Stochasticity in bacteria

Sources of stochasticity:

bacterial cytoplasm: disordered medium

main cellular motility mechanism: diffusion in a stiff medium most cellular processes require the encounter of macromolecules (Poisson process)

Protein production: inherently stochastic process.

(7)

Model

Model

(8)

Central Dogma of molecular biology

DNA

TRANSCRIPTION

mRNA

protein

TRANSLATION
(9)

Model

3-stage model: activation

λ−1 λ+1

µ2 µ3

λ2 λ3

(10)

Model

3-stage model: transcription

λ−1 λ+1

µ2 µ3

λ2

λ3

(11)

Model

3-stage model: translation

λ−1 λ+1

µ2 µ3

λ2 λ3

(12)

3-stage model

2

λ−1

λ+1 µ2 µ3

λ2 λ3

2Johan Paulsson. Physics of life reviews(2005), pp. 157–175.

(13)

Model

3-stage model

Y(t)∈ {0,1} M(t)∈N P(t)∈N ∅ ∅ λ2 µ2 λ3 µ3 λ−1 λ+1 µ2 µ3 λ2 λ3

Goal: Characterize mean and variance of the number of

proteinsP at equilibrium

(14)

Model

Classic models: Paulsson’s survey

3

Properties

Assumption: each step hasexponentially distributed duration

Markovian description of the protein production

Tools

Markov processes Fokker-Plank equations

explicit analytic formulas of mean and variance as function of the main parameters

biologists classically use models because of the explicit char-acterization of protein fluctuations

3Johan Paulsson. Physics of life reviews(2005), pp. 157–175.

(15)

Model

Classic models: Paulsson’s survey

3

Properties

Assumption: each step hasexponentially distributed duration

Markovian description of the protein production

Tools

Markov processes Fokker-Plank equations

explicit analytic formulas of mean and variance as function of the main parameters

biologists classically use models because of the explicit char-acterization of protein fluctuations

3Johan Paulsson. Physics of life reviews(2005), pp. 157–175.

(16)

Classic models: Paulsson’s survey

3

Properties

Assumption: each step hasexponentially distributed duration

Markovian description of the protein production

Tools

Markov processes Fokker-Plank equations

explicit analytic formulas of mean and variance as function of the main parameters

biologists classically use models because of the explicit char-acterization of protein fluctuations

3Johan Paulsson. Physics of life reviews(2005), pp. 157–175.

(17)

Model

Classic approach:

Exponential assumption:

Not each described process has an exponentially distributed duration.

The duration of the following processes is not exponential protein elongation

mRNA elongation

protein degradation

(18)

Classic approach:

Exponential assumption:

Not each described process has an exponentially distributed duration.

The duration of the following processes is not exponential

protein elongation

mRNA elongation

protein degradation

(19)

Model

Protein chain elongation

30S 50S mRNA tRNA (charged) tRNA fMet tRNA (uncharged)

(20)

Model

Protein chain elongation

30S 50S mRNA tRNA (charged) tRNA fMet tRNA (uncharged)

(21)

Model

Protein chain elongation

30S 50S mRNA tRNA (charged) tRNA fMet tRNA (uncharged)

(22)

Model

Protein chain elongation

30S 50S mRNA tRNA (charged) tRNA fMet tRNA (uncharged)

(23)

Model

Protein chain elongation

Elongation: exponentially distributed elementary steps

the sum of exponential random variables is

not exponential

Large number of identical steps (N ≈400 a.a.)

elongation time described as normal ran-dom variable

(24)

Protein chain elongation

Elongation: exponentially distributed elementary steps

the sum of exponential random variables is

not exponential

Large number of identical steps (N ≈400 a.a.)

elongation time described as normal ran-dom variable

(25)

Model

Classic approach: a not completely satisfying description

The duration of the following processes is not exponential protein elongation

mRNA elongation

protein degradation

New description

Gene expression with non-exponential steps requires a new approach.

(26)

Marked Poisson Point Process (MPPP):

new description of gene expression

(27)

MPPP MPPP: mathematical tools & application

General distributions: introducing MPPP

F2(dt)

λ2

Assumptions:

mRNA births (sn) follow a Poisson process of parameterλ2

mRNA lifetimes (σ2,n) have distributionF2(dt)

(28)

MPPP MPPP: mathematical tools & application

mRNA

How many mRNAs at time t ?

t y =−s +t birth lifetime E1 σ2,1 s1+σ2,1 E2 σ2,2 s2+ σ2,2 E3 σ2,3 s3+σ2,3

Ei exponential random variables of parameterλ2.

(29)

MPPP MPPP: mathematical tools & application

mRNA

How many mRNAs at time t ?

t y =−s +t birth lifetime E1 σ2,1 s1+σ2,1 E2 σ2,2 s2+ σ2,2 E3 σ2,3 s3+σ2,3

Ei exponential random variables of parameter λ2.

(30)

MPPP MPPP: mathematical tools & application

mRNA

How many mRNAs at time t ?

t y =−s +t birth lifetime E1 σ2,1 s1+σ2,1 E2 σ2,2 s2+ σ2,2 E3 σ2,3 s3+σ2,3

Ei exponential random variables of parameterλ2.

(31)

MPPP MPPP: mathematical tools & application

mRNA

How many mRNAs at time t ?

t y =−s +t birth lifetime E1 σ2,1 s1+σ2,1 E2 σ2,2 s2+σ2,2 E3 σ2,3 s3+σ2,3

Ei exponential random variables of parameterλ2.

(32)

MPPP MPPP: mathematical tools & application

mRNA

How many mRNAs at time t ?

t y =−s+t birth lifetime E1 σ2,1 s1+σ2,1 E2 σ2,2 s2+σ2,2 E3 σ2,3 s3+σ2,3

Ei exponential random variables of parameterλ2.

(33)

MPPP MPPP: mathematical tools & application

mRNA: general results

mRNAs at equilibrium: M = Z R×R+ 1{u≤0≤u+v}Nλ2(du,dv) Proposition E[M] =δ+λ2E[σ2] var(M) =E[M] + 2λ22δ+(1−δ+)· · Z +∞ 0 Z 0 −u e−Λv(1−F2(u))(1−F2(u+v))dudv where F2(x) =F2([0,x]), Λ =λ+1 +λ − 1 and δ+=λ+1/Λ

(34)

mRNA: general results

mRNAs at equilibrium: M = Z R×R+ 1{u≤0≤u+v}Nλ2(du,dv) Proposition E[M] =δ+λ2E[σ2] var(M) =E[M] + 2λ22δ+(1−δ+)· · Z +∞ 0 Z 0 −u e−Λv(1−F2(u))(1−F2(u+v))dudv where F2(x) =F2([0,x]), Λ =λ+1 +λ − 1 and δ+=λ+1/Λ
(35)

MPPP MPPP: mathematical tools & application

Proteins: general results

Proteins at equilibrium: P = Z R×R+ Nλ2(du,dv) Z R×R+ 1{u≤x≤u+v}1{x≤0≤x+y}Nλ3u(dx,dy) Proposition E[P] =δ+λ2λ3E[σ2]E[σ3] var(P) =E[P] +λ2λ23δ+ Z R2+ Z (−s+t)∧0 −s F3(u)du ! F2(t)dtds +λ22λ23δ+(1−δ+) Z R4 + e−Λ|u1−u2+v1−v2| 2 Y i=1 F2(ui)F3(vi)duidvi

(36)

Proteins: general results

Proteins at equilibrium: P = Z R×R+ Nλ2(du,dv) Z R×R+ 1{u≤x≤u+v}1{x≤0≤x+y}Nλ3u(dx,dy) Proposition E[P] =δ+λ2λ3E[σ2]E[σ3] var(P) =E[P] +λ2λ23δ+ Z R2+ Z (−s+t)∧0 −s F3(u)du ! F2(t)dtds +λ22λ23δ+(1−δ+) Z R4 + e−Λ|u1−u2+v1−v2| 2 Y i=1 F2(ui)F3(vi)duidvi
(37)

MPPP Applications & Model Extensions

MPPP Applications & Model Extensions

(38)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Application: MPPP 3-Stage Model

Y(t)∈ {0,1} M(t)∈N P(t)∈N

λ−1

λ+1 F2(dv) F3(dy)

µ2

λ2 λ3

(39)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Application: MPPP 3-Stage Model

Y(t)∈ {0,1} M(t)∈N P(t)∈N λ−1 λ+1 F2(dv) F3(dy) µ2 λ2 λ3

(40)

Application: MPPP 3-Stage Model

Y(t)∈ {0,1} M(t)∈N P(t)∈N λ−1 λ+1 F 3(dy) µ2 λ2 λ3 Choices forF3(dy)

Exponential explicit close formula

depend-ing on model parameters

Normal analytic formula

Deterministic

explicit close formula depend-ing on model parameters (limit case)

(41)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Deterministic vs Exponential

(42)

Application: MPPP 3-Stage Model

3-Stage model with deterministic elongation

varD(P) =E(P) 1 + 2λ3 µ2 1−µ3 µ2 (1−e−µ2/µ3) + 2λ2λ3(1−δ+)µ2 Λ2µ2 2 µ3 Λ2 h 1−e−Λ/µ3 i −µ3 µ3 2 Λ h 1−e−µ2/µ3 i + Λ 1 µ2 2 − 1 Λ2

3-Stage model with exponential elongation

varE(P) =E(P) 1 + λ3 µ2+µ3 +λ2λ3(1−δ+)(Λ +µ2+µ3) (µ2+µ3)(Λ +µ2)(Λ +µ3) where Λ =λ+1 +λ−1,δ+=λ+1/Λ

(43)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Conclusions

(MPPP) appropriate mathematical tool to describe gene expression

averages independent of the chosen distribution

analytic form formula for any distribution

explicit formula depending on the model parameters for specific and interesting distributions

(44)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Conclusions

(MPPP) appropriate mathematical tool to describe gene expression

averages independent of the chosen distribution

analytic form formula for any distribution

explicit formula depending on the model parameters for specific and interesting distributions

(45)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Conclusions

(MPPP) appropriate mathematical tool to describe gene expression

averages independent of the chosen distribution

analytic form formula for any distribution

explicit formula depending on the model parameters for specific and interesting distributions

(46)

Conclusions

Biological consequences:

the computed variance may be underestimated

deterministic protein elongation might be an upper-bound for protein variance

possibility to compute protein variance under more realistic assumptions

(47)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Few more results

More realistic description of gene expression

4-Stage model and general distribution for protein elongation

analysis and proof of the correct assumption for protein degradation (proteolysis/volume dilution)

counter-intuitive: varDET(P)>varEXP(P)

(48)

V. Fromion, E. Leoncini, and P. Robert. “Stochastic Gene Expression in

Cells: A Point Process Approach”. In: SIAM Journal on Applied

Mathematics 73.1 (2013), pp. 195–211

(49)

MPPP Applications & Model Extensions MPPP 3-Stage Model

Thanks.

(50)

One more thing: realistic model of gene expression

(51)

4-Stage Model MPPP 4-Stage model

4-Stage Model:

mRNAM(t) F2(dv) protein P(t) λ3 F3(dy) λ3 F3(dy) F4(dy) ribosome R(t) protein P(t) µ2 F3(dy)? µ4
(52)

4-Stage Model MPPP 4-Stage model

4-Stage Model:

mRNAM(t) F2(dv) protein P(t) λ3 F3(dy) λ3 F3(dy) F4(dy) ribosome R(t) protein P(t) µ2 F3(dy)? µ4
(53)

4-Stage Model MPPP 4-Stage model

4-Stage Model:

mRNAM(t) F2(dv) protein P(t) λ3 F3(dy) λ3 F3(dy) F4(dy) ribosome R(t) protein P(t) µ2 F3(dy)? µ4
(54)

4-Stage Model MPPP 4-Stage model

4-Stage Model:

mRNAM(t) F2(dv) protein P(t) λ3 F3(dy) λ3 F3(dy) F4(dy) ribosome R(t) protein P(t) µ2 F3(dy)? µ4
(55)

4-Stage Model MPPP 4-Stage model

4-Stage Model:

3-Stage model all exponential steps

var(3)(P) =E[P] 1 + λ3 µ2+µ3 (active gene) 4-Stage model all exponential steps

var(4)(P) =E[P] 1 + λ3µ3(µ2+µ3+µ4) (µ2+µ3)(µ2+µ4)(µ3+µ4) (active gene)

(56)

4-Stage Model MPPP 4-Stage model

4-Stage Model: protein elongation

lots of identical steps (∼400 amino acids per protein)

each step is exponentially distributed

The resulting distribution can be described by normal distribution

Gamma distribution

Problem: hard to obtain explicit formula depending on the model parameters

(57)

4-Stage Model MPPP 4-Stage model

4-Stage Model: protein elongation

lots of identical steps (∼400 amino acids per protein)

each step is exponentially distributed

The resulting distribution can be described by normal distribution

Gamma distribution

Problem: hard to obtain explicit formula depending on the model parameters

(58)

4-Stage Model: protein elongation

The resulting distribution can be described by normal distribution

Gamma distribution

Problem: hard to obtain explicit formula depending on the model parameters

Approximation: deterministic protein elongationτ3

(59)

4-Stage Model MPPP 4-Stage model

4-Stage Model:

4-Stage model with deterministic elongation

varD(P) = E(P) 1 + λ3 µ2+µ4 + λ2λ3(1−δ+)(Λ +µ2+µ4) (µ2+µ4)(Λ +µ2)(Λ +µ4) .

4-Stage model with exponential elongation

varE(P) =E(P) 1 + λ3µ3(µ2+µ3+µ4) (µ2+µ3)(µ2+µ4)(µ3+µ4) + λ2λ3(1−δ+)µ3µ24 (Λ +µ2)(µ24−µ23) Λ+µ2+µ3 µ3(µ2+µ3)(Λ+µ3)− Λ+µ2+µ4 µ4(µ2+µ4)(Λ+µ4) .

(60)

4-Stage Model:

4-Stage model with deterministic elongation

varD(P) = E(P) 1 + λ3 µ2+µ4 + λ2λ3(1−δ+)(Λ +µ2+µ4) (µ2+µ4)(Λ +µ2)(Λ +µ4) .

4-Stage model with exponential elongation

varE(P) =E(P) 1 + λ3µ3(µ2+µ3+µ4) (µ2+µ3)(µ2+µ4)(µ3+µ4) + λ2λ3(1−δ+)µ3µ24 (Λ +µ2)(µ24−µ23) Λ+µ2+µ3 µ3(µ2+µ3)(Λ+µ3)− Λ+µ2+µ4 µ4(µ2+µ4)(Λ+µ4) .

(61)

4-Stage Model MPPP 4-Stage model

Deterministic vs Exponential

(62)

4-Stage Model MPPP 4-Stage model

Conclusions

Few results:

explicit formula depending on the model parameters for specific assumptions;

counter-intuitive: varDET(P)>varEXP(P).

Biological consequences:

the estimated variance could have been underestimated;

deterministic protein elongation as upper-bound for protein variance; possibility to compute (numerically) more precise protein variance with realistic assumptions.

(63)

4-Stage Model MPPP 4-Stage model

Conclusions

Few results:

explicit formula depending on the model parameters for specific assumptions;

counter-intuitive: varDET(P)>varEXP(P).

Biological consequences:

the estimated variance could have been underestimated;

deterministic protein elongation as upper-bound for protein variance; possibility to compute (numerically) more precise protein variance with realistic assumptions.

References

Related documents

Decision Support Systems and Product Market Performance: Data analysis revealed that there is a positive and significant relationship between decision support

Wenn man sich neben der verminderten F¨ ahigkeit, Nahrungsger¨ uche zu.. erkennen, das ebenfalls eingeschr¨ ankte Diskriminationsverm¨ ogen f¨ ur Duft- stoffe vor Augen h¨ alt, l¨

This includes hazardous waste requests forms submitted for pick-up, monthly waste pick-up volume with generator’s name, and annual waste generation

Verksted 80 Bjørn Andersen ”Reinebuen” Fishing Vessel

Micro nutrients and heavy metals concentration in edible mushrooms grow on rice straw from different Governorates have been determined on a dry weight basis using

state creation would create immutable rules and would force all parent and successor states to follow those rules as a matter of international law. Even agreement

This liberalization of standing requirements resulted in a new class of legal action: public interest litigation (PIL). 114 PIL suits seek to redress wrongs to public

Jill and Earl is useful because it illustrates something that is true of all unwitting justification cases: regardless of whether unwittingly justified actors are guilty of