• No results found

Network analysis with the W -graph model

N/A
N/A
Protected

Academic year: 2021

Share "Network analysis with the W -graph model"

Copied!
97
0
0

Loading.... (view fulltext now)

Full text

(1)

Network analysis with the

W

-graph model

(via the Stochastic Block Model)

S. Robin

Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech

(2)

Outline

1 Modeling heterogeneity in interaction networks

2 Statistical inference of latent space models (focus on SBM)

3 From SBM toW-graph: Averaging models

(3)

Modeling heterogeneity in interaction networks

(4)

Modeling heterogeneity in interaction networks Heterogeneity in biological networks

Heterogeneity in biological networks

Biological networks describe interactions between entities: genes, proteins, individuals, species...

Observed networks display heterogeneous topologies, that one would like to decipher and better understand.

Dolphine social network.

[Newman and Girvan (2004)]

(5)

Modeling heterogeneity in interaction networks Heterogeneity in biological networks

Heterogeneity in biological networks

Biological networks describe interactions between entities: genes, proteins, individuals, species...

Observed networks display heterogeneous topologies, that one would like to decipher and better understand.

Dolphine social network.

(6)

Modeling heterogeneity in interaction networks Heterogeneity in biological networks

Heterogeneous means ...

... not homogeneous,that is: different from an Erd¨os-Renyi (ER) graph.

Erd¨os-Renyi random graphG(n,p): Considernnodes, node pairs 1≤i<j≤n

are independently connected with same probabilityp:

(Yij) iid, Yij ∼ B(p).

Very intensively studied. Fits very few real-life networks.

(7)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

Latent variablesallow to capture some underlying structure of a network (see review[Matias and R. (2014)]).

General setting for binary graphs. [Bollob´aset al.(2007)]:

A latent (unobserved) variableZi is associated with each node:

{Zi}iid ∼π

EdgesYij =I{i∼j} are independent conditionally to theZi’s:

{Yij} independent|{Zi}: Pr{Yij= 1}=γ(Zi,Zj)

We focus here on model approaches, in contrast with, e.g.

Graph clustering[Girvan and Newman (2002)],[Newman (2004)];

(8)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

Latent variablesallow to capture some underlying structure of a network (see review[Matias and R. (2014)]).

General setting for binary graphs. [Bollob´aset al.(2007)]:

A latent (unobserved) variableZi is associated with each node:

{Zi}iid ∼π

EdgesYij =I{i∼j} are independent conditionally to theZi’s:

{Yij} independent|{Zi}: Pr{Yij= 1}=γ(Zi,Zj)

We focus here on model approaches, in contrast with, e.g.

Graph clustering[Girvan and Newman (2002)],[Newman (2004)];

(9)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

Latent variablesallow to capture some underlying structure of a network (see review[Matias and R. (2014)]).

General setting for binary graphs. [Bollob´aset al.(2007)]:

A latent (unobserved) variableZi is associated with each node:

{Zi}iid ∼π

EdgesYij =I{i∼j} are independent conditionally to theZi’s:

{Yij} independent|{Zi}: Pr{Yij= 1}=γ(Zi,Zj)

We focus here on model approaches, in contrast with, e.g.

(10)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

State-space model: principle.

Considernnodes (i= 1..n);

Zi = unobserved position of nodei,

e.g.

{Zi}iid ∼ N(0,I)

Edge{Yij}independent given{Zi},

e.g.

(11)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

State-space model: principle.

Considernnodes (i= 1..n);

Zi = unobserved position of nodei,

e.g.

{Zi}iid ∼ N(0,I)

Edge{Yij}independent given{Zi},

e.g.

(12)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

State-space model: principle.

Considernnodes (i= 1..n);

Zi = unobserved position of nodei,

e.g.

{Zi}iid ∼ N(0,I)

Edge{Yij}independent given{Zi},

e.g.

(13)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

State-space model: principle.

Considernnodes (i= 1..n);

Zi = unobserved position of nodei,

e.g.

{Zi}iid ∼ N(0,I)

Edge{Yij}independent given{Zi},

e.g.

(14)

Modeling heterogeneity in interaction networks Latent space models

Latent space models

State-space model: principle.

Considernnodes (i= 1..n);

Zi = unobserved position of nodei,

e.g.

{Zi}iid ∼ N(0,I)

Edge{Yij}independent given{Zi},

e.g. Pr{Yij= 1}=γ(Zi,Zj). Y =        0 1 1 0 1 . . . 0 0 1 0 1 . . . 0 0 0 0 0 . . . 0 0 0 0 1 . . . 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . ..       

(15)

Modeling heterogeneity in interaction networks Latent space models

A variety of state-space models

Latent position models.

[Hoffet al. (2002)]: Zi∈Rd, logitγ(z,z0) =a− |z−z0| [Handcocket al. (2007)]: Zi ∼ X k pkNd(µk, σk2I) [Daudinet al. (2010)]: Zi∈ SK, γ(z,z0) = X k,` zkz`0γk`

In this talk, focus on

the Stochastic Block Model (SBM) and

(16)

Modeling heterogeneity in interaction networks Latent space models

A variety of state-space models

Latent position models.

[Hoffet al. (2002)]: Zi∈Rd, logitγ(z,z0) =a− |z−z0| [Handcocket al. (2007)]: Zi ∼ X k pkNd(µk, σk2I) [Daudinet al. (2010)]: Zi∈ SK, γ(z,z0) = X k,` zkz`0γk`

In this talk, focus on

the Stochastic Block Model (SBM) and

(17)

Modeling heterogeneity in interaction networks Latent space models

Stochastic Block Model (SBM)

A mixture model for random graphs.

[Nowicki and Snijders (2001)]

Considernnodes (i = 1..n);

Zi= unobserved label of nodei:

{Zi} iid ∼ M(1;π)

π= (π1, ...πK);

EdgeYij depends on the labels:

{Yij} independent given{Zi},

(18)

Modeling heterogeneity in interaction networks Latent space models

Stochastic Block Model (SBM)

A mixture model for random graphs.

[Nowicki and Snijders (2001)]

Considernnodes (i = 1..n);

Zi= unobserved label of nodei:

{Zi} iid ∼ M(1;π)

π= (π1, ...πK);

EdgeYij depends on the labels:

{Yij} independent given{Zi},

(19)

Modeling heterogeneity in interaction networks Latent space models

Stochastic Block Model (SBM)

A mixture model for random graphs.

[Nowicki and Snijders (2001)]

Considernnodes (i = 1..n);

Zi= unobserved label of nodei:

{Zi} iid ∼ M(1;π)

π= (π1, ...πK);

EdgeYij depends on the labels:

{Yij} independent given{Zi},

(20)

Modeling heterogeneity in interaction networks Latent space models

Stochastic Block Model (SBM)

A mixture model for random graphs.

[Nowicki and Snijders (2001)]

Considernnodes (i = 1..n);

Zi= unobserved label of nodei:

{Zi} iid ∼ M(1;π)

π= (π1, ...πK);

EdgeYij depends on the labels:

{Yij} independent given{Zi},

(21)

Modeling heterogeneity in interaction networks Latent space models

Stochastic Block Model (SBM)

A mixture model for random graphs.

[Nowicki and Snijders (2001)]

Considernnodes (i = 1..n);

Zi= unobserved label of nodei:

{Zi} iid ∼ M(1;π)

π= (π1, ...πK);

EdgeYij depends on the labels:

{Yij} independent given{Zi},

(22)

Modeling heterogeneity in interaction networks Latent space models

W

-graph model

Latent variables: (Zi) iid ∼ U[0,1], Graphon functionγ: γ(z,z0) : [0,1]2→[0,1] Edges: Pr{Yij = 1}=γ(Zi,Zj) Graphon functionγ(z,z0)
(23)

Modeling heterogeneity in interaction networks Latent space models

Interpreting the graphon function

The graphon function provides a global picture of the network’s topology.

(24)

Modeling heterogeneity in interaction networks Latent space models

Few words about the

W

-graph

Probabilistic point of view.

W-graph have been mostly studied in the probability literature: [Lov´asz and

Szegedy (2006)],[Diaconis and Janson (2008)]

Motif (sub-graph) frequencies are invariant characteristics of aW-graph.

Intrinsic un-identifiability of the graphon functionγis often overcome by

imposing thatu7→R

γ(u,v) dv is monotonous increasing.

Statistical point of view.

Not much attention has been paid to its inference until recently: [Airoldiet al.

(2013)],[Chatterjee et al. (2014)],[Olhede and Wolfe (2014)], ...

(25)

Modeling heterogeneity in interaction networks Latent space models

Few words about the

W

-graph

Probabilistic point of view.

W-graph have been mostly studied in the probability literature: [Lov´asz and

Szegedy (2006)],[Diaconis and Janson (2008)]

Motif (sub-graph) frequencies are invariant characteristics of aW-graph.

Intrinsic un-identifiability of the graphon functionγis often overcome by

imposing thatu7→R

γ(u,v) dv is monotonous increasing.

Statistical point of view.

Not much attention has been paid to its inference until recently: [Airoldiet al.

(2013)],[Chatterjee et al. (2014)],[Olhede and Wolfe (2014)], ...

(26)

Modeling heterogeneity in interaction networks Some generalizations of latent space graph models

Some generalizations of latent space graph models

Latent space models can be extended in various directions.

Weighted or directed networks. Edges may have values: count, real,{0,+,−,±}, ... Latent space model can be adapted as

Yij|Zi,Zj ∼ F(γ(Zi,Zj))

whereF is can be any distribution: Poisson, normal, multinomial, etc.

Accounting for covariates. Latent space model can also accommodate for covariates, via a regression term:

Yij|Zi,Zj ∼ F(γ(Zi,Zj) +xij0β)

(27)

Modeling heterogeneity in interaction networks Some generalizations of latent space graph models

Some generalizations of latent space graph models

Latent space models can be extended in various directions.

Weighted or directed networks. Edges may have values: count, real,{0,+,−,±}, ... Latent space model can be adapted as

Yij|Zi,Zj ∼ F(γ(Zi,Zj))

whereF is can be any distribution: Poisson, normal, multinomial, etc.

Accounting for covariates. Latent space model can also accommodate for covariates, via a regression term:

Yij|Zi,Zj ∼ F(γ(Zi,Zj) +xij0β)

(28)

Modeling heterogeneity in interaction networks Some generalizations of latent space graph models

Some generalizations of latent space graph models

Latent space models can be extended in various directions.

Weighted or directed networks. Edges may have values: count, real,{0,+,−,±}, ... Latent space model can be adapted as

Yij|Zi,Zj ∼ F(γ(Zi,Zj))

whereF is can be any distribution: Poisson, normal, multinomial, etc.

Accounting for covariates. Latent space model can also accommodate for covariates, via a regression term:

Yij|Zi,Zj ∼ F(γ(Zi,Zj) +xij0β)

(29)

Statistical inference of latent space models

(30)

Statistical inference of latent space models Incomplete data models

Incomplete data models

Aim. Based on the observed networkY = (Yij), one want typically to infer

the parameters

θ= (π, γ)

the hidden states

Z = (Zi)

State space models belong to the class of incomplete data models as

the edges (Yij) are observed,

the latent positions (or status) (Zi) are not,

(31)

Statistical inference of latent space models Incomplete data models

Incomplete data models

Aim. Based on the observed networkY = (Yij), one want typically to infer

the parameters

θ= (π, γ)

the hidden states

Z = (Zi)

State space models belong to the class of incomplete data models as

the edges (Yij) are observed,

the latent positions (or status) (Zi) are not,

(32)

Statistical inference of latent space models Incomplete data models

Frequentist or Bayesian inference

Frequentist inference. θis fixed andZ is random. The aim is then to

provide an estimateθbofθ,

provide the conditional distributionPθ(Z|Y) (for classification purposes and

as a side product of the inference).

Bayesian inference. BothθandZ are random. The aim is then to

provide the joint conditional distributionP(θ,Z|Y).

Whatever the approach, we have to deal with conditional distributions:

(33)

Statistical inference of latent space models Incomplete data models

Frequentist or Bayesian inference

Frequentist inference. θis fixed andZ is random. The aim is then to

provide an estimateθbofθ,

provide the conditional distributionPθ(Z|Y) (for classification purposes and

as a side product of the inference).

Bayesian inference. BothθandZ are random. The aim is then to

provide the joint conditional distributionP(θ,Z|Y).

Whatever the approach, we have to deal with conditional distributions:

(34)

Statistical inference of latent space models Incomplete data models

Frequentist or Bayesian inference

Frequentist inference. θis fixed andZ is random. The aim is then to

provide an estimateθbofθ,

provide the conditional distributionPθ(Z|Y) (for classification purposes and

as a side product of the inference).

Bayesian inference. BothθandZ are random. The aim is then to

provide the joint conditional distributionP(θ,Z|Y).

Whatever the approach, we have to deal with conditional distributions:

(35)

Statistical inference of latent space models Incomplete data models

Conditional distributions (1/2)

Graphical modelsdescribe the conditional independences between the random

vari-ables from a model[Lauritzen (1996)].

Frequentist setting:

iidZi’s,

P(Yij|Zi,Zj),

P(Zi,Zj|Y): graph moralization,

this holds for each pair (i,j),

Conditional distribution. The dependency graph of Z given Y is a clique.

→No factorization can be hoped (unlike for HMM).

(36)

Statistical inference of latent space models Incomplete data models

Conditional distributions (1/2)

Graphical modelsdescribe the conditional independences between the random

vari-ables from a model[Lauritzen (1996)].

Frequentist setting:

iidZi’s,

P(Yij|Zi,Zj),

P(Zi,Zj|Y): graph moralization,

this holds for each pair (i,j),

Conditional distribution. The dependency graph of Z given Y is a clique.

→No factorization can be hoped (unlike for HMM).

(37)

Statistical inference of latent space models Incomplete data models

Conditional distributions (1/2)

Graphical modelsdescribe the conditional independences between the random

vari-ables from a model[Lauritzen (1996)].

Frequentist setting:

iidZi’s,

P(Yij|Zi,Zj),

P(Zi,Zj|Y): graph moralization,

this holds for each pair (i,j),

Conditional distribution. The dependency graph of Z given Y is a clique.

→No factorization can be hoped (unlike for HMM).

(38)

Statistical inference of latent space models Incomplete data models

Conditional distributions (1/2)

Graphical modelsdescribe the conditional independences between the random

vari-ables from a model[Lauritzen (1996)].

Frequentist setting:

iidZi’s,

P(Yij|Zi,Zj),

P(Zi,Zj|Y): graph moralization,

this holds for each pair (i,j),

Conditional distribution. The dependency graph of Z given Y is a clique.

→No factorization can be hoped (unlike for HMM).

(39)

Statistical inference of latent space models Incomplete data models

Conditional distributions (1/2)

Graphical modelsdescribe the conditional independences between the random

vari-ables from a model[Lauritzen (1996)].

Frequentist setting:

iidZi’s,

P(Yij|Zi,Zj),

P(Zi,Zj|Y): graph moralization,

this holds for each pair (i,j),

Conditional distribution. The dependency graph of Z given Y is a clique.

→No factorization can be hoped (unlike for HMM).

(40)

Statistical inference of latent space models Incomplete data models

Conditional distributions (1/2)

Graphical modelsdescribe the conditional independences between the random

vari-ables from a model[Lauritzen (1996)].

Frequentist setting:

iidZi’s,

P(Yij|Zi,Zj),

P(Zi,Zj|Y): graph moralization,

this holds for each pair (i,j),

Conditional distribution. The dependency graph of Z given Y is a clique.

→No factorization can be hoped (unlike for HMM).

(41)

Statistical inference of latent space models Incomplete data models

Conditional distributions (2/2)

Bayesian perspective. Things get

worst becauseθ = (π, γ) is also

random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .

Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.

(42)

Statistical inference of latent space models Incomplete data models

Conditional distributions (2/2)

Bayesian perspective. Things get

worst becauseθ = (π, γ) is also

random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .

Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.

(43)

Statistical inference of latent space models Incomplete data models

Conditional distributions (2/2)

Bayesian perspective. Things get

worst becauseθ = (π, γ) is also

random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .

Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.

(44)

Statistical inference of latent space models Incomplete data models

Conditional distributions (2/2)

Bayesian perspective. Things get

worst becauseθ = (π, γ) is also

random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .

Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.

(45)

Statistical inference of latent space models Incomplete data models

Conditional distributions (2/2)

Bayesian perspective. Things get

worst becauseθ = (π, γ) is also

random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved.

Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.

(46)

Statistical inference of latent space models Incomplete data models

Conditional distributions (2/2)

Bayesian perspective. Things get

worst becauseθ = (π, γ) is also

random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved.

Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.

(47)

Statistical inference of latent space models Variational (Bayes) inference

Variational (Bayes) inference

Variational approximationsaim at replacing an intractable exact distributionP

with a tractable approximate distributionP. Typically:e

Pθ(Z|Y) ≈ Y i e Pθ,Y(Zi) P(θ,Z|Y) ≈ PeY(θ)×PeY(Z) P(θ,Z|Y) ≈ PeY(θ)× Y i e PY(Zi)

Popular strategy: minimize the K¨ullback-Leibler divergence betweenPe andP:

minKL[P(Ze )||Pθ(Z|Y)] or minKL[P(θ,e Z)||P(θ,Z|Y)]

(48)

Statistical inference of latent space models Variational (Bayes) inference

VBEM inference for SBM:

E. coli

’s operon network

[Picardet al. (2009)]

Meta-graph representation.

(49)

Statistical inference of latent space models Variational (Bayes) inference

VBEM inference for SBM:

E. coli

’s operon network

Meta-graph representation.
(50)

Statistical inference of latent space models Variational (Bayes) inference

VBEM inference for SBM:

E. coli

’s operon network

[Picardet al. (2009)]

Meta-graph representation.

(51)

Statistical inference of latent space models Variational (Bayes) inference

Accuracy of VBEM estimates for SBM: Simulation study

Credibility intervals: π1: +,γ11: 4, γ12: ◦, γ22: •

Width of the posterior credibility intervals. π1,γ11,γ12,γ22

(52)

Statistical inference of latent space models Variational (Bayes) inference

Accuracy of VBEM estimates for SBM: Simulation study

Credibility intervals: π1: +,γ11: 4, γ12: ◦, γ22: •
(53)

Statistical inference of latent space models Variational (Bayes) inference

First half summary

Latent space graph models are useful to describe network heterogeneity. Their statistical inference raises some specific issues.

Variational approximations help to circumvent these issues.

And also

Theoretical justifications of these approximations exist for SBM:[Celisseet al.

(2012)],[Mariadassou and Matias (2014)]

VEM and VBEM algorithms have been specifically developed for SBM:

[Daudinet al. (2008)],[Latoucheet al. (2012)]

(54)

Statistical inference of latent space models Variational (Bayes) inference

First half summary

Latent space graph models are useful to describe network heterogeneity. Their statistical inference raises some specific issues.

Variational approximations help to circumvent these issues.

And also

Theoretical justifications of these approximations exist for SBM:[Celisseet al.

(2012)],[Mariadassou and Matias (2014)]

VEM and VBEM algorithms have been specifically developed for SBM:

[Daudinet al. (2008)], [Latoucheet al. (2012)]

(55)

From SBM toW-graph: Averaging models

(56)

From SBM toW-graph: Averaging models SBM as aW-graph model

SBM as a

W

-graph model

Latent variables:

(Zi) iid ∼ M(1, π)

Blockwise constant graphon: γ(z,z0) =γk`

Edges:

Pr{Yij = 1}=γ(Zi,Zj)

Graphon functionγSBM

K (z,z0)

(57)

From SBM toW-graph: Averaging models SBM as aW-graph model

Variational Bayes estimation of

γ

(

z

,

z

0

)

VBEM inferenceprovides the approximate posteriors:

(π|Y) ≈ Dir(π∗)

(γk`|Y) ≈ Beta(γk0`∗, γk1∗`)

Estimate ofγ(u,v). Due

to the uncertainty of theπk,

the posterior mean ofγSBM

K

is smooth

(Explicit integration using [Gouda and

Sz´antai (2010)])

(58)

From SBM toW-graph: Averaging models Bayesian model averaging

Bayesian model averaging

Bayesian model averaging (BMA).Consider a series of models 1, . . . ,K, . . . in

which a certain function of the parameterf(θ) can always be defined.

Bayesian inference within each modelK provides the posterior

P(θ|K,Y) → P(f(θ)|K,Y).

BMA[Hoetinget al.(1999)]relies on the marginal posterior off(θ):

P(f(θ)|Y) =X

K

(59)

From SBM toW-graph: Averaging models Bayesian model averaging

Bayesian model averaging

Bayesian model averaging (BMA).Consider a series of models 1, . . . ,K, . . . in

which a certain function of the parameterf(θ) can always be defined.

Bayesian inference within each modelK provides the posterior

P(θ|K,Y) → P(f(θ)|K,Y).

BMA[Hoetinget al.(1999)]relies on the marginal posterior off(θ):

P(f(θ)|Y) =X

K

(60)

From SBM toW-graph: Averaging models Bayesian model averaging

Bayesian model averaging

Bayesian model averaging (BMA).Consider a series of models 1, . . . ,K, . . . in

which a certain function of the parameterf(θ) can always be defined.

Bayesian inference within each modelK provides the posterior

P(θ|K,Y) → P(f(θ)|K,Y).

BMA[Hoetinget al.(1999)]relies on the marginal posterior off(θ):

P(f(θ)|Y) =X

K

(61)

From SBM toW-graph: Averaging models Bayesian model averaging

Variational Bayes model averaging

Pushing it further: Consider the modelK as an additional hidden variable:

P(Z, θ,K|Y) ≈ P(Z, θ,e K)

:= P(Ze |K)×P(θe |K)×P(Ke )

Note that no additional independence assumption is needed.

Variational Bayes model averaging (VBMA).The optimal1approximation of P(K|Y) satisfies[Volantet al. (2012)]:

e

P(K)∝P(K)elogP(Y|K)−KL(K)=P(K|Y)e−KL(K)

(62)

From SBM toW-graph: Averaging models Bayesian model averaging

Variational Bayes model averaging

Pushing it further: Consider the modelK as an additional hidden variable:

P(Z, θ,K|Y) ≈ P(Z, θ,e K)

:= P(Ze |K)×P(θe |K)×P(Ke )

Note that no additional independence assumption is needed.

Variational Bayes model averaging (VBMA).The optimal1 approximation of P(K|Y) satisfies[Volantet al. (2012)]:

e

P(K)∝P(K)elogP(Y|K)−KL(K)=P(K|Y)e−KL(K)

(63)

From SBM toW-graph: Averaging models Inferring the graphon function

Inferring the graphon function

Model averaging: There is no ’trueK’ in theW-graph model.

Apply VBMA recipe toγ(z,z0). ForK = 1..Kmax, fit an SBM model via VBEM and compute

b

γKSBM(z,z0) =Ee[γC(z),C(z0)|Y,K].

Then perform model averaging as

b γ(z,z0) =Ee[γC(z),C(z0)|Y] = X K e P(K)bγKSBM(z,z0), [Latouche and R. (2013)].

(64)

From SBM toW-graph: Averaging models Inferring the graphon function

Inferring the graphon function

Model averaging: There is no ’trueK’ in theW-graph model.

Apply VBMA recipe toγ(z,z0). ForK = 1..Kmax, fit an SBM model via VBEM and compute

b

γKSBM(z,z0) =Ee[γC(z),C(z0)|Y,K].

Then perform model averaging as

b γ(z,z0) =Ee[γC(z),C(z0)|Y] = X K e P(K)bγKSBM(z,z0), [Latouche and R. (2013)].

(65)

From SBM toW-graph: Averaging models Inferring the graphon function

PPI network

Like many PPI networks,E. coli’s network is highly concentrated around few

(66)

From SBM toW-graph: Averaging models Inferring the graphon function

PPI network

Like many PPI networks,E. coli’s network is highly concentrated around few

(67)

From SBM toW-graph: Averaging models Inferring the graphon function

Ecological network between fungal species

(68)

From SBM toW-graph: Averaging models Inferring the graphon function

Ecological network between fungal species

(69)

From SBM toW-graph: Averaging models Inferring the graphon function

Brain network

(70)

From SBM toW-graph: Averaging models Inferring the graphon function

Brain network

(71)

From SBM toW-graph: Averaging models Inferring the graphon function

Blog network (non-biological)

(72)

From SBM toW-graph: Averaging models Inferring the graphon function

Blog network (non-biological)

(73)

Goodness-of-fit

(74)

Goodness-of-fit Motifs frequency

Motifs frequency

Network motifs have a biological (or sociological) interpretation in terms of building blocks of the global network

→Triangles = ’friends of my friends are my friends’.

Latent space graph models only describe binary interactions, conditional on the latent positions

(75)

Goodness-of-fit Motifs frequency

Moments of motif counts

Moments under SBM:The first momentsEN(m),VN(m) of the count are known

for exchangeable graph models (incl. SBM)[Picardet al. (2008)]:

ESBMN(m)∝µSBM(m) =:f(θSBM)

whereµSBM(m) is the motif occurrence probability under SBM.

Moments underW-graph: Motif probability under theW -graph can be estimated as b µ(m) =X k e P(K)Ee(µSBM(m)|X,K)

Estimates ofEWN(m) andVWN(m) can be derived accordingly[Latouche and R.

(76)

Goodness-of-fit Motifs frequency

Network frequencies in the blog network

Motif Count Mean Std. dev.

(×103) (×103) (×103) 29.7 39.7 8.3 3.8 4.6 1.3 608.7 968.3 336.8 279.8 428.9 154.0 47.4 74.5 35.1 270.5 397.0 177.0 62.1 87.8 47.4 6.5 8.8 5.4

(77)

Goodness-of-fit ’Residual graphon’

Covariates: Tree interaction (valued) network

Data: n= 51 tree species,

Yij= number of shared parasites

[Vacheret al. (2008)].

SBM:GivenZi =k,Zj=`,

Yij∼ P(eγk`),

γk`= log-mean number of shared

parasites.

Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6

(78)

Goodness-of-fit ’Residual graphon’

Covariates: Tree interaction (valued) network

Data: n= 51 tree species,

Yij= number of shared parasites

[Vacheret al. (2008)].

SBM:GivenZi =k,Zj=`,

Yij∼ P(eγk`),

γk`= log-mean number of shared

parasites.

Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6

(79)

Goodness-of-fit ’Residual graphon’

Covariates: Tree interaction (valued) network

Data: n= 51 tree species,

Yij= number of shared parasites

[Vacheret al. (2008)].

SBM:GivenZi =k,Zj=`,

Yij∼ P(eγk`),

γk`= log-mean number of shared

parasites.

Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6

(80)

Goodness-of-fit ’Residual graphon’

Covariates: Tree interaction (valued) network

Data: n= 51 tree species,

Yij= number of shared parasites

[Vacheret al. (2008)].

SBM:GivenZi =k,Zj=`,

Yij∼ P(eγk`),

γk`= log-mean number of shared

parasites.

Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6

(81)

Goodness-of-fit ’Residual graphon’

Covariates: Tree interaction (valued) network

Data: n= 51 tree species,

Yij= number of shared parasites

[Vacheret al. (2008)].

SBM:GivenZi =k,Zj=`,

Yij∼ P(eγk`),

γk`= log-mean number of shared

parasites.

Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6

(82)

Goodness-of-fit ’Residual graphon’

Accounting for the taxonomic distance

Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298

→ The mean number of shared

parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317

→Groups are no longer associated with the phylogenetic structure.

(83)

Goodness-of-fit ’Residual graphon’

Accounting for the taxonomic distance

Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298

→ The mean number of shared

parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317

→Groups are no longer associated with the phylogenetic structure.

(84)

Goodness-of-fit ’Residual graphon’

Accounting for the taxonomic distance

Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298

→ The mean number of shared

parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317

→Groups are no longer associated with the phylogenetic structure.

(85)

Goodness-of-fit ’Residual graphon’

Accounting for the taxonomic distance

Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298

→ The mean number of shared

parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317

→Groups are no longer associated with the phylogenetic structure.

(86)

Goodness-of-fit ’Residual graphon’

Accounting for the taxonomic distance

Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298

→ The mean number of shared

parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317

→Groups are no longer associated with the phylogenetic structure.

(87)

Goodness-of-fit ’Residual graphon’

’Residual’ graphon

A simple graph model with covariates. When edge covariatesxij are available,

simply fit a logistic regression[Pattison and Robins (2007)]:

(Yij) independent Yij ∼ B(pij)    logitpij =xij0β.

Introducing a residual term. To assess the fit of the model, simply add a residual graphon-like term: (Zi) iidU[0,1] Yij|Zi,Zj∼ B(pij)    logitpij=xij0β+γ(Zi,Zj).

→A VBEM algorithm can be designed to getP(β, θ,e Z)≈P(β, θ,Z|Y):

(88)

Goodness-of-fit ’Residual graphon’

’Residual’ graphon

A simple graph model with covariates. When edge covariatesxij are available,

simply fit a logistic regression[Pattison and Robins (2007)]:

(Yij) independent Yij ∼ B(pij)    logitpij =xij0β.

Introducing a residual term. To assess the fit of the model, simply add a residual graphon-like term: (Zi) iidU[0,1] Yij|Zi,Zj ∼ B(pij)    logitpij =xij0β+γ(Zi,Zj).

→A VBEM algorithm can be designed to getP(β, θ,e Z)≈P(β, θ,Z|Y):

(89)

Goodness-of-fit ’Residual graphon’

Tree network

Binary version: Links between tree species if they host at least one common fungal parasite.

Regression: covariates = genetic dis-tance, taxonomic disdis-tance, geographic distance

(90)

Goodness-of-fit ’Residual graphon’

Tree network

Binary version: Links between tree species if they host at least one common fungal parasite.

Regression: covariates = genetic dis-tance, taxonomic disdis-tance, geographic distance

(91)

Goodness-of-fit ’Residual graphon’

Blog network

Blog network: Already shown.

Regression: covariates = same political party, pair includes a journalist

(92)

Goodness-of-fit ’Residual graphon’

Blog network

Blog network: Already shown.

Regression: covariates = same political party, pair includes a journalist

(93)

Conclusion & future work

Some conclusions.

The graphon provides a representation of the network topology It can be estimated using variational Bayes inference

→R packages ’mixer’ and ’blockmodels’

It can be combined with covariates as a residual term

Future work.

Formal goodness-of-fit test

Quality of variational Bayes estimates in SBM with covariates

(94)

Conclusion & future work

Some conclusions.

The graphon provides a representation of the network topology It can be estimated using variational Bayes inference

→R packages ’mixer’ and ’blockmodels’

It can be combined with covariates as a residual term

Future work.

Formal goodness-of-fit test

Quality of variational Bayes estimates in SBM with covariates

(95)

Airoldi, E. M.,Costa, T. B.andChan, S. H.(2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In Advances in Neural Information Processing Systems, 692–700.

Beal, J., M.andGhahramani, Z.(2003). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures.Bayes. Statist.7543–52.

Bollob´as, B.,Janson, S.andRiordan, O.(2007). The phase transition in inhomogeneous random graphs.Rand. Struct. Algo.31 (1)3–122.

Celisse, A.,Daudin, J.-J.andPierre, L.(2012). Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electron. J. Statis.61847–99.

Chatterjee, S.et al.(2014). Matrix estimation by universal singular value thresholding.The Annals of Statistics.43 (1)177–214.

Daudin, J.-J.,Picard, F.andRobin, S.(Jun, 2008). A mixture model for random graphs.Stat. Comput.18 (2)173–83.

Daudin, J.-J.,Pierre, L.andVacher, C.(2010). Model for heterogeneous random networks using continuous latent variables and an application to a tree–fungus network.Biometrics.66 (4)1043–1051.

Diaconis, P.andJanson, S.(2008). Graph limits and exchangeable random graphs.Rend. Mat. Appl.7 (28)33–61.

Gazal, S.,Daudin, J.-J.andRobin, S.(2012). Accuracy of variational estimates for random graph mixture models.Journal of Statistical Computation and Simulation.82 (6)849–862.

Girvan, M.andNewman, M. E. J.(2002). Community strucutre in social and biological networks.Proc. Natl. Acad. Sci. USA.99 (12)7821–6.

Gouda, A.andSz´antai, T.(2010). On numerical calculation of probabilities according to Dirichlet distribution.Ann. Oper. Res.177185–200. DOI: 10.1007/s10479-009-0601-9.

Handcock, M.,Raftery, A.andTantrum, J.(2007). Model-based clustering for social networks.JRSSA.170 (2)301–54.doi: 10.1111/j.1467-985X.2007.00471.x.

(96)

Jaakkola, T. S.andJordan, M. I.(2000). Bayesian parameter estimation via variational methods.Statistics and Computing.10 (1)25–37.

Latouche, P.,Birmel´e, E.andAmbroise, C.(2012). Variational bayesian inference and complexity control for stochastic block models.Statis. Model. 12 (1)93–115.

Latouche, P.andRobin, S.(2013), Bayesian model averaging of stochastic block models to estimate the graphon function and motif frequencies in a W-graph model. Technical report, arXiv:1310.6150.

Lauritzen, S.(1996).Graphical Models. Oxford Statistical Science Series. Clarendon Press.

Lov´asz, L.andSzegedy, B.(2006). Limits of dense graph sequences.Journal of Combinatorial Theory, Series B.96 (6)933 – 957.

von Luxburg, U.,Belkin, M.andBousquet, O.(2008). Consistency of spectral clustering.Ann. Stat.36 (2)555–586.

Mariadassou, M.,Robin, S.andVacher, C.(2010). Uncovering structure in valued graphs: a variational approach.Ann. Appl. Statist.4 (2)715–42.

Mariadassou, M.andMatias, C.(2014). Convergence of the groups posterior distribution in latent or stochastic block models.Bernoulli. ??–?? to appear.

Matias, CatherineandRobin, St´ephane. (2014). Modeling heterogeneity in random graphs through latent space models: a selective review.ESAIM: Proc.4755–74.

Newman, M.andGirvan, M.(2004). Finding and evaluating community structure in networks,.Phys. Rev. E.69026113.

Newman, M. E. J.(2004). Fast algorithm for detecting community structure in networks.Phys. Rev.E (69)066133.

Nowicki, K.andSnijders, T.(2001). Estimation and prediction for stochastic block-structures.J. Amer. Statist. Assoc.961077–87.

Olhede, S. C.andWolfe, P. J.(2014). Network histograms and universality of blockmodel approximation.Proceedings of the National Academy of Sciences.111 (41)14722–14727.

Pattison, P. E.andRobins, G. L.(2007).Handbook of Probability Theory with Applications. chapter Probabilistic Network Theory. Sage Publication.

(97)

Picard, F.,Miele, V.,Daudin, J.-J.,Cottret, L.andRobin, S.(2009). Deciphering the connectivity structure of biological networks using mixnet. BMC Bioinformatics.Suppl 6S17.doi:10.1186/1471-2105-10-S6-S17.

Vacher, C.,Piou, D.andDesprez-Loustau, M.-L.(2008). Architecture of an antagonistic tree/fungus network: The asymmetric influence of past evolutionary history.PLoS ONE.3 (3)1740.e1740. doi:10.1371/journal.pone.0001740.

Volant, S.,Magniette, M.-L. M.andRobin, S.(2012). Variational bayes approach for model aggregation in unsupervised classification with markovian dependency.Comput. Statis. & Data Analysis.56 (8)2375 – 2387.

Wainwright, M. J.andJordan, M. I.(2008). Graphical models, exponential families, and variational inference.Found. Trends Mach. Learn.1 (1–2) 1–305. http:/dx.doi.org/10.1561/2200000001.

References

Related documents

The RFS scheme is integrated with Naive Bayesian, Decision Tree (C4.5) and k-NN classifiers and their effectiveness is demonstrated by comparing with conventional classifiers for

In the folder backups from Step 1, check for any custom scripts in the %Backup Location% \LyncWS\Scripts\Custom folder and copy them to their original folder locations.. Import

Rather, the DevOps organization should evolve in order to deliver business value for the organization -- and not the set of technology-driven methods, tools and practices

This multiple case study explored 6 millennial generation small business owners participating in the Futurpreneur Canada mentoring program.. Data included

The plant samples were taken from different organs (root, leaf, flower and bulb) at different growing stages (at the beginning of flowering, after flowering and fruit ripening)

Бүрэн ойлтын рентген флуоресценцийн спектрометр (БОРФС) нь Менделеевийн үелэх системийн фосфороос (P, Z=15) – уран (U, Z=92) хүртэлх элементүүдийг нэгэн зэрэг

It is characterized microscopically by presence of upper and lower leaf epidermises, neural epidermis, petiole epidermal cells, non-glandular tetracellular