Network analysis with the
W
-graph model
(via the Stochastic Block Model)
S. Robin
Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech
Outline
1 Modeling heterogeneity in interaction networks
2 Statistical inference of latent space models (focus on SBM)
3 From SBM toW-graph: Averaging models
Modeling heterogeneity in interaction networks
Modeling heterogeneity in interaction networks Heterogeneity in biological networks
Heterogeneity in biological networks
Biological networks describe interactions between entities: genes, proteins, individuals, species...
Observed networks display heterogeneous topologies, that one would like to decipher and better understand.
Dolphine social network.
[Newman and Girvan (2004)]
Modeling heterogeneity in interaction networks Heterogeneity in biological networks
Heterogeneity in biological networks
Biological networks describe interactions between entities: genes, proteins, individuals, species...
Observed networks display heterogeneous topologies, that one would like to decipher and better understand.
Dolphine social network.
Modeling heterogeneity in interaction networks Heterogeneity in biological networks
Heterogeneous means ...
... not homogeneous,that is: different from an Erd¨os-Renyi (ER) graph.
Erd¨os-Renyi random graphG(n,p): Considernnodes, node pairs 1≤i<j≤n
are independently connected with same probabilityp:
(Yij) iid, Yij ∼ B(p).
Very intensively studied. Fits very few real-life networks.
Modeling heterogeneity in interaction networks Latent space models
Latent space models
Latent variablesallow to capture some underlying structure of a network (see review[Matias and R. (2014)]).
General setting for binary graphs. [Bollob´aset al.(2007)]:
A latent (unobserved) variableZi is associated with each node:
{Zi}iid ∼π
EdgesYij =I{i∼j} are independent conditionally to theZi’s:
{Yij} independent|{Zi}: Pr{Yij= 1}=γ(Zi,Zj)
We focus here on model approaches, in contrast with, e.g.
Graph clustering[Girvan and Newman (2002)],[Newman (2004)];
Modeling heterogeneity in interaction networks Latent space models
Latent space models
Latent variablesallow to capture some underlying structure of a network (see review[Matias and R. (2014)]).
General setting for binary graphs. [Bollob´aset al.(2007)]:
A latent (unobserved) variableZi is associated with each node:
{Zi}iid ∼π
EdgesYij =I{i∼j} are independent conditionally to theZi’s:
{Yij} independent|{Zi}: Pr{Yij= 1}=γ(Zi,Zj)
We focus here on model approaches, in contrast with, e.g.
Graph clustering[Girvan and Newman (2002)],[Newman (2004)];
Modeling heterogeneity in interaction networks Latent space models
Latent space models
Latent variablesallow to capture some underlying structure of a network (see review[Matias and R. (2014)]).
General setting for binary graphs. [Bollob´aset al.(2007)]:
A latent (unobserved) variableZi is associated with each node:
{Zi}iid ∼π
EdgesYij =I{i∼j} are independent conditionally to theZi’s:
{Yij} independent|{Zi}: Pr{Yij= 1}=γ(Zi,Zj)
We focus here on model approaches, in contrast with, e.g.
Modeling heterogeneity in interaction networks Latent space models
Latent space models
State-space model: principle.
Considernnodes (i= 1..n);
Zi = unobserved position of nodei,
e.g.
{Zi}iid ∼ N(0,I)
Edge{Yij}independent given{Zi},
e.g.
Modeling heterogeneity in interaction networks Latent space models
Latent space models
State-space model: principle.
Considernnodes (i= 1..n);
Zi = unobserved position of nodei,
e.g.
{Zi}iid ∼ N(0,I)
Edge{Yij}independent given{Zi},
e.g.
Modeling heterogeneity in interaction networks Latent space models
Latent space models
State-space model: principle.
Considernnodes (i= 1..n);
Zi = unobserved position of nodei,
e.g.
{Zi}iid ∼ N(0,I)
Edge{Yij}independent given{Zi},
e.g.
Modeling heterogeneity in interaction networks Latent space models
Latent space models
State-space model: principle.
Considernnodes (i= 1..n);
Zi = unobserved position of nodei,
e.g.
{Zi}iid ∼ N(0,I)
Edge{Yij}independent given{Zi},
e.g.
Modeling heterogeneity in interaction networks Latent space models
Latent space models
State-space model: principle.
Considernnodes (i= 1..n);
Zi = unobserved position of nodei,
e.g.
{Zi}iid ∼ N(0,I)
Edge{Yij}independent given{Zi},
e.g. Pr{Yij= 1}=γ(Zi,Zj). Y = 0 1 1 0 1 . . . 0 0 1 0 1 . . . 0 0 0 0 0 . . . 0 0 0 0 1 . . . 0 0 0 0 0 . . . . . . . . . . . . . . . . . . . ..
Modeling heterogeneity in interaction networks Latent space models
A variety of state-space models
Latent position models.
[Hoffet al. (2002)]: Zi∈Rd, logitγ(z,z0) =a− |z−z0| [Handcocket al. (2007)]: Zi ∼ X k pkNd(µk, σk2I) [Daudinet al. (2010)]: Zi∈ SK, γ(z,z0) = X k,` zkz`0γk`
In this talk, focus on
the Stochastic Block Model (SBM) and
Modeling heterogeneity in interaction networks Latent space models
A variety of state-space models
Latent position models.
[Hoffet al. (2002)]: Zi∈Rd, logitγ(z,z0) =a− |z−z0| [Handcocket al. (2007)]: Zi ∼ X k pkNd(µk, σk2I) [Daudinet al. (2010)]: Zi∈ SK, γ(z,z0) = X k,` zkz`0γk`
In this talk, focus on
the Stochastic Block Model (SBM) and
Modeling heterogeneity in interaction networks Latent space models
Stochastic Block Model (SBM)
A mixture model for random graphs.
[Nowicki and Snijders (2001)]
Considernnodes (i = 1..n);
Zi= unobserved label of nodei:
{Zi} iid ∼ M(1;π)
π= (π1, ...πK);
EdgeYij depends on the labels:
{Yij} independent given{Zi},
Modeling heterogeneity in interaction networks Latent space models
Stochastic Block Model (SBM)
A mixture model for random graphs.
[Nowicki and Snijders (2001)]
Considernnodes (i = 1..n);
Zi= unobserved label of nodei:
{Zi} iid ∼ M(1;π)
π= (π1, ...πK);
EdgeYij depends on the labels:
{Yij} independent given{Zi},
Modeling heterogeneity in interaction networks Latent space models
Stochastic Block Model (SBM)
A mixture model for random graphs.
[Nowicki and Snijders (2001)]
Considernnodes (i = 1..n);
Zi= unobserved label of nodei:
{Zi} iid ∼ M(1;π)
π= (π1, ...πK);
EdgeYij depends on the labels:
{Yij} independent given{Zi},
Modeling heterogeneity in interaction networks Latent space models
Stochastic Block Model (SBM)
A mixture model for random graphs.
[Nowicki and Snijders (2001)]
Considernnodes (i = 1..n);
Zi= unobserved label of nodei:
{Zi} iid ∼ M(1;π)
π= (π1, ...πK);
EdgeYij depends on the labels:
{Yij} independent given{Zi},
Modeling heterogeneity in interaction networks Latent space models
Stochastic Block Model (SBM)
A mixture model for random graphs.
[Nowicki and Snijders (2001)]
Considernnodes (i = 1..n);
Zi= unobserved label of nodei:
{Zi} iid ∼ M(1;π)
π= (π1, ...πK);
EdgeYij depends on the labels:
{Yij} independent given{Zi},
Modeling heterogeneity in interaction networks Latent space models
W
-graph model
Latent variables: (Zi) iid ∼ U[0,1], Graphon functionγ: γ(z,z0) : [0,1]2→[0,1] Edges: Pr{Yij = 1}=γ(Zi,Zj) Graphon functionγ(z,z0)Modeling heterogeneity in interaction networks Latent space models
Interpreting the graphon function
The graphon function provides a global picture of the network’s topology.
Modeling heterogeneity in interaction networks Latent space models
Few words about the
W
-graph
Probabilistic point of view.
W-graph have been mostly studied in the probability literature: [Lov´asz and
Szegedy (2006)],[Diaconis and Janson (2008)]
Motif (sub-graph) frequencies are invariant characteristics of aW-graph.
Intrinsic un-identifiability of the graphon functionγis often overcome by
imposing thatu7→R
γ(u,v) dv is monotonous increasing.
Statistical point of view.
Not much attention has been paid to its inference until recently: [Airoldiet al.
(2013)],[Chatterjee et al. (2014)],[Olhede and Wolfe (2014)], ...
Modeling heterogeneity in interaction networks Latent space models
Few words about the
W
-graph
Probabilistic point of view.
W-graph have been mostly studied in the probability literature: [Lov´asz and
Szegedy (2006)],[Diaconis and Janson (2008)]
Motif (sub-graph) frequencies are invariant characteristics of aW-graph.
Intrinsic un-identifiability of the graphon functionγis often overcome by
imposing thatu7→R
γ(u,v) dv is monotonous increasing.
Statistical point of view.
Not much attention has been paid to its inference until recently: [Airoldiet al.
(2013)],[Chatterjee et al. (2014)],[Olhede and Wolfe (2014)], ...
Modeling heterogeneity in interaction networks Some generalizations of latent space graph models
Some generalizations of latent space graph models
Latent space models can be extended in various directions.
Weighted or directed networks. Edges may have values: count, real,{0,+,−,±}, ... Latent space model can be adapted as
Yij|Zi,Zj ∼ F(γ(Zi,Zj))
whereF is can be any distribution: Poisson, normal, multinomial, etc.
Accounting for covariates. Latent space model can also accommodate for covariates, via a regression term:
Yij|Zi,Zj ∼ F(γ(Zi,Zj) +xij0β)
Modeling heterogeneity in interaction networks Some generalizations of latent space graph models
Some generalizations of latent space graph models
Latent space models can be extended in various directions.
Weighted or directed networks. Edges may have values: count, real,{0,+,−,±}, ... Latent space model can be adapted as
Yij|Zi,Zj ∼ F(γ(Zi,Zj))
whereF is can be any distribution: Poisson, normal, multinomial, etc.
Accounting for covariates. Latent space model can also accommodate for covariates, via a regression term:
Yij|Zi,Zj ∼ F(γ(Zi,Zj) +xij0β)
Modeling heterogeneity in interaction networks Some generalizations of latent space graph models
Some generalizations of latent space graph models
Latent space models can be extended in various directions.
Weighted or directed networks. Edges may have values: count, real,{0,+,−,±}, ... Latent space model can be adapted as
Yij|Zi,Zj ∼ F(γ(Zi,Zj))
whereF is can be any distribution: Poisson, normal, multinomial, etc.
Accounting for covariates. Latent space model can also accommodate for covariates, via a regression term:
Yij|Zi,Zj ∼ F(γ(Zi,Zj) +xij0β)
Statistical inference of latent space models
Statistical inference of latent space models Incomplete data models
Incomplete data models
Aim. Based on the observed networkY = (Yij), one want typically to infer
the parameters
θ= (π, γ)
the hidden states
Z = (Zi)
State space models belong to the class of incomplete data models as
the edges (Yij) are observed,
the latent positions (or status) (Zi) are not,
Statistical inference of latent space models Incomplete data models
Incomplete data models
Aim. Based on the observed networkY = (Yij), one want typically to infer
the parameters
θ= (π, γ)
the hidden states
Z = (Zi)
State space models belong to the class of incomplete data models as
the edges (Yij) are observed,
the latent positions (or status) (Zi) are not,
Statistical inference of latent space models Incomplete data models
Frequentist or Bayesian inference
Frequentist inference. θis fixed andZ is random. The aim is then to
provide an estimateθbofθ,
provide the conditional distributionPθ(Z|Y) (for classification purposes and
as a side product of the inference).
Bayesian inference. BothθandZ are random. The aim is then to
provide the joint conditional distributionP(θ,Z|Y).
Whatever the approach, we have to deal with conditional distributions:
Statistical inference of latent space models Incomplete data models
Frequentist or Bayesian inference
Frequentist inference. θis fixed andZ is random. The aim is then to
provide an estimateθbofθ,
provide the conditional distributionPθ(Z|Y) (for classification purposes and
as a side product of the inference).
Bayesian inference. BothθandZ are random. The aim is then to
provide the joint conditional distributionP(θ,Z|Y).
Whatever the approach, we have to deal with conditional distributions:
Statistical inference of latent space models Incomplete data models
Frequentist or Bayesian inference
Frequentist inference. θis fixed andZ is random. The aim is then to
provide an estimateθbofθ,
provide the conditional distributionPθ(Z|Y) (for classification purposes and
as a side product of the inference).
Bayesian inference. BothθandZ are random. The aim is then to
provide the joint conditional distributionP(θ,Z|Y).
Whatever the approach, we have to deal with conditional distributions:
Statistical inference of latent space models Incomplete data models
Conditional distributions (1/2)
Graphical modelsdescribe the conditional independences between the random
vari-ables from a model[Lauritzen (1996)].
Frequentist setting:
iidZi’s,
P(Yij|Zi,Zj),
P(Zi,Zj|Y): graph moralization,
this holds for each pair (i,j),
Conditional distribution. The dependency graph of Z given Y is a clique.
→No factorization can be hoped (unlike for HMM).
Statistical inference of latent space models Incomplete data models
Conditional distributions (1/2)
Graphical modelsdescribe the conditional independences between the random
vari-ables from a model[Lauritzen (1996)].
Frequentist setting:
iidZi’s,
P(Yij|Zi,Zj),
P(Zi,Zj|Y): graph moralization,
this holds for each pair (i,j),
Conditional distribution. The dependency graph of Z given Y is a clique.
→No factorization can be hoped (unlike for HMM).
Statistical inference of latent space models Incomplete data models
Conditional distributions (1/2)
Graphical modelsdescribe the conditional independences between the random
vari-ables from a model[Lauritzen (1996)].
Frequentist setting:
iidZi’s,
P(Yij|Zi,Zj),
P(Zi,Zj|Y): graph moralization,
this holds for each pair (i,j),
Conditional distribution. The dependency graph of Z given Y is a clique.
→No factorization can be hoped (unlike for HMM).
Statistical inference of latent space models Incomplete data models
Conditional distributions (1/2)
Graphical modelsdescribe the conditional independences between the random
vari-ables from a model[Lauritzen (1996)].
Frequentist setting:
iidZi’s,
P(Yij|Zi,Zj),
P(Zi,Zj|Y): graph moralization,
this holds for each pair (i,j),
Conditional distribution. The dependency graph of Z given Y is a clique.
→No factorization can be hoped (unlike for HMM).
Statistical inference of latent space models Incomplete data models
Conditional distributions (1/2)
Graphical modelsdescribe the conditional independences between the random
vari-ables from a model[Lauritzen (1996)].
Frequentist setting:
iidZi’s,
P(Yij|Zi,Zj),
P(Zi,Zj|Y): graph moralization,
this holds for each pair (i,j),
Conditional distribution. The dependency graph of Z given Y is a clique.
→No factorization can be hoped (unlike for HMM).
Statistical inference of latent space models Incomplete data models
Conditional distributions (1/2)
Graphical modelsdescribe the conditional independences between the random
vari-ables from a model[Lauritzen (1996)].
Frequentist setting:
iidZi’s,
P(Yij|Zi,Zj),
P(Zi,Zj|Y): graph moralization,
this holds for each pair (i,j),
Conditional distribution. The dependency graph of Z given Y is a clique.
→No factorization can be hoped (unlike for HMM).
Statistical inference of latent space models Incomplete data models
Conditional distributions (2/2)
Bayesian perspective. Things get
worst becauseθ = (π, γ) is also
random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .
Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.
Statistical inference of latent space models Incomplete data models
Conditional distributions (2/2)
Bayesian perspective. Things get
worst becauseθ = (π, γ) is also
random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .
Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.
Statistical inference of latent space models Incomplete data models
Conditional distributions (2/2)
Bayesian perspective. Things get
worst becauseθ = (π, γ) is also
random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .
Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.
Statistical inference of latent space models Incomplete data models
Conditional distributions (2/2)
Bayesian perspective. Things get
worst becauseθ = (π, γ) is also
random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved .
Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.
Statistical inference of latent space models Incomplete data models
Conditional distributions (2/2)
Bayesian perspective. Things get
worst becauseθ = (π, γ) is also
random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved.
Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.
Statistical inference of latent space models Incomplete data models
Conditional distributions (2/2)
Bayesian perspective. Things get
worst becauseθ = (π, γ) is also
random. Model: P(θ) P(Z|π) P(Y|γ,Z) P(θ,Z|Y) is even more involved.
Both frequentist and Bayesian inference require the calculation of conditional distributions that can not be computed.
Statistical inference of latent space models Variational (Bayes) inference
Variational (Bayes) inference
Variational approximationsaim at replacing an intractable exact distributionP
with a tractable approximate distributionP. Typically:e
Pθ(Z|Y) ≈ Y i e Pθ,Y(Zi) P(θ,Z|Y) ≈ PeY(θ)×PeY(Z) P(θ,Z|Y) ≈ PeY(θ)× Y i e PY(Zi)
Popular strategy: minimize the K¨ullback-Leibler divergence betweenPe andP:
minKL[P(Ze )||Pθ(Z|Y)] or minKL[P(θ,e Z)||P(θ,Z|Y)]
Statistical inference of latent space models Variational (Bayes) inference
VBEM inference for SBM:
E. coli
’s operon network
[Picardet al. (2009)]
Meta-graph representation.
Statistical inference of latent space models Variational (Bayes) inference
VBEM inference for SBM:
E. coli
’s operon network
Meta-graph representation.Statistical inference of latent space models Variational (Bayes) inference
VBEM inference for SBM:
E. coli
’s operon network
[Picardet al. (2009)]
Meta-graph representation.
Statistical inference of latent space models Variational (Bayes) inference
Accuracy of VBEM estimates for SBM: Simulation study
Credibility intervals: π1: +,γ11: 4, γ12: ◦, γ22: •Width of the posterior credibility intervals. π1,γ11,γ12,γ22
Statistical inference of latent space models Variational (Bayes) inference
Accuracy of VBEM estimates for SBM: Simulation study
Credibility intervals: π1: +,γ11: 4, γ12: ◦, γ22: •Statistical inference of latent space models Variational (Bayes) inference
First half summary
Latent space graph models are useful to describe network heterogeneity. Their statistical inference raises some specific issues.
Variational approximations help to circumvent these issues.
And also
Theoretical justifications of these approximations exist for SBM:[Celisseet al.
(2012)],[Mariadassou and Matias (2014)]
VEM and VBEM algorithms have been specifically developed for SBM:
[Daudinet al. (2008)],[Latoucheet al. (2012)]
Statistical inference of latent space models Variational (Bayes) inference
First half summary
Latent space graph models are useful to describe network heterogeneity. Their statistical inference raises some specific issues.
Variational approximations help to circumvent these issues.
And also
Theoretical justifications of these approximations exist for SBM:[Celisseet al.
(2012)],[Mariadassou and Matias (2014)]
VEM and VBEM algorithms have been specifically developed for SBM:
[Daudinet al. (2008)], [Latoucheet al. (2012)]
From SBM toW-graph: Averaging models
From SBM toW-graph: Averaging models SBM as aW-graph model
SBM as a
W
-graph model
Latent variables:
(Zi) iid ∼ M(1, π)
Blockwise constant graphon: γ(z,z0) =γk`
Edges:
Pr{Yij = 1}=γ(Zi,Zj)
Graphon functionγSBM
K (z,z0)
From SBM toW-graph: Averaging models SBM as aW-graph model
Variational Bayes estimation of
γ
(
z
,
z
0)
VBEM inferenceprovides the approximate posteriors:
(π|Y) ≈ Dir(π∗)
(γk`|Y) ≈ Beta(γk0`∗, γk1∗`)
Estimate ofγ(u,v). Due
to the uncertainty of theπk,
the posterior mean ofγSBM
K
is smooth
(Explicit integration using [Gouda and
Sz´antai (2010)])
From SBM toW-graph: Averaging models Bayesian model averaging
Bayesian model averaging
Bayesian model averaging (BMA).Consider a series of models 1, . . . ,K, . . . in
which a certain function of the parameterf(θ) can always be defined.
Bayesian inference within each modelK provides the posterior
P(θ|K,Y) → P(f(θ)|K,Y).
BMA[Hoetinget al.(1999)]relies on the marginal posterior off(θ):
P(f(θ)|Y) =X
K
From SBM toW-graph: Averaging models Bayesian model averaging
Bayesian model averaging
Bayesian model averaging (BMA).Consider a series of models 1, . . . ,K, . . . in
which a certain function of the parameterf(θ) can always be defined.
Bayesian inference within each modelK provides the posterior
P(θ|K,Y) → P(f(θ)|K,Y).
BMA[Hoetinget al.(1999)]relies on the marginal posterior off(θ):
P(f(θ)|Y) =X
K
From SBM toW-graph: Averaging models Bayesian model averaging
Bayesian model averaging
Bayesian model averaging (BMA).Consider a series of models 1, . . . ,K, . . . in
which a certain function of the parameterf(θ) can always be defined.
Bayesian inference within each modelK provides the posterior
P(θ|K,Y) → P(f(θ)|K,Y).
BMA[Hoetinget al.(1999)]relies on the marginal posterior off(θ):
P(f(θ)|Y) =X
K
From SBM toW-graph: Averaging models Bayesian model averaging
Variational Bayes model averaging
Pushing it further: Consider the modelK as an additional hidden variable:
P(Z, θ,K|Y) ≈ P(Z, θ,e K)
:= P(Ze |K)×P(θe |K)×P(Ke )
Note that no additional independence assumption is needed.
Variational Bayes model averaging (VBMA).The optimal1approximation of P(K|Y) satisfies[Volantet al. (2012)]:
e
P(K)∝P(K)elogP(Y|K)−KL(K)=P(K|Y)e−KL(K)
From SBM toW-graph: Averaging models Bayesian model averaging
Variational Bayes model averaging
Pushing it further: Consider the modelK as an additional hidden variable:
P(Z, θ,K|Y) ≈ P(Z, θ,e K)
:= P(Ze |K)×P(θe |K)×P(Ke )
Note that no additional independence assumption is needed.
Variational Bayes model averaging (VBMA).The optimal1 approximation of P(K|Y) satisfies[Volantet al. (2012)]:
e
P(K)∝P(K)elogP(Y|K)−KL(K)=P(K|Y)e−KL(K)
From SBM toW-graph: Averaging models Inferring the graphon function
Inferring the graphon function
Model averaging: There is no ’trueK’ in theW-graph model.
Apply VBMA recipe toγ(z,z0). ForK = 1..Kmax, fit an SBM model via VBEM and compute
b
γKSBM(z,z0) =Ee[γC(z),C(z0)|Y,K].
Then perform model averaging as
b γ(z,z0) =Ee[γC(z),C(z0)|Y] = X K e P(K)bγKSBM(z,z0), [Latouche and R. (2013)].
From SBM toW-graph: Averaging models Inferring the graphon function
Inferring the graphon function
Model averaging: There is no ’trueK’ in theW-graph model.
Apply VBMA recipe toγ(z,z0). ForK = 1..Kmax, fit an SBM model via VBEM and compute
b
γKSBM(z,z0) =Ee[γC(z),C(z0)|Y,K].
Then perform model averaging as
b γ(z,z0) =Ee[γC(z),C(z0)|Y] = X K e P(K)bγKSBM(z,z0), [Latouche and R. (2013)].
From SBM toW-graph: Averaging models Inferring the graphon function
PPI network
Like many PPI networks,E. coli’s network is highly concentrated around few
From SBM toW-graph: Averaging models Inferring the graphon function
PPI network
Like many PPI networks,E. coli’s network is highly concentrated around few
From SBM toW-graph: Averaging models Inferring the graphon function
Ecological network between fungal species
From SBM toW-graph: Averaging models Inferring the graphon function
Ecological network between fungal species
From SBM toW-graph: Averaging models Inferring the graphon function
Brain network
From SBM toW-graph: Averaging models Inferring the graphon function
Brain network
From SBM toW-graph: Averaging models Inferring the graphon function
Blog network (non-biological)
From SBM toW-graph: Averaging models Inferring the graphon function
Blog network (non-biological)
Goodness-of-fit
Goodness-of-fit Motifs frequency
Motifs frequency
Network motifs have a biological (or sociological) interpretation in terms of building blocks of the global network
→Triangles = ’friends of my friends are my friends’.
Latent space graph models only describe binary interactions, conditional on the latent positions
Goodness-of-fit Motifs frequency
Moments of motif counts
Moments under SBM:The first momentsEN(m),VN(m) of the count are known
for exchangeable graph models (incl. SBM)[Picardet al. (2008)]:
ESBMN(m)∝µSBM(m) =:f(θSBM)
whereµSBM(m) is the motif occurrence probability under SBM.
Moments underW-graph: Motif probability under theW -graph can be estimated as b µ(m) =X k e P(K)Ee(µSBM(m)|X,K)
Estimates ofEWN(m) andVWN(m) can be derived accordingly[Latouche and R.
Goodness-of-fit Motifs frequency
Network frequencies in the blog network
Motif Count Mean Std. dev.
(×103) (×103) (×103) 29.7 39.7 8.3 3.8 4.6 1.3 608.7 968.3 336.8 279.8 428.9 154.0 47.4 74.5 35.1 270.5 397.0 177.0 62.1 87.8 47.4 6.5 8.8 5.4
Goodness-of-fit ’Residual graphon’
Covariates: Tree interaction (valued) network
Data: n= 51 tree species,
Yij= number of shared parasites
[Vacheret al. (2008)].
SBM:GivenZi =k,Zj=`,
Yij∼ P(eγk`),
γk`= log-mean number of shared
parasites.
Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6
Goodness-of-fit ’Residual graphon’
Covariates: Tree interaction (valued) network
Data: n= 51 tree species,
Yij= number of shared parasites
[Vacheret al. (2008)].
SBM:GivenZi =k,Zj=`,
Yij∼ P(eγk`),
γk`= log-mean number of shared
parasites.
Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6
Goodness-of-fit ’Residual graphon’
Covariates: Tree interaction (valued) network
Data: n= 51 tree species,
Yij= number of shared parasites
[Vacheret al. (2008)].
SBM:GivenZi =k,Zj=`,
Yij∼ P(eγk`),
γk`= log-mean number of shared
parasites.
Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6
Goodness-of-fit ’Residual graphon’
Covariates: Tree interaction (valued) network
Data: n= 51 tree species,
Yij= number of shared parasites
[Vacheret al. (2008)].
SBM:GivenZi =k,Zj=`,
Yij∼ P(eγk`),
γk`= log-mean number of shared
parasites.
Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6
Goodness-of-fit ’Residual graphon’
Covariates: Tree interaction (valued) network
Data: n= 51 tree species,
Yij= number of shared parasites
[Vacheret al. (2008)].
SBM:GivenZi =k,Zj=`,
Yij∼ P(eγk`),
γk`= log-mean number of shared
parasites.
Results: ICL selects K = 7 groups that are partly related with phylums. eγbk` T1 T2 T3 T4 T5 T6 T7 T1 14.46 4.19 5.99 7.67 2.44 0.13 1.43 T2 14.13 0.68 2.79 4.84 0.53 1.54 T3 3.19 4.10 0.66 0.02 0.69 T4 7.42 2.57 0.04 1.05 T5 3.64 0.23 0.83 T6 0.04 0.06 T7 0.27 b πk 7.8 7.8 13.7 13.7 15.7 19.6 21.6
Goodness-of-fit ’Residual graphon’
Accounting for the taxonomic distance
Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298
→ The mean number of shared
parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317
→Groups are no longer associated with the phylogenetic structure.
Goodness-of-fit ’Residual graphon’
Accounting for the taxonomic distance
Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298
→ The mean number of shared
parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317
→Groups are no longer associated with the phylogenetic structure.
Goodness-of-fit ’Residual graphon’
Accounting for the taxonomic distance
Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298
→ The mean number of shared
parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317
→Groups are no longer associated with the phylogenetic structure.
Goodness-of-fit ’Residual graphon’
Accounting for the taxonomic distance
Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298
→ The mean number of shared
parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317
→Groups are no longer associated with the phylogenetic structure.
Goodness-of-fit ’Residual graphon’
Accounting for the taxonomic distance
Model: xij = distance(i,j) Yij∼ P(eγk`+βxij), [Mariadassouet al. (2010)]. Results: βb=−0.317. →forx = 3.82, eβbx =.298
→ The mean number of shared
parasites decreases with taxo-nomic distance. eλbk` T’1 T’2 T’3 T’4 T’1 0.75 2.46 0.40 3.77 T’2 4.30 0.52 8.77 T’3 0.080 1.05 T’4 14.22 b πk 17.7 21.5 23.5 37.3 b β -0.317
→Groups are no longer associated with the phylogenetic structure.
Goodness-of-fit ’Residual graphon’
’Residual’ graphon
A simple graph model with covariates. When edge covariatesxij are available,
simply fit a logistic regression[Pattison and Robins (2007)]:
(Yij) independent Yij ∼ B(pij) logitpij =xij0β.
Introducing a residual term. To assess the fit of the model, simply add a residual graphon-like term: (Zi) iidU[0,1] Yij|Zi,Zj∼ B(pij) logitpij=xij0β+γ(Zi,Zj).
→A VBEM algorithm can be designed to getP(β, θ,e Z)≈P(β, θ,Z|Y):
Goodness-of-fit ’Residual graphon’
’Residual’ graphon
A simple graph model with covariates. When edge covariatesxij are available,
simply fit a logistic regression[Pattison and Robins (2007)]:
(Yij) independent Yij ∼ B(pij) logitpij =xij0β.
Introducing a residual term. To assess the fit of the model, simply add a residual graphon-like term: (Zi) iidU[0,1] Yij|Zi,Zj ∼ B(pij) logitpij =xij0β+γ(Zi,Zj).
→A VBEM algorithm can be designed to getP(β, θ,e Z)≈P(β, θ,Z|Y):
Goodness-of-fit ’Residual graphon’
Tree network
Binary version: Links between tree species if they host at least one common fungal parasite.
Regression: covariates = genetic dis-tance, taxonomic disdis-tance, geographic distance
Goodness-of-fit ’Residual graphon’
Tree network
Binary version: Links between tree species if they host at least one common fungal parasite.
Regression: covariates = genetic dis-tance, taxonomic disdis-tance, geographic distance
Goodness-of-fit ’Residual graphon’
Blog network
Blog network: Already shown.
Regression: covariates = same political party, pair includes a journalist
Goodness-of-fit ’Residual graphon’
Blog network
Blog network: Already shown.
Regression: covariates = same political party, pair includes a journalist
Conclusion & future work
Some conclusions.
The graphon provides a representation of the network topology It can be estimated using variational Bayes inference
→R packages ’mixer’ and ’blockmodels’
It can be combined with covariates as a residual term
Future work.
Formal goodness-of-fit test
Quality of variational Bayes estimates in SBM with covariates
Conclusion & future work
Some conclusions.
The graphon provides a representation of the network topology It can be estimated using variational Bayes inference
→R packages ’mixer’ and ’blockmodels’
It can be combined with covariates as a residual term
Future work.
Formal goodness-of-fit test
Quality of variational Bayes estimates in SBM with covariates
Airoldi, E. M.,Costa, T. B.andChan, S. H.(2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In Advances in Neural Information Processing Systems, 692–700.
Beal, J., M.andGhahramani, Z.(2003). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures.Bayes. Statist.7543–52.
Bollob´as, B.,Janson, S.andRiordan, O.(2007). The phase transition in inhomogeneous random graphs.Rand. Struct. Algo.31 (1)3–122.
Celisse, A.,Daudin, J.-J.andPierre, L.(2012). Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electron. J. Statis.61847–99.
Chatterjee, S.et al.(2014). Matrix estimation by universal singular value thresholding.The Annals of Statistics.43 (1)177–214.
Daudin, J.-J.,Picard, F.andRobin, S.(Jun, 2008). A mixture model for random graphs.Stat. Comput.18 (2)173–83.
Daudin, J.-J.,Pierre, L.andVacher, C.(2010). Model for heterogeneous random networks using continuous latent variables and an application to a tree–fungus network.Biometrics.66 (4)1043–1051.
Diaconis, P.andJanson, S.(2008). Graph limits and exchangeable random graphs.Rend. Mat. Appl.7 (28)33–61.
Gazal, S.,Daudin, J.-J.andRobin, S.(2012). Accuracy of variational estimates for random graph mixture models.Journal of Statistical Computation and Simulation.82 (6)849–862.
Girvan, M.andNewman, M. E. J.(2002). Community strucutre in social and biological networks.Proc. Natl. Acad. Sci. USA.99 (12)7821–6.
Gouda, A.andSz´antai, T.(2010). On numerical calculation of probabilities according to Dirichlet distribution.Ann. Oper. Res.177185–200. DOI: 10.1007/s10479-009-0601-9.
Handcock, M.,Raftery, A.andTantrum, J.(2007). Model-based clustering for social networks.JRSSA.170 (2)301–54.doi: 10.1111/j.1467-985X.2007.00471.x.
Jaakkola, T. S.andJordan, M. I.(2000). Bayesian parameter estimation via variational methods.Statistics and Computing.10 (1)25–37.
Latouche, P.,Birmel´e, E.andAmbroise, C.(2012). Variational bayesian inference and complexity control for stochastic block models.Statis. Model. 12 (1)93–115.
Latouche, P.andRobin, S.(2013), Bayesian model averaging of stochastic block models to estimate the graphon function and motif frequencies in a W-graph model. Technical report, arXiv:1310.6150.
Lauritzen, S.(1996).Graphical Models. Oxford Statistical Science Series. Clarendon Press.
Lov´asz, L.andSzegedy, B.(2006). Limits of dense graph sequences.Journal of Combinatorial Theory, Series B.96 (6)933 – 957.
von Luxburg, U.,Belkin, M.andBousquet, O.(2008). Consistency of spectral clustering.Ann. Stat.36 (2)555–586.
Mariadassou, M.,Robin, S.andVacher, C.(2010). Uncovering structure in valued graphs: a variational approach.Ann. Appl. Statist.4 (2)715–42.
Mariadassou, M.andMatias, C.(2014). Convergence of the groups posterior distribution in latent or stochastic block models.Bernoulli. ??–?? to appear.
Matias, CatherineandRobin, St´ephane. (2014). Modeling heterogeneity in random graphs through latent space models: a selective review.ESAIM: Proc.4755–74.
Newman, M.andGirvan, M.(2004). Finding and evaluating community structure in networks,.Phys. Rev. E.69026113.
Newman, M. E. J.(2004). Fast algorithm for detecting community structure in networks.Phys. Rev.E (69)066133.
Nowicki, K.andSnijders, T.(2001). Estimation and prediction for stochastic block-structures.J. Amer. Statist. Assoc.961077–87.
Olhede, S. C.andWolfe, P. J.(2014). Network histograms and universality of blockmodel approximation.Proceedings of the National Academy of Sciences.111 (41)14722–14727.
Pattison, P. E.andRobins, G. L.(2007).Handbook of Probability Theory with Applications. chapter Probabilistic Network Theory. Sage Publication.
Picard, F.,Miele, V.,Daudin, J.-J.,Cottret, L.andRobin, S.(2009). Deciphering the connectivity structure of biological networks using mixnet. BMC Bioinformatics.Suppl 6S17.doi:10.1186/1471-2105-10-S6-S17.
Vacher, C.,Piou, D.andDesprez-Loustau, M.-L.(2008). Architecture of an antagonistic tree/fungus network: The asymmetric influence of past evolutionary history.PLoS ONE.3 (3)1740.e1740. doi:10.1371/journal.pone.0001740.
Volant, S.,Magniette, M.-L. M.andRobin, S.(2012). Variational bayes approach for model aggregation in unsupervised classification with markovian dependency.Comput. Statis. & Data Analysis.56 (8)2375 – 2387.
Wainwright, M. J.andJordan, M. I.(2008). Graphical models, exponential families, and variational inference.Found. Trends Mach. Learn.1 (1–2) 1–305. http:/dx.doi.org/10.1561/2200000001.