Solving Relational Learning Tasks - A Three-Way Model for Relational Learning

2. A Three-Way Model for Relational Learning

2.3. Solving Relational Learning Tasks

likely to exist, whereas a relationship fromentity₁ toentity₃is very unlikely. This modeling is very similar to the modeling of stochastic equivalence employed in IRM and stochastic block models.

Homophily It has been discussed previously that the similarity of entities in the latent space

Areﬂects their similarity in a relational domain. In R, homophily can therefore be modeled by a nearly diagonal interaction matrixR_k. Consider latent representations of entities, all of similar length ka_ik ≈α. It follows fromR_k ≈I thataT_iR_ka_j ≈α2 a

T iaj

ka_ik ka_jk,

such that the probability of a relationshipx_ijk depends on the cosine similarity ofa_i and a_j. Furthermore, the magnitude of the diagonal entries inR_k can model the importance of a latent component for the homophily pattern in a particular relation. For instance, consider the following interaction matrixR_k and entity representationsa₁,a₂,a₃:

R_k = " 0.8 0.1 0.1 0.9 # ,a₁ = " 0.6 0.4 # ,a₂ = " 0.2 0.8 # ,a₃ = " 0.5 0.5 #

As the latent representations ofentity₁andentity₃are more similar then the representations ofentity₁andentity₂, it is more likely that a relationship exists between the former than between the latter pair of entities.

In general, the entries in the latent components ofAandRare not conﬁned to this interval [0,1] but are allowed to take negative values as well as values larger than one. Although this complicates theinterpretationof the latent representations this does not alter the general ability of R to model such patterns.

2.3. Solving Relational Learning Tasks

Given the factorization of an adjacency tensor, R can be used to approach all relational learning tasks outlined in section 1.2.2 as follows:

L P

For link prediction, the task is to predict P

x_ijk =1

. As discussed in section 2.2, under a least-squares loss function, the entries in the reconstructionbX= R×1A×2Aare not conﬁned to the interval [0,1] and do not have an easy interpretation as probabilities. However, usually we are not interested in exact probabilities, but only in aranking of relationships relative to their likelihood. In such cases, we interpret the entries of bX as conﬁdence values that

42 2. A Three-Way Model for Relational Learning

a particular relationship exists, meaning that the probability of a relationship is set to be proportional to the corresponding entry inbX, i.e. P

x_ijk =1 aT_iR_ka_j ∝ aT_iR_ka_j. Given a reconstructionbX, we then simply rank relationships by these conﬁdence values. A similar approach has successfully been used by Liben-Nowell and J. Kleinberg (2007) for link prediction on social networks via SVD, as well as by Huang et al. (2011) for relational learning via matrix factorization. If it is necessary to obtain valid probabilities, we apply an additional post-processing step, which has been introduced by Platt (1999). Let sig_ε :7→[0,1] be the sigmoidal transfer function, which is deﬁned as

sig_ε(x) ··=                ε e exp _x ε , ifx ≤ε x, ifε <x < 1−ε 1− ε e exp ₁₋_x ε , ifx ≥ 1−ε

whereε ∈[0,0.5] is a user-given parameter ande denotes Euler’s number. The probability of a relationshipx_ijk is then set to

P x_ijk =1 aT_iR_ka_j =sig_ε aT_iR_ka_j

Please note the sig_ε(·) is a monotone function, such that the relative ordering of entries aT_iR_ka_j is preserved. The parameterε is determined via cross-validation.

E R  LB C

Entity resolution and link-based clustering are both learning tasks which are defined over the similarity of entities in a relational domain, meaning that similar entities are assumed to be identical (entity resolution) or that similar entities are grouped in identical clusters (link-based clustering). To approach these learning tasks, or any other task that is defined over the relational similarity of entities, we make use of the fact that the latent spaceAreflects the similarity of entities in the relational domain, as discussed in section 2.2. SinceAis a

vector space representation of entities, any feature-based machine learning method such as

k-means or even non-linear kernel methods can be applied to these tasks and still exploit the similarity of the entities in the relational domain.

C C

Collective classiﬁcation can be approached in two alternative ways. One way, is to cast collective classiﬁcation as a link prediction problem, by introducing an additionalclassOf

2.4 Discussion and Related Work 43

In document Nickel, Maximilian (2013): Tensor factorization for relational learning. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik (Page 57-59)