3. Large-Scale Relational Learning and Application on Linked Data
3.6. Learning from Attributes via Coupled Tensor Factorization
3.8.2. Comparison to Semantic Web Machine Learning Methods
To the best of our knowledge, there have yet not been any attempts to apply machine learning methods which compute afull relational model to Linked Data of the size considered in this chapter. In the context of the Semantic Web, Inductive Logic Programming and kernel learning have been the dominant approaches to machine learning so far (Bloehdorn and Sure, 2007; d’Amato et al., 2008; Fanizzi et al., 2008). Further approaches to learn from Semantic Web data include the work by Lin et al. (2011), which proposed to learn Relational Bayesian Classifiers for RDF data via queries to a S endpoint. S-ML (Kiefer et al., 2008) extends S queries to support data mining constructs. Bicer et al. (2011) employ a coevolution-based genetic algorithm to learn kernels for RDF data. Recently, methods such as association rule mining and knowledge base fragment extraction have been applied to large Semantic Web databases for tasks like schema induction and learning complex class descriptions (Voelker and Niepert, 2011; Hellmann et al., 2009). However, all these methods either do not scale to large data sizes or do not compute a full relational model of a complete knowledge base. Huang et al. (2011) proposed S, a regularized matrix factorization to predict unknown triples in Semantic Web data. It has been shown in sections 2.6 and 3.7 that R can outperform this approach significantly on data where collective learning is important. Probably most similar to our approach are TR (Franz et al., 2009) and TH (Kolda et al., 2005), which employ the CP decomposition for learning from Semantic Web data and multigraphs. The limited scalability and collective learning ability of CP and CP-ALS compared to R and R-ALS translates therefore also to these models.
3.9 Summary 91
3.9. Summary
In this chapter, we demonstrated that tensor factorization in form of the R decomposition is a suitable approach for relational learning from Linked Data and showed that the proposed approach can scale to large knowledge bases. We showed via a thorough analysis of the computational and of the memory complexity of R-ALS that a sparse implementation scales linearly with the size of relational data. Furthermore, we derived a very efficient way to compute updates of the core tensor within R-ALS, which decreases the runtime complexity with regard to the number of latent components from O(r5) toO(r3) and the memory complexity from O(r4) to O(r2). To handle attributes of entities efficiently, we proposed coupled tensor-matrix factorization, which improves the scalability of the algorithm with regard to the number of attribute values in the data. We reassessed the theoretical analysis with experiments on synthetic data, where we showed that R-ALS scales linearly with the number of entities, predicates, and know facts in a relational data set and can factorize data consisting of millions of entities, hundreds of predicates and billions of known facts. Furthermore, we showed that our approach is able to factorize the Y2 core ontology and predict the types of entities for various higher-level classes in this large knowledge base. Experimentally, we also demonstrated the effectiveness of the R model for difficult tasks that are important to Linked Data such as collective link prediction and taxonomy learning. In its present form, a limitation of R-ALS is its cubic runtime complexity with regard to the number of latent components. While the algorithm is certainly scalable enough to handle large knowledge bases that require only a few hundred of latent components, a full relational model for very complex knowledge bases is currently out of reach. An interesting direction for future work is therefore to investigate if the inclusion of prior knowledge in the factorization can reduce the number of latent components that is needed to model complex knowledge bases. Despite this current limitation, we think that the proposed method opens up an interesting perspective towards learning from complete knowledge bases in the Linked Data cloud.
Chapter 4
An Analysis of Tensor Models
for Learning on Structured Data
While tensor factorizations have become increasingly popular for learning on various forms of structured data, only very few theoretical results exist on the generalization abilities of these methods. In this chapter, we will discuss the tensor product as a principled way to represent structured data in vector spaces for machine learning tasks. By extending known bounds for matrix factorizations, we will derive the first known generalization error bounds for tensor factorizations in a classification setting. This setting subsumes also link prediction on multi-relational data consisting ofn-ary relations. Furthermore, we will analyze analytically and experimentally how tensor factorization behaves when applied to over- and understructured representations, for instance, when two-way tensor factorization, i.e. matrix factorization, is applied to three-way tensor data.
4.1. Introduction
Learning from structured data is a very active line of research in a variety of fields, including social network analysis, natural language processing, bioinformatics, and artificial intelli- gence. While tensor factorizations have a long tradition in psycho- and chemometrics, only more recently have they been applied to various tasks on structured data in machine learning. Examples include link prediction and entity resolution on multi-relational data and large knowledge bases as discussed in this thesis or in (Jenatton et al., 2012; Bordes et al., 2011), item recommendation on sequential data (Rendle et al., 2010; Rettinger et al., 2012), or the