5.3 Experiments
5.3.3 Case Study
Lastly, we provide some case studies on hyperparameter values, and some examples of relation prediction.
5.3.3.1 Hyperparameter study
We examine the hyperparameterα1, which is the trade-off between CSM and HM. The result based on relation prediction on YG15k is shown in Fig. 5.3. As we can see, although enabling HM with even a small value of α1 can noticeably leverage the performance of On2Vec, the influence of different values ofα1 is not very notable, and the accuracy does not always go up along with the higherα1. In practice, α1 may be fine-tuned for marginal improvement, while α1 = 0.75can be empirically selected.
5.3.3.2 Examples of relation prediction
Relation prediction is also performed for the complete data set of CN30k and YG60k. To do so, we randomly select 20 million pairs of unlinked concepts from these two data sets, and rank all the predictions based on the dissimilarity scoreSd. Then top-ranked predictions are selected. Human
evaluation is used in this procedure, since there is no ground truth for the relation facts that are not pre-existing. Like previous works (Lin et al., 2016; Zeng et al., 2015), we aggregateP@200, i.e. the precision on the 200 predictions with highest confidence, which results in 73% and 71% respectively. Some examples of top-ranked predictions are shown in Table 5.5.
5.4
Conclusion
This chapter proposes a greatly improved translation-based graph embedding method that helps ontology population by way of relation prediction. The proposedOn2Vecmodel can effectively address the learning issues on the two categories of complex semantic relations in ontology graphs, and improves previous methods using two dedicated component models. Extensive experiments on four data sets show promising capability ofOn2Vecon predicting and verifying relation facts. The results here are very encouraging, but we also point out opportunities for further work and improvements. In particular, we should explore the effects of other possible forms of component- specific projections, such as dynamic mapping matrices and bilinear mappings. Encoding other information such as the domain and range information of concepts may also improve the precision of our tasks. More advanced applications may also be developed usingOn2Vec such as ontology- boosted question answering.
CHAPTER 6
Embedding Uncertain Knowledge Graphs
Uncertain knowledge graphsassociate every relation facts with a confidence score that represents the likelihood of a relation fact. Examples of such knowledge graphs include commonsense knowl- edge graphs Probase (Wu et al., 2012) and NELL (Mitchell et al., 2018), and biological knowledge graphs STRING (Szklarczyk et al., 2016) and SKEMPI (Moal and Fernández-Recio, 2012). In this chapter, we propose a representation learning method for uncertain knowledge graphs (Chen et al., 2019c).
6.1
Introduction
While current methods focus on embedding deterministic knowledge, it is critical to incorporate uncertainty information into knowledge sources for several reasons. First, uncertainty is the nature of many forms of knowledge. An example of naturally uncertain knowledge is the interactions between proteins. As molecular reactions are random processes, biologists label the confidence protein-protein interactions with the evidence for the occurrence of interactions, and represent them as uncertain knowledge graphs called protein-protein interaction (PPI) graphs. Second, uncertainty enhances inference in knowledge-driven applications. For example, short text understanding often entails interpreting real-world concepts that are ambiguous or intrinsically vague. The probabilistic knowledge graph Probase (Wu et al., 2012) provides a prior probability distribution of concepts for different English terms, and such probabilistic representations have critically supported short text understanding tasks involving disambiguation. (Wang and Wang, 2016; Wang et al., 2015).
Besides, uncertain knowledge representations have benefited various other applications, such as question answering(Yih et al., 2013) and named entity recognition (Ratinov and Roth, 2009).
To capture the quantified uncertainty information with multi-relational embeddings remains an unresolved problem. This is a non-trivial task for several reasons. First, to capture uncertainty, the embeddings need to encode additional information, as the deterministic knowledge graph em- bedding methods only reflects if a relation exists between entities . Second, while deterministic knowledge graph embedding models target at minimizing the estimated probability of false triples via negative sampling to enhance model learning, there is no clear border between observed low- confidence relation facts and unseen relation facts. Embedding models for deterministic knowl- edge graphs assume all unseen relation facts are in false beliefs. Therefore, they maximize the probability of observed training cases, and minimize the probability of unseen relation facts. How- ever, since knowledge graphs are far from complete, unseen relation facts can represent unknown positive cases. This problem is especially significant to uncertain knowledge graphs. Existing techniques hence fall short at differentiating low-confidence relation facts from unseen relation facts.
To address the above issues, we propose a new embedding model calledUKGE(Uncertain Knowledge Graph Embedding), which learns embeddings of entities and relations on un- certain knowledge graphs according to confidence scores. To enhance the precision of UKGEfor predicting the uncertainty of unseen relation facts, we incorporateprobabilistic soft logicinto the learning process, which seeks to propagate the confidence information of unseen relation facts. We define three variants ofUKGEthat differ in the mappings from the triple plausibility estimation to confidence scores. We conducted extensive experiments on three real-world uncertain knowledge graphs for three tasks: (i)confidence predictionseeks to predict confidence scores of unseen rela- tion facts; (ii) relation fact rankingfocuses on retrieving tail entities for the query(h, r,?t), and ranking these retrieved tails in the right order; (iii) relation fact classification decides whether a given relation fact is a "strong" relation fact with high confidence.
6.2
Related Work
To the best of our knowledge, there has been no previous work on learning embeddings for un- certain knowledge graphs. We hereby discuss the next besides deterministic knowledge graph embedding methods that have been discussed in Section 2.1, we discuss the next two lines of work that are closely related to this topic.
Uncertain Knowledge GraphsAn uncertain knowledge graph provides a confidence score along with every relation fact. The development of relation extraction and crowdsourcing in recent years enabled the construction of large-scale uncertain knowledge bases. ConceptNet (Speer et al., 2017) is a multilingual uncertain knowledge graph for commonsense knowledge that is collected via crowdsourcing. The confidence scores in ConceptNet mainly come from the co-occurrence fre- quency of the labels in crowdsourced task results. Probase (Wu et al., 2012) consists of an universal probabilistic taxonomy that is built by relation extraction. Every fact in Probase is associated with a joint probability PisA(x, y). NELL (Mitchell et al., 2018) collects relation facts from reading
web pages, and learns their confidence scores from semi-supervised learning with Expectation- maximum (EM) algorithm. Aforementioned uncertain knowledge graphs have enabled numerous knowledge-driven applications. For example, Wang and Wang (Wang and Wang, 2016) utilize Probase to help understand short texts.
One recent work has proposed a matrix-factorization-based approach to embed uncertain net- works (Hu et al., 2017). However, it cannot be generalized to embed uncertain knowledge graphs, as the model only considers the node proximity in such networks without explicit relations, and only generates the node embeddings. As far as we know, we are among the first to study the uncertain knowledge graph embedding problem.
Probabilistic Soft LogicProbabilistic soft logic (PSL) (Kimmig et al., 2012) is a framework for probabilistic reasoning. A PSL program consists of a set of first-order logic rules with conjunctive bodies and single literal heads. PSL takes the confidence from interval [0,1] as the soft truth valuesfor every atom. It usesLukasiewics t-norm(Lukasiewicz and Straccia, 2008) to determine to which degree a ground rule is satisfied. PSL is widely used for Most Probable Explanation
(MPE) inference and Maximum a Posteriori (MAP) inference on Hinge-Loss Markov Random Field (HL-MRF) (Bach et al., 2013). PSL, in combination with HL-MRF, are widely used in probabilistic reasoning tasks, such as social-trust prediction and preference prediction (Bach et al., 2013, 2017). In this work, we adopt PSL to support the inference for unseen relation facts.