Unifying Knowledge Base Completion with PU Learning to Mitigate the Observation Bias


  • Jonas Schouterden KU Leuven Leuven.AI
  • Jessa Bekker KU Leuven Leuven.AI
  • Jesse Davis KU Leuven Leuven.AI
  • Hendrik Blockeel KU Leuven Leuven.AI




Data Mining & Knowledge Management (DMKM), Machine Learning (ML)


Methods for Knowledge Base Completion (KBC) reason about a knowledge base (KB) in order to derive new facts that should be included in the KB. This is challenging for two reasons. First, KBs only contain positive examples. This complicates model evaluation which needs both positive and negative examples. Second, those facts that were selected to be included in the knowledge base, are most likely not an i.i.d. sample of the true facts, due to the way knowledge bases are constructed. In this paper, we focus on rule-based approaches, which traditionally address the first challenge by making assumptions that enable identifying negative examples, which in turn makes it possible to compute a rule's confidence or precision. However, they largely ignore the second challenge, which means that their estimates of a rule's confidence can be biased. This paper approaches rule-based KBC through the lens of PU-learning, which can cope with both challenges. We make three contributions.: (1) We provide a unifying view that formalizes the relationship between multiple existing confidences measures based on (i) what assumption they make about and (ii) how their accuracy depends on the selection mechanism. (2) We introduce two new confidence measures that can mitigate known biases by using propensity scores that quantify how likely a fact is to be included the KB. (3) We show through theoretical and empirical analysis that taking the bias into account improves the confidence estimates, even when the propensity scores are not known exactly.




How to Cite

Schouterden, J., Bekker, J., Davis, J., & Blockeel, H. (2022). Unifying Knowledge Base Completion with PU Learning to Mitigate the Observation Bias. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4), 4137-4145. https://doi.org/10.1609/aaai.v36i4.20332



AAAI Technical Track on Data Mining and Knowledge Management