Skip to content

semantic-systems/RE-for-KGQA-survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Relation Extraction

Relation Extraction (RE) is the procedure used to detect the relations between various entities in (unstructured/unlabelled) texts. RE is a very actively researched field and there have been a lot of very interesting papers and promising algorithms proposed in the recent years along with a multitude of high quality datasets.

This repositories goal is to provide an overview of the current research challenges and how they are adressed.

If some links appear broken, you can feel free to update that link and issue a pull request. (Or you can just notify me, that's fine too.)

Surveys:

  1. (Bach and Badaskar, 2007) A Review of Relation Extraction
  2. (de Abreu et al., 2013) A review on Relation Extraction with an eye on Portuguese
  3. (Konstantinova, 2014) Review of Relation Extraction Methods: What is New Out There?
  4. (Asghar, 2016) Automatic Extraction of Causal Relations from Natural Language Texts: A Comprehensive Survey
  5. (Kumar, 2017) A Survey of Deep Learning Methods for Relation Extraction
  6. (Pawar et al., 2017) Relation extraction: A survey
  7. (Cui et al., 2017) A Survey on Relation Extraction
  8. (Chakraborty et al., 2019) Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs
  9. (Han et al., 2020) More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction
  10. (Fu et al., 2020) A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges
  11. (Yang et al., 2021) A Survey on Extraction of Causal Relations from Natural Language Text
  12. (Nayak et al., 2021) Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey
  13. (Wang et al., 2021) Deep Neural Network Based Relation Extraction: An Overview
  14. (Aydar et al., 2021) Neural Relation Extraction: A Review
  15. (Pawar et al., 2021) Techniques for Jointly Extracting Entities and Relations A Survey
  16. (Lan et al., 2021) A Survey on Complex Knowledge Base Question Answering: Methods, Challenges and Solutions

Knowledge Graphs / Knowledge Bases

  1. DBpedia Website / GitHub / Paper
  2. Freebase Website / DEPRECATED / Paper
  3. YAGO Website / Latest Release / Paper
  4. Wikidata Website / Paper

Datasets:

Here is a distribution of some of the most used datasets showing their usage frequency in over 550 papers.

drawing

If you created a new dataset or found something missing, please don't hesitate to create a pull request to add it here.

  1. Datasets for Semantic Parsing

    1. LC-QuAD Paper / Website / Repository
    2. LC-QuAD 2.0 Paper / Website
    3. ComplexWebQuestions Paper / Website
    4. WebQuestionsSP Paper / Download
    5. QALD Series Website
    6. CompositionalFreebaseQuestions (CFQ) Paper / Repository
  2. Datasets for Information Retrieval

    1. SimpleQuestions Paper / Repository
    2. WebQuestions Paper / Website
    3. ComplexQuestions (unfortunately, there are 2 datasets with the same name ComplexQuestions)
      1. ComplexQuestions (sometimes referred to as CompQ) Paper / Repository
      2. ComplexQuestions Paper / Website – Note that the dataset was provided by a different author
    4. MetaQA Paper / Repository
  3. Datasets for Reinforcement Learning

    1. UMLS Paper / Repository (MINERVA Repository)
    2. NELL-995 Paper / Repository (MINERVA Repository)
    3. Kinship Paper / Repository (MINERVA Repository)
    4. FB15K-237 Paper (Original FB15K) / Paper (FB15K-237 Variant) / Download (FB15K-237)
    5. WN18RR Paper / Repository
    6. Countries Paper / Repository (MINERVA Repository)
  4. Datasets for Hybrid KGQA

    1. CommonSenseQA Paper / Website
    2. OpenBookQA Paper / Website

Performance Leaderboard per Dataset

When reporting the results of your approach, make sure to be as precise as possible. You would be surprised, how many papers report ambiguous results. If your approach outperforms everyone else for a certain benchmark, make sure to mark it bold.

You should be familiar with the following report metrics, but if you are not, here's a short recap:

TP = True Positive
FP = False Positive
TN = True Negatives
FN = False Negatives

P = Precision

Measures, how many of your true predictions did you get right.
Formula: P = TP / (TP + FP)

R = Recall

How many positive labels did you find out of all the positive labels that exist
Formula = R = TP / (TP + FN)

F = F1

Harmonic mean of precision and recall
F1 = 2 * P * R / (P + R)
= 2 * TP / (2 * TP + FP + FN)

RE = Relation Extraction Subtask

This metric refers solely to the RE subtask, i.e. how well can you find the correct relations. This metric is different from E2E.

E2E = End to End

This metric shows the result of running your algorithm end to end on the dataset's test set. End to end means the whole process from start to finish.



10. MSF

11. NYT

14. KBC

15. PQA



QALD-Series

QALD-5
HCqa (Asadifar et al., 2019)* P = 0.7
R = 1.0
F = 0.81

*) Tested only on 10 questions

QALD-6
HCqa (Asadifar et al., 2019)* P = 0.42
R = 0.42
F = 0.52

*) Tested only on 25 questions

QALD-7
SLING (Mihindukulasooriya et al., 2020) P = 0.57
R = 0.76
F = 0.65
EARL (Dubey et al., 2018) RE = 0.47
GGNN (Sorokin and Gurevych, 2018) P = 0.2686
R = 0.3179
F = 0.2588
QALD-9
SLING (Mihindukulasooriya et al., 2020) P = 0.50
R = 0.64
F = 0.56



LC-QuAD

LC-QuAD 1
SLING (Mihindukulasooriya et al., 2020) P = 0.41
R = 0.44
F = 0.48
EARL (Dubey et al., 2018) RE = 0.36



FreebaseQA

FreebaseQA (Paper / Repository )
Retrieve and Re-rank (Wang et al., 2021) E2E = 0.517



SimpleQuestions

SimpleQuestions
AdvT-MMRD (Zhang et al., 2020) RE = 0.938
E2E = 0.790
MLTA (Wang et al., 2019) RE = 0.824
Question Matching (Abolghasemi et al., 2020) RE = 0.9341
Relation Splitting (Hsiao et al., 2017) E2E = 0.767
KSA-BiGRU (Zhu et al., 2019) P = 0.867
R = 0.848
F = 0.849

E2E = 0.731
Alias Matching (Buzaaba and Amagasa, 2021) RE = 0.8288
E2E = 0.7464
Synthetic Data (Sidiropoulos et al., 2020) RE* (unseen domain) = 0.7041
E2E (seen domain) = 0.77
E2E* (unseen domain) = 0.6657
Transfer Learning with BERT (Lukovnikov et al., 2020) RE = 0.836
E2E = 0.773
Retrieve and Re-rank (Wang et al., 2021) E2E = 0.797
HR-BiLSTM (Yu et al., 2017) RE = 0.933 E2E = 0.787
Multi-View Matching (Yu et al., 2018) RE = 0.9375

*) Average of Micro + Macro

SimpleQuestions-Balanced (Paper / Repository)
HR-BiLSTM (Yu et al., 2017) RE* (seen) = 0.891
RE*(unseen) = 0.412
RE*(seen+unseen avg.) = 0.673
Representation Adapter (Wu et al., 2019) RE* (seen) = 0.8925
RE*(unseen) = 0.7515
RE*(seen+unseen avg.) = 0.83

*) Average of Micro + Macro




WebQuestions + Derivatives

WebQuestions
Support Sentences (Li et al., 2017) P = 0.572
R = 0.396
F = 0.382
E2E = 0.423
QARDTE (Zheng et al., 2018) P = 0.512
R = 0.613
F = 0.558
RE = 0.843
HybQA (Mohamed et al., 2017) F = 0.57
WebQuestionsSP
HR-BiLSTM (Yu et al., 2017) RE = 0.8253
UHOP (Chen et al., 2019) (w/ HR-BiLSTM) RE = 0.8260
OPQL (Sun et al., 2021) RE = 0.8540
E2E = 0.519
Multi-View Matching (Yu et al., 2018) RE = 0.8595
Masking Mechanism (Chen et al., 2018) RE = 0.77
WebQuestionsSP-WD (Paper / Repository)
GGNN (Sorokin and Gurevych, 2018) P = 0.2686
R = 0.3179
F = 0.2588



Free917

Free917 (Original Paper / Data)
QARDTE (Zheng et al., 2018) P = 0.683
R = 0.679
F = 0.663



ComplexQuestions

ComplexQuestions
HCqa (Asadifar et al., 2019) F = 0.536



MetaQA

MetaQA
OPQL (Sun et al., 2021) E2E (2-Hop) = 0.885
E2E (3-Hop) = 0.871
RDAS (Wang et al., 2021) E2E (1-Hop) = 0.991
E2E (2-Hop) = 0.97
E2E (3-Hop) = 0.856
Incremental Sequence Matching (Lan et al., 2019) F = 0.981
E2E (1-Hop) = 0.963
E2E (2-Hop) = 0.991
E2E (3-Hop) = 0.996



PathQuestion

PathQuestion (Paper / Repository)
Incremental Sequence Matching (Lan et al., 2019) F = 0.96
E2E* = 0.967
RDAS (Wang et al., 2021) E2E (2-Hop) = 0.736
E2E (3-Hop) = 0.910

*) 2-Hop and 3-Hop mixed




MSF

MSF (Paper / Repository)
OPQL (Sun et al., 2021) E2E (2-Hop) = 0.492
E2E (3-Hop) = 0.297



NYT

NYT (Paper / Data)
Deep RL (Qin et al., 2018) F* = 0.778
ReQuest (Wu et al., 2017) P = 0.404 R = 0.48 F = 0.439

*) Average




Hybrid QA

ComplexWebQuestions
OPQL (Sun et al., 2021) E2E = 0.407
OpenBookQA
MHGRN (Feng et al., 2020) E2E = 0.806
QA-GNN (Yasunaga et al., 2021) E2E = 0.828
CommonsenseQA
MHGRN (Feng et al., 2020) E2E = 0.765
QA-GNN (Yasunaga et al., 2021) E2E = 0.761



Reinforcment Learning

Kinship
MINERVA (Das et al., 2018) E2E = 0.605
Reward Shaping (Lin et al., 2018) E2E = 0.811
UMLS
MINERVA (Das et al., 2018) E2E = 0.728
Reward Shaping (Lin et al., 2018) E2E = 0.902
Countries
MINERVA (Das et al., 2018) E2E* = 0.9582

*) Average of S1, S2 and S3

WN18RR
MINERVA (Das et al., 2018) E2E = 0.413
Reward Shaping (Lin et al., 2018) E2E = 0.437
FB15K-237
MINERVA (Das et al., 2018) E2E = 0.217
Reward Shaping (Lin et al., 2018) E2E = 0.329
NELL-995
MINERVA (Das et al., 2018) E2E = 0.663
Reward Shaping (Lin et al., 2018) E2E = 0.656



KBC

KBC (Paper / Repository)
ROP (Yin et al., 2018) E2E* = 0.7616

*) Here: the mean average precision




PQA

PQA (Paper / Repository)
ROP (Yin et al., 2018) E2E = 0.907



Research Challenges:

For each solution to a challenge, a short description is provided. If you write a paper, that deals with these challenges, you can create a pull request and add a link to your paper with a short description of the paper. If it fits to no challenge provided here, you may create a new entry and add your paper there. Make sure to add a little description of the new challenge that you added.

Table of Contents


drawingg

Lexical Gap

The lexical gap problems refer to the situation in which the expression of a relation differs in how they are represented in a KB (this problem is also related to the relation linking problem). When faced with the question where was Angela Merkel born? the corresponding relation "birthPlace" does not appear in the question. This means that exact matching procedures would fail in this situation, requiring the usage of a different, softer matching mechanism.

  1. SLING (Mihindukulasooriya et al., 2020)
    • Integrate abstract meaning representation to increase question understanding
  2. AdvT-MMRD (Zhang et al., 2020)
    • Use semantic and literal question-relation matching and incorporate entity type information with adversarial training
  3. MLTA (Wang et al., 2019)
    • Similarity computation between the question and relation candidates on multiple levels using an attention mechanism
  4. Support Sentences (Li et al., 2017)
    • Enrich candidate pairs with support sentences from an external source
  5. Question Matching (Abolghasemi et al., 2020)
    • Find the most matching question to the input question

Incomplete Knowledge Graphs

One of the most known problems in KGQA is that KGs are incomplete (Min et al., 2013), i.e. certain relations or entities are missing, which is natural considering how vast and complex the body of human knowledge is (and that it keeps growing daily). This problem is especially evident in highly technical and specialised areas.

  1. OPQL (Sun et al., 2021)
    • Construct a virtual knowledge base
  2. MINERVA (Das et al., 2018)
    • Infer missing knowledge using RL
  3. Reward Shaping (Lin et al., 2018)
    • Improve reward mechanism of MINERVA
  4. ROP (Yin et al., 2018)
    • Predict KG paths using an RNN to infer new information

Disambiguation Problem

A difficult challenge for QA systems to overcome is the ambiguity of natural language. The problem here is, that certain relations may have the same name but a different meaning depending on the context. An example on a KB level (taken from Hsiao et al., 2017) would be the Freebase relation genre which both appears in the context of film.film.genre as well as music.artist.genre.

  1. Relation Splitting (Hsiao et al., 2017)
    • Further split a relation into its type and property
  2. KSA-BiGRU (Zhu et al., 2019)
    • Computing a probability distribution for every relation
  3. Alias Matching (Buzaaba and Amagasa, 2021)
    • Match alias from question with KB and pick most likely relation
  4. EARL (Dubey et al., 2018)
    • Perform entity and relation linking jointly
  5. HR-BiLSTM (Yu et al., 2017)
    • Use an hierarchical BiLSTM model and entity re-ranking

Noise From Distant Supervision

In some domains, training data is sparse and typically involves manual human labour to annotate correctly. This process is, however, very time consuming and therefore not scalable. To overcome this problem, distant supervision (DS) was proposed, which is able to automatically generate training data. The problem with using DS is that the resulting training data can be very noisy, which in turn degrades the model's performance when trained on that data.

  1. ReQuest (Wu et al., 2018)
    • Use indirect supervision from external QA corpus
  2. Deep RL (Qin et al., 2018)
    • Use a policy-based RL agent to find false positives

Inclusion of Structured Information From Subgraphs

The main idea of this research challenge is that subgraphs - either generated from the input query or from a KB using the input query - contain useful structural information. This structural information could be leveraged to perform KGQA more accurately.

  1. RDAS (Wang et al., 2021)
    • Incorporate information direction within reasoning
  2. GGNN for SP (Sorokin and Gurevych, 2018)
    • Integrate the structure of the semantic query
  3. MHGRN (Feng et al., 2020)
    • Capture relations between entities using a Graph Relation Network

Hybrid Question-Answering

The hybrid QA challenge involves answering question while not only referring to a KB but also use knowledge from external, often natural language textual sources. This can be especially helpful in domains, in which knowledge is not readily available in triplet form. This challenge overlaps with the Incomplete KG challenge.

  1. HCqa (Asadifar et al., 2019)
    • Extract knowledge from text using linguistic patterns
  2. QARDTE (Zheng et al., 2018)
    • NN with attention mechanism to extract features from unstructured text based on the input question to be used during candidate re-ranking
  3. HybQA (Mohamed et al., 2017)
    • Filter answers using Wikipedia as external source

New and Unseen Domains

The authors (Sidiropoulos et al., 2020) define an unseen domain as a domain for which facts exist in a given KB/KG but are absent within the training data.

  1. Representation Adapter (Wu et al., 2019)
    • Use an adapter to map from general purpose representations to task specific ones (model-centric)
  2. Synthetic Data (Sidiropoulos et al., 2020)
    • Generation of synthetic training data (distant supervision) for new, unseen domains (data-centric)

Integration of Language Models for Relation Extraction

Pre-trained language models have learned knowledge in a more general sense, which means that they can struggle in situations in which structured or factual knowledge is required (Kassner and Schütze, 2020). Therefore, using language models alone for KGQA can lead to poor performance. However, leveraging language models with structural information from KGs can lead to better question understanding and increased accuracy (Yasunaga et al., 2021).

  1. Transfer Learning with BERT (Lukovnikov et al., 2020)
    • Use BERT to predict the relation of the input
  2. QA-GNN (Yasunaga et al., 2021)
    • Integrate QA context with KG subgraphs

Candidate Generation

Generating a set of relation candidates for an input query can be a very challenging task as it requires finding solutions for different problems such as finding the right candidates and limiting the candidate size. Furthermore, it is necessary to rank the candidates correctly in order to retrieve the correct answer. The following research addresses these problems.

  1. UHOP (Chen et al., 2019)
    • Lifting the limit of hops without increasing the candidate set's size
  2. Incremental Sequence Matching (Lan et al., 2019)
    • Iterative candidate path generation and pruning
  3. Retrieve and Re-rank (Wang et al., 2021)
    • Create an inverted index and create a candidate set using the TF-IDF algorithm and rank the candidates using BERT

Low Relation Extraction Accuracy

The goal of the following research is to increase the accuracy of RE.

  1. Multi-View Matching (Yu et al., 2018)
    • Match the input question to multiple views from the KG to capture more information
  2. Masking Mechanism (Chen et al., 2018)
    • Set a hop limit of 2 to hide far away relations, which might be irrelevant

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages