# Relation Extraction

By Jun Tian 2018-03-21

The task of relation extraction(RE) is to extract links between pairs of nominals.

Given a sentence $$S$$ with annotated pairs of nominals $$e_1$$ and $$e_2$$, we aim to identify the relations between $$e_1$$ and $$e_2$$.

Typical relation types include birthdate(PER, DATE), and founder-of(PER, ORG), with examples for relations being birthdate(John Smith, 1985-01-01) or founder-of(Bill Gates, Microsoft).

## Approaches

### Rule Based Approaches

There are two different types of rule-based approaches:

• Those which are stand-alone (poor recall, unable to generalize to unseen patterns).
• Those which learn rules for inference to complement other relation extraction approaches.
1. Reiss, Frederick, et al. “An algebraic approach to rule-based information extraction.” Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on. IEEE, 2008.
2. Dong, Xin, et al. “Knowledge vault: A web-scale approach to probabilistic knowledge fusion.” Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014. This starts with a pair of entities that are known to be in a relation according to a (seed) knowledge base, then performs random walk over the knowledge graph to find other paths that connect these entities.

### Supervised Approaches

Pros: Best performance, provided that a sufficient amount of labeled training data is available. Cons: Labeled training data is expensive to produce and thus limited in quantity.

Typical features:

• n-grams of words to left and right of entities;
• n-grams of POS of words to left and right of entities;
• flag indicating which entity came first in sentence;
• sequence of POS tags and bag of words (BOW) between the two entities;
• dependency path between subject and object;
• POS tags of words on the dependency path between the two entities; and
• lemmas on the dependency path.
• relation embeddings

### Unsupervised Approaches

Pros: Leverage large amounts of data and extract large numbers of relations.