论文标题
使用约束子序列内核提取N- ARY跨句子关系
Extracting N-ary Cross-sentence Relations using Constrained Subsequence Kernel
论文作者
论文摘要
过去的大多数工作中的大多数工作都涉及一个句子中发生的关系,只有两个实体参数。我们提出了一个新的公式,即关系提取任务,其中关系比句子内关系更一般,因为它们可能跨越多个句子并且可能有两个以上的论点。此外,这种关系比语料库级的关系更具体,因为它们的范围仅限于文档中,并且在整个语料库中无效。我们提出了一种新颖的序列表示,以表征这种关系的实例。然后,我们探索各种分类器,其特征是从此序列表示的。对于SVM分类器,我们设计了一个受约束的子序列内核,该子核是广义子序列的变体。我们在两个领域的三个数据集上评估了我们的方法:生物医学和一般领域。
Most of the past work in relation extraction deals with relations occurring within a sentence and having only two entity arguments. We propose a new formulation of the relation extraction task where the relations are more general than intra-sentence relations in the sense that they may span multiple sentences and may have more than two arguments. Moreover, the relations are more specific than corpus-level relations in the sense that their scope is limited only within a document and not valid globally throughout the corpus. We propose a novel sequence representation to characterize instances of such relations. We then explore various classifiers whose features are derived from this sequence representation. For SVM classifier, we design a Constrained Subsequence Kernel which is a variant of Generalized Subsequence Kernel. We evaluate our approach on three datasets across two domains: biomedical and general domain.