首页 > 最新文献

Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)最新文献

英文 中文
Learning Clause Representation from Dependency-Anchor Graph for Connective Prediction 从关联锚图学习子句表示用于连接预测
Yanjun Gao, Ting-Hao 'Kenneth' Huang, R. Passonneau
Semantic representation that supports the choice of an appropriate connective between pairs of clauses inherently addresses discourse coherence, which is important for tasks such as narrative understanding, argumentation, and discourse parsing. We propose a novel clause embedding method that applies graph learning to a data structure we refer to as a dependency-anchor graph. The dependency anchor graph incorporates two kinds of syntactic information, constituency structure, and dependency relations, to highlight the subject and verb phrase relation. This enhances coherence-related aspects of representation. We design a neural model to learn a semantic representation for clauses from graph convolution over latent representations of the subject and verb phrase. We evaluate our method on two new datasets: a subset of a large corpus where the source texts are published novels, and a new dataset collected from students’ essays. The results demonstrate a significant improvement over tree-based models, confirming the importance of emphasizing the subject and verb phrase. The performance gap between the two datasets illustrates the challenges of analyzing student’s written text, plus a potential evaluation task for coherence modeling and an application for suggesting revisions to students.
语义表示支持在对子句之间选择适当的连接词,从本质上解决了语篇连贯问题,这对于叙事理解、论证和语篇分析等任务非常重要。我们提出了一种新的子句嵌入方法,将图学习应用于我们称为依赖锚图的数据结构。依存锚图结合了两种句法信息,即群体结构和依存关系,以突出主语和动词短语的关系。这增强了表征的连贯性。我们设计了一个神经模型,通过对主语和动词短语的潜在表示进行图卷积来学习子句的语义表示。我们在两个新数据集上评估了我们的方法:一个是大型语料库的子集,其中源文本是已发表的小说,另一个是从学生论文中收集的新数据集。结果表明,与基于树的模型相比,该模型有了显著的改进,证实了强调主语和动词短语的重要性。两个数据集之间的表现差距说明了分析学生书面文本的挑战,再加上连贯性建模的潜在评估任务和向学生建议修改的应用程序。
{"title":"Learning Clause Representation from Dependency-Anchor Graph for Connective Prediction","authors":"Yanjun Gao, Ting-Hao 'Kenneth' Huang, R. Passonneau","doi":"10.18653/V1/11.TEXTGRAPHS-1.6","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.6","url":null,"abstract":"Semantic representation that supports the choice of an appropriate connective between pairs of clauses inherently addresses discourse coherence, which is important for tasks such as narrative understanding, argumentation, and discourse parsing. We propose a novel clause embedding method that applies graph learning to a data structure we refer to as a dependency-anchor graph. The dependency anchor graph incorporates two kinds of syntactic information, constituency structure, and dependency relations, to highlight the subject and verb phrase relation. This enhances coherence-related aspects of representation. We design a neural model to learn a semantic representation for clauses from graph convolution over latent representations of the subject and verb phrase. We evaluate our method on two new datasets: a subset of a large corpus where the source texts are published novels, and a new dataset collected from students’ essays. The results demonstrate a significant improvement over tree-based models, confirming the importance of emphasizing the subject and verb phrase. The performance gap between the two datasets illustrates the challenges of analyzing student’s written text, plus a potential evaluation task for coherence modeling and an application for suggesting revisions to students.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115981067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Three-step Method for Multi-Hop Inference Explanation Regeneration 多跳推理解释再生的三步法
Yuejia Xiang, Yunyan Zhang, Xiaoming Shi, Bo Liu, Wandi Xu, Xi Chen
Multi-hop inference for explanation generation is to combine two or more facts to make an inference. The task focuses on generating explanations for elementary science questions. In the task, the relevance between the explanations and the QA pairs is of vital importance. To address the task, a three-step framework is proposed. Firstly, vector distance between two texts is utilized to recall the top-K relevant explanations for each question, reducing the calculation consumption. Then, a selection module is employed to choose those most relative facts in an autoregressive manner, giving a preliminary order for the retrieved facts. Thirdly, we adopt a re-ranking module to re-rank the retrieved candidate explanations with relevance between each fact and the QA pairs. Experimental results illustrate the effectiveness of the proposed framework with an improvement of 39.78% in NDCG over the official baseline.
解释生成的多跳推理是将两个或两个以上的事实结合起来进行推理。这项任务的重点是为基础科学问题提供解释。在任务中,解释和QA对之间的相关性是至关重要的。为了解决这个问题,提出了一个三步走的框架。首先,利用两个文本之间的向量距离来召回每个问题的top-K相关解释,减少计算消耗。然后,使用选择模块以自回归的方式选择那些最相关的事实,给出检索事实的初步顺序。第三,我们采用重新排序模块,根据每个事实与QA对之间的相关性对检索到的候选解释进行重新排序。实验结果证明了该框架的有效性,与官方基线相比,NDCG提高了39.78%。
{"title":"A Three-step Method for Multi-Hop Inference Explanation Regeneration","authors":"Yuejia Xiang, Yunyan Zhang, Xiaoming Shi, Bo Liu, Wandi Xu, Xi Chen","doi":"10.18653/V1/11.TEXTGRAPHS-1.19","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.19","url":null,"abstract":"Multi-hop inference for explanation generation is to combine two or more facts to make an inference. The task focuses on generating explanations for elementary science questions. In the task, the relevance between the explanations and the QA pairs is of vital importance. To address the task, a three-step framework is proposed. Firstly, vector distance between two texts is utilized to recall the top-K relevant explanations for each question, reducing the calculation consumption. Then, a selection module is employed to choose those most relative facts in an autoregressive manner, giving a preliminary order for the retrieved facts. Thirdly, we adopt a re-ranking module to re-rank the retrieved candidate explanations with relevance between each fact and the QA pairs. Experimental results illustrate the effectiveness of the proposed framework with an improvement of 39.78% in NDCG over the official baseline.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125924523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving Human Text Simplification with Sentence Fusion 用句子融合改进人类文本简化
Max Schwarzer, Teerapaun Tanprasert, David Kauchak
The quality of fully automated text simplification systems is not good enough for use in real-world settings; instead, human simplifications are used. In this paper, we examine how to improve the cost and quality of human simplifications by leveraging crowdsourcing. We introduce a graph-based sentence fusion approach to augment human simplifications and a reranking approach to both select high quality simplifications and to allow for targeting simplifications with varying levels of simplicity. Using the Newsela dataset (Xu et al., 2015) we show consistent improvements over experts at varying simplification levels and find that the additional sentence fusion simplifications allow for simpler output than the human simplifications alone.
全自动文本简化系统的质量不够好,不适合在现实环境中使用;相反,使用了人为的简化。在本文中,我们研究了如何通过利用众包来提高人工简化的成本和质量。我们引入了一种基于图的句子融合方法来增强人工简化,并引入了一种重新排序方法来选择高质量的简化并允许不同简单程度的目标简化。使用Newsela数据集(Xu et al., 2015),我们展示了在不同简化级别上比专家的一致改进,并发现额外的句子融合简化比单独的人工简化允许更简单的输出。
{"title":"Improving Human Text Simplification with Sentence Fusion","authors":"Max Schwarzer, Teerapaun Tanprasert, David Kauchak","doi":"10.18653/V1/11.TEXTGRAPHS-1.10","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.10","url":null,"abstract":"The quality of fully automated text simplification systems is not good enough for use in real-world settings; instead, human simplifications are used. In this paper, we examine how to improve the cost and quality of human simplifications by leveraging crowdsourcing. We introduce a graph-based sentence fusion approach to augment human simplifications and a reranking approach to both select high quality simplifications and to allow for targeting simplifications with varying levels of simplicity. Using the Newsela dataset (Xu et al., 2015) we show consistent improvements over experts at varying simplification levels and find that the additional sentence fusion simplifications allow for simpler output than the human simplifications alone.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121689473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Entity Prediction in Knowledge Graphs with Joint Embeddings 基于联合嵌入的知识图实体预测
Matthias Baumgartner, Daniele Dell'Aglio, A. Bernstein
Knowledge Graphs (KGs) have become increasingly popular in the recent years. However, as knowledge constantly grows and changes, it is inevitable to extend existing KGs with entities that emerged or became relevant to the scope of the KG after its creation. Research on updating KGs typically relies on extracting named entities and relations from text. However, these approaches cannot infer entities or relations that were not explicitly stated. Alternatively, embedding models exploit implicit structural regularities to predict missing relations, but cannot predict missing entities. In this article, we introduce a novel method to enrich a KG with new entities given their textual description. Our method leverages joint embedding models, hence does not require entities or relations to be named explicitly. We show that our approach can identify new concepts in a document corpus and transfer them into the KG, and we find that the performance of our method improves substantially when extended with techniques from association rule mining, text mining, and active learning.
近年来,知识图谱(Knowledge Graphs, KGs)变得越来越流行。然而,随着知识的不断增长和变化,不可避免地要用在KG创建后出现或与KG范围相关的实体来扩展现有的KG。更新知识库的研究通常依赖于从文本中提取命名实体和关系。然而,这些方法不能推断没有明确说明的实体或关系。或者,嵌入模型利用隐式结构规律来预测缺失的关系,但不能预测缺失的实体。在本文中,我们引入了一种新的方法,用给定文本描述的新实体来丰富KG。我们的方法利用联合嵌入模型,因此不需要显式地命名实体或关系。我们表明,我们的方法可以识别文档语料库中的新概念并将它们转移到KG中,并且我们发现,当使用关联规则挖掘、文本挖掘和主动学习等技术进行扩展时,我们的方法的性能得到了显着提高。
{"title":"Entity Prediction in Knowledge Graphs with Joint Embeddings","authors":"Matthias Baumgartner, Daniele Dell'Aglio, A. Bernstein","doi":"10.18653/V1/11.TEXTGRAPHS-1.3","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.3","url":null,"abstract":"Knowledge Graphs (KGs) have become increasingly popular in the recent years. However, as knowledge constantly grows and changes, it is inevitable to extend existing KGs with entities that emerged or became relevant to the scope of the KG after its creation. Research on updating KGs typically relies on extracting named entities and relations from text. However, these approaches cannot infer entities or relations that were not explicitly stated. Alternatively, embedding models exploit implicit structural regularities to predict missing relations, but cannot predict missing entities. In this article, we introduce a novel method to enrich a KG with new entities given their textual description. Our method leverages joint embedding models, hence does not require entities or relations to be named explicitly. We show that our approach can identify new concepts in a document corpus and transfer them into the KG, and we find that the performance of our method improves substantially when extended with techniques from association rule mining, text mining, and active learning.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133964887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
TextGraphs 2021 Shared Task on Multi-Hop Inference for Explanation Regeneration TextGraphs 2021多跳推理解释再生共享任务
Peter Alexander Jansen, Mokanarangan Thayaparan, Marco Valentino, Dmitry Ustalov
The Shared Task on Multi-Hop Inference for Explanation Regeneration asks participants to compose large multi-hop explanations to questions by assembling large chains of facts from a supporting knowledge base. While previous editions of this shared task aimed to evaluate explanatory completeness – finding a set of facts that form a complete inference chain, without gaps, to arrive from question to correct answer, this 2021 instantiation concentrates on the subtask of determining relevance in large multi-hop explanations. To this end, this edition of the shared task makes use of a large set of approximately 250k manual explanatory relevancy ratings that augment the 2020 shared task data. In this summary paper, we describe the details of the explanation regeneration task, the evaluation data, and the participating systems. Additionally, we perform a detailed analysis of participating systems, evaluating various aspects involved in the multi-hop inference process. The best performing system achieved an NDCG of 0.82 on this challenging task, substantially increasing performance over baseline methods by 32%, while also leaving significant room for future improvement.
多跳推理解释再生的共享任务要求参与者通过从支持知识库中组装大量事实链来编写对问题的大型多跳解释。虽然这个共享任务的先前版本旨在评估解释的完整性-找到一组形成完整推理链的事实,没有间隙,从问题到正确答案,但这个2021实例集中在确定大型多跳解释中的相关性的子任务上。为此,这个版本的共享任务使用了一组大约25万个手动解释性相关性评级,这些评级增加了2020年共享任务数据。在这篇总结文章中,我们描述了解释再生任务、评估数据和参与系统的细节。此外,我们对参与系统进行了详细的分析,评估了多跳推断过程中涉及的各个方面。在这项具有挑战性的任务中,表现最好的系统实现了0.82的NDCG,大大提高了基准方法32%的性能,同时也为未来的改进留下了很大的空间。
{"title":"TextGraphs 2021 Shared Task on Multi-Hop Inference for Explanation Regeneration","authors":"Peter Alexander Jansen, Mokanarangan Thayaparan, Marco Valentino, Dmitry Ustalov","doi":"10.18653/V1/11.TEXTGRAPHS-1.17","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.17","url":null,"abstract":"The Shared Task on Multi-Hop Inference for Explanation Regeneration asks participants to compose large multi-hop explanations to questions by assembling large chains of facts from a supporting knowledge base. While previous editions of this shared task aimed to evaluate explanatory completeness – finding a set of facts that form a complete inference chain, without gaps, to arrive from question to correct answer, this 2021 instantiation concentrates on the subtask of determining relevance in large multi-hop explanations. To this end, this edition of the shared task makes use of a large set of approximately 250k manual explanatory relevancy ratings that augment the 2020 shared task data. In this summary paper, we describe the details of the explanation regeneration task, the evaluation data, and the participating systems. Additionally, we perform a detailed analysis of participating systems, evaluating various aspects involved in the multi-hop inference process. The best performing system achieved an NDCG of 0.82 on this challenging task, substantially increasing performance over baseline methods by 32%, while also leaving significant room for future improvement.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121770940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Structural Realization with GGNNs ggnn的结构实现
Jinman Zhao, Gerald Penn, Huan Ling
In this paper, we define an abstract task called structural realization that generates words given a prefix of words and a partial representation of a parse tree. We also present a method for solving instances of this task using a Gated Graph Neural Network (GGNN). We evaluate it with standard accuracy measures, as well as with respect to perplexity, in which its comparison to previous work on language modelling serves to quantify the information added to a lexical selection task by the presence of syntactic knowledge. That the addition of parse-tree-internal nodes to this neural model should improve the model, with respect both to accuracy and to more conventional measures such as perplexity, may seem unsurprising, but previous attempts have not met with nearly as much success. We have also learned that transverse links through the parse tree compromise the model’s accuracy at generating adjectival and nominal parts of speech.
在本文中,我们定义了一个称为结构实现的抽象任务,该任务在给定单词前缀和解析树的部分表示的情况下生成单词。我们还提出了一种使用门控图神经网络(GGNN)求解该任务实例的方法。我们用标准的准确性度量来评估它,以及关于困惑,其中它与先前的语言建模工作的比较有助于量化通过句法知识的存在添加到词汇选择任务中的信息。在这个神经模型中添加解析树内部节点应该会提高模型的准确性和更传统的度量(如困惑度),这似乎不足为奇,但之前的尝试并没有取得如此大的成功。我们还了解到,通过解析树的横向链接会损害模型在生成形容词和名义词性部分时的准确性。
{"title":"Structural Realization with GGNNs","authors":"Jinman Zhao, Gerald Penn, Huan Ling","doi":"10.18653/V1/11.TEXTGRAPHS-1.11","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.11","url":null,"abstract":"In this paper, we define an abstract task called structural realization that generates words given a prefix of words and a partial representation of a parse tree. We also present a method for solving instances of this task using a Gated Graph Neural Network (GGNN). We evaluate it with standard accuracy measures, as well as with respect to perplexity, in which its comparison to previous work on language modelling serves to quantify the information added to a lexical selection task by the presence of syntactic knowledge. That the addition of parse-tree-internal nodes to this neural model should improve the model, with respect both to accuracy and to more conventional measures such as perplexity, may seem unsurprising, but previous attempts have not met with nearly as much success. We have also learned that transverse links through the parse tree compromise the model’s accuracy at generating adjectival and nominal parts of speech.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124590548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Graph Convolutional Networks for Jointly Resolving Cross-document Coreference of Entity and Event Mentions 分层图卷积网络联合解决实体和事件提及的跨文档共同引用
Duy Phung, Tuan Ngo Nguyen, Thien Huu Nguyen
This paper studies the problem of cross-document event coreference resolution (CDECR) that seeks to determine if event mentions across multiple documents refer to the same real-world events. Prior work has demonstrated the benefits of the predicate-argument information and document context for resolving the coreference of event mentions. However, such information has not been captured effectively in prior work for CDECR. To address these limitations, we propose a novel deep learning model for CDECR that introduces hierarchical graph convolutional neural networks (GCN) to jointly resolve entity and event mentions. As such, sentence-level GCNs enable the encoding of important context words for event mentions and their arguments while the document-level GCN leverages the interaction structures of event mentions and arguments to compute document representations to perform CDECR. Extensive experiments are conducted to demonstrate the effectiveness of the proposed model.
本文研究了跨文档事件共引用解析(CDECR)问题,该问题旨在确定跨多个文档提及的事件是否指的是相同的现实世界事件。先前的工作已经证明了谓词参数信息和文档上下文在解决事件提及的共同引用方面的好处。然而,在CDECR的前期工作中并没有有效地捕捉到这些信息。为了解决这些限制,我们提出了一种新的CDECR深度学习模型,该模型引入了层次图卷积神经网络(GCN)来共同解决实体和事件提及。因此,句子级GCN支持对事件提及及其参数的重要上下文词进行编码,而文档级GCN利用事件提及和参数的交互结构来计算文档表示以执行CDECR。大量的实验证明了所提出模型的有效性。
{"title":"Hierarchical Graph Convolutional Networks for Jointly Resolving Cross-document Coreference of Entity and Event Mentions","authors":"Duy Phung, Tuan Ngo Nguyen, Thien Huu Nguyen","doi":"10.18653/V1/11.TEXTGRAPHS-1.4","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.4","url":null,"abstract":"This paper studies the problem of cross-document event coreference resolution (CDECR) that seeks to determine if event mentions across multiple documents refer to the same real-world events. Prior work has demonstrated the benefits of the predicate-argument information and document context for resolving the coreference of event mentions. However, such information has not been captured effectively in prior work for CDECR. To address these limitations, we propose a novel deep learning model for CDECR that introduces hierarchical graph convolutional neural networks (GCN) to jointly resolve entity and event mentions. As such, sentence-level GCNs enable the encoding of important context words for event mentions and their arguments while the document-level GCN leverages the interaction structures of event mentions and arguments to compute document representations to perform CDECR. Extensive experiments are conducted to demonstrate the effectiveness of the proposed model.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"323 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115226336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Textgraphs-15 Shared Task System Description : Multi-Hop Inference Explanation Regeneration by Matching Expert Ratings Textgraphs-15共享任务系统描述:匹配专家评级的多跳推理解释再生
Sureshkumar Vivek Kalyan, Sam Witteveen, Martin Andrews
Creating explanations for answers to science questions is a challenging task that requires multi-hop inference over a large set of fact sentences. This year, to refocus the Textgraphs Shared Task on the problem of gathering relevant statements (rather than solely finding a single ‘correct path’), the WorldTree dataset was augmented with expert ratings of ‘relevance’ of statements to each overall explanation. Our system, which achieved second place on the Shared Task leaderboard, combines initial statement retrieval; language models trained to predict the relevance scores; and ensembling of a number of the resulting rankings. Our code implementation is made available at https://github.com/mdda/worldtree_corpus/tree/textgraphs_2021
为科学问题的答案创建解释是一项具有挑战性的任务,它需要对大量事实句进行多跳推理。今年,为了将Textgraphs共享任务的重点重新放在收集相关语句的问题上(而不是仅仅找到一条“正确的路径”),WorldTree数据集增加了专家对语句与每个整体解释的“相关性”评级。我们的系统在共享任务排行榜上排名第二,它结合了初始语句检索;语言模型训练预测相关分数;并综合了一些结果排名。我们的代码实现可以在https://github.com/mdda/worldtree_corpus/tree/textgraphs_2021上获得
{"title":"Textgraphs-15 Shared Task System Description : Multi-Hop Inference Explanation Regeneration by Matching Expert Ratings","authors":"Sureshkumar Vivek Kalyan, Sam Witteveen, Martin Andrews","doi":"10.18653/V1/11.TEXTGRAPHS-1.20","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.20","url":null,"abstract":"Creating explanations for answers to science questions is a challenging task that requires multi-hop inference over a large set of fact sentences. This year, to refocus the Textgraphs Shared Task on the problem of gathering relevant statements (rather than solely finding a single ‘correct path’), the WorldTree dataset was augmented with expert ratings of ‘relevance’ of statements to each overall explanation. Our system, which achieved second place on the Shared Task leaderboard, combines initial statement retrieval; language models trained to predict the relevance scores; and ensembling of a number of the resulting rankings. Our code implementation is made available at https://github.com/mdda/worldtree_corpus/tree/textgraphs_2021","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125720308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MG-BERT: Multi-Graph Augmented BERT for Masked Language Modeling 用于屏蔽语言建模的多图增强BERT
Parishad BehnamGhader, Hossein Zakerinia, Mahdieh Soleymani Baghshah
Pre-trained models like Bidirectional Encoder Representations from Transformers (BERT), have recently made a big leap forward in Natural Language Processing (NLP) tasks. However, there are still some shortcomings in the Masked Language Modeling (MLM) task performed by these models. In this paper, we first introduce a multi-graph including different types of relations between words. Then, we propose Multi-Graph augmented BERT (MG-BERT) model that is based on BERT. MG-BERT embeds tokens while taking advantage of a static multi-graph containing global word co-occurrences in the text corpus beside global real-world facts about words in knowledge graphs. The proposed model also employs a dynamic sentence graph to capture local context effectively. Experimental results demonstrate that our model can considerably enhance the performance in the MLM task.
像来自变形金刚的双向编码器表示(BERT)这样的预训练模型,最近在自然语言处理(NLP)任务中取得了很大的飞跃。然而,这些模型在执行掩码语言建模(MLM)任务时仍然存在一些不足。本文首先介绍了包含不同类型词间关系的多图。然后,我们提出了基于BERT的多图增强BERT (MG-BERT)模型。MG-BERT嵌入标记,同时利用静态多图,其中包含文本语料库中的全局词共现现象,以及知识图中关于词的全局现实世界事实。该模型还采用动态句子图来有效地捕获局部上下文。实验结果表明,该模型可以显著提高传销任务的性能。
{"title":"MG-BERT: Multi-Graph Augmented BERT for Masked Language Modeling","authors":"Parishad BehnamGhader, Hossein Zakerinia, Mahdieh Soleymani Baghshah","doi":"10.18653/V1/11.TEXTGRAPHS-1.12","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.12","url":null,"abstract":"Pre-trained models like Bidirectional Encoder Representations from Transformers (BERT), have recently made a big leap forward in Natural Language Processing (NLP) tasks. However, there are still some shortcomings in the Masked Language Modeling (MLM) task performed by these models. In this paper, we first introduce a multi-graph including different types of relations between words. Then, we propose Multi-Graph augmented BERT (MG-BERT) model that is based on BERT. MG-BERT embeds tokens while taking advantage of a static multi-graph containing global word co-occurrences in the text corpus beside global real-world facts about words in knowledge graphs. The proposed model also employs a dynamic sentence graph to capture local context effectively. Experimental results demonstrate that our model can considerably enhance the performance in the MLM task.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123839007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Keyword Extraction Using Unsupervised Learning on the Document’s Adjacency Matrix 基于文档邻接矩阵的无监督学习关键字提取
Eirini Papagiannopoulou, Grigorios Tsoumakas, A. Papadopoulos
This work revisits the information given by the graph-of-words and its typical utilization through graph-based ranking approaches in the context of keyword extraction. Recent, well-known graph-based approaches typically employ the knowledge from word vector representations during the ranking process via popular centrality measures (e.g., PageRank) without giving the primary role to vectors’ distribution. We consider the adjacency matrix that corresponds to the graph-of-words of a target text document as the vector representation of its vocabulary. We propose the distribution-based modeling of this adjacency matrix using unsupervised (learning) algorithms. The efficacy of the distribution-based modeling approaches compared to state-of-the-art graph-based methods is confirmed by an extensive experimental study according to the F1 score. Our code is available on GitHub.
这项工作重新审视了词图给出的信息,并通过基于图的排名方法在关键字提取的背景下对其进行了典型的利用。最近,众所周知的基于图的方法通常在通过流行的中心性度量(例如PageRank)在排名过程中使用来自词向量表示的知识,而不给予向量分布的主要作用。我们考虑与目标文本文档的词图对应的邻接矩阵作为其词汇表的向量表示。我们提出使用无监督(学习)算法对邻接矩阵进行基于分布的建模。与最先进的基于图的建模方法相比,基于分布的建模方法的有效性得到了根据F1分数进行的广泛实验研究的证实。我们的代码可以在GitHub上找到。
{"title":"Keyword Extraction Using Unsupervised Learning on the Document’s Adjacency Matrix","authors":"Eirini Papagiannopoulou, Grigorios Tsoumakas, A. Papadopoulos","doi":"10.18653/V1/11.TEXTGRAPHS-1.9","DOIUrl":"https://doi.org/10.18653/V1/11.TEXTGRAPHS-1.9","url":null,"abstract":"This work revisits the information given by the graph-of-words and its typical utilization through graph-based ranking approaches in the context of keyword extraction. Recent, well-known graph-based approaches typically employ the knowledge from word vector representations during the ranking process via popular centrality measures (e.g., PageRank) without giving the primary role to vectors’ distribution. We consider the adjacency matrix that corresponds to the graph-of-words of a target text document as the vector representation of its vocabulary. We propose the distribution-based modeling of this adjacency matrix using unsupervised (learning) algorithms. The efficacy of the distribution-based modeling approaches compared to state-of-the-art graph-based methods is confirmed by an extensive experimental study according to the F1 score. Our code is available on GitHub.","PeriodicalId":332938,"journal":{"name":"Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)","volume":"27 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121018075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1