首页 > 最新文献

Proceedings of COLING. International Conference on Computational Linguistics最新文献

英文 中文
Structural Bias for Aspect Sentiment Triplet Extraction 面向方面情感三元组提取的结构偏差
Pub Date : 2022-09-02 DOI: 10.48550/arXiv.2209.00820
Chen Zhang, Lei Ren, Fang Ma, Jingang Wang, Wei Yu Wu, Dawei Song
Structural bias has recently been exploited for aspect sentiment triplet extraction (ASTE) and led to improved performance. On the other hand, it is recognized that explicitly incorporating structural bias would have a negative impact on efficiency, whereas pretrained language models (PLMs) can already capture implicit structures. Thus, a natural question arises: Is structural bias still a necessity in the context of PLMs? To answer the question, we propose to address the efficiency issues by using an adapter to integrate structural bias in the PLM and using a cheap-to-compute relative position structure in place of the syntactic dependency structure. Benchmarking evaluation is conducted on the SemEval datasets. The results show that our proposed structural adapter is beneficial to PLMs and achieves state-of-the-art performance over a range of strong baselines, yet with a light parameter demand and low latency. Meanwhile, we give rise to the concern that the current evaluation default with data of small scale is under-confident. Consequently, we release a large-scale dataset for ASTE. The results on the new dataset hint that the structural adapter is confidently effective and efficient to a large scale. Overall, we draw the conclusion that structural bias shall still be a necessity even with PLMs.
结构偏差最近被用于方面情感三联体提取(ASTE),并导致性能的提高。另一方面,人们认识到明确地纳入结构偏见会对效率产生负面影响,而预训练语言模型(PLMs)已经可以捕获隐含结构。因此,一个自然的问题出现了:在plm的背景下,结构偏差仍然是必要的吗?为了回答这个问题,我们建议通过使用适配器来集成PLM中的结构偏差,并使用易于计算的相对位置结构来代替语法依赖结构来解决效率问题。对SemEval数据集进行基准测试评估。结果表明,我们提出的结构适配器有利于plm,并且在一系列强基线上实现了最先进的性能,同时具有低参数需求和低延迟。同时,我们也担心目前小规模数据的评估违约是不自信的。因此,我们发布了一个大规模的ASTE数据集。在新数据集上的结果表明,结构适配器在大规模上是有效的和高效的。总的来说,我们得出的结论是,即使在plm中,结构偏差仍然是必要的。
{"title":"Structural Bias for Aspect Sentiment Triplet Extraction","authors":"Chen Zhang, Lei Ren, Fang Ma, Jingang Wang, Wei Yu Wu, Dawei Song","doi":"10.48550/arXiv.2209.00820","DOIUrl":"https://doi.org/10.48550/arXiv.2209.00820","url":null,"abstract":"Structural bias has recently been exploited for aspect sentiment triplet extraction (ASTE) and led to improved performance. On the other hand, it is recognized that explicitly incorporating structural bias would have a negative impact on efficiency, whereas pretrained language models (PLMs) can already capture implicit structures. Thus, a natural question arises: Is structural bias still a necessity in the context of PLMs? To answer the question, we propose to address the efficiency issues by using an adapter to integrate structural bias in the PLM and using a cheap-to-compute relative position structure in place of the syntactic dependency structure. Benchmarking evaluation is conducted on the SemEval datasets. The results show that our proposed structural adapter is beneficial to PLMs and achieves state-of-the-art performance over a range of strong baselines, yet with a light parameter demand and low latency. Meanwhile, we give rise to the concern that the current evaluation default with data of small scale is under-confident. Consequently, we release a large-scale dataset for ASTE. The results on the new dataset hint that the structural adapter is confidently effective and efficient to a large scale. Overall, we draw the conclusion that structural bias shall still be a necessity even with PLMs.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"11 1","pages":"6736-6745"},"PeriodicalIF":0.0,"publicationDate":"2022-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74248142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
KoCHET: A Korean Cultural Heritage Corpus for Entity-related Tasks KoCHET:用于实体相关任务的韩国文化遗产语料库
Pub Date : 2022-09-01 DOI: 10.48550/arXiv.2209.00367
Gyeongmin Kim, Jinsung Kim, Junyoung Son, Heu-Jeoung Lim
As digitized traditional cultural heritage documents have rapidly increased, resulting in an increased need for preservation and management, practical recognition of entities and typification of their classes has become essential. To achieve this, we propose KoCHET - a Korean cultural heritage corpus for the typical entity-related tasks, i.e., named entity recognition (NER), relation extraction (RE), and entity typing (ET). Advised by cultural heritage experts based on the data construction guidelines of government-affiliated organizations, KoCHET consists of respectively 112,362, 38,765, 113,198 examples for NER, RE, and ET tasks, covering all entity types related to Korean cultural heritage. Moreover, unlike the existing public corpora, modified redistribution can be allowed both domestic and foreign researchers. Our experimental results make the practical usability of KoCHET more valuable in terms of cultural heritage. We also provide practical insights of KoCHET in terms of statistical and linguistic analysis. Our corpus is freely available at https://github.com/Gyeongmin47/KoCHET.
随着数字化传统文化遗产文献的迅速增加,保护和管理的需求日益增加,对实体的实际识别和分类已变得至关重要。为了实现这一目标,我们提出了KoCHET——一个韩国文化遗产语料库,用于典型的实体相关任务,即命名实体识别(NER)、关系提取(RE)和实体分类(ET)。KoCHET根据政府下属机关的数据构建指南,由文化遗产专家提供建议,分别为NER、RE、ET的11万2362件、3万8765件、11万33198件,涵盖了与韩国文化遗产相关的所有实体类型。此外,与现有的公共语料库不同,修改后的再分配可以被国内外研究者所允许。我们的实验结果使KoCHET的实际可用性在文化遗产方面更具价值。我们还提供KoCHET在统计和语言分析方面的实用见解。我们的语料库可在https://github.com/Gyeongmin47/KoCHET免费获得。
{"title":"KoCHET: A Korean Cultural Heritage Corpus for Entity-related Tasks","authors":"Gyeongmin Kim, Jinsung Kim, Junyoung Son, Heu-Jeoung Lim","doi":"10.48550/arXiv.2209.00367","DOIUrl":"https://doi.org/10.48550/arXiv.2209.00367","url":null,"abstract":"As digitized traditional cultural heritage documents have rapidly increased, resulting in an increased need for preservation and management, practical recognition of entities and typification of their classes has become essential. To achieve this, we propose KoCHET - a Korean cultural heritage corpus for the typical entity-related tasks, i.e., named entity recognition (NER), relation extraction (RE), and entity typing (ET). Advised by cultural heritage experts based on the data construction guidelines of government-affiliated organizations, KoCHET consists of respectively 112,362, 38,765, 113,198 examples for NER, RE, and ET tasks, covering all entity types related to Korean cultural heritage. Moreover, unlike the existing public corpora, modified redistribution can be allowed both domestic and foreign researchers. Our experimental results make the practical usability of KoCHET more valuable in terms of cultural heritage. We also provide practical insights of KoCHET in terms of statistical and linguistic analysis. Our corpus is freely available at https://github.com/Gyeongmin47/KoCHET.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"8 1","pages":"3496-3505"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89539439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Focus-Driven Contrastive Learning for Medical Question Summarization 焦点驱动对比学习在医学问题总结中的应用
Pub Date : 2022-09-01 DOI: 10.48550/arXiv.2209.00484
Minghua Zhang, Shuai Dou, Ziyang Wang, Yunfang Wu
Automatic medical question summarization can significantly help the system to understand consumer health questions and retrieve correct answers. The Seq2Seq model based on maximum likelihood estimation (MLE) has been applied in this task, which faces two general problems: the model can not capture well question focus and and the traditional MLE strategy lacks the ability to understand sentence-level semantics. To alleviate these problems, we propose a novel question focus-driven contrastive learning framework (QFCL). Specially, we propose an easy and effective approach to generate hard negative samples based on the question focus, and exploit contrastive learning at both encoder and decoder to obtain better sentence level representations. On three medical benchmark datasets, our proposed model achieves new state-of-the-art results, and obtains a performance gain of 5.33, 12.85 and 3.81 points over the baseline BART model on three datasets respectively. Further human judgement and detailed analysis prove that our QFCL model learns better sentence representations with the ability to distinguish different sentence meanings, and generates high-quality summaries by capturing question focus.
自动医疗问题摘要可以显著地帮助系统理解消费者的健康问题并检索正确的答案。基于最大似然估计(MLE)的Seq2Seq模型被应用于该任务中,该模型面临着两个普遍的问题:模型不能很好地捕获问题焦点,传统的MLE策略缺乏对句子级语义的理解能力。为了缓解这些问题,我们提出了一个新的问题焦点驱动对比学习框架(QFCL)。特别地,我们提出了一种简单有效的基于问题焦点生成硬负样本的方法,并利用编码器和解码器的对比学习来获得更好的句子级表示。在三个医疗基准数据集上,我们提出的模型获得了新的最先进的结果,并在三个数据集上分别获得了5.33,12.85和3.81分的性能增益。进一步的人工判断和详细分析证明,我们的QFCL模型能够更好地学习句子表征,能够区分不同的句子含义,并通过捕捉问题焦点生成高质量的摘要。
{"title":"Focus-Driven Contrastive Learning for Medical Question Summarization","authors":"Minghua Zhang, Shuai Dou, Ziyang Wang, Yunfang Wu","doi":"10.48550/arXiv.2209.00484","DOIUrl":"https://doi.org/10.48550/arXiv.2209.00484","url":null,"abstract":"Automatic medical question summarization can significantly help the system to understand consumer health questions and retrieve correct answers. The Seq2Seq model based on maximum likelihood estimation (MLE) has been applied in this task, which faces two general problems: the model can not capture well question focus and and the traditional MLE strategy lacks the ability to understand sentence-level semantics. To alleviate these problems, we propose a novel question focus-driven contrastive learning framework (QFCL). Specially, we propose an easy and effective approach to generate hard negative samples based on the question focus, and exploit contrastive learning at both encoder and decoder to obtain better sentence level representations. On three medical benchmark datasets, our proposed model achieves new state-of-the-art results, and obtains a performance gain of 5.33, 12.85 and 3.81 points over the baseline BART model on three datasets respectively. Further human judgement and detailed analysis prove that our QFCL model learns better sentence representations with the ability to distinguish different sentence meanings, and generates high-quality summaries by capturing question focus.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"9 1","pages":"6176-6186"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84681299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MultiCoNER: A Large-scale Multilingual Dataset for Complex Named Entity Recognition MultiCoNER:用于复杂命名实体识别的大规模多语言数据集
Pub Date : 2022-08-30 DOI: 10.48550/arXiv.2208.14536
S. Malmasi, Anjie Fang, B. Fetahu, Sudipta Kar, Oleg Rokhlenko
We present AnonData, a large multilingual dataset for Named Entity Recognition that covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as well as multilingual and code-mixing subsets. This dataset is designed to represent contemporary challenges in NER, including low-context scenarios (short and uncased text), syntactically complex entities like movie titles, and long-tail entity distributions. The 26M token dataset is compiled from public resources using techniques such as heuristic-based sentence sampling, template extraction and slotting, and machine translation. We tested the performance of two NER models on our dataset: a baseline XLM-RoBERTa model, and a state-of-the-art NER GEMNET model that leverages gazetteers. The baseline achieves moderate performance (macro-F1=54%). GEMNET, which uses gazetteers, improvement significantly (average improvement of macro-F1=+30%) and demonstrates the difficulty of our dataset. AnonData poses challenges even for large pre-trained language models, and we believe that it can help further research in building robust NER systems.
我们提出了AnonData,一个用于命名实体识别的大型多语言数据集,涵盖了3个领域(Wiki句子,问题和搜索查询),跨越11种语言,以及多语言和代码混合子集。该数据集旨在表示NER中的当代挑战,包括低上下文场景(短文本和无大小写文本)、语法复杂的实体(如电影标题)和长尾实体分布。26M令牌数据集是从公共资源中使用启发式句子采样、模板提取和插槽以及机器翻译等技术编译而成的。我们在数据集上测试了两个NER模型的性能:一个是基线XLM-RoBERTa模型,另一个是利用地名词典的最先进的NER GEMNET模型。基线达到中等性能(macro-F1=54%)。GEMNET使用了词典,显著提高了(macro-F1的平均提高=+30%),并证明了我们数据集的难度。AnonData甚至对大型预训练语言模型也提出了挑战,我们相信它可以帮助进一步研究构建健壮的NER系统。
{"title":"MultiCoNER: A Large-scale Multilingual Dataset for Complex Named Entity Recognition","authors":"S. Malmasi, Anjie Fang, B. Fetahu, Sudipta Kar, Oleg Rokhlenko","doi":"10.48550/arXiv.2208.14536","DOIUrl":"https://doi.org/10.48550/arXiv.2208.14536","url":null,"abstract":"We present AnonData, a large multilingual dataset for Named Entity Recognition that covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as well as multilingual and code-mixing subsets. This dataset is designed to represent contemporary challenges in NER, including low-context scenarios (short and uncased text), syntactically complex entities like movie titles, and long-tail entity distributions. The 26M token dataset is compiled from public resources using techniques such as heuristic-based sentence sampling, template extraction and slotting, and machine translation. We tested the performance of two NER models on our dataset: a baseline XLM-RoBERTa model, and a state-of-the-art NER GEMNET model that leverages gazetteers. The baseline achieves moderate performance (macro-F1=54%). GEMNET, which uses gazetteers, improvement significantly (average improvement of macro-F1=+30%) and demonstrates the difficulty of our dataset. AnonData poses challenges even for large pre-trained language models, and we believe that it can help further research in building robust NER systems.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"71 1","pages":"3798-3809"},"PeriodicalIF":0.0,"publicationDate":"2022-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83334390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Do Language Models Make Human-like Predictions about the Coreferents of Italian Anaphoric Zero Pronouns? 语言模型能对意大利语零代名词的指代物做出类似人类的预测吗?
Pub Date : 2022-08-30 DOI: 10.48550/arXiv.2208.14554
J. Michaelov, B. Bergen
Some languages allow arguments to be omitted in certain contexts. Yet human language comprehenders reliably infer the intended referents of these zero pronouns, in part because they construct expectations about which referents are more likely. We ask whether Neural Language Models also extract the same expectations. We test whether 12 contemporary language models display expectations that reflect human behavior when exposed to sentences with zero pronouns from five behavioral experiments conducted in Italian by Carminati (2005). We find that three models - XGLM 2.9B, 4.5B, and 7.5B - capture the human behavior from all the experiments, with others successfully modeling some of the results. This result suggests that human expectations about coreference can be derived from exposure to language, and also indicates features of language models that allow them to better reflect human behavior.
有些语言允许在某些上下文中省略参数。然而,人类语言理解者可靠地推断出这些零代词的预期所指,部分原因是他们对哪些所指更有可能建立了预期。我们想知道神经语言模型是否也提取了相同的期望。我们从Carminati(2005)用意大利语进行的五项行为实验中测试了12种当代语言模型在面对零代词句子时是否表现出反映人类行为的预期。我们发现三个模型——XGLM 2.9B、4.5B和7.5B——从所有的实验中捕捉到了人类的行为,其他模型成功地模拟了一些结果。这一结果表明,人类对共指的期望可以来自于语言的暴露,也表明语言模型的特征使它们能够更好地反映人类的行为。
{"title":"Do Language Models Make Human-like Predictions about the Coreferents of Italian Anaphoric Zero Pronouns?","authors":"J. Michaelov, B. Bergen","doi":"10.48550/arXiv.2208.14554","DOIUrl":"https://doi.org/10.48550/arXiv.2208.14554","url":null,"abstract":"Some languages allow arguments to be omitted in certain contexts. Yet human language comprehenders reliably infer the intended referents of these zero pronouns, in part because they construct expectations about which referents are more likely. We ask whether Neural Language Models also extract the same expectations. We test whether 12 contemporary language models display expectations that reflect human behavior when exposed to sentences with zero pronouns from five behavioral experiments conducted in Italian by Carminati (2005). We find that three models - XGLM 2.9B, 4.5B, and 7.5B - capture the human behavior from all the experiments, with others successfully modeling some of the results. This result suggests that human expectations about coreference can be derived from exposure to language, and also indicates features of language models that allow them to better reflect human behavior.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"165 1","pages":"1-14"},"PeriodicalIF":0.0,"publicationDate":"2022-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80428145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Reweighting Strategy Based on Synthetic Data Identification for Sentence Similarity 基于句子相似度综合数据识别的重权策略
Pub Date : 2022-08-29 DOI: 10.48550/arXiv.2208.13376
Taehee Kim, chaeHun Park, Jimin Hong, Radhika Dua, E. Choi, J. Choo
Semantically meaningful sentence embeddings are important for numerous tasks in natural language processing. To obtain such embeddings, recent studies explored the idea of utilizing synthetically generated data from pretrained language models(PLMs) as a training corpus. However, PLMs often generate sentences different from the ones written by human. We hypothesize that treating all these synthetic examples equally for training can have an adverse effect on learning semantically meaningful embeddings. To analyze this, we first train a classifier that identifies machine-written sentences and observe that the linguistic features of the sentences identified as written by a machine are significantly different from those of human-written sentences. Based on this, we propose a novel approach that first trains the classifier to measure the importance of each sentence. The distilled information from the classifier is then used to train a reliable sentence embedding model. Through extensive evaluation on four real-world datasets, we demonstrate that our model trained on synthetic data generalizes well and outperforms the baselines.
语义上有意义的句子嵌入对于自然语言处理中的许多任务都很重要。为了获得这样的嵌入,最近的研究探索了利用预训练语言模型(PLMs)合成生成的数据作为训练语料库的想法。然而,plm经常生成与人类写的句子不同的句子。我们假设在训练中平等地对待所有这些合成示例会对学习语义上有意义的嵌入产生不利影响。为了分析这一点,我们首先训练一个识别机器编写的句子的分类器,并观察到由机器编写的句子的语言特征与人类编写的句子的语言特征明显不同。在此基础上,我们提出了一种新的方法,首先训练分类器来衡量每个句子的重要性。然后使用从分类器中提取的信息来训练可靠的句子嵌入模型。通过对四个真实数据集的广泛评估,我们证明了我们的模型在合成数据上训练得很好,并且优于基线。
{"title":"Reweighting Strategy Based on Synthetic Data Identification for Sentence Similarity","authors":"Taehee Kim, chaeHun Park, Jimin Hong, Radhika Dua, E. Choi, J. Choo","doi":"10.48550/arXiv.2208.13376","DOIUrl":"https://doi.org/10.48550/arXiv.2208.13376","url":null,"abstract":"Semantically meaningful sentence embeddings are important for numerous tasks in natural language processing. To obtain such embeddings, recent studies explored the idea of utilizing synthetically generated data from pretrained language models(PLMs) as a training corpus. However, PLMs often generate sentences different from the ones written by human. We hypothesize that treating all these synthetic examples equally for training can have an adverse effect on learning semantically meaningful embeddings. To analyze this, we first train a classifier that identifies machine-written sentences and observe that the linguistic features of the sentences identified as written by a machine are significantly different from those of human-written sentences. Based on this, we propose a novel approach that first trains the classifier to measure the importance of each sentence. The distilled information from the classifier is then used to train a reliable sentence embedding model. Through extensive evaluation on four real-world datasets, we demonstrate that our model trained on synthetic data generalizes well and outperforms the baselines.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"40 1","pages":"4853-4863"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81486453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck 基于变分信息瓶颈的事件参数提取多格式迁移学习模型
Pub Date : 2022-08-27 DOI: 10.48550/arXiv.2208.13017
Jie Zhou, Qi Zhang, Qin Chen, Liang He, Xuanjing Huang
Event argument extraction (EAE) aims to extract arguments with given roles from texts, which have been widely studied in natural language processing. Most previous works have achieved good performance in specific EAE datasets with dedicated neural architectures. Whereas, these architectures are usually difficult to adapt to new datasets/scenarios with various annotation schemas or formats. Furthermore, they rely on large-scale labeled data for training, which is unavailable due to the high labelling cost in most cases. In this paper, we propose a multi-format transfer learning model with variational information bottleneck, which makes use of the information especially the common knowledge in existing datasets for EAE in new datasets. Specifically, we introduce a shared-specific prompt framework to learn both format-shared and format-specific knowledge from datasets with different formats. In order to further absorb the common knowledge for EAE and eliminate the irrelevant noise, we integrate variational information bottleneck into our architecture to refine the shared representation. We conduct extensive experiments on three benchmark datasets, and obtain new state-of-the-art performance on EAE.
事件参数提取(EAE)旨在从文本中提取具有给定角色的参数,这是自然语言处理中广泛研究的问题。大多数先前的工作已经在特定的EAE数据集上取得了良好的性能。然而,这些体系结构通常很难适应具有各种注释模式或格式的新数据集/场景。此外,它们依赖于大规模标记数据进行训练,这在大多数情况下由于标记成本高而无法获得。在本文中,我们提出了一种具有变分信息瓶颈的多格式迁移学习模型,该模型利用现有数据集中的信息特别是公共知识对新数据集中的EAE进行学习。具体来说,我们引入了一个特定于共享的提示框架,以从不同格式的数据集中学习格式共享和格式特定的知识。为了进一步吸收EAE的公共知识,消除无关噪声,我们将变分信息瓶颈集成到我们的体系结构中,以改进共享表示。我们在三个基准数据集上进行了广泛的实验,并获得了新的最先进的EAE性能。
{"title":"A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck","authors":"Jie Zhou, Qi Zhang, Qin Chen, Liang He, Xuanjing Huang","doi":"10.48550/arXiv.2208.13017","DOIUrl":"https://doi.org/10.48550/arXiv.2208.13017","url":null,"abstract":"Event argument extraction (EAE) aims to extract arguments with given roles from texts, which have been widely studied in natural language processing. Most previous works have achieved good performance in specific EAE datasets with dedicated neural architectures. Whereas, these architectures are usually difficult to adapt to new datasets/scenarios with various annotation schemas or formats. Furthermore, they rely on large-scale labeled data for training, which is unavailable due to the high labelling cost in most cases. In this paper, we propose a multi-format transfer learning model with variational information bottleneck, which makes use of the information especially the common knowledge in existing datasets for EAE in new datasets. Specifically, we introduce a shared-specific prompt framework to learn both format-shared and format-specific knowledge from datasets with different formats. In order to further absorb the common knowledge for EAE and eliminate the irrelevant noise, we integrate variational information bottleneck into our architecture to refine the shared representation. We conduct extensive experiments on three benchmark datasets, and obtain new state-of-the-art performance on EAE.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"36 1","pages":"1990-2000"},"PeriodicalIF":0.0,"publicationDate":"2022-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84113200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Domain-Specific NER via Retrieving Correlated Samples 基于检索相关样本的特定领域NER
Pub Date : 2022-08-27 DOI: 10.48550/arXiv.2208.12995
Xin Zhang, Yong Jiang, Xiaobin Wang, Xuming Hu, Yueheng Sun, Pengjun Xie, Meishan Zhang
Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge. Such texts are also difficult for human annotators. In fact, we can obtain some potentially helpful information from correlated texts, which have some common entities, to help the text understanding. Then, one can easily reason out the correct answer by referencing correlated samples. In this paper, we suggest enhancing NER models with correlated samples. We draw correlated samples by the sparse BM25 retriever from large-scale in-domain unlabeled data. To explicitly simulate the human reasoning process, we perform a training-free entity type calibrating by majority voting. To capture correlation features in the training stage, we suggest to model correlated samples by the transformer-based multi-instance cross-encoder. Empirical results on datasets of the above two domains show the efficacy of our methods.
成功的基于机器学习的命名实体识别模型可能会在一些特殊领域的文本上失败,例如中文地址和电子商务标题,这些领域需要足够的背景知识。这样的文本对人类注释者来说也很困难。事实上,我们可以从具有共同实体的相关文本中获得一些潜在的有用信息,以帮助文本理解。然后,人们可以很容易地通过参考相关样本推断出正确的答案。在本文中,我们建议用相关样本来增强NER模型。利用稀疏BM25检索器从大规模域内未标记数据中提取相关样本。为了明确地模拟人类的推理过程,我们通过多数投票执行了一个无需训练的实体类型校准。为了在训练阶段捕获相关特征,我们建议使用基于变压器的多实例交叉编码器对相关样本进行建模。在上述两个领域的数据集上的实证结果表明了我们的方法的有效性。
{"title":"Domain-Specific NER via Retrieving Correlated Samples","authors":"Xin Zhang, Yong Jiang, Xiaobin Wang, Xuming Hu, Yueheng Sun, Pengjun Xie, Meishan Zhang","doi":"10.48550/arXiv.2208.12995","DOIUrl":"https://doi.org/10.48550/arXiv.2208.12995","url":null,"abstract":"Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge. Such texts are also difficult for human annotators. In fact, we can obtain some potentially helpful information from correlated texts, which have some common entities, to help the text understanding. Then, one can easily reason out the correct answer by referencing correlated samples. In this paper, we suggest enhancing NER models with correlated samples. We draw correlated samples by the sparse BM25 retriever from large-scale in-domain unlabeled data. To explicitly simulate the human reasoning process, we perform a training-free entity type calibrating by majority voting. To capture correlation features in the training stage, we suggest to model correlated samples by the transformer-based multi-instance cross-encoder. Empirical results on datasets of the above two domains show the efficacy of our methods.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"11 1","pages":"2398-2404"},"PeriodicalIF":0.0,"publicationDate":"2022-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73057337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications 目的:在教育应用中牢记填空题的答案
Pub Date : 2022-08-26 DOI: 10.48550/arXiv.2208.12505
Yusen Zhang, Zhongli Li, Qingyu Zhou, Ziyi Liu, Chao Li, Mina W. Ma, Yunbo Cao, Hongzhi Liu
To automatically correct handwritten assignments, the traditional approach is to use an OCR model to recognize characters and compare them to answers. The OCR model easily gets confused on recognizing handwritten Chinese characters, and the textual information of the answers is missing during the model inference. However, teachers always have these answers in mind to review and correct assignments. In this paper, we focus on the Chinese cloze tests correction and propose a multimodal approach(named AiM). The encoded representations of answers interact with the visual information of students’ handwriting. Instead of predicting ‘right’ or ‘wrong’, we perform the sequence labeling on the answer text to infer which answer character differs from the handwritten content in a fine-grained way. We take samples of OCR datasets as the positive samples for this task, and develop a negative sample augmentation method to scale up the training data. Experimental results show that AiM outperforms OCR-based methods by a large margin. Extensive studies demonstrate the effectiveness of our multimodal approach.
为了自动纠正手写作业,传统的方法是使用OCR模型识别字符并将其与答案进行比较。OCR模型在识别手写体汉字时容易出现混淆,并且在模型推理过程中缺少答案的文本信息。然而,老师们总是把这些答案记在心里,以便复习和批改作业。本文以汉语完形填空题为研究对象,提出了一种多模态填空修正方法。答案的编码表示与学生手写的视觉信息相互作用。我们不是预测“对”或“错”,而是对答案文本执行序列标记,以细粒度的方式推断哪个答案字符与手写内容不同。我们将OCR数据集的样本作为该任务的正样本,并开发了一种负样本扩增方法来扩展训练数据。实验结果表明,AiM的性能明显优于基于ocr的方法。广泛的研究证明了我们的多模式方法的有效性。
{"title":"AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications","authors":"Yusen Zhang, Zhongli Li, Qingyu Zhou, Ziyi Liu, Chao Li, Mina W. Ma, Yunbo Cao, Hongzhi Liu","doi":"10.48550/arXiv.2208.12505","DOIUrl":"https://doi.org/10.48550/arXiv.2208.12505","url":null,"abstract":"To automatically correct handwritten assignments, the traditional approach is to use an OCR model to recognize characters and compare them to answers. The OCR model easily gets confused on recognizing handwritten Chinese characters, and the textual information of the answers is missing during the model inference. However, teachers always have these answers in mind to review and correct assignments. In this paper, we focus on the Chinese cloze tests correction and propose a multimodal approach(named AiM). The encoded representations of answers interact with the visual information of students’ handwriting. Instead of predicting ‘right’ or ‘wrong’, we perform the sequence labeling on the answer text to infer which answer character differs from the handwritten content in a fine-grained way. We take samples of OCR datasets as the positive samples for this task, and develop a negative sample augmentation method to scale up the training data. Experimental results show that AiM outperforms OCR-based methods by a large margin. Extensive studies demonstrate the effectiveness of our multimodal approach.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"17 1","pages":"3042-3053"},"PeriodicalIF":0.0,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76920347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GRASP: Guiding Model with RelAtional Semantics Using Prompt for Dialogue Relation Extraction 基于对话关系抽取提示的关系语义引导模型
Pub Date : 2022-08-26 DOI: 10.48550/arXiv.2208.12494
Junyoung Son, Jinsung Kim, J. Lim, Heu-Jeoung Lim
The dialogue-based relation extraction (DialogRE) task aims to predict the relations between argument pairs that appear in dialogue. Most previous studies utilize fine-tuning pre-trained language models (PLMs) only with extensive features to supplement the low information density of the dialogue by multiple speakers. To effectively exploit inherent knowledge of PLMs without extra layers and consider scattered semantic cues on the relation between the arguments, we propose a Guiding model with RelAtional Semantics using Prompt (GRASP). We adopt a prompt-based fine-tuning approach and capture relational semantic clues of a given dialogue with 1) an argument-aware prompt marker strategy and 2) the relational clue detection task. In the experiments, GRASP achieves state-of-the-art performance in terms of both F1 and F1c scores on a DialogRE dataset even though our method only leverages PLMs without adding any extra layers.
基于对话的关系提取(dialgre)任务旨在预测出现在对话中的参数对之间的关系。以往的研究大多利用具有广泛特征的微调预训练语言模型(PLMs)来补充多说话者对话的低信息密度。为了有效地利用plm的固有知识而不需要额外的层,并考虑参数之间关系的分散语义线索,我们提出了一个使用提示符(GRASP)的关系语义指导模型。我们采用基于提示的微调方法,通过1)论点感知提示标记策略和2)关系线索检测任务捕获给定对话的关系语义线索。在实验中,尽管我们的方法仅利用plm而不添加任何额外层,但GRASP在dialog数据集上的F1和F1c分数方面都达到了最先进的性能。
{"title":"GRASP: Guiding Model with RelAtional Semantics Using Prompt for Dialogue Relation Extraction","authors":"Junyoung Son, Jinsung Kim, J. Lim, Heu-Jeoung Lim","doi":"10.48550/arXiv.2208.12494","DOIUrl":"https://doi.org/10.48550/arXiv.2208.12494","url":null,"abstract":"The dialogue-based relation extraction (DialogRE) task aims to predict the relations between argument pairs that appear in dialogue. Most previous studies utilize fine-tuning pre-trained language models (PLMs) only with extensive features to supplement the low information density of the dialogue by multiple speakers. To effectively exploit inherent knowledge of PLMs without extra layers and consider scattered semantic cues on the relation between the arguments, we propose a Guiding model with RelAtional Semantics using Prompt (GRASP). We adopt a prompt-based fine-tuning approach and capture relational semantic clues of a given dialogue with 1) an argument-aware prompt marker strategy and 2) the relational clue detection task. In the experiments, GRASP achieves state-of-the-art performance in terms of both F1 and F1c scores on a DialogRE dataset even though our method only leverages PLMs without adding any extra layers.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"81 1","pages":"412-423"},"PeriodicalIF":0.0,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83909297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Proceedings of COLING. International Conference on Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1