首页 > 最新文献

AACL Bioflux最新文献

英文 中文
Enhancing Tabular Reasoning with Pattern Exploiting Training 利用模式挖掘训练提高表格推理能力
Q3 Environmental Science Pub Date : 2022-10-21 DOI: 10.48550/arXiv.2210.12259
Abhilash Shankarampeta, Vivek Gupta, Shuo Zhang
Recent methods based on pre-trained language models have exhibited superior performance over tabular tasks (e.g., tabular NLI), despite showing inherent problems such as not using the right evidence and inconsistent predictions across inputs while reasoning over the tabular data (Gupta et al., 2021). In this work, we utilize Pattern-Exploiting Training (PET) (i.e., strategic MLM) on pre-trained language models to strengthen these tabular reasoning models’ pre-existing knowledge and reasoning abilities. Our upgraded model exhibits a superior understanding of knowledge facts and tabular reasoning compared to current baselines. Additionally, we demonstrate that such models are more effective for underlying downstream tasks of tabular inference on INFOTABS. Furthermore, we show our model’s robustness against adversarial sets generated through various character and word level perturbations.
最近基于预训练语言模型的方法表现出优于表格任务(例如,表格NLI)的性能,尽管存在固有的问题,例如在对表格数据进行推理时没有使用正确的证据和跨输入的预测不一致(Gupta等人,2021)。在这项工作中,我们在预训练的语言模型上使用模式开发训练(PET)(即战略MLM)来增强这些表格推理模型的预先存在的知识和推理能力。与目前的基线相比,我们升级的模型显示出对知识事实和表格推理的更好理解。此外,我们证明了这种模型对于INFOTABS上的表格推理的底层下游任务更有效。此外,我们展示了我们的模型对通过各种字符和单词水平扰动产生的对抗集的鲁棒性。
{"title":"Enhancing Tabular Reasoning with Pattern Exploiting Training","authors":"Abhilash Shankarampeta, Vivek Gupta, Shuo Zhang","doi":"10.48550/arXiv.2210.12259","DOIUrl":"https://doi.org/10.48550/arXiv.2210.12259","url":null,"abstract":"Recent methods based on pre-trained language models have exhibited superior performance over tabular tasks (e.g., tabular NLI), despite showing inherent problems such as not using the right evidence and inconsistent predictions across inputs while reasoning over the tabular data (Gupta et al., 2021). In this work, we utilize Pattern-Exploiting Training (PET) (i.e., strategic MLM) on pre-trained language models to strengthen these tabular reasoning models’ pre-existing knowledge and reasoning abilities. Our upgraded model exhibits a superior understanding of knowledge facts and tabular reasoning compared to current baselines. Additionally, we demonstrate that such models are more effective for underlying downstream tasks of tabular inference on INFOTABS. Furthermore, we show our model’s robustness against adversarial sets generated through various character and word level perturbations.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"265 1","pages":"706-726"},"PeriodicalIF":0.0,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87525650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
AugCSE: Contrastive Sentence Embedding with Diverse Augmentations 不同增词的对比句嵌入
Q3 Environmental Science Pub Date : 2022-10-20 DOI: 10.48550/arXiv.2210.13749
Zilu Tang, Muhammed Yusuf Kocyigit, D. Wijaya
Data augmentation techniques have been proven useful in many applications in NLP fields. Most augmentations are task-specific, and cannot be used as a general-purpose tool. In our work, we present AugCSE, a unified framework to utilize diverse sets of data augmentations to achieve a better, general-purpose, sentence embedding model. Building upon the latest sentence embedding models, our approach uses a simple antagonistic discriminator that differentiates the augmentation types. With the finetuning objective borrowed from domain adaptation, we show that diverse augmentations, which often lead to conflicting contrastive signals, can be tamed to produce a better and more robust sentence representation. Our methods achieve state-of-the-art results on downstream transfer tasks and perform competitively on semantic textual similarity tasks, using only unsupervised data.
数据增强技术已被证明在自然语言处理领域的许多应用中是有用的。大多数增强功能都是特定于任务的,不能用作通用工具。在我们的工作中,我们提出了AugCSE,这是一个统一的框架,可以利用不同的数据增强集来实现更好的通用句子嵌入模型。基于最新的句子嵌入模型,我们的方法使用一个简单的拮抗鉴别器来区分增强类型。通过借鉴领域自适应的微调目标,我们证明了不同的增强通常会导致冲突的对比信号,可以被驯服以产生更好和更鲁棒的句子表示。我们的方法在下游传输任务上取得了最先进的结果,并在语义文本相似任务上具有竞争力,仅使用无监督数据。
{"title":"AugCSE: Contrastive Sentence Embedding with Diverse Augmentations","authors":"Zilu Tang, Muhammed Yusuf Kocyigit, D. Wijaya","doi":"10.48550/arXiv.2210.13749","DOIUrl":"https://doi.org/10.48550/arXiv.2210.13749","url":null,"abstract":"Data augmentation techniques have been proven useful in many applications in NLP fields. Most augmentations are task-specific, and cannot be used as a general-purpose tool. In our work, we present AugCSE, a unified framework to utilize diverse sets of data augmentations to achieve a better, general-purpose, sentence embedding model. Building upon the latest sentence embedding models, our approach uses a simple antagonistic discriminator that differentiates the augmentation types. With the finetuning objective borrowed from domain adaptation, we show that diverse augmentations, which often lead to conflicting contrastive signals, can be tamed to produce a better and more robust sentence representation. Our methods achieve state-of-the-art results on downstream transfer tasks and perform competitively on semantic textual similarity tasks, using only unsupervised data.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"21 1","pages":"375-398"},"PeriodicalIF":0.0,"publicationDate":"2022-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73372637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
GCDT: A Chinese RST Treebank for Multigenre and Multilingual Discourse Parsing 面向多体裁、多语言语篇分析的汉语RST树库
Q3 Environmental Science Pub Date : 2022-10-19 DOI: 10.48550/arXiv.2210.10449
Siyao Peng, Yang Janet Liu, Amir Zeldes
A lack of large-scale human-annotated data has hampered the hierarchical discourse parsing of Chinese. In this paper, we present GCDT, the largest hierarchical discourse treebank for Mandarin Chinese in the framework of Rhetorical Structure Theory (RST). GCDT covers over 60K tokens across five genres of freely available text, using the same relation inventory as contemporary RST treebanks for English. We also report on this dataset’s parsing experiments, including state-of-the-art (SOTA) scores for Chinese RST parsing and RST parsing on the English GUM dataset, using cross-lingual training in Chinese and English with multilingual embeddings.
大规模人工标注数据的缺乏阻碍了汉语分层语篇分析。本文提出了在修辞结构理论(RST)框架下最大的汉语分层语篇树库GCDT。GCDT涵盖了5种免费文本类型的60K多个标记,使用与当代RST英语树库相同的关系库。我们还报告了该数据集的分析实验,包括中文RST分析的最先进(SOTA)分数和英语GUM数据集上的RST分析,使用多语言嵌入的中英文跨语言训练。
{"title":"GCDT: A Chinese RST Treebank for Multigenre and Multilingual Discourse Parsing","authors":"Siyao Peng, Yang Janet Liu, Amir Zeldes","doi":"10.48550/arXiv.2210.10449","DOIUrl":"https://doi.org/10.48550/arXiv.2210.10449","url":null,"abstract":"A lack of large-scale human-annotated data has hampered the hierarchical discourse parsing of Chinese. In this paper, we present GCDT, the largest hierarchical discourse treebank for Mandarin Chinese in the framework of Rhetorical Structure Theory (RST). GCDT covers over 60K tokens across five genres of freely available text, using the same relation inventory as contemporary RST treebanks for English. We also report on this dataset’s parsing experiments, including state-of-the-art (SOTA) scores for Chinese RST parsing and RST parsing on the English GUM dataset, using cross-lingual training in Chinese and English with multilingual embeddings.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"35 1","pages":"382-391"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78128544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Domain Specific Sub-network for Multi-Domain Neural Machine Translation 多领域神经机器翻译的领域特定子网络
Q3 Environmental Science Pub Date : 2022-10-18 DOI: 10.48550/arXiv.2210.09805
Amr Hendy, M. Abdelghaffar, M. Afify, Ahmed Tawfik
This paper presents Domain-Specific Sub-network (DoSS). It uses a set of masks obtained through pruning to define a sub-network for each domain and finetunes the sub-network parameters on domain data. This performs very closely and drastically reduces the number of parameters compared to finetuning the whole network on each domain. Also a method to make masks unique per domain is proposed and shown to greatly improve the generalization to unseen domains. In our experiments on German to English machine translation the proposed method outperforms the strong baseline of continue training on multi-domain (medical, tech and religion) data by 1.47 BLEU points. Also continue training DoSS on new domain (legal) outperforms the multi-domain (medical, tech, religion, legal) baseline by 1.52 BLEU points.
本文提出了域特定子网(DoSS)。它利用剪枝得到的一组掩码为每个域定义一个子网,并根据域数据对子网参数进行微调。与在每个域上微调整个网络相比,这执行得非常接近,并且大大减少了参数的数量。提出了一种使掩码在每个域上唯一的方法,并证明了这种方法可以大大提高对未知域的泛化能力。在我们的德语到英语机器翻译实验中,所提出的方法比多领域(医学、科技和宗教)数据继续训练的强基线高出1.47 BLEU点。同时继续培训DoSS在新领域(法律)上的表现比多领域(医疗,技术,宗教,法律)基线高出1.52 BLEU点。
{"title":"Domain Specific Sub-network for Multi-Domain Neural Machine Translation","authors":"Amr Hendy, M. Abdelghaffar, M. Afify, Ahmed Tawfik","doi":"10.48550/arXiv.2210.09805","DOIUrl":"https://doi.org/10.48550/arXiv.2210.09805","url":null,"abstract":"This paper presents Domain-Specific Sub-network (DoSS). It uses a set of masks obtained through pruning to define a sub-network for each domain and finetunes the sub-network parameters on domain data. This performs very closely and drastically reduces the number of parameters compared to finetuning the whole network on each domain. Also a method to make masks unique per domain is proposed and shown to greatly improve the generalization to unseen domains. In our experiments on German to English machine translation the proposed method outperforms the strong baseline of continue training on multi-domain (medical, tech and religion) data by 1.47 BLEU points. Also continue training DoSS on new domain (legal) outperforms the multi-domain (medical, tech, religion, legal) baseline by 1.52 BLEU points.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"9 1","pages":"351-356"},"PeriodicalIF":0.0,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78595462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic Evaluation of Predictive Fairness 预测公平性的系统评价
Q3 Environmental Science Pub Date : 2022-10-17 DOI: 10.48550/arXiv.2210.08758
Xudong Han, Aili Shen, Trevor Cohn, Timothy Baldwin, Lea Frermann
Mitigating bias in training on biased datasets is an important open problem. Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of target class imbalance and stereotyping is under-studied. To address this gap, we examine the performance of various debiasing methods across multiple tasks, spanning binary classification (Twitter sentiment), multi-class classification (profession prediction), and regression (valence prediction). Through extensive experimentation, we find that data conditions have a strong influence on relative model performance, and that general conclusions cannot be drawn about method efficacy when evaluating only on standard datasets, as is current practice in fairness research.
在有偏数据集的训练中减少偏差是一个重要的开放性问题。已经提出了几种技术,但是考虑到非常狭窄的数据条件,典型的评价制度非常有限。例如,目标阶级不平衡和刻板印象的影响尚未得到充分研究。为了解决这一差距,我们研究了跨多个任务的各种去偏方法的性能,包括二元分类(Twitter情绪)、多类分类(职业预测)和回归(价预测)。通过广泛的实验,我们发现数据条件对模型的相对性能有很强的影响,并且仅在标准数据集上进行评估时无法得出关于方法有效性的一般结论,这是目前公平性研究的实践。
{"title":"Systematic Evaluation of Predictive Fairness","authors":"Xudong Han, Aili Shen, Trevor Cohn, Timothy Baldwin, Lea Frermann","doi":"10.48550/arXiv.2210.08758","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08758","url":null,"abstract":"Mitigating bias in training on biased datasets is an important open problem. Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of target class imbalance and stereotyping is under-studied. To address this gap, we examine the performance of various debiasing methods across multiple tasks, spanning binary classification (Twitter sentiment), multi-class classification (profession prediction), and regression (valence prediction). Through extensive experimentation, we find that data conditions have a strong influence on relative model performance, and that general conclusions cannot be drawn about method efficacy when evaluating only on standard datasets, as is current practice in fairness research.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"114 1","pages":"68-81"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81042412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World 有些语言比其他语言更平等:对NLP世界语言差异的深入探讨
Q3 Environmental Science Pub Date : 2022-10-16 DOI: 10.48550/arXiv.2210.08523
Surangika Ranathunga, Nisansa de Silva
Linguistic disparity in the NLP world is a problem that has been widely acknowledged recently. However, different facets of this problem, or the reasons behind this disparity are seldom discussed within the NLP community. This paper provides a comprehensive analysis of the disparity that exists within the languages of the world. We show that simply categorising languages considering data availability may not be always correct. Using an existing language categorisation based on speaker population and vitality, we analyse the distribution of language data resources, amount of NLP/CL research, inclusion in multilingual web-based platforms and the inclusion in pre-trained multilingual models. We show that many languages do not get covered in these resources or platforms, and even within the languages belonging to the same language group, there is wide disparity. We analyse the impact of family, geographical location, GDP and the speaker population of languages and provide possible reasons for this disparity, along with some suggestions to overcome the same.
语言差异在NLP领域是一个近年来已被广泛承认的问题。然而,这个问题的不同方面,或者这种差异背后的原因很少在NLP社区讨论。本文全面分析了世界语言之间存在的差异。我们表明,简单地根据数据可用性对语言进行分类可能并不总是正确的。利用现有的基于说话者数量和活力的语言分类,我们分析了语言数据资源的分布、NLP/CL研究的数量、多语言网络平台的包含以及预训练的多语言模型的包含。我们发现,许多语言在这些资源或平台中没有被涵盖,甚至在属于同一语言组的语言中,也存在很大的差异。我们分析了家庭、地理位置、GDP和语言使用者人数的影响,并提供了造成这种差异的可能原因,以及一些克服这些差异的建议。
{"title":"Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World","authors":"Surangika Ranathunga, Nisansa de Silva","doi":"10.48550/arXiv.2210.08523","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08523","url":null,"abstract":"Linguistic disparity in the NLP world is a problem that has been widely acknowledged recently. However, different facets of this problem, or the reasons behind this disparity are seldom discussed within the NLP community. This paper provides a comprehensive analysis of the disparity that exists within the languages of the world. We show that simply categorising languages considering data availability may not be always correct. Using an existing language categorisation based on speaker population and vitality, we analyse the distribution of language data resources, amount of NLP/CL research, inclusion in multilingual web-based platforms and the inclusion in pre-trained multilingual models. We show that many languages do not get covered in these resources or platforms, and even within the languages belonging to the same language group, there is wide disparity. We analyse the impact of family, geographical location, GDP and the speaker population of languages and provide possible reasons for this disparity, along with some suggestions to overcome the same.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"148 1","pages":"823-848"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76573423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
COFAR: Commonsense and Factual Reasoning in Image Search COFAR:图像搜索中的常识和事实推理
Q3 Environmental Science Pub Date : 2022-10-16 DOI: 10.48550/arXiv.2210.08554
Prajwal Gatti, A. S. Penamakuri, Revant Teotia, Anand Mishra, Shubhashis Sengupta, Roshni Ramnani
One characteristic that makes humans superior to modern artificially intelligent models is the ability to interpret images beyond what is visually apparent. Consider the following two natural language search queries – (i) “a queue of customers patiently waiting to buy ice cream” and (ii) “a queue of tourists going to see a famous Mughal architecture in India”. Interpreting these queries requires one to reason with (i) Commonsense such as interpreting people as customers or tourists, actions as waiting to buy or going to see; and (ii) Fact or world knowledge associated with named visual entities, for example, whether the store in the image sells ice cream or whether the landmark in the image is a Mughal architecture located in India. Such reasoning goes beyond just visual recognition. To enable both commonsense and factual reasoning in the image search, we present a unified framework namely Knowledge Retrieval-Augmented Multimodal Transformer (KRAMT) that treats the named visual entities in an image as a gateway to encyclopedic knowledge and leverages them along with natural language query to ground relevant knowledge. Further, KRAMT seamlessly integrates visual content and grounded knowledge to learn alignment between images and search queries. This unified framework is then used to perform image search requiring commonsense and factual reasoning. The retrieval performance of KRAMT is evaluated and compared with related approaches on a new dataset we introduce – namely COFAR.
人类优于现代人工智能模型的一个特点是,人类有能力解读视觉之外的图像。考虑以下两个自然语言搜索查询:(i)“一群耐心等待购买冰淇淋的顾客”和(ii)“一群前往印度参观著名莫卧儿建筑的游客”。解释这些问题需要一个人进行推理:(1)常识,例如将人解释为顾客或游客,将行为解释为等待购买或去看;(ii)与已命名的视觉实体相关的事实或世界知识,例如,图像中的商店是否出售冰淇淋,或者图像中的地标是否是位于印度的莫卧儿王朝建筑。这样的推理不仅仅是视觉识别。为了在图像搜索中实现常识和事实推理,我们提出了一个统一的框架,即知识检索-增强多模态转换器(KRAMT),它将图像中的命名视觉实体视为百科知识的门户,并利用它们与自然语言查询一起来获取相关知识。此外,KRAMT无缝地集成了视觉内容和基础知识,以学习图像和搜索查询之间的对齐。然后使用这个统一的框架来执行需要常识和事实推理的图像搜索。在新引入的COFAR数据集上,对KRAMT的检索性能进行了评价,并与相关方法进行了比较。
{"title":"COFAR: Commonsense and Factual Reasoning in Image Search","authors":"Prajwal Gatti, A. S. Penamakuri, Revant Teotia, Anand Mishra, Shubhashis Sengupta, Roshni Ramnani","doi":"10.48550/arXiv.2210.08554","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08554","url":null,"abstract":"One characteristic that makes humans superior to modern artificially intelligent models is the ability to interpret images beyond what is visually apparent. Consider the following two natural language search queries – (i) “a queue of customers patiently waiting to buy ice cream” and (ii) “a queue of tourists going to see a famous Mughal architecture in India”. Interpreting these queries requires one to reason with (i) Commonsense such as interpreting people as customers or tourists, actions as waiting to buy or going to see; and (ii) Fact or world knowledge associated with named visual entities, for example, whether the store in the image sells ice cream or whether the landmark in the image is a Mughal architecture located in India. Such reasoning goes beyond just visual recognition. To enable both commonsense and factual reasoning in the image search, we present a unified framework namely Knowledge Retrieval-Augmented Multimodal Transformer (KRAMT) that treats the named visual entities in an image as a gateway to encyclopedic knowledge and leverages them along with natural language query to ground relevant knowledge. Further, KRAMT seamlessly integrates visual content and grounded knowledge to learn alignment between images and search queries. This unified framework is then used to perform image search requiring commonsense and factual reasoning. The retrieval performance of KRAMT is evaluated and compared with related approaches on a new dataset we introduce – namely COFAR.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"44 1","pages":"1185-1199"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79742140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation 法律案件文书摘要:抽取与抽象方法及其评价
Q3 Environmental Science Pub Date : 2022-10-14 DOI: 10.48550/arXiv.2210.07544
A. Shukla, Paheli Bhattacharya, Soham Poddar, Rajdeep Mukherjee, Kripabandhu Ghosh, Pawan Goyal, Saptarshi Ghosh
Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input tokens, and legal documents are known to be very long. Also, it is an open question on how best to evaluate legal case document summarization systems. In this paper, we carry out extensive experiments with several extractive and abstractive summarization methods (both supervised and unsupervised) over three legal summarization datasets that we have developed. Our analyses, that includes evaluation by law practitioners, lead to several interesting insights on legal summarization in specific and long document summarization in general.
案件判决书摘要是法律自然语言处理中的一个难题。然而,对于不同类型的摘要模型(例如,抽取型与抽象型)在应用于法律案例文件时的表现,并没有太多的分析。这个问题尤其重要,因为许多最新的基于转换器的抽象摘要模型对输入令牌的数量有限制,而且法律文档非常长。此外,如何最好地评估法律案件文件摘要系统也是一个悬而未决的问题。在本文中,我们在我们开发的三个法律摘要数据集上使用几种提取和抽象摘要方法(监督和无监督)进行了广泛的实验。我们的分析,包括法律从业人员的评估,导致了几个有趣的见解,具体的法律摘要和一般的长文件摘要。
{"title":"Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation","authors":"A. Shukla, Paheli Bhattacharya, Soham Poddar, Rajdeep Mukherjee, Kripabandhu Ghosh, Pawan Goyal, Saptarshi Ghosh","doi":"10.48550/arXiv.2210.07544","DOIUrl":"https://doi.org/10.48550/arXiv.2210.07544","url":null,"abstract":"Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input tokens, and legal documents are known to be very long. Also, it is an open question on how best to evaluate legal case document summarization systems. In this paper, we carry out extensive experiments with several extractive and abstractive summarization methods (both supervised and unsupervised) over three legal summarization datasets that we have developed. Our analyses, that includes evaluation by law practitioners, lead to several interesting insights on legal summarization in specific and long document summarization in general.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"16 1","pages":"1048-1064"},"PeriodicalIF":0.0,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89920293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Improving Graph-Based Text Representations with Character and Word Level N-grams 用字符和词级n -图改进基于图的文本表示
Q3 Environmental Science Pub Date : 2022-10-12 DOI: 10.48550/arXiv.2210.05999
Wenzhe Li, Nikolaos Aletras
Graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus. Despite the increasing interest in graph representation learning, there is limited research in exploring new ways for graph-based text representation, which is important in downstream natural language processing tasks. In this paper, we first propose a new heterogeneous word-character text graph that combines word and character n-gram nodes together with document nodes, allowing us to better learn dependencies among these entities. Additionally, we propose two new graph-based neural models, WCTextGCN and WCTextGAT, for modeling our proposed text graph. Extensive experiments in text classification and automatic text summarization benchmarks demonstrate that our proposed models consistently outperform competitive baselines and state-of-the-art graph-based models.
基于图的文本表示侧重于如何将文本文档表示为图,以利用语料库中令牌和文档之间的依赖信息。尽管人们对图表示学习越来越感兴趣,但在探索基于图的文本表示的新方法方面的研究有限,这在下游自然语言处理任务中很重要。在本文中,我们首先提出了一种新的异构词-字符文本图,它将词和字符n-gram节点与文档节点结合在一起,使我们能够更好地学习这些实体之间的依赖关系。此外,我们提出了两个新的基于图的神经模型WCTextGCN和WCTextGAT来建模我们提出的文本图。在文本分类和自动文本摘要基准测试中的大量实验表明,我们提出的模型始终优于竞争性基线和最先进的基于图的模型。
{"title":"Improving Graph-Based Text Representations with Character and Word Level N-grams","authors":"Wenzhe Li, Nikolaos Aletras","doi":"10.48550/arXiv.2210.05999","DOIUrl":"https://doi.org/10.48550/arXiv.2210.05999","url":null,"abstract":"Graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus. Despite the increasing interest in graph representation learning, there is limited research in exploring new ways for graph-based text representation, which is important in downstream natural language processing tasks. In this paper, we first propose a new heterogeneous word-character text graph that combines word and character n-gram nodes together with document nodes, allowing us to better learn dependencies among these entities. Additionally, we propose two new graph-based neural models, WCTextGCN and WCTextGAT, for modeling our proposed text graph. Extensive experiments in text classification and automatic text summarization benchmarks demonstrate that our proposed models consistently outperform competitive baselines and state-of-the-art graph-based models.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"22 1","pages":"228-233"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85317069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
How Well Do Multi-hop Reading Comprehension Models Understand Date Information? 多跳阅读理解模型对日期信息的理解程度如何?
Q3 Environmental Science Pub Date : 2022-10-11 DOI: 10.48550/arXiv.2210.05208
Xanh Ho, Saku Sugawara, Akiko Aizawa
Several multi-hop reading comprehension datasets have been proposed to resolve the issue of reasoning shortcuts by which questions can be answered without performing multi-hop reasoning. However, the ability of multi-hop models to perform step-by-step reasoning when finding an answer to a comparison question remains unclear. It is also unclear how questions about the internal reasoning process are useful for training and evaluating question-answering (QA) systems. To evaluate the model precisely in a hierarchical manner, we first propose a dataset, HieraDate, with three probing tasks in addition to the main question: extraction, reasoning, and robustness. Our dataset is created by enhancing two previous multi-hop datasets, HotpotQA and 2WikiMultiHopQA, focusing on multi-hop questions on date information that involve both comparison and numerical reasoning. We then evaluate the ability of existing models to understand date information. Our experimental results reveal that the multi-hop models do not have the ability to subtract two dates even when they perform well in date comparison and number subtraction tasks. Other results reveal that our probing questions can help to improve the performance of the models (e.g., by +10.3 F1) on the main QA task and our dataset can be used for data augmentation to improve the robustness of the models.
为了解决不需要进行多跳推理就能回答问题的推理捷径问题,已经提出了几个多跳阅读理解数据集。然而,在找到比较问题的答案时,多跳模型执行逐步推理的能力仍然不清楚。关于内部推理过程的问题如何对培训和评估问答(QA)系统有用也尚不清楚。为了以分层的方式精确地评估模型,我们首先提出了一个数据集,HieraDate,除了主要问题外,还有三个探测任务:提取、推理和鲁棒性。我们的数据集是通过增强之前的两个多跳数据集HotpotQA和2WikiMultiHopQA创建的,重点关注涉及比较和数值推理的日期信息的多跳问题。然后我们评估现有模型理解日期信息的能力。我们的实验结果表明,即使多跳模型在日期比较和数字减法任务中表现良好,也不具有减法两个日期的能力。其他结果表明,我们的探索性问题可以帮助提高模型在主要QA任务上的性能(例如+10.3 F1),我们的数据集可以用于数据增强,以提高模型的鲁棒性。
{"title":"How Well Do Multi-hop Reading Comprehension Models Understand Date Information?","authors":"Xanh Ho, Saku Sugawara, Akiko Aizawa","doi":"10.48550/arXiv.2210.05208","DOIUrl":"https://doi.org/10.48550/arXiv.2210.05208","url":null,"abstract":"Several multi-hop reading comprehension datasets have been proposed to resolve the issue of reasoning shortcuts by which questions can be answered without performing multi-hop reasoning. However, the ability of multi-hop models to perform step-by-step reasoning when finding an answer to a comparison question remains unclear. It is also unclear how questions about the internal reasoning process are useful for training and evaluating question-answering (QA) systems. To evaluate the model precisely in a hierarchical manner, we first propose a dataset, HieraDate, with three probing tasks in addition to the main question: extraction, reasoning, and robustness. Our dataset is created by enhancing two previous multi-hop datasets, HotpotQA and 2WikiMultiHopQA, focusing on multi-hop questions on date information that involve both comparison and numerical reasoning. We then evaluate the ability of existing models to understand date information. Our experimental results reveal that the multi-hop models do not have the ability to subtract two dates even when they perform well in date comparison and number subtraction tasks. Other results reveal that our probing questions can help to improve the performance of the models (e.g., by +10.3 F1) on the main QA task and our dataset can be used for data augmentation to improve the robustness of the models.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"375 1","pages":"470-479"},"PeriodicalIF":0.0,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75513246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
AACL Bioflux
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1