首页 > 最新文献

Proceedings of the Natural Legal Language Processing Workshop 2022最新文献

英文 中文
Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts 结合WordNet和词嵌入在法律文本数据增强中的应用
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.nllp-1.4
Sezen Perçin, Andrea Galassi, F. Lagioia, Federico Ruggeri, Piera Santin, G. Sartor, Paolo Torroni
Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts.To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology.We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.
为复杂的任务(如法律分析)创建平衡的标记文本语料库是一个具有挑战性和昂贵的过程,通常需要领域专家的协作。为了解决这个问题,我们提出了一种基于GloVe词嵌入和WordNet本体相结合的数据增强方法。我们提出了一个在法律领域,特别是在欧洲联盟法院的判决中应用的例子。我们与人类专家的评估证实,我们的方法比替代方案更稳健。
{"title":"Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts","authors":"Sezen Perçin, Andrea Galassi, F. Lagioia, Federico Ruggeri, Piera Santin, G. Sartor, Paolo Torroni","doi":"10.18653/v1/2022.nllp-1.4","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.4","url":null,"abstract":"Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts.To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology.We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"27 15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131106385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards Cross-Domain Transferability of Text Generation Models for Legal Text 法律文本生成模型的跨域可移植性研究
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.nllp-1.9
Vinayshekhar Bannihatti Kumar, Kasturi Bhattacharjee
Legalese can often be filled with verbose domain-specific jargon which can make it challenging to understand and use for non-experts. Creating succinct summaries of legal documents often makes it easier for user comprehension. However, obtaining labeled data for every domain of legal text is challenging, which makes cross-domain transferability of text generation models for legal text, an important area of research. In this paper, we explore the ability of existing state-of-the-art T5 & BART-based summarization models to transfer across legal domains. We leverage publicly available datasets across four domains for this task, one of which is a new resource for summarizing privacy policies, that we curate and release for academic research. Our experiments demonstrate the low cross-domain transferability of these models, while also highlighting the benefits of combining different domains. Further, we compare the effectiveness of standard metrics for this task and illustrate the vast differences in their performance.
法律术语通常充满了冗长的领域特定术语,这对非专家来说很难理解和使用。为法律文件创建简洁的摘要通常会让用户更容易理解。然而,获取法律文本各个领域的标记数据具有挑战性,这使得法律文本生成模型的跨领域可移植性成为一个重要的研究领域。在本文中,我们探讨了现有的最先进的基于T5和bart的摘要模型跨法律领域转移的能力。我们利用四个领域的公开数据集来完成这项任务,其中一个是总结隐私政策的新资源,我们为学术研究策划和发布。我们的实验证明了这些模型的低跨领域可移植性,同时也突出了不同领域结合的好处。此外,我们比较了该任务的标准度量的有效性,并说明了它们在性能上的巨大差异。
{"title":"Towards Cross-Domain Transferability of Text Generation Models for Legal Text","authors":"Vinayshekhar Bannihatti Kumar, Kasturi Bhattacharjee","doi":"10.18653/v1/2022.nllp-1.9","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.9","url":null,"abstract":"Legalese can often be filled with verbose domain-specific jargon which can make it challenging to understand and use for non-experts. Creating succinct summaries of legal documents often makes it easier for user comprehension. However, obtaining labeled data for every domain of legal text is challenging, which makes cross-domain transferability of text generation models for legal text, an important area of research. In this paper, we explore the ability of existing state-of-the-art T5 & BART-based summarization models to transfer across legal domains. We leverage publicly available datasets across four domains for this task, one of which is a new resource for summarizing privacy policies, that we curate and release for academic research. Our experiments demonstrate the low cross-domain transferability of these models, while also highlighting the benefits of combining different domains. Further, we compare the effectiveness of standard metrics for this task and illustrate the vast differences in their performance.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130372621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting Relevant Differences Between Similar Legal Texts 发现相似法律文本之间的相关差异
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.nllp-1.24
Xiang Li, Jiaxun Gao, D. Inkpen, Wolfgang Alschner
Given two similar legal texts, is it useful to be able to focus only on the parts that contain relevant differences. However, because of variation in linguistic structure and terminology, it is not easy to identify true semantic differences. An accurate difference detection model between similar legal texts is therefore in demand, in order to increase the efficiency of legal research and document analysis. In this paper, we automatically label a training dataset of sentence pairs using an existing legal resource of international investment treaties that were already manually annotated with metadata. Then we propose models based on state-of-the-art deep learning techniques for the novel task of detecting relevant differences. In addition to providing solutions for this task, we include models for automatically producing metadata for the treaties that do not have it.
鉴于两个类似的法律文本,能够只集中注意包含相关差异的部分是否有用?然而,由于语言结构和术语的差异,识别真正的语义差异并不容易。因此,为了提高法律研究和文献分析的效率,需要一种准确的法律文本之间的差异检测模型。在本文中,我们使用已经用元数据手动注释的现有国际投资条约法律资源自动标记句子对的训练数据集。然后,我们提出了基于最先进的深度学习技术的模型,用于检测相关差异的新任务。除了为这项任务提供解决方案之外,我们还包括了自动为没有元数据的条约生成元数据的模型。
{"title":"Detecting Relevant Differences Between Similar Legal Texts","authors":"Xiang Li, Jiaxun Gao, D. Inkpen, Wolfgang Alschner","doi":"10.18653/v1/2022.nllp-1.24","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.24","url":null,"abstract":"Given two similar legal texts, is it useful to be able to focus only on the parts that contain relevant differences. However, because of variation in linguistic structure and terminology, it is not easy to identify true semantic differences. An accurate difference detection model between similar legal texts is therefore in demand, in order to increase the efficiency of legal research and document analysis. In this paper, we automatically label a training dataset of sentence pairs using an existing legal resource of international investment treaties that were already manually annotated with metadata. Then we propose models based on state-of-the-art deep learning techniques for the novel task of detecting relevant differences. In addition to providing solutions for this task, we include models for automatically producing metadata for the treaties that do not have it.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115863943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On What it Means to Pay Your Fair Share: Towards Automatically Mapping Different Conceptions of Tax Justice in Legal Research Literature 论纳税公平的意义:论法律研究文献中税收公正不同概念的自动映射
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.nllp-1.2
Reto Gubelmann, Peter Hongler, Elina Margadant, S. Handschuh
In this article, we explore the potential and challenges of applying transformer-based pre-trained language models (PLMs) and statistical methods to a particularly challenging, yet highly important and largely uncharted domain: normative discussions in tax law research. On our conviction, the role of NLP in this essentially contested territory is to make explicit implicit normative assumptions, and to foster debates across ideological divides. To this goal, we propose the first steps towards a method that automatically labels normative statements in tax law research, and that suggests the normative background of these statements. Our results are encouraging, but it is clear that there is still room for improvement.
在本文中,我们探讨了将基于变压器的预训练语言模型(PLMs)和统计方法应用于一个特别具有挑战性,但非常重要且很大程度上未知的领域的潜力和挑战:税法研究中的规范性讨论。根据我们的信念,NLP在这个本质上有争议的领域中的作用是做出明确的、隐含的规范性假设,并促进跨越意识形态分歧的辩论。为了实现这一目标,我们提出了在税法研究中自动标记规范性陈述的方法的第一步,并提出了这些陈述的规范性背景。我们的结果令人鼓舞,但显然仍有改进的余地。
{"title":"On What it Means to Pay Your Fair Share: Towards Automatically Mapping Different Conceptions of Tax Justice in Legal Research Literature","authors":"Reto Gubelmann, Peter Hongler, Elina Margadant, S. Handschuh","doi":"10.18653/v1/2022.nllp-1.2","DOIUrl":"https://doi.org/10.18653/v1/2022.nllp-1.2","url":null,"abstract":"In this article, we explore the potential and challenges of applying transformer-based pre-trained language models (PLMs) and statistical methods to a particularly challenging, yet highly important and largely uncharted domain: normative discussions in tax law research. On our conviction, the role of NLP in this essentially contested territory is to make explicit implicit normative assumptions, and to foster debates across ideological divides. To this goal, we propose the first steps towards a method that automatically labels normative statements in tax law research, and that suggests the normative background of these statements. Our results are encouraging, but it is clear that there is still room for improvement.","PeriodicalId":278495,"journal":{"name":"Proceedings of the Natural Legal Language Processing Workshop 2022","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116860748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Natural Legal Language Processing Workshop 2022
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1