基于转换器的神经网络框架，用于预测包含缩写和上下文的全名

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data & Knowledge Engineering Pub Date : 2023-12-30 DOI:10.1016/j.datak.2023.102275

Ziming Ye , Shuangyin Li

{"title":"基于转换器的神经网络框架，用于预测包含缩写和上下文的全名","authors":"Ziming Ye , Shuangyin Li","doi":"10.1016/j.datak.2023.102275","DOIUrl":null,"url":null,"abstract":"<div><p>With the rapid spread of information, abbreviations are used more and more common because they are convenient. However, the duplication of abbreviations can lead to confusion in many cases, such as information management and information retrieval. The resultant confusion annoys users. Thus, inferring a full name from an abbreviation has practical and significant advantages. The bulk of studies in the literature mainly inferred full names based on rule-based methods, statistical models, the similarity of representation, etc. However, these methods are unable to use various grained contexts properly. In this paper, we propose a flexible framework of Multi-attention mask Abbreviation Context and Full name language model<span>, named MACF to address the problem. With the abbreviation and contexts as the inputs, the MACF can automatically predict a full name by generation, where the contexts can be variously grained. That is, different grained contexts ranging from coarse to fine can be selected to perform such complicated tasks in which contexts include paragraphs, several sentences, or even just a few keywords. A novel multi-attention mask mechanism is also proposed, which allows the model to learn the relationships among abbreviations, contexts, and full names, a process that makes the most of various grained contexts. The three corpora of different languages and fields were analyzed and measured with seven metrics in various aspects to evaluate the proposed framework. According to the experimental results, the MACF yielded more significant and consistent outputs than other baseline methods. Moreover, we discuss the significance and findings, and give the case studies to show the performance in real applications.</span></p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"150 ","pages":"Article 102275"},"PeriodicalIF":2.7000,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A transformer-based neural network framework for full names prediction with abbreviations and contexts\",\"authors\":\"Ziming Ye , Shuangyin Li\",\"doi\":\"10.1016/j.datak.2023.102275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>With the rapid spread of information, abbreviations are used more and more common because they are convenient. However, the duplication of abbreviations can lead to confusion in many cases, such as information management and information retrieval. The resultant confusion annoys users. Thus, inferring a full name from an abbreviation has practical and significant advantages. The bulk of studies in the literature mainly inferred full names based on rule-based methods, statistical models, the similarity of representation, etc. However, these methods are unable to use various grained contexts properly. In this paper, we propose a flexible framework of Multi-attention mask Abbreviation Context and Full name language model<span>, named MACF to address the problem. With the abbreviation and contexts as the inputs, the MACF can automatically predict a full name by generation, where the contexts can be variously grained. That is, different grained contexts ranging from coarse to fine can be selected to perform such complicated tasks in which contexts include paragraphs, several sentences, or even just a few keywords. A novel multi-attention mask mechanism is also proposed, which allows the model to learn the relationships among abbreviations, contexts, and full names, a process that makes the most of various grained contexts. The three corpora of different languages and fields were analyzed and measured with seven metrics in various aspects to evaluate the proposed framework. According to the experimental results, the MACF yielded more significant and consistent outputs than other baseline methods. Moreover, we discuss the significance and findings, and give the case studies to show the performance in real applications.</span></p></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":\"150 \",\"pages\":\"Article 102275\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2023-12-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X23001350\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23001350","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

随着信息的迅速传播，缩写因其方便快捷而被越来越多地使用。然而，在信息管理和信息检索等许多情况下，缩略语的重复使用会导致混乱。由此造成的混乱会让用户感到厌烦。因此，从缩写中推断全名具有实际而显著的优势。文献中的大量研究主要是基于规则方法、统计模型、表征相似性等来推断全名。然而，这些方法无法正确使用各种粒度的上下文。本文提出了一种灵活的多注意掩码缩写上下文和全名语言模型框架（命名为 MACF）来解决这一问题。以缩写和上下文为输入，MACF 可以自动生成预测全名，其中上下文可以是不同粒度的。也就是说，可以选择从粗粒到细粒的不同粒度上下文，来完成这种复杂的任务，其中上下文包括段落、几个句子，甚至只是几个关键词。此外，还提出了一种新颖的多注意掩码机制，该机制允许模型学习缩写、上下文和全名之间的关系，这一过程充分利用了不同粒度的上下文。通过对三个不同语言和领域的语料库进行分析，并从七个方面进行衡量，对所提出的框架进行了评估。实验结果表明，与其他基线方法相比，MACF 得出的结果更显著、更一致。此外，我们还讨论了实验的意义和结果，并通过案例研究展示了其在实际应用中的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A transformer-based neural network framework for full names prediction with abbreviations and contexts

With the rapid spread of information, abbreviations are used more and more common because they are convenient. However, the duplication of abbreviations can lead to confusion in many cases, such as information management and information retrieval. The resultant confusion annoys users. Thus, inferring a full name from an abbreviation has practical and significant advantages. The bulk of studies in the literature mainly inferred full names based on rule-based methods, statistical models, the similarity of representation, etc. However, these methods are unable to use various grained contexts properly. In this paper, we propose a flexible framework of Multi-attention mask Abbreviation Context and Full name language model, named MACF to address the problem. With the abbreviation and contexts as the inputs, the MACF can automatically predict a full name by generation, where the contexts can be variously grained. That is, different grained contexts ranging from coarse to fine can be selected to perform such complicated tasks in which contexts include paragraphs, several sentences, or even just a few keywords. A novel multi-attention mask mechanism is also proposed, which allows the model to learn the relationships among abbreviations, contexts, and full names, a process that makes the most of various grained contexts. The three corpora of different languages and fields were analyzed and measured with seven metrics in various aspects to evaluate the proposed framework. According to the experimental results, the MACF yielded more significant and consistent outputs than other baseline methods. Moreover, we discuss the significance and findings, and give the case studies to show the performance in real applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Data & Knowledge Engineering 工程技术-计算机：人工智能

CiteScore

5.00

自引率

0.00%

发文量

审稿时长

6 months

期刊介绍： Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.