Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models

Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert
{"title":"Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models","authors":"Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert","doi":"10.1145/3604951.3605514","DOIUrl":null,"url":null,"abstract":"The Holocaust was a tragic and catastrophic event in World War II (WWII) history that resulted in the loss of millions of lives. In recent years, the emergence of the field of digital humanities has made the study of Holocaust testimonies an important area of research for historians, Holocaust educators, social scientists, and linguists. One of the challenges in analysing Holocaust testimonies is the recognition and categorisation of named entities such as concentration camps, military officers, ships, and ghettos, due to the scarcity of annotated data. This paper presents a research study on a domain-specific hybrid named-entity recognition model, which focuses on developing NER models specifically tailored for the Holocaust domain. To overcome the problem of data scarcity, we employed hybrid annotation approach to training different transformer model architectures in order to recognise the named entities. Results show transformer models to have good performance compared to other approaches.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3604951.3605514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The Holocaust was a tragic and catastrophic event in World War II (WWII) history that resulted in the loss of millions of lives. In recent years, the emergence of the field of digital humanities has made the study of Holocaust testimonies an important area of research for historians, Holocaust educators, social scientists, and linguists. One of the challenges in analysing Holocaust testimonies is the recognition and categorisation of named entities such as concentration camps, military officers, ships, and ghettos, due to the scarcity of annotated data. This paper presents a research study on a domain-specific hybrid named-entity recognition model, which focuses on developing NER models specifically tailored for the Holocaust domain. To overcome the problem of data scarcity, we employed hybrid annotation approach to training different transformer model architectures in order to recognise the named entities. Results show transformer models to have good performance compared to other approaches.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过伪标签和基于变压器的模型增强大屠杀证词的命名实体识别
大屠杀是第二次世界大战历史上的悲惨和灾难性事件,导致数百万人丧生。近年来,数字人文学科领域的兴起使得大屠杀证词研究成为历史学家、大屠杀教育者、社会科学家和语言学家的一个重要研究领域。分析大屠杀证词面临的挑战之一是,由于缺乏注释数据,对集中营、军官、船只和隔都等已命名实体进行识别和分类。本文对特定领域的混合命名实体识别模型进行了研究,重点是开发专门针对大屠杀领域的NER模型。为了克服数据稀缺性的问题,我们采用混合标注的方法来训练不同的转换器模型体系结构,以识别命名实体。结果表明,与其他方法相比,变压器模型具有良好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts Laypa: A Novel Framework for Applying Segmentation Networks to Historical Documents Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents PapyTwin net: a Twin network for Greek letters detection on ancient Papyri Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1