首页 > 最新文献

Proceedings of the 7th International Workshop on Historical Document Imaging and Processing最新文献

英文 中文
Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents 历史文献文字、字型、位置分类的自监督学习研究
Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret, V. Christlein
In the context of automated classification of historical documents, we investigate three contemporary self-supervised learning (SSL) techniques (SimSiam, Dino, and VICReg) for the pre-training of three different document analysis tasks, namely script-type, font-type, and location classification. Our study draws samples from multiple datasets that contain images of manuscripts, prints, charters, and letters. The representations derived via pre-text training are taken as inputs for k-NN classification and a parametric linear classifier. The latter is placed atop the pre-trained backbones to enable fine-tuning of the entire network to further improve the classification by exploiting task-specific label data. The network’s final performance is assessed via independent test sets obtained from the ICDAR2021 Competition on Historical Document Classification. We empirically show that representations learned with SSL are significantly better suited for subsequent document classification than features generated by commonly used transfer learning on ImageNet.
在历史文档自动分类的背景下,我们研究了三种当代自我监督学习(SSL)技术(SimSiam、Dino和VICReg),用于三种不同文档分析任务的预训练,即脚本类型、字体类型和位置分类。我们的研究从包含手稿、印刷品、宪章和信件图像的多个数据集中抽取样本。通过文本前训练得到的表示作为k-NN分类和参数线性分类器的输入。后者放置在预训练的骨干网之上,以便对整个网络进行微调,从而通过利用特定于任务的标签数据进一步改进分类。网络的最终性能通过从ICDAR2021历史文档分类竞赛中获得的独立测试集进行评估。我们的经验表明,与ImageNet上常用的迁移学习生成的特征相比,使用SSL学习的表征明显更适合后续文档分类。
{"title":"Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents","authors":"Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret, V. Christlein","doi":"10.1145/3604951.3605519","DOIUrl":"https://doi.org/10.1145/3604951.3605519","url":null,"abstract":"In the context of automated classification of historical documents, we investigate three contemporary self-supervised learning (SSL) techniques (SimSiam, Dino, and VICReg) for the pre-training of three different document analysis tasks, namely script-type, font-type, and location classification. Our study draws samples from multiple datasets that contain images of manuscripts, prints, charters, and letters. The representations derived via pre-text training are taken as inputs for k-NN classification and a parametric linear classifier. The latter is placed atop the pre-trained backbones to enable fine-tuning of the entire network to further improve the classification by exploiting task-specific label data. The network’s final performance is assessed via independent test sets obtained from the ICDAR2021 Competition on Historical Document Classification. We empirically show that representations learned with SSL are significantly better suited for subsequent document classification than features generated by commonly used transfer learning on ImageNet.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121022906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts 用Iconclass视觉概念衡量自然语言监督的文本-图像度量学习的局限性
Kai Labusch, Clemens Neudecker
Identification of images that are close to each other in terms of their iconographical meaning requires an applicable distance measure for text-image or image-image pairs. To obtain such a measure of distance, we finetune a group of contrastive loss based text-to-image similarity models (MS-CLIP) with respect to a large number of Iconclass visual concepts by means of natural language supervised learning. We show that there are certain Iconclass concepts that actually can be learned by the models whereas other visual concepts cannot be learned. We hypothesize that the visual concepts that can be learned more easily are intrinsically different from those that are more difficult to learn and that these qualitative differences can provide a valuable orientation for future research directions in text-to-image similarity learning.
识别在图像意义上彼此接近的图像需要一个适用于文本-图像或图像-图像对的距离度量。为了获得这样的距离度量,我们通过自然语言监督学习的方式,针对大量的Iconclass视觉概念,对一组基于对比损失的文本到图像相似性模型(MS-CLIP)进行了微调。我们表明,模型实际上可以学习某些Iconclass概念,而其他视觉概念则无法学习。我们假设,易于学习的视觉概念与较难学习的视觉概念具有本质上的不同,这些质的差异可以为文本到图像相似性学习的未来研究方向提供有价值的方向。
{"title":"Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts","authors":"Kai Labusch, Clemens Neudecker","doi":"10.1145/3604951.3605516","DOIUrl":"https://doi.org/10.1145/3604951.3605516","url":null,"abstract":"Identification of images that are close to each other in terms of their iconographical meaning requires an applicable distance measure for text-image or image-image pairs. To obtain such a measure of distance, we finetune a group of contrastive loss based text-to-image similarity models (MS-CLIP) with respect to a large number of Iconclass visual concepts by means of natural language supervised learning. We show that there are certain Iconclass concepts that actually can be learned by the models whereas other visual concepts cannot be learned. We hypothesize that the visual concepts that can be learned more easily are intrinsically different from those that are more difficult to learn and that these qualitative differences can provide a valuable orientation for future research directions in text-to-image similarity learning.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114424739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classifying The Scripts of Aramaic Incantation Bowls 亚拉姆语咒语碗的文字分类
Said Naamneh, Nour Atamni, Boraq Madi, Daria Vasyutinsky Shapira, Irina Rabaev, Jihad El-Sana, Shoshana Boardman
Aramaic incantation bowls are a magical object commonly used in Sasanian Mesopotamia (the region that includes modern-day Iraq and Iran) between the 4th and 7th centuries CE. These bowls were typically made of clay and inscribed with incantations in three dialects of Aramaic, the languages widely spoken in the region then. This paper focuses on bowls written in Jewish Babylonian Aramaic. The purpose of these bowls was to protect the homes of their owners from evil spirits and demons. The inscriptions on the bowls were most often written in a spiral fashion and often included the names of various demons and invocations of protective spirits and angels, alongside the names and family relationships of the clients, Biblical quotations, and other interesting material. The bowls were buried upside down beneath the floor of a home so that the incantations faced downward towards the underworld. This study tackles the problem of automatic classification of the script style of incantation bowls. To this end, we prepare and introduce a new dataset of incantation bowl images from the 4th to 7th centuries CE. We experiment with and compare several Siamese-based architectures, and introduce a new Multi-Level-of-Detail architecture, which extracts features at different scales. Our results establish baselines for future research and make valuable contributions to ongoing research addressing challenges in working with ancient artifact images.
阿拉姆咒语碗是公元4世纪至7世纪在萨珊美索不达米亚(包括今天的伊拉克和伊朗在内的地区)普遍使用的一种神奇物品。这些碗通常是用粘土制成的,上面刻有当时在该地区广泛使用的阿拉姆语的三种方言的咒语。本文主要研究用犹太巴比伦亚拉姆语写的碗。这些碗的目的是保护他们的主人的家园免受邪恶的灵魂和恶魔。碗上的铭文通常以螺旋形式书写,通常包括各种恶魔的名字和对保护精神和天使的召唤,以及客户的名字和家庭关系,圣经引文和其他有趣的材料。这些碗被倒埋在一户人家的地板下,这样咒语就朝下朝向地下世界。本文研究了咒语碗文字风格的自动分类问题。为此,我们准备并引入了一个新的数据集,其中包括公元4世纪至7世纪的咒语碗图像。我们对几种基于暹罗的结构进行了实验和比较,并引入了一种新的多层细节结构,该结构可以提取不同尺度的特征。我们的研究结果为未来的研究奠定了基础,并对正在进行的研究做出了宝贵的贡献,这些研究解决了处理古代文物图像的挑战。
{"title":"Classifying The Scripts of Aramaic Incantation Bowls","authors":"Said Naamneh, Nour Atamni, Boraq Madi, Daria Vasyutinsky Shapira, Irina Rabaev, Jihad El-Sana, Shoshana Boardman","doi":"10.1145/3604951.3605510","DOIUrl":"https://doi.org/10.1145/3604951.3605510","url":null,"abstract":"Aramaic incantation bowls are a magical object commonly used in Sasanian Mesopotamia (the region that includes modern-day Iraq and Iran) between the 4th and 7th centuries CE. These bowls were typically made of clay and inscribed with incantations in three dialects of Aramaic, the languages widely spoken in the region then. This paper focuses on bowls written in Jewish Babylonian Aramaic. The purpose of these bowls was to protect the homes of their owners from evil spirits and demons. The inscriptions on the bowls were most often written in a spiral fashion and often included the names of various demons and invocations of protective spirits and angels, alongside the names and family relationships of the clients, Biblical quotations, and other interesting material. The bowls were buried upside down beneath the floor of a home so that the incantations faced downward towards the underworld. This study tackles the problem of automatic classification of the script style of incantation bowls. To this end, we prepare and introduce a new dataset of incantation bowl images from the 4th to 7th centuries CE. We experiment with and compare several Siamese-based architectures, and introduce a new Multi-Level-of-Detail architecture, which extracts features at different scales. Our results establish baselines for future research and make valuable contributions to ongoing research addressing challenges in working with ancient artifact images.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123772409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-step sequence transformer based method for Cham to Latin script transliteration 基于两步序列变换的占语到拉丁字母音译方法
Tien-Nam Nguyen, J. Burie, Thi-Lan Le, Anne-Valérie Schweyer
Fusion information between visual and textual information is an interesting way to better represent the features. In this work, we propose a method for the text line transliteration of Cham manuscripts by combining visual and textual modality. Instead of using a standard approach that directly recognizes the words in the image, we split the problem into two steps. Firstly, we propose a scenario for recognition where similar characters are considered as unique characters, then we use the transformer model which considers both visual and context information to adjust the prediction when it concerns similar characters to be able to distinguish them. Based on this two-step strategy, the proposed method consists of a sequence to sequence model and a multi-modal transformer. Thus, we can take advantage of both the sequence-to-sequence model and the transformer model. Extensive experiments prove that the proposed method outperforms the approaches of the literature on our Cham manuscripts dataset.
视觉信息和文本信息之间的融合是一种更好地表示特征的有趣方法。在本研究中,我们提出了一种结合视觉和文本形态的占族手稿文本线音译方法。我们没有使用直接识别图像中的单词的标准方法,而是将问题分为两个步骤。首先,我们提出了一种将相似字符视为唯一字符的识别场景,然后我们使用同时考虑视觉和上下文信息的变形模型来调整涉及相似字符的预测,使其能够区分。基于此两步策略,该方法由序列到序列模型和多模态变压器组成。因此,我们可以利用序列到序列模型和转换器模型。大量的实验证明,该方法优于我们的Cham手稿数据集上的文献方法。
{"title":"Two-step sequence transformer based method for Cham to Latin script transliteration","authors":"Tien-Nam Nguyen, J. Burie, Thi-Lan Le, Anne-Valérie Schweyer","doi":"10.1145/3604951.3605525","DOIUrl":"https://doi.org/10.1145/3604951.3605525","url":null,"abstract":"Fusion information between visual and textual information is an interesting way to better represent the features. In this work, we propose a method for the text line transliteration of Cham manuscripts by combining visual and textual modality. Instead of using a standard approach that directly recognizes the words in the image, we split the problem into two steps. Firstly, we propose a scenario for recognition where similar characters are considered as unique characters, then we use the transformer model which considers both visual and context information to adjust the prediction when it concerns similar characters to be able to distinguish them. Based on this two-step strategy, the proposed method consists of a sequence to sequence model and a multi-modal transformer. Thus, we can take advantage of both the sequence-to-sequence model and the transformer model. Extensive experiments prove that the proposed method outperforms the approaches of the literature on our Cham manuscripts dataset.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132429168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Study of historical Byzantine seal images: the BHAI project for computer-based sigillography 历史拜占庭印章图像的研究:基于计算机的符号学的BHAI项目
Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, A. Fiandrotti, Beatrice Caseau, Isabelle Bloch
BHAI 1 (Byzantine Hybrid Artificial Intelligence) is the first project based on artificial intelligence dedicated to Byzantine seals. The scientific consortium comprises a multidisciplinary team involving historians specialized in the Byzantine period, specialists in sigillography, and computer science experts. This article describes the main objectives of this project: data acquisition of seal images, text and iconography recognition, seal dating, as well as our current achievements and first results on character recognition and spatial analysis of personages.
BHAI 1(拜占庭混合人工智能)是第一个基于人工智能的拜占庭印章项目。该科学联盟由一个多学科团队组成,包括拜占庭时期的历史学家、手语专家和计算机科学专家。本文介绍了本项目的主要目标:印章图像的数据采集、文字和图像识别、印章年代测定,以及目前在人物文字识别和人物空间分析方面的研究成果和初步成果。
{"title":"Study of historical Byzantine seal images: the BHAI project for computer-based sigillography","authors":"Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, A. Fiandrotti, Beatrice Caseau, Isabelle Bloch","doi":"10.1145/3604951.3605523","DOIUrl":"https://doi.org/10.1145/3604951.3605523","url":null,"abstract":"BHAI 1 (Byzantine Hybrid Artificial Intelligence) is the first project based on artificial intelligence dedicated to Byzantine seals. The scientific consortium comprises a multidisciplinary team involving historians specialized in the Byzantine period, specialists in sigillography, and computer science experts. This article describes the main objectives of this project: data acquisition of seal images, text and iconography recognition, seal dating, as well as our current achievements and first results on character recognition and spatial analysis of personages.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123746170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid CNN-Transformer model for Historical Document Image Binarization 历史文献图像二值化的CNN-Transformer混合模型
V. Rezanezhad, Konstantin Baierer, Clemens Neudecker
Document image binarization is one of the main preprocessing steps in document image analysis for text recognition. Noise, faint characters, bad scanning conditions, uneven lighting or paper aging can cause artifacts that negatively impact text recognition algorithms. The task of binarization is to segment the foreground (text) from these degradations in order to improve optical character recognition (OCR) results. Convolutional Neural Networks (CNNs) are one popular method for binarization. But they suffer from focusing on the local context in a document image. We have applied a hybrid CNN-Transformer model to convert a document image into a binary output. For the model training, we used datasets from the Document Image Binarization Contests (DIBCO). For the datasets DIBCO-2012, DIBCO-2017 and DIBCO-2018, our model outperforms the state-of-the-art algorithms.
文档图像二值化是文本识别中文档图像分析的主要预处理步骤之一。噪声、模糊的字符、糟糕的扫描条件、不均匀的光照或纸张老化都会导致对文本识别算法产生负面影响的伪影。二值化的任务是从这些退化中分割前景(文本),以改善光学字符识别(OCR)结果。卷积神经网络(cnn)是一种流行的二值化方法。但是,他们在文档图像中关注本地上下文时遇到了麻烦。我们应用了CNN-Transformer混合模型将文档图像转换为二进制输出。对于模型训练,我们使用了来自文档图像二值化竞赛(DIBCO)的数据集。对于DIBCO-2012、DIBCO-2017和DIBCO-2018数据集,我们的模型优于最先进的算法。
{"title":"A hybrid CNN-Transformer model for Historical Document Image Binarization","authors":"V. Rezanezhad, Konstantin Baierer, Clemens Neudecker","doi":"10.1145/3604951.3605508","DOIUrl":"https://doi.org/10.1145/3604951.3605508","url":null,"abstract":"Document image binarization is one of the main preprocessing steps in document image analysis for text recognition. Noise, faint characters, bad scanning conditions, uneven lighting or paper aging can cause artifacts that negatively impact text recognition algorithms. The task of binarization is to segment the foreground (text) from these degradations in order to improve optical character recognition (OCR) results. Convolutional Neural Networks (CNNs) are one popular method for binarization. But they suffer from focusing on the local context in a document image. We have applied a hybrid CNN-Transformer model to convert a document image into a binary output. For the model training, we used datasets from the Document Image Binarization Contests (DIBCO). For the datasets DIBCO-2012, DIBCO-2017 and DIBCO-2018, our model outperforms the state-of-the-art algorithms.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Homer restored: Virtual reconstruction of Papyrus Bodmer 1 荷马还原:莎草纸的虚拟重建1
Simon Perrin, Léopold Cudilla, Yejing Xie, H. Mouchère, Isabelle Marthot-Santaniello
In this paper, we propose a complete method to reconstruct a damaged piece of papyrus using its image annotated at the character level and the original ancient Greek text (known otherwise). Our reconstruction allows us to recreate the written surface, making it readable and consistent with the original one. Our method is in two stages. First, the text is reconstructed by pasting character patches in their possible locations. Second, we reconstruct the background of the papyrus by applying inpainting methods. Two different inpainting techniques are tested in this article, one traditional and one using a GAN. This global reconstruction method is applied on a piece of Papyrus Bodmer 1. The results are evaluated visually by the authors of the paper and by researchers in papyrology. This reconstruction allows historians to investigate new paths on the topic of writing culture and materiality while it significantly improves the ability of non specialists to picture what this papyrus, and ancient books in general, could have looked like in Antiquity.
在本文中,我们提出了一种完整的方法,利用其图像注释在字符水平和原始的古希腊文本(已知的其他)重建损坏的纸莎草纸。我们的重建使我们能够重建书写的表面,使其具有可读性并与原始表面一致。我们的方法分为两个阶段。首先,通过在可能的位置粘贴字符补丁来重建文本。其次,运用手绘的方法对莎草纸的背景进行重建。本文测试了两种不同的喷涂技术,一种是传统的,另一种是使用GAN的。将这种全局重建方法应用于纸莎草卷1。结果由论文作者和纸莎草学研究人员进行视觉评估。这种重建使历史学家能够研究书写文化和物质性这一主题的新途径,同时也极大地提高了非专业人士描绘纸莎草纸和古书在古代可能是什么样子的能力。
{"title":"Homer restored: Virtual reconstruction of Papyrus Bodmer 1","authors":"Simon Perrin, Léopold Cudilla, Yejing Xie, H. Mouchère, Isabelle Marthot-Santaniello","doi":"10.1145/3604951.3605518","DOIUrl":"https://doi.org/10.1145/3604951.3605518","url":null,"abstract":"In this paper, we propose a complete method to reconstruct a damaged piece of papyrus using its image annotated at the character level and the original ancient Greek text (known otherwise). Our reconstruction allows us to recreate the written surface, making it readable and consistent with the original one. Our method is in two stages. First, the text is reconstructed by pasting character patches in their possible locations. Second, we reconstruct the background of the papyrus by applying inpainting methods. Two different inpainting techniques are tested in this article, one traditional and one using a GAN. This global reconstruction method is applied on a piece of Papyrus Bodmer 1. The results are evaluated visually by the authors of the paper and by researchers in papyrology. This reconstruction allows historians to investigate new paths on the topic of writing culture and materiality while it significantly improves the ability of non specialists to picture what this papyrus, and ancient books in general, could have looked like in Antiquity.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134501182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NAME – A Rich XML Format for Named Entity and Relation Tagging 用于命名实体和关系标记的富XML格式
C. Clausner, S. Pletschacher, A. Antonacopoulos
We present NAME XML, a schema for named entities and relations in documents. The standout features are: option to reference a variety of document formats (such as PAGE XML or plain text), support of entity hierarchies, custom entity types via ontologies, more expressivity due to disambiguation of base entities and entity attributes (e.g. “person” and “person name”), and relations between entities that can be directed or undirected. We describe the format in detail, show examples, and discuss real-world use cases.
我们提出了NAME XML,这是文档中命名实体和关系的模式。突出的特点是:可选择引用各种文档格式(如PAGE XML或纯文本),支持实体层次结构,通过本体自定义实体类型,由于消除了基本实体和实体属性(例如“人”和“人名”)的歧义而具有更强的表达性,以及实体之间可以定向或非定向的关系。我们详细描述了这种格式,展示了示例,并讨论了真实世界的用例。
{"title":"NAME – A Rich XML Format for Named Entity and Relation Tagging","authors":"C. Clausner, S. Pletschacher, A. Antonacopoulos","doi":"10.1145/3604951.3605521","DOIUrl":"https://doi.org/10.1145/3604951.3605521","url":null,"abstract":"We present NAME XML, a schema for named entities and relations in documents. The standout features are: option to reference a variety of document formats (such as PAGE XML or plain text), support of entity hierarchies, custom entity types via ontologies, more expressivity due to disambiguation of base entities and entity attributes (e.g. “person” and “person name”), and relations between entities that can be directed or undirected. We describe the format in detail, show examples, and discuss real-world use cases.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134394575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models 通过伪标签和基于变压器的模型增强大屠杀证词的命名实体识别
Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert
The Holocaust was a tragic and catastrophic event in World War II (WWII) history that resulted in the loss of millions of lives. In recent years, the emergence of the field of digital humanities has made the study of Holocaust testimonies an important area of research for historians, Holocaust educators, social scientists, and linguists. One of the challenges in analysing Holocaust testimonies is the recognition and categorisation of named entities such as concentration camps, military officers, ships, and ghettos, due to the scarcity of annotated data. This paper presents a research study on a domain-specific hybrid named-entity recognition model, which focuses on developing NER models specifically tailored for the Holocaust domain. To overcome the problem of data scarcity, we employed hybrid annotation approach to training different transformer model architectures in order to recognise the named entities. Results show transformer models to have good performance compared to other approaches.
大屠杀是第二次世界大战历史上的悲惨和灾难性事件,导致数百万人丧生。近年来,数字人文学科领域的兴起使得大屠杀证词研究成为历史学家、大屠杀教育者、社会科学家和语言学家的一个重要研究领域。分析大屠杀证词面临的挑战之一是,由于缺乏注释数据,对集中营、军官、船只和隔都等已命名实体进行识别和分类。本文对特定领域的混合命名实体识别模型进行了研究,重点是开发专门针对大屠杀领域的NER模型。为了克服数据稀缺性的问题,我们采用混合标注的方法来训练不同的转换器模型体系结构,以识别命名实体。结果表明,与其他方法相比,变压器模型具有良好的性能。
{"title":"Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labelling and Transformer-based Models","authors":"Isuri Anuradha Nanomi Arachchige, L. Ha, R. Mitkov, Johannes-Dieter Steinert","doi":"10.1145/3604951.3605514","DOIUrl":"https://doi.org/10.1145/3604951.3605514","url":null,"abstract":"The Holocaust was a tragic and catastrophic event in World War II (WWII) history that resulted in the loss of millions of lives. In recent years, the emergence of the field of digital humanities has made the study of Holocaust testimonies an important area of research for historians, Holocaust educators, social scientists, and linguists. One of the challenges in analysing Holocaust testimonies is the recognition and categorisation of named entities such as concentration camps, military officers, ships, and ghettos, due to the scarcity of annotated data. This paper presents a research study on a domain-specific hybrid named-entity recognition model, which focuses on developing NER models specifically tailored for the Holocaust domain. To overcome the problem of data scarcity, we employed hybrid annotation approach to training different transformer model architectures in order to recognise the named entities. Results show transformer models to have good performance compared to other approaches.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122105646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts 历史密文手写体文本识别方法评价
Mohamed Ali Souibgui, Pau Torras, Jialuo Chen, A. Fornés
This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.
本文研究了不同深度学习HTR家族,包括LSTM、Seq2Seq和基于自监督预训练的变压器方法,在识别来自不同历史时期和文化的加密手稿方面的有效性。目标是确定识别加密手稿的最合适的方法或培训技术,并提供对这一研究领域的挑战和机遇的见解。我们评估了这些模型在几个加密手稿数据集上的性能,并讨论了它们的结果。这项研究有助于开发更准确和有效的方法来识别历史手稿,以保护和传播我们的文化遗产。
{"title":"An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts","authors":"Mohamed Ali Souibgui, Pau Torras, Jialuo Chen, A. Fornés","doi":"10.1145/3604951.3605509","DOIUrl":"https://doi.org/10.1145/3604951.3605509","url":null,"abstract":"This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.","PeriodicalId":375632,"journal":{"name":"Proceedings of the 7th International Workshop on Historical Document Imaging and Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128040705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 7th International Workshop on Historical Document Imaging and Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1