Papy-S-Net: A Siamese Network to match papyrus fragments

A. Pirrone, M. Beurton-Aimar, N. Journet
{"title":"Papy-S-Net: A Siamese Network to match papyrus fragments","authors":"A. Pirrone, M. Beurton-Aimar, N. Journet","doi":"10.1145/3352631.3352646","DOIUrl":null,"url":null,"abstract":"Like all heritage documents, papyri are the subject of an in-depth study by scientists. While large volumes of papyri have been digitized and indexed, many are still waiting to be so. It takes time to study a papyrus mainly because they are rarely available in one piece. Papyrologists must review a large number of fragments, find those that go together and then assemble them to finally analyze the text. Unfortunately, some fragments no longer exist. It is then a time consuming puzzle to solve, where not all the pieces are available and where fragments boundaries are not perfectly matching.AB@This article describes a method to help Papyrologists save time by helping them to solve this complex puzzle. We provide a solution where an expert use a fragment as a request element and get fragments that belong to the same papyrus. The main contribution is the proposal of a deep siamese network architecture, called Papy-S-Net for Papyrus-Siamese-Network, designed for papyri fragment matching. This network is trained and validated on 500 papyrus fragments approx. We compare the results of Papy-S-Net with a previous work of Koch et al. [14] which proposes a siamese network to match written symbols. In order to train and validate the network, we proceed to the extraction of patches from the papyrus fragments to create our ground truth. Papy-S-Net outperforms Koch et al.'s network. We also evaluate our approach on a real use case on which Papy-S-Net achieves 79% of correct matches.","PeriodicalId":174440,"journal":{"name":"Proceedings of the 5th International Workshop on Historical Document Imaging and Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Workshop on Historical Document Imaging and Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3352631.3352646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Like all heritage documents, papyri are the subject of an in-depth study by scientists. While large volumes of papyri have been digitized and indexed, many are still waiting to be so. It takes time to study a papyrus mainly because they are rarely available in one piece. Papyrologists must review a large number of fragments, find those that go together and then assemble them to finally analyze the text. Unfortunately, some fragments no longer exist. It is then a time consuming puzzle to solve, where not all the pieces are available and where fragments boundaries are not perfectly matching.AB@This article describes a method to help Papyrologists save time by helping them to solve this complex puzzle. We provide a solution where an expert use a fragment as a request element and get fragments that belong to the same papyrus. The main contribution is the proposal of a deep siamese network architecture, called Papy-S-Net for Papyrus-Siamese-Network, designed for papyri fragment matching. This network is trained and validated on 500 papyrus fragments approx. We compare the results of Papy-S-Net with a previous work of Koch et al. [14] which proposes a siamese network to match written symbols. In order to train and validate the network, we proceed to the extraction of patches from the papyrus fragments to create our ground truth. Papy-S-Net outperforms Koch et al.'s network. We also evaluate our approach on a real use case on which Papy-S-Net achieves 79% of correct matches.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Papy-S-Net:一个匹配纸莎草碎片的暹罗网络
像所有的遗产文献一样,纸莎草纸也是科学家们深入研究的对象。虽然大量的纸莎草纸已经被数字化和索引,但许多仍在等待。研究纸莎草纸需要时间,主要是因为它们很少是完整的。纸莎草学家必须审查大量的碎片,找到那些在一起的碎片,然后将它们组合起来,最后分析文本。不幸的是,有些片段已不复存在。这是一个耗时的谜题,因为不是所有的碎片都可用,碎片的边界也不完全匹配。AB@This文章介绍了一种方法,帮助纸莎草学家节省时间,帮助他们解决这个复杂的难题。我们提供了一个解决方案,专家使用片段作为请求元素,并获得属于同一纸莎草的片段。主要贡献是提出了一种深度暹罗网络架构,称为Papy-S-Net for Papyrus-Siamese-Network,专为纸莎草碎片匹配而设计。该网络在大约500个莎草纸碎片上进行了训练和验证。我们将Papy-S-Net的结果与Koch等人先前的工作进行了比较,后者提出了一个连体网络来匹配书写符号。为了训练和验证网络,我们继续从莎草纸碎片中提取补丁来创建我们的地面真相。Papy-S-Net优于Koch等人的网络。我们还在一个真实的用例中评估了我们的方法,在这个用例中,Papy-S-Net实现了79%的正确匹配。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Dataset of Pages from Early Printed Books with Multiple Font Groups Papy-S-Net: A Siamese Network to match papyrus fragments Signature detection as a way to recognise historical parish register structure Keeping Informed: Automatic Processing of Residual Functional Capacity Form Images Using Balanced Training to Minimize Biased Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1