John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas
{"title":"Explainable dating of greek papyri images","authors":"John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas","doi":"10.1007/s10994-024-06589-w","DOIUrl":null,"url":null,"abstract":"<p>Greek literary papyri, which are unique witnesses of antique literature, do not usually bear a date. They are thus currently dated based on palaeographical methods, with broad approximations which often span more than a century. We created a dataset of 242 images of papyri written in “bookhand” scripts whose date can be securely assigned, and we used it to train algorithms for the task of dating, showing its challenging nature. To address data scarcity, we extended our dataset by segmenting each image into its respective text lines. By using the line-based version of our dataset, we trained a Convolutional Neural Network, equipped with a fragmentation-based augmentation strategy, and we achieved a mean absolute error of 54 years. The results improve further when the task is cast as a multi-class classification problem, predicting the century. Using our network, we computed precise date estimations for papyri whose date is disputed or vaguely defined, employing explainability to understand dating-driving features.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06589-w","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Greek literary papyri, which are unique witnesses of antique literature, do not usually bear a date. They are thus currently dated based on palaeographical methods, with broad approximations which often span more than a century. We created a dataset of 242 images of papyri written in “bookhand” scripts whose date can be securely assigned, and we used it to train algorithms for the task of dating, showing its challenging nature. To address data scarcity, we extended our dataset by segmenting each image into its respective text lines. By using the line-based version of our dataset, we trained a Convolutional Neural Network, equipped with a fragmentation-based augmentation strategy, and we achieved a mean absolute error of 54 years. The results improve further when the task is cast as a multi-class classification problem, predicting the century. Using our network, we computed precise date estimations for papyri whose date is disputed or vaguely defined, employing explainability to understand dating-driving features.
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.