Visual language processing (VLP) of ancient manuscripts: Converting collections to windows on the past

2013 7th IEEE GCC Conference and Exhibition (GCC) Pub Date : 2013-11-01 DOI:10.1109/IEEEGCC.2013.6705813

M. Cheriet, R. F. Moghaddam, R. Hedjam

{"title":"Visual language processing (VLP) of ancient manuscripts: Converting collections to windows on the past","authors":"M. Cheriet, R. F. Moghaddam, R. Hedjam","doi":"10.1109/IEEEGCC.2013.6705813","DOIUrl":null,"url":null,"abstract":"Ancient manuscripts constitute a primary carrier of cultural heritage globally, and they are currently being intensively digitized all over the world to ensure their preservation, and, ultimately, the wide accessibility of their content. Critical to this research process are the legibility of the documents in image form, and access to live texts. Several state-of-the-art methods and approaches have been proposed and developed to address the challenges associated with processing these manuscripts. However, there is a huge amount of data involved, and also the high cost and scarcity of human expert feedback and reference data call for the development of fundamental approaches that encompass all these aspects in an objective and tractable manner. In this paper, we propose one such approach, which is a novel framework for the computational pattern analysis of ancient manuscripts that is data-driven, multilevel, self-sustaining, and learning-based, and takes advantage of the large quantities of unprocessed data available. Unlike many approaches, which fast-forward to the processing and analysis of feature vectors, our innovative framework represents a new perspective on the task, which starts from ground zero of the problem, which is the definition of objects. In addition, it leverages the data-driven mining of relations among objects to discover hidden but persistent links between them. The problem is addressed at three main levels. At the lowest level, that of images, it tackles automatic, data-driven enhancement and restoration of document images using spatial, spectral, sparse, and graph-based representations of visual objects. At the second level, which is transliteration, directed graphical models, HMMs, Undirected Random Fields, and spatial relations models are used to extract the live text of manuscript images, which reduces dependency on human experts. Finally, at the highest level, that of network analysis of the relations among objects (from patches and words to manuscripts and writers) involves the search for `social networks' linking manuscripts. Considering this approach under the umbrella of Visual Language Processing (VLP), we hope that it will be further enriched by the research community, in the form of new insights and approaches contributed at the various levels.","PeriodicalId":316751,"journal":{"name":"2013 7th IEEE GCC Conference and Exhibition (GCC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 7th IEEE GCC Conference and Exhibition (GCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEEGCC.2013.6705813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Ancient manuscripts constitute a primary carrier of cultural heritage globally, and they are currently being intensively digitized all over the world to ensure their preservation, and, ultimately, the wide accessibility of their content. Critical to this research process are the legibility of the documents in image form, and access to live texts. Several state-of-the-art methods and approaches have been proposed and developed to address the challenges associated with processing these manuscripts. However, there is a huge amount of data involved, and also the high cost and scarcity of human expert feedback and reference data call for the development of fundamental approaches that encompass all these aspects in an objective and tractable manner. In this paper, we propose one such approach, which is a novel framework for the computational pattern analysis of ancient manuscripts that is data-driven, multilevel, self-sustaining, and learning-based, and takes advantage of the large quantities of unprocessed data available. Unlike many approaches, which fast-forward to the processing and analysis of feature vectors, our innovative framework represents a new perspective on the task, which starts from ground zero of the problem, which is the definition of objects. In addition, it leverages the data-driven mining of relations among objects to discover hidden but persistent links between them. The problem is addressed at three main levels. At the lowest level, that of images, it tackles automatic, data-driven enhancement and restoration of document images using spatial, spectral, sparse, and graph-based representations of visual objects. At the second level, which is transliteration, directed graphical models, HMMs, Undirected Random Fields, and spatial relations models are used to extract the live text of manuscript images, which reduces dependency on human experts. Finally, at the highest level, that of network analysis of the relations among objects (from patches and words to manuscripts and writers) involves the search for `social networks' linking manuscripts. Considering this approach under the umbrella of Visual Language Processing (VLP), we hope that it will be further enriched by the research community, in the form of new insights and approaches contributed at the various levels.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

古代手稿的视觉语言处理(VLP):将收藏转化为过去的窗口

古代手稿是全球文化遗产的主要载体，目前世界各地正在对它们进行大规模数字化，以确保它们得到保存，并最终使其内容能够广泛获取。这个研究过程的关键是图像形式的文件的易读性，并获得现场文本。已经提出并开发了几种最先进的方法和方法来解决与处理这些手稿相关的挑战。然而，涉及的数据量巨大，而且人类专家反馈和参考数据的高成本和稀缺要求开发以客观和可处理的方式涵盖所有这些方面的基本方法。在本文中，我们提出了一种这样的方法，它是一种新的框架，用于古代手稿的计算模式分析，它是数据驱动的、多层次的、自我维持的、基于学习的，并利用了大量可用的未处理数据。与许多快速推进到特征向量处理和分析的方法不同，我们的创新框架代表了对任务的新视角，它从问题的起点开始，即对象的定义。此外，它利用数据驱动的对象之间关系挖掘来发现它们之间隐藏但持久的链接。这个问题在三个主要层面上得到解决。在最低级别，即图像级别，它使用可视对象的空间、光谱、稀疏和基于图形的表示来处理文档图像的自动、数据驱动的增强和恢复。在第二层，即音译，有向图形模型，hmm，无向随机场和空间关系模型用于提取手稿图像的实时文本，从而减少了对人类专家的依赖。最后，在最高层次上，对象之间关系的网络分析(从补丁和文字到手稿和作家)涉及到对连接手稿的“社会网络”的搜索。考虑到这种方法在视觉语言处理(VLP)的保护伞下，我们希望研究界能够进一步丰富它，在各个层面上贡献新的见解和方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 7th IEEE GCC Conference and Exhibition (GCC)

自引率

0.00%

发文量