A Scheme Towards Automatic Word Indexation System for Balinese Palm Leaf Manuscripts

IF 0.5 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of ICT Research and Applications Pub Date : 2021-10-07 DOI:10.5614/itbj.ict.res.appl.2021.15.2.1
M. W. A. Kesiman, G. Pradnyana
{"title":"A Scheme Towards Automatic Word Indexation System for Balinese Palm Leaf Manuscripts","authors":"M. W. A. Kesiman, G. Pradnyana","doi":"10.5614/itbj.ict.res.appl.2021.15.2.1","DOIUrl":null,"url":null,"abstract":"This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.","PeriodicalId":42785,"journal":{"name":"Journal of ICT Research and Applications","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2021-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of ICT Research and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5614/itbj.ict.res.appl.2021.15.2.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种巴厘岛棕榈叶手稿自动标引系统方案
本文提出了一个巴厘文棕榈叶手稿自动词标引系统的初步方案。字词索引系统方案由字词区域的补丁图像提取子模块和字词图像音译子模块组成。这是第一个为lontar collection提出的词索引系统。为了检测lontar图像中包含文本的部分,使用Gabor过滤器来提供关于图像中文本纹理存在的初始信息。提出了一种自适应滑动patch算法,用于提取lontars中的patch图像。采用长短期记忆(LSTM)模型构建单词图像转写子模块。结果表明,文本区域的图像patch提取过程能够最优地检测出lontars中的文本区域,并在合适的位置提取出patch图像。所提出的方案成功地提取了lontar集合中20%至40%的关键字,因此至少可以为lontar集合中包含的内容的潜在lontar读者提供初始描述,或查找在lontar集合中可以找到某些关键字。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of ICT Research and Applications
Journal of ICT Research and Applications COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
1.60
自引率
0.00%
发文量
13
审稿时长
24 weeks
期刊介绍: Journal of ICT Research and Applications welcomes full research articles in the area of Information and Communication Technology from the following subject areas: Information Theory, Signal Processing, Electronics, Computer Network, Telecommunication, Wireless & Mobile Computing, Internet Technology, Multimedia, Software Engineering, Computer Science, Information System and Knowledge Management. Authors are invited to submit articles that have not been published previously and are not under consideration elsewhere.
期刊最新文献
Smart Card-based Access Control System using Isolated Many-to-Many Authentication Scheme for Electric Vehicle Charging Stations The Evaluation of DyHATR Performance for Dynamic Heterogeneous Graphs Machine Learning-based Early Detection and Prognosis of the Covid-19 Pandemic Improving Robustness Using MixUp and CutMix Augmentation for Corn Leaf Diseases Classification based on ConvMixer Architecture Generative Adversarial Networks Based Scene Generation on Indian Driving Dataset
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1