{"title":"A Scheme Towards Automatic Word Indexation System for Balinese Palm Leaf Manuscripts","authors":"M. W. A. Kesiman, G. Pradnyana","doi":"10.5614/itbj.ict.res.appl.2021.15.2.1","DOIUrl":null,"url":null,"abstract":"This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.","PeriodicalId":42785,"journal":{"name":"Journal of ICT Research and Applications","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2021-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of ICT Research and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5614/itbj.ict.res.appl.2021.15.2.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.
期刊介绍:
Journal of ICT Research and Applications welcomes full research articles in the area of Information and Communication Technology from the following subject areas: Information Theory, Signal Processing, Electronics, Computer Network, Telecommunication, Wireless & Mobile Computing, Internet Technology, Multimedia, Software Engineering, Computer Science, Information System and Knowledge Management. Authors are invited to submit articles that have not been published previously and are not under consideration elsewhere.