Extracting and Tagging Unstructured Citation of a Hebrew Religious Document

Dror Mughaz, Yaakov HaCohen-Kerner, D. Gabbay
{"title":"Extracting and Tagging Unstructured Citation of a Hebrew Religious Document","authors":"Dror Mughaz, Yaakov HaCohen-Kerner, D. Gabbay","doi":"10.28945/4345","DOIUrl":null,"url":null,"abstract":"Aim/Purpose: Finding and tagging citation on an ancient Hebrew religious document. These documents have no structured citations and have no bibliography.\n\nBackground: We look for common patterns within Hebrew religious texts. \n\nMethodology: We developed a method that goes over the texts and extracts sentences con-taining the names of three famous authors. Within these sentences we find common ways of addressing those three authors and with these patterns we find references to various other authors.\n\nContribution: This type of text is rich in citations and references to authors, but because there is no structure of references it is very difficult for a computer to automatically identify the references. We hope that with the method we have developed it will be easier for a computer to identify references and even turn them into hyper-links.\n\nFindings: We have provided an algorithm to solve the problem of non-structured cita-tions in an old Hebrew plain text. The algorithm definitely was able to find many citations but it has missed out some types of citations.\n\nImpact on Society: When the computer recognizes references, it will be able to build (at least par-tially) a bibliography that currently does not exist in such texts at all. Over time, OCR scans more and more ancient texts. This method can make people's access and understanding much.\n\nFuture Research: After we identify the references, we plan to automatically create a bibliography for these texts and even transform those references into hyperlinks.","PeriodicalId":249265,"journal":{"name":"Proceedings of the 2019 InSITE Conference","volume":"301 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 InSITE Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28945/4345","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Aim/Purpose: Finding and tagging citation on an ancient Hebrew religious document. These documents have no structured citations and have no bibliography. Background: We look for common patterns within Hebrew religious texts. Methodology: We developed a method that goes over the texts and extracts sentences con-taining the names of three famous authors. Within these sentences we find common ways of addressing those three authors and with these patterns we find references to various other authors. Contribution: This type of text is rich in citations and references to authors, but because there is no structure of references it is very difficult for a computer to automatically identify the references. We hope that with the method we have developed it will be easier for a computer to identify references and even turn them into hyper-links. Findings: We have provided an algorithm to solve the problem of non-structured cita-tions in an old Hebrew plain text. The algorithm definitely was able to find many citations but it has missed out some types of citations. Impact on Society: When the computer recognizes references, it will be able to build (at least par-tially) a bibliography that currently does not exist in such texts at all. Over time, OCR scans more and more ancient texts. This method can make people's access and understanding much. Future Research: After we identify the references, we plan to automatically create a bibliography for these texts and even transform those references into hyperlinks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
提取和标记非结构化引文的希伯来宗教文件
目的:在古希伯来宗教文献中寻找并标注引文。这些文档没有结构化的引用,也没有参考书目。背景:我们在希伯来宗教文本中寻找共同的模式。方法:我们开发了一种方法,通过文本和包含三个著名作家的名字提取句子。在这些句子中,我们可以找到称呼这三位作者的常用方式,并通过这些模式找到对其他作者的引用。贡献:这种类型的文本有丰富的引用和作者参考文献,但由于没有参考文献结构,计算机很难自动识别参考文献。我们希望通过我们开发的方法,计算机可以更容易地识别参考文献,甚至将它们转换为超链接。结果:我们提供了一种算法来解决古希伯来文纯文本中的非结构化引文问题。该算法确实能够找到许多引用,但它错过了某些类型的引用。对社会的影响:当计算机识别参考文献时,它将能够建立(至少部分)目前根本不存在于此类文本中的参考书目。随着时间的推移,OCR扫描了越来越多的古代文本。这种方法可以使人们的接触和理解更多。未来研究:在我们确定了参考文献之后,我们计划为这些文本自动创建参考书目,甚至将这些参考文献转换为超链接。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Impact of Contribution in Aid of Construction on Utility Dilapidated Infrastructure: Evidence from the State of Florida [Abstract] Emoji Identification and Prediction in Hebrew Political Corpus Transforming a First-year Accounting Course Using a Blended Learning Pathway Educational Technology in IT and Marketing Education - The Experience of Early Thai Educators [Abstract] Workshop: Keyword Discovery: Visualizing Your Topic in Research, Thesis and Dissertation Development
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1