Skaitmeninių metodų taikymas praeities kelionių tyrimuose

Q2 Arts and Humanities Knygotyra Pub Date : 2023-12-22 DOI:10.15388/knygotyra.2023.81.6
Rimvydas Laužikas
{"title":"Skaitmeninių metodų taikymas praeities kelionių tyrimuose","authors":"Rimvydas Laužikas","doi":"10.15388/knygotyra.2023.81.6","DOIUrl":null,"url":null,"abstract":"The massive digitisation of written historical sources, optical character recognition (OCR) of texts, and their online availability in recent decades have created new opportunities and challenges for historical research. The digital humanities research model presented in this paper is based on the information organisation paradigm and the application of digital technology-based methods in studying ancient travels. The model has been developed and tested using the materials of the project “Homo Viator: Travel Space and Travellers’ Experiences in Early Modern Lithuania”.\nThe main problem of the research is related to the fact that one of the essential sources of information about ancient travel are egodocuments (letters, diaries, memoirs, etc.) that contain journeys described alongside other important life events of a particular person. However, travel descriptions form only a small part of a text of a given egodocument and are unevenly distributed among different egodocuments. Therefore, given the size of the text of the egodocuments and their collections and the number of egodocuments published in different languages, researching them as sources in only one aspect (travel) requires a significant amount of human and time resources. A similar problematic situation exists with other sources of knowledge on ancient travel: a massive number of documents published in digital form (including OCR), their texts are voluminous, and the text fragments related to travel, country descriptions, ancient travel routes, travel and mobility infrastructure, and travellers’ experiences are relatively small and scattered throughout the source text.\nThe research model described in the paper is divided into two steps: (i) collection of the corpus of OCR source texts; (ii) collection of empirical data using a dictionary-based computer-aided [or assisted] qualitative text analysis method implemented with the MaxQDA software. The collection of the source text corpus is carried out by applying the general principles and methods of online search of scientific publications. The corpus comprises authentic, published sources relevant to the study (letters, diaries, memoirs, etc.) and scholarly publications about them, thus forming two blocks of text - sources and literature. The literature block is used as additional material for a more precise selection and interpretation of the source texts. A key element for applying a dictionary-based computer-aided [or assisted] qualitative text analysis method is a high-quality dictionary that accurately describes the concepts (categories) relevant to the research. Considering the specificity of the sources (the languages used in the sources and their translations), a multilingual dictionary (Lithuanian-Polish-English-Russian-German) was compiled. The structure of the dictionary consists of six concepts (categories) related to ancient travel: (i) journey (general description), (ii) road and its infrastructure (bridges, fords, etc.), (iii) means of transportation, (iv) resting and accommodation places (towns, villages, taverns, post offices, etc.), (v) people encountered on the way (inn-keepers, highwaymen, guides, etc.), and (vi) food of the journey. A set of keywords and phrases describes each concept.\nIn the last stage of the study, the research model was tested. The testing showed that the model could solve the above problems that arose during the project.","PeriodicalId":37220,"journal":{"name":"Knygotyra","volume":"67 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knygotyra","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15388/knygotyra.2023.81.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

Abstract

The massive digitisation of written historical sources, optical character recognition (OCR) of texts, and their online availability in recent decades have created new opportunities and challenges for historical research. The digital humanities research model presented in this paper is based on the information organisation paradigm and the application of digital technology-based methods in studying ancient travels. The model has been developed and tested using the materials of the project “Homo Viator: Travel Space and Travellers’ Experiences in Early Modern Lithuania”. The main problem of the research is related to the fact that one of the essential sources of information about ancient travel are egodocuments (letters, diaries, memoirs, etc.) that contain journeys described alongside other important life events of a particular person. However, travel descriptions form only a small part of a text of a given egodocument and are unevenly distributed among different egodocuments. Therefore, given the size of the text of the egodocuments and their collections and the number of egodocuments published in different languages, researching them as sources in only one aspect (travel) requires a significant amount of human and time resources. A similar problematic situation exists with other sources of knowledge on ancient travel: a massive number of documents published in digital form (including OCR), their texts are voluminous, and the text fragments related to travel, country descriptions, ancient travel routes, travel and mobility infrastructure, and travellers’ experiences are relatively small and scattered throughout the source text. The research model described in the paper is divided into two steps: (i) collection of the corpus of OCR source texts; (ii) collection of empirical data using a dictionary-based computer-aided [or assisted] qualitative text analysis method implemented with the MaxQDA software. The collection of the source text corpus is carried out by applying the general principles and methods of online search of scientific publications. The corpus comprises authentic, published sources relevant to the study (letters, diaries, memoirs, etc.) and scholarly publications about them, thus forming two blocks of text - sources and literature. The literature block is used as additional material for a more precise selection and interpretation of the source texts. A key element for applying a dictionary-based computer-aided [or assisted] qualitative text analysis method is a high-quality dictionary that accurately describes the concepts (categories) relevant to the research. Considering the specificity of the sources (the languages used in the sources and their translations), a multilingual dictionary (Lithuanian-Polish-English-Russian-German) was compiled. The structure of the dictionary consists of six concepts (categories) related to ancient travel: (i) journey (general description), (ii) road and its infrastructure (bridges, fords, etc.), (iii) means of transportation, (iv) resting and accommodation places (towns, villages, taverns, post offices, etc.), (v) people encountered on the way (inn-keepers, highwaymen, guides, etc.), and (vi) food of the journey. A set of keywords and phrases describes each concept. In the last stage of the study, the research model was tested. The testing showed that the model could solve the above problems that arose during the project.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
应用数字方法研究过去的旅程
近几十年来,书面历史资料的大规模数字化、文本的光学字符识别(OCR)及其在线提供为历史研究带来了新的机遇和挑战。本文介绍的数字人文研究模式是基于信息组织范式和数字技术方法在古代旅行研究中的应用。该模型是利用 "Homo Viator "项目的材料开发和测试的:研究的主要问题在于,古代旅行的重要信息来源之一是个人文集(书信、日记、回忆录等),其中包含旅行描述以及特定个人的其他重要生活事件。然而,旅行描述只占特定电子文献文本的一小部分,而且在不同电子文献中分布不均。因此,考虑到电子文献文本及其收藏的规模,以及以不同语言出版的电子文献的数量,仅将其作为一个方面(旅行)的资料来源进行研究就需要大量的人力和时间资源。其他有关古代旅行的知识来源也存在类似的问题:以数字形式(包括 OCR)出版的文献数量庞大,其文本浩如烟海,而与旅行、国家描述、古代旅行路线、旅行和流动基础设施以及旅行者经历相关的文本片段相对较少,且分散在源文本中:本文所述的研究模式分为两个步骤:(i) 收集 OCR 源文本语料库;(ii) 利用 MaxQDA 软件实施的基于词典的计算机辅助(或辅助)定性文本分析方法收集经验数据。源文本语料库的收集采用了科学出版物在线搜索的一般原则和方法。该语料库包括与研究相关的真实的、已出版的资料来源(书信、日记、回忆录等)以及与之相关的学术出版物,从而形成两个文本块--资料来源和文献。文献块作为补充材料,用于对来源文本进行更精确的选择和解释。应用基于词典的计算机辅助(或辅助)定性文本分析方法的一个关键因素是要有一本能准确描述与研究相关的概念(类别)的高质量词典。考虑到资料来源的特殊性(资料来源中使用的语言及其翻译),我们编制了一本多语言词典(立陶宛语-波兰语-英语-俄语-德语)。词典的结构包括与古代旅行有关的六个概念(类别):(i) 旅行(一般描述),(ii) 道路及其基础设施(桥梁、岔道等),(iii) 交通工具,(iv) 休息和住宿地点(城镇、村庄、酒馆、邮局等),(v) 途中遇到的人(旅店老板、路人、向导等),以及 (vi) 旅行中的食物。在研究的最后阶段,对研究模型进行了测试。测试结果表明,该模型可以解决项目过程中出现的上述问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Knygotyra
Knygotyra Arts and Humanities-Literature and Literary Theory
CiteScore
0.30
自引率
0.00%
发文量
14
审稿时长
30 weeks
期刊最新文献
Nespaustuvinė periodika Vrublevskių bibliotekoje: ką Lietuvos leidybos istorijai pasakoja rinkinys? From Spy to Editor – Informants of Elżbieta Sieniawska née Lubomirska (d. 1729), Wife of the Castellan of Krakow Legal Disputes of Emigrant Periodical Publishers from the End of the 19th Century to 1904 Neformalių XX a. pabaigos – XXI a. pradžios Lietuvos jaunimo leidinių – fanzinų – leidybos tendencijos Knygotyros 80 tomų: iškilmingas minėjimas Vilniaus universiteto bibliotekoje
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1