Kirjanduslikud digikeskkonnad keeleressursside baasina: mõjukriitika juhtumiuuring päringusüsteemis KORP / Digital literary heritage projects as a source of language resources: a case of Estonian criticism in KORP

Q1 Arts and Humanities Methis Pub Date : 2020-12-15 DOI:10.7592/methis.v21i26.16916
Marin Laak
{"title":"Kirjanduslikud digikeskkonnad keeleressursside baasina: mõjukriitika juhtumiuuring päringusüsteemis KORP / Digital literary heritage projects as a source of language resources: a case of Estonian criticism in KORP","authors":"Marin Laak","doi":"10.7592/methis.v21i26.16916","DOIUrl":null,"url":null,"abstract":"Eesti Kirjandusmuuseum on olnud teerajajaid digihumanitaaria valdkonnas juba 1990. aastatest, alates arvutikultuuri laiemast levikust. Väärtuslike andmekogude haldamisel on olnud missiooniks nende kättesaadavaks tegemine avalikkusele. Kultuuripärand avati laiemale kasutajale kahes suunas: sisupõhised otsitavad andmebaasid ning suhtepõhised andmekeskkonnad. Siinse artikli eesmärgiks on näidata arvutusliku kirjandusteaduse tänapäevaseid võimalusi ja nendega seotud kirjanduslike keeleressursside loomist koostöös korpuslingvistidega. Artiklis analüüsin kultuuripärandi sisukeskkondade ja andmekoguside kasutusvõimalusi masinloetava keeleressursina. Esimeste selliste katsetena on valminud kirjavahetuse ja kriitika märgendatud keelekorpused päringusüsteemis KORP. Käesolev uurimus toob on 20. sajandi alguse mõjukriitika probleemi näitel välja kirjanduslike keelekorpuste potentsiaali kultuuripärandi uurimisel. \n  \nEstonia can soon expect an explosive growth in digital heritage and text resources due to the current project of mass digitisation of national cultural heritage (printed books, archival documents, photos, art, audiovisual, and ethnographic artifacts) (2019–2023). This will give new opportunities for different fields of digital humanities and make digitised heritage accessible to everyone in the form of open data. The project will focus on the usage of the heritage, on the needs of education, e-learning, and the creative industry, including digital creative arts. \nThe aim of this article is to examine some research possibilities that opened up for literary history due to the digitisation of literary works and archival sources and to put them in the general context of digital humanities. \nAlthough the field of digital humanities is broad, the meaning of DH is often reduced to methods of computational language-centered analyses, mainly based on using different tools and software languages (R, Stylo, Phyton, Gephy, Top Modelling etc.). While the corpus-based research is already a professional standard in linguistics, literary scholars are still more used to working with traditional methods. This article introduces two digital literary history projects belonging to the field of digital humanities and analyses them as language resources for creating texts corpora, and introduces some results of the case study of Estonian criticism from the Young Estonia movement up to the 1920s, carried out using the literary texts corpora in the corpus query system KORP (https://korp.keeleressursid.ee) by the Centre of Estonian Language Resources. \nDuring the past twenty years, I have mainly focussed on developing large-scale implementation projects for digital representation of Estonian literary history. The objective of these experimental projects has been to develop principally new non-linear models of Estonian literary history for the digital environment. These activities were based on my research of the intertextual relations between authors, literary works, and critical texts using traditional methods. \nThe first content-based literary history project “ERNI. Estonian Literary History in Texts 1924–1925” (www2.kirmus.ee/erni) was based on a hypertextual network of literary source texts and reviews. We re-conceptualised literary history as a non-linear narrative and a gallery with many entrances. The task of the project was also to ensure its usability in education: a significant number of study materials has been added in cooperation with schoolteachers. \nIn 2004, we initiated our long-term and still running project “Kreutzwald’s Century: the Estonian Cultural History Web” (http://kreutzwald.kirmus.ee) at the Estonian Literary Museum. The objective of this project was to make literary sources of the period accessible as the dynamic, interactive information environment. This was a hybrid project which synthesised the classical study of Estonian literary history, the needs of the digital media user, and the expanding digital resources from different memory institutions; its underlying idea was to link together all the works of fiction of an author, as well as their biography, manuscripts, and photos and to make them visible for the user on five interactive time axes. The project uses a specially created platform. Today, this platform is extensively used by schoolteachers: in 2020 (Jan.–Dec.) it had about 8, 986.555 million clicks and during seven years (2013 Dec.–2020 Dec.) it has collected 64, 627.380 million clicks. \nTo find out how we can fit such content-based models of literary heritage into the context of Digital Humanities we need to compare the previous modelling practices with our current experimental project in the corpus query system KORP. Our interdisciplinary project “Literary Studies Meet Corpus Linguistics” (2017–2020) concentrated on studying literary history sources with linguistic methods. As the result of the project two literary text corpora were created: “Epistolary text corpus of Estonian writers Johannes Semper and Johannes Vares-Barbarus” and “Corpus of the Estonian literary criticism, Noor-Eesti and the 1920s”. Both of them were pilot projects in the field, started with converting the digitalised archival and printed sources into machine-readable format before text and data mining for corpus creation. \nQuery system KORP allows us to organise the language data by all the categories used in the corpus, for example, to learn who and in what context mentioned the name of the French writer André Gide. The second currently running project is the morphologically annotated corpus of literary criticism. This corpus contains texts of literary reviews and criticism in different genres, drawn from the projects ERNI and “Kreutzwald’s Century”. The first results in studying the dynamics of literary values can already be seen. \nA query in KORP about the word ‘mõju’ (‘influence’) revealed that the manifesto “More of European culture!”of the group Young Estonia, voiced in 1905, was during the independent Estonian Republic replaced by the valuing of a specific national character. Corpus query showed a change in the meaning of the word: in the criticism contemporary to Young Estonia, the word ‘mõju’ was only associated with the historical pressure from Russian and German cultures. The foundation for modern comparative linguistics at the University of Tartu was laid in the 1920s by the professorship in Estonian literature.","PeriodicalId":37565,"journal":{"name":"Methis","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7592/methis.v21i26.16916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

Abstract

Eesti Kirjandusmuuseum on olnud teerajajaid digihumanitaaria valdkonnas juba 1990. aastatest, alates arvutikultuuri laiemast levikust. Väärtuslike andmekogude haldamisel on olnud missiooniks nende kättesaadavaks tegemine avalikkusele. Kultuuripärand avati laiemale kasutajale kahes suunas: sisupõhised otsitavad andmebaasid ning suhtepõhised andmekeskkonnad. Siinse artikli eesmärgiks on näidata arvutusliku kirjandusteaduse tänapäevaseid võimalusi ja nendega seotud kirjanduslike keeleressursside loomist koostöös korpuslingvistidega. Artiklis analüüsin kultuuripärandi sisukeskkondade ja andmekoguside kasutusvõimalusi masinloetava keeleressursina. Esimeste selliste katsetena on valminud kirjavahetuse ja kriitika märgendatud keelekorpused päringusüsteemis KORP. Käesolev uurimus toob on 20. sajandi alguse mõjukriitika probleemi näitel välja kirjanduslike keelekorpuste potentsiaali kultuuripärandi uurimisel.   Estonia can soon expect an explosive growth in digital heritage and text resources due to the current project of mass digitisation of national cultural heritage (printed books, archival documents, photos, art, audiovisual, and ethnographic artifacts) (2019–2023). This will give new opportunities for different fields of digital humanities and make digitised heritage accessible to everyone in the form of open data. The project will focus on the usage of the heritage, on the needs of education, e-learning, and the creative industry, including digital creative arts. The aim of this article is to examine some research possibilities that opened up for literary history due to the digitisation of literary works and archival sources and to put them in the general context of digital humanities. Although the field of digital humanities is broad, the meaning of DH is often reduced to methods of computational language-centered analyses, mainly based on using different tools and software languages (R, Stylo, Phyton, Gephy, Top Modelling etc.). While the corpus-based research is already a professional standard in linguistics, literary scholars are still more used to working with traditional methods. This article introduces two digital literary history projects belonging to the field of digital humanities and analyses them as language resources for creating texts corpora, and introduces some results of the case study of Estonian criticism from the Young Estonia movement up to the 1920s, carried out using the literary texts corpora in the corpus query system KORP (https://korp.keeleressursid.ee) by the Centre of Estonian Language Resources. During the past twenty years, I have mainly focussed on developing large-scale implementation projects for digital representation of Estonian literary history. The objective of these experimental projects has been to develop principally new non-linear models of Estonian literary history for the digital environment. These activities were based on my research of the intertextual relations between authors, literary works, and critical texts using traditional methods. The first content-based literary history project “ERNI. Estonian Literary History in Texts 1924–1925” (www2.kirmus.ee/erni) was based on a hypertextual network of literary source texts and reviews. We re-conceptualised literary history as a non-linear narrative and a gallery with many entrances. The task of the project was also to ensure its usability in education: a significant number of study materials has been added in cooperation with schoolteachers. In 2004, we initiated our long-term and still running project “Kreutzwald’s Century: the Estonian Cultural History Web” (http://kreutzwald.kirmus.ee) at the Estonian Literary Museum. The objective of this project was to make literary sources of the period accessible as the dynamic, interactive information environment. This was a hybrid project which synthesised the classical study of Estonian literary history, the needs of the digital media user, and the expanding digital resources from different memory institutions; its underlying idea was to link together all the works of fiction of an author, as well as their biography, manuscripts, and photos and to make them visible for the user on five interactive time axes. The project uses a specially created platform. Today, this platform is extensively used by schoolteachers: in 2020 (Jan.–Dec.) it had about 8, 986.555 million clicks and during seven years (2013 Dec.–2020 Dec.) it has collected 64, 627.380 million clicks. To find out how we can fit such content-based models of literary heritage into the context of Digital Humanities we need to compare the previous modelling practices with our current experimental project in the corpus query system KORP. Our interdisciplinary project “Literary Studies Meet Corpus Linguistics” (2017–2020) concentrated on studying literary history sources with linguistic methods. As the result of the project two literary text corpora were created: “Epistolary text corpus of Estonian writers Johannes Semper and Johannes Vares-Barbarus” and “Corpus of the Estonian literary criticism, Noor-Eesti and the 1920s”. Both of them were pilot projects in the field, started with converting the digitalised archival and printed sources into machine-readable format before text and data mining for corpus creation. Query system KORP allows us to organise the language data by all the categories used in the corpus, for example, to learn who and in what context mentioned the name of the French writer André Gide. The second currently running project is the morphologically annotated corpus of literary criticism. This corpus contains texts of literary reviews and criticism in different genres, drawn from the projects ERNI and “Kreutzwald’s Century”. The first results in studying the dynamics of literary values can already be seen. A query in KORP about the word ‘mõju’ (‘influence’) revealed that the manifesto “More of European culture!”of the group Young Estonia, voiced in 1905, was during the independent Estonian Republic replaced by the valuing of a specific national character. Corpus query showed a change in the meaning of the word: in the criticism contemporary to Young Estonia, the word ‘mõju’ was only associated with the historical pressure from Russian and German cultures. The foundation for modern comparative linguistics at the University of Tartu was laid in the 1920s by the professorship in Estonian literature.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
该项目的结果是创建了两个文学文本语料库:“爱沙尼亚作家约翰内斯·森佩尔和约翰内斯·瓦雷斯- barbarus的书信体文本语料库”和“爱沙尼亚文学批评语料库,Noor-Eesti和1920年代”。这两个项目都是该领域的试点项目,首先将数字化档案和印刷资源转换为机器可读的格式,然后再进行文本和数据挖掘以创建语料库。查询系统KORP允许我们按照语料库中使用的所有类别来组织语言数据,例如,了解谁以及在什么上下文中提到了法国作家andre Gide的名字。目前正在进行的第二个项目是文学批评的形态注释语料库。该语料库包含来自ERNI和“克鲁茨瓦尔德的世纪”项目的不同体裁的文学评论和批评文本。研究文学价值动态的第一个结果已经可以看到。在KORP网站上查询“mõju”(影响)一词时发现,“更多的欧洲文化!”,在独立的爱沙尼亚共和国期间,被对特定民族性格的重视所取代。语料库查询显示了这个词的意义的变化:在与年轻爱沙尼亚同时代的批评中,“mõju”这个词只与来自俄罗斯和德国文化的历史压力有关。塔尔图大学现代比较语言学的基础是在20世纪20年代由爱沙尼亚文学教授奠定的。 该项目的结果是创建了两个文学文本语料库:“爱沙尼亚作家约翰内斯·森佩尔和约翰内斯·瓦雷斯- barbarus的书信体文本语料库”和“爱沙尼亚文学批评语料库,Noor-Eesti和1920年代”。这两个项目都是该领域的试点项目,首先将数字化档案和印刷资源转换为机器可读的格式,然后再进行文本和数据挖掘以创建语料库。查询系统KORP允许我们按照语料库中使用的所有类别来组织语言数据,例如,了解谁以及在什么上下文中提到了法国作家andre Gide的名字。目前正在进行的第二个项目是文学批评的形态注释语料库。该语料库包含来自ERNI和“克鲁茨瓦尔德的世纪”项目的不同体裁的文学评论和批评文本。研究文学价值动态的第一个结果已经可以看到。在KORP网站上查询“mõju”(影响)一词时发现,“更多的欧洲文化!”,在独立的爱沙尼亚共和国期间,被对特定民族性格的重视所取代。语料库查询显示了这个词的意义的变化:在与年轻爱沙尼亚同时代的批评中,“mõju”这个词只与来自俄罗斯和德国文化的历史压力有关。塔尔图大学现代比较语言学的基础是在20世纪20年代由爱沙尼亚文学教授奠定的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Methis
Methis Arts and Humanities-Literature and Literary Theory
CiteScore
0.60
自引率
0.00%
发文量
12
审稿时长
12 weeks
期刊介绍: Methis publishes original research in the field of humanities, in particular in the field of literary and cultural studies and theater studies. The journal features thematic issues on a regular basis with every third issue being a varia issue. Articles are published in Estonian (or in English) with a summary in English (or in Estonian). The journal also includes the following sections: - MANIFESTO: a programmatic (theoretical) article - MEDIATION OF THEORY: a translation of a key theoretical text within the field - REVIEW: a review article on recent developments within the field - ARCHIVAL FINDING: an annotated publication of some relevant archival source from the collections of Cultural History Archives of Estonian Literary Museum or another memory institution. - INTERVIEW
期刊最新文献
Keskkonnahumanitaaria / Environmental Humanities Vastuseisust protestideni / From Opposition to Protests Kunst, keskkond ja keskkonnaliikumine Eestis 1960.–1980. aastatel / Art, Environment, and Environmentalism in Estonia in the 1960s–1980s Eesti loomakaitseliikumine sõdadevahelisel perioodil / Animal Protection Movement in Interwar Estonia Roheliste rattaretked aastail 1988–1993 / Green Bicycle Tours in the Years 1988–1993
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1