Text Analysis and Visualization Research on the Hetu Dangse During the Qing Dynasty of China

IF 1.3 4区管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Technology and Libraries Pub Date : 2021-09-20 DOI:10.6017/ital.v40i3.13279

Zhiyu Wang, Jingyu Wu, Guang Yu, Zhiping Song

{"title":"Text Analysis and Visualization Research on the Hetu Dangse During the Qing Dynasty of China","authors":"Zhiyu Wang, Jingyu Wu, Guang Yu, Zhiping Song","doi":"10.6017/ital.v40i3.13279","DOIUrl":null,"url":null,"abstract":"In traditional historical research, interpreting historical documents subjectively and manually causes problems such as one-sided understanding, selective analysis, and one-way knowledge connection. In this study, we aim to use machine learning to automatically analyze and explore historical documents from a text analysis and visualization perspective. This technology solves the problem of large-scale historical data analysis that is difficult for humans to read and intuitively understand. In this study, we use the historical documents of the Qing Dynasty Hetu Dangse,preserved in the Archives of Liaoning Province, as data analysis samples. China’s Hetu Dangse is the largest Qing Dynasty thematic archive with Manchu and Chinese characters in the world. Through word frequency analysis, correlation analysis, co-word clustering, word2vec model, and SVM (Support Vector Machines) algorithms, we visualize historical documents, reveal the relationships between functions of the government departments in the Shengjing area of the Qing Dynasty, achieve the automatic classification of historical archives, improve the efficient use of historical materials as well as build connections between historical knowledge. Through this, archivists can be guided practically in historical materials’ management and compilation.","PeriodicalId":50361,"journal":{"name":"Information Technology and Libraries","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2021-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Technology and Libraries","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.6017/ital.v40i3.13279","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 1

Abstract

In traditional historical research, interpreting historical documents subjectively and manually causes problems such as one-sided understanding, selective analysis, and one-way knowledge connection. In this study, we aim to use machine learning to automatically analyze and explore historical documents from a text analysis and visualization perspective. This technology solves the problem of large-scale historical data analysis that is difficult for humans to read and intuitively understand. In this study, we use the historical documents of the Qing Dynasty Hetu Dangse,preserved in the Archives of Liaoning Province, as data analysis samples. China’s Hetu Dangse is the largest Qing Dynasty thematic archive with Manchu and Chinese characters in the world. Through word frequency analysis, correlation analysis, co-word clustering, word2vec model, and SVM (Support Vector Machines) algorithms, we visualize historical documents, reveal the relationships between functions of the government departments in the Shengjing area of the Qing Dynasty, achieve the automatic classification of historical archives, improve the efficient use of historical materials as well as build connections between historical knowledge. Through this, archivists can be guided practically in historical materials’ management and compilation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

清代河图党歌文本分析与形象化研究

在传统的历史研究中，对历史文献的主观解读和人工解读造成了片面理解、选择性分析、单向知识连接等问题。在本研究中，我们旨在使用机器学习从文本分析和可视化的角度自动分析和探索历史文献。这项技术解决了人类难以阅读和直观理解的大规模历史数据分析问题。本研究以辽宁省档案馆保存的清代河图党史文献为数据分析样本。中国河图党色是世界上最大的清代满汉文字专题档案馆。通过词频分析、相关性分析、共词聚类、word2vec模型和SVM（Support Vector Machines）算法，我们将历史文献可视化，揭示清代盛京地区政府部门职能之间的关系，实现历史档案的自动分类，提高历史资料的有效利用率，并建立历史知识之间的联系。通过这一点，可以对档案工作者的史料管理和编纂工作起到实际指导作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Information Technology and Libraries 管理科学-计算机：信息系统

CiteScore

2.90

自引率

5.60%

发文量

审稿时长

1 months

期刊介绍： Information Technology and Libraries publishes original material related to all aspects of information technology in all types of libraries. Topic areas include, but are not limited to, library automation, digital libraries, metadata, identity management, distributed systems and networks, computer security, intellectual property rights, technical standards, geographic information systems, desktop applications, information discovery tools, web-scale library services, cloud computing, digital preservation, data curation, virtualization, search-engine optimization, emerging technologies, social networking, open data, the semantic web, mobile services and applications, usability, universal access to technology, library consortia, vendor relations, and digital humanities.

期刊最新文献

Response to "From ChatGPT to CatGPT" To Thine Own 3D Selfie Be True Towards an Open Source-first Praxis in Libraries Response to "From ChatGPT to CatGPT" Drained-pool Politics Versus Digital Libraries in U.S. Cyberspace