{"title":"Challenges of Multi-Source and Bilingual Data Curation for the Research of Tribute during the Qing Dynasty","authors":"Loretta E. Kim, Eugenia Kim","doi":"10.6245/JLIS.2013.392/626","DOIUrl":null,"url":null,"abstract":"This paper discusses the objectives, process, and outcomes of creating a digital dataset for a historical research project on the tribute system in Heilongjiang during the Qing dynasty (1644-1911). Relevant information from non-digital primary sources was compiled for the dataset to facilitate quantitative and qualitative analyses of the system's attributes. In the course of curating the data, the investigators addressed the challenges of defining a common set of variables and matching Chinese original data with English translations. They tested methods of learning to create datasets that could accommodate heterogeneous sources and share among multiple users.","PeriodicalId":30155,"journal":{"name":"Tushuguanxue yu Zixun Kexue","volume":"39 1","pages":"84-95"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tushuguanxue yu Zixun Kexue","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.6245/JLIS.2013.392/626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper discusses the objectives, process, and outcomes of creating a digital dataset for a historical research project on the tribute system in Heilongjiang during the Qing dynasty (1644-1911). Relevant information from non-digital primary sources was compiled for the dataset to facilitate quantitative and qualitative analyses of the system's attributes. In the course of curating the data, the investigators addressed the challenges of defining a common set of variables and matching Chinese original data with English translations. They tested methods of learning to create datasets that could accommodate heterogeneous sources and share among multiple users.