{"title":"可见阈值:数据挖掘应用于eb社会研究的一些思考","authors":"Eric W Scarpa","doi":"10.21814/h2d.3466","DOIUrl":null,"url":null,"abstract":"Accounting is a routine activity. Through repetition, the scribes of the Ebla Archives (Syria, 24th cent. BCE) have been able to record thousands of transactions. They organized and stored accounting data referred to more than thirty years of the Palace G activities. The recurring textual patterns characterizing the administrative corpus are a byproduct of this routine-based approach. The ability to see recurring patterns in the textual record is fundamental when dealing with an administrative corpus: however, this ability fails when the patterns are buried in data. In this paper, I argue that theoretical aspects of data mining are not far from theoretical and methodological tenets of the historical approach. Data mining is a useful technique for the identification of document clusters and relevant information which would otherwise remain hidden. Furthermore, textual pattern recognition is critical to address topics such as the study of society: belonging to a category of complex problems, any socio-historical investigation requires dealing with multiple interconnected variables. However, not all research topics require such an approach. I define the line beyond which digital approaches are extremely useful (if not indispensable) as 'visibility threshold’. The position of this interface is relative and subjective.","PeriodicalId":365381,"journal":{"name":"H2D|Revista de Humanidades Digitais","volume":"231 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Visibility Threshold: Some Considerations on Data Mining Applied to the Study of Eblaite Society\",\"authors\":\"Eric W Scarpa\",\"doi\":\"10.21814/h2d.3466\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accounting is a routine activity. Through repetition, the scribes of the Ebla Archives (Syria, 24th cent. BCE) have been able to record thousands of transactions. They organized and stored accounting data referred to more than thirty years of the Palace G activities. The recurring textual patterns characterizing the administrative corpus are a byproduct of this routine-based approach. The ability to see recurring patterns in the textual record is fundamental when dealing with an administrative corpus: however, this ability fails when the patterns are buried in data. In this paper, I argue that theoretical aspects of data mining are not far from theoretical and methodological tenets of the historical approach. Data mining is a useful technique for the identification of document clusters and relevant information which would otherwise remain hidden. Furthermore, textual pattern recognition is critical to address topics such as the study of society: belonging to a category of complex problems, any socio-historical investigation requires dealing with multiple interconnected variables. However, not all research topics require such an approach. I define the line beyond which digital approaches are extremely useful (if not indispensable) as 'visibility threshold’. The position of this interface is relative and subjective.\",\"PeriodicalId\":365381,\"journal\":{\"name\":\"H2D|Revista de Humanidades Digitais\",\"volume\":\"231 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"H2D|Revista de Humanidades Digitais\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21814/h2d.3466\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"H2D|Revista de Humanidades Digitais","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21814/h2d.3466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Visibility Threshold: Some Considerations on Data Mining Applied to the Study of Eblaite Society
Accounting is a routine activity. Through repetition, the scribes of the Ebla Archives (Syria, 24th cent. BCE) have been able to record thousands of transactions. They organized and stored accounting data referred to more than thirty years of the Palace G activities. The recurring textual patterns characterizing the administrative corpus are a byproduct of this routine-based approach. The ability to see recurring patterns in the textual record is fundamental when dealing with an administrative corpus: however, this ability fails when the patterns are buried in data. In this paper, I argue that theoretical aspects of data mining are not far from theoretical and methodological tenets of the historical approach. Data mining is a useful technique for the identification of document clusters and relevant information which would otherwise remain hidden. Furthermore, textual pattern recognition is critical to address topics such as the study of society: belonging to a category of complex problems, any socio-historical investigation requires dealing with multiple interconnected variables. However, not all research topics require such an approach. I define the line beyond which digital approaches are extremely useful (if not indispensable) as 'visibility threshold’. The position of this interface is relative and subjective.