可见阈值:数据挖掘应用于eb社会研究的一些思考

Eric W Scarpa
{"title":"可见阈值:数据挖掘应用于eb社会研究的一些思考","authors":"Eric W Scarpa","doi":"10.21814/h2d.3466","DOIUrl":null,"url":null,"abstract":"Accounting is a routine activity. Through repetition, the scribes of the Ebla Archives (Syria, 24th cent. BCE) have been able to record thousands of transactions. They organized and stored accounting data referred to more than thirty years of the Palace G activities. The recurring textual patterns characterizing the administrative corpus are a byproduct of this routine-based approach. The ability to see recurring patterns in the textual record is fundamental when dealing with an administrative corpus: however, this ability fails when the patterns are buried in data. In this paper, I argue that theoretical aspects of data mining are not far from theoretical and methodological tenets of the historical approach. Data mining is a useful technique for the identification of document clusters and relevant information which would otherwise remain hidden. Furthermore, textual pattern recognition is critical to address topics such as the study of society: belonging to a category of complex problems, any socio-historical investigation requires dealing with multiple interconnected variables. However, not all research topics require such an approach. I define the line beyond which digital approaches are extremely useful (if not indispensable) as 'visibility threshold’. The position of this interface is relative and subjective.","PeriodicalId":365381,"journal":{"name":"H2D|Revista de Humanidades Digitais","volume":"231 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Visibility Threshold: Some Considerations on Data Mining Applied to the Study of Eblaite Society\",\"authors\":\"Eric W Scarpa\",\"doi\":\"10.21814/h2d.3466\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accounting is a routine activity. Through repetition, the scribes of the Ebla Archives (Syria, 24th cent. BCE) have been able to record thousands of transactions. They organized and stored accounting data referred to more than thirty years of the Palace G activities. The recurring textual patterns characterizing the administrative corpus are a byproduct of this routine-based approach. The ability to see recurring patterns in the textual record is fundamental when dealing with an administrative corpus: however, this ability fails when the patterns are buried in data. In this paper, I argue that theoretical aspects of data mining are not far from theoretical and methodological tenets of the historical approach. Data mining is a useful technique for the identification of document clusters and relevant information which would otherwise remain hidden. Furthermore, textual pattern recognition is critical to address topics such as the study of society: belonging to a category of complex problems, any socio-historical investigation requires dealing with multiple interconnected variables. However, not all research topics require such an approach. I define the line beyond which digital approaches are extremely useful (if not indispensable) as 'visibility threshold’. The position of this interface is relative and subjective.\",\"PeriodicalId\":365381,\"journal\":{\"name\":\"H2D|Revista de Humanidades Digitais\",\"volume\":\"231 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"H2D|Revista de Humanidades Digitais\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21814/h2d.3466\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"H2D|Revista de Humanidades Digitais","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21814/h2d.3466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

会计是一项日常活动。通过重复,埃布拉档案(叙利亚,公元前24世纪)的抄写员已经能够记录数千笔交易。他们整理和储存了30多年来G宫活动的会计资料。描述行政语料库特征的重复文本模式是这种基于例程的方法的副产品。在处理管理语料库时,在文本记录中查看重复模式的能力是基本的:但是,当模式隐藏在数据中时,这种能力就失效了。在本文中,我认为数据挖掘的理论方面与历史方法的理论和方法原则相距不远。数据挖掘是一种有用的技术,用于识别文档簇和相关信息,否则这些信息将被隐藏。此外,文本模式识别对于解决诸如社会研究等主题至关重要:属于复杂问题的范畴,任何社会历史调查都需要处理多个相互关联的变量。然而,并不是所有的研究课题都需要这样的方法。我将数字方法非常有用(如果不是必不可少的话)的界限定义为“可见性阈值”。这个界面的位置是相对的和主观的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Visibility Threshold: Some Considerations on Data Mining Applied to the Study of Eblaite Society
Accounting is a routine activity. Through repetition, the scribes of the Ebla Archives (Syria, 24th cent. BCE) have been able to record thousands of transactions. They organized and stored accounting data referred to more than thirty years of the Palace G activities. The recurring textual patterns characterizing the administrative corpus are a byproduct of this routine-based approach. The ability to see recurring patterns in the textual record is fundamental when dealing with an administrative corpus: however, this ability fails when the patterns are buried in data. In this paper, I argue that theoretical aspects of data mining are not far from theoretical and methodological tenets of the historical approach. Data mining is a useful technique for the identification of document clusters and relevant information which would otherwise remain hidden. Furthermore, textual pattern recognition is critical to address topics such as the study of society: belonging to a category of complex problems, any socio-historical investigation requires dealing with multiple interconnected variables. However, not all research topics require such an approach. I define the line beyond which digital approaches are extremely useful (if not indispensable) as 'visibility threshold’. The position of this interface is relative and subjective.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A memória 2.0 do pianista do séc. XXI: possibilidades e recursos Localization of the Japanese brand Tatcha’s website Aprendizagem criativa e letramento digital: práticas inovadoras nos anos iniciais do ensino fundamental O ensino-aprendizagem da Participação Portuguesa na I Guerra Mundial nas disciplinas de História e Cidadania e Desenvolvimento LyricLearn: um novo recurso para a aprendizagem de línguas
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1