Marco Di Marzio, Stefania Fensore, Chiara Passamonti
{"title":"Validating Benfordness on contaminated data","authors":"Marco Di Marzio, Stefania Fensore, Chiara Passamonti","doi":"10.1016/j.seps.2024.102008","DOIUrl":null,"url":null,"abstract":"<div><p>Benford’s law is a mathematical model, very recurrent in practice for a wide variety of datasets, used to represent the frequencies of digits. A well-established usage of Benfordness statistical testing lies within investigations aimed to ascertain if balance sheet and income statement data are genuine. A typical, frustrating problem of Benfordness statistical tests on big, practical datasets is that they often provide <em>p-values</em>smaller than expected when the Benfordness null hypothesis is very realistic. A possible reason is that data are contaminated by some kind of noise. In this paper we propose the deconvolution approach to alleviate this issue, using both simulated and real data.</p></div>","PeriodicalId":22033,"journal":{"name":"Socio-economic Planning Sciences","volume":"95 ","pages":"Article 102008"},"PeriodicalIF":6.2000,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0038012124002076/pdfft?md5=8520deba3d009221cceee9e5202fcc12&pid=1-s2.0-S0038012124002076-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Socio-economic Planning Sciences","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0038012124002076","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
Benford’s law is a mathematical model, very recurrent in practice for a wide variety of datasets, used to represent the frequencies of digits. A well-established usage of Benfordness statistical testing lies within investigations aimed to ascertain if balance sheet and income statement data are genuine. A typical, frustrating problem of Benfordness statistical tests on big, practical datasets is that they often provide p-valuessmaller than expected when the Benfordness null hypothesis is very realistic. A possible reason is that data are contaminated by some kind of noise. In this paper we propose the deconvolution approach to alleviate this issue, using both simulated and real data.
本福德定律是一个数学模型,在实践中经常用于表示数字的频率,适用于各种数据集。本福德统计检验的一个行之有效的用法是在旨在确定资产负债表和损益表数据是否真实的调查中使用。在大型实际数据集上进行本福德统计检验时,一个典型的、令人沮丧的问题是,当本福德虚假设非常现实时,它们提供的 p 值往往比预期的要小。一个可能的原因是数据受到了某种噪声的污染。在本文中,我们利用模拟数据和真实数据提出了去卷积法来缓解这一问题。
期刊介绍:
Studies directed toward the more effective utilization of existing resources, e.g. mathematical programming models of health care delivery systems with relevance to more effective program design; systems analysis of fire outbreaks and its relevance to the location of fire stations; statistical analysis of the efficiency of a developing country economy or industry.
Studies relating to the interaction of various segments of society and technology, e.g. the effects of government health policies on the utilization and design of hospital facilities; the relationship between housing density and the demands on public transportation or other service facilities: patterns and implications of urban development and air or water pollution.
Studies devoted to the anticipations of and response to future needs for social, health and other human services, e.g. the relationship between industrial growth and the development of educational resources in affected areas; investigation of future demands for material and child health resources in a developing country; design of effective recycling in an urban setting.