CLADAG 2021特刊:分类与数据分析论文精选

C. Bocci, A. Gottard, T. B. Murphy, G. C. Porzio
{"title":"CLADAG 2021特刊:分类与数据分析论文精选","authors":"C. Bocci, A. Gottard, T. B. Murphy, G. C. Porzio","doi":"10.1002/sam.11633","DOIUrl":null,"url":null,"abstract":"This special issue of Statistical Analysis and Data Mining contains a selection of the papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG), scheduled for September 9–11, 2021 in Florence, Italy. Due to the COVID-19 pandemic, the conference was held online. The CLADAG is a Section of the Italian Statistical Society (SIS), and a member of the International Federation of Classification Societies (IFCS). It was founded in 1997 to promote advanced methodological research in multivariate statistics, focusing on Data Analysis and Classification. The Section organizes a biennial international scientific meeting, offers classification and data analysis courses, publishes a newsletter, and collaborates on planning conferences and meetings with other IFCS societies. The previous 12 CLADAG meetings were held in various locations throughout Italy: Pescara (1997), Roma (1999), Palermo (2001), Bologna (2003), Parma (2005), Macerata (2007), Catania (2009), Pavia (2011), Modena and Reggio Emilia (2013), Cagliari (2015), Milano (2017), and Cassino (2019). Following a blind peer-review process, six papers presented at the conference and submitted to this special issue have been selected for publication. The articles cover a broad range of data analysis topics: gender gap analysis, income clustering, structural equation modeling, multivariate nonparametric methods, and classifier selection. Their content is briefly described below. In studying the gender gap, a relevant topic for promoting equality and social justice, Greselin et al. propose a new parametric approach utilizing the relative distribution method and Dagum parametric inference. Additionally, they assessed how to select covariates that impact gender gaps. The proposed approach is applied to measure and compare the gender gap in Poland and Italy, using data from the 2018 European Survey of Income and Living Conditions. On a related field, Condino proposes a procedure for clustering income data using a share density-based dynamic clustering algorithm. The paper compares subgroups’ income inequality using a dissimilarity measure based on information theory. This measure is then utilized for clustering, providing a prototype descriptor of income inequality for the clustered earners. The proposal is applied to data from the Survey on Households Income and Wealth by the Bank of Italy. The paper by Yu et al. introduces a refinement of the so-called Henseler–Ogasawara specification that integrates composites, linear combinations of variables, into structural equation models. This refined version addresses some concerns of the Henseler–Ogasawara specification, and it is less complex and less prone to misspecification mistakes. Additionally, the paper provides a strategy to compute standard errors. Statistical depth functions are a valuable tool for multivariate nonparametric data analysis, extending the concept of ranks, orderings, and quantiles to the multivariate setup. The paper by Laketa and Nagy investigates one of the fundamental open problems of contemporary depth research, the so-called characterization and reconstruction questions, focusing on the simplicial depth. Their results are illustrated via several insightful examples. On the same topic, Nagy revisits the classical definition of the simplicial depth and explores its theoretical properties. Particularly, properties of the simplicial median are investigated. The author provides the exact simplicial depth in several scenarios, outlining undesirable behaviors of this depth function. Carpita and Golia tackle the problem of choosing the rule to assign a unit to a category given the estimated probabilities. In particular, the paper compares the classical Bayesian Classifier, which minimizes the expected classification error rate, with the Max Difference Classifier and the Max Ratio Classifier, showing when these classifiers should be preferred. Findings are illustrated by means of a broad simulation study and an application on benchmark data sets. To conclude, we believe that this special issue accurately portrays the scientific features of the CLADAG community nowadays and supports the CLADAG mission of facilitating the exchange of ideas in Classification and Data Analysis. We warmly encourage all readers to attend the","PeriodicalId":342679,"journal":{"name":"Statistical Analysis and Data Mining: The ASA Data Science Journal","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CLADAG 2021 special issue: Selected papers on classification and data analysis\",\"authors\":\"C. Bocci, A. Gottard, T. B. Murphy, G. C. Porzio\",\"doi\":\"10.1002/sam.11633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This special issue of Statistical Analysis and Data Mining contains a selection of the papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG), scheduled for September 9–11, 2021 in Florence, Italy. Due to the COVID-19 pandemic, the conference was held online. The CLADAG is a Section of the Italian Statistical Society (SIS), and a member of the International Federation of Classification Societies (IFCS). It was founded in 1997 to promote advanced methodological research in multivariate statistics, focusing on Data Analysis and Classification. The Section organizes a biennial international scientific meeting, offers classification and data analysis courses, publishes a newsletter, and collaborates on planning conferences and meetings with other IFCS societies. The previous 12 CLADAG meetings were held in various locations throughout Italy: Pescara (1997), Roma (1999), Palermo (2001), Bologna (2003), Parma (2005), Macerata (2007), Catania (2009), Pavia (2011), Modena and Reggio Emilia (2013), Cagliari (2015), Milano (2017), and Cassino (2019). Following a blind peer-review process, six papers presented at the conference and submitted to this special issue have been selected for publication. The articles cover a broad range of data analysis topics: gender gap analysis, income clustering, structural equation modeling, multivariate nonparametric methods, and classifier selection. Their content is briefly described below. In studying the gender gap, a relevant topic for promoting equality and social justice, Greselin et al. propose a new parametric approach utilizing the relative distribution method and Dagum parametric inference. Additionally, they assessed how to select covariates that impact gender gaps. The proposed approach is applied to measure and compare the gender gap in Poland and Italy, using data from the 2018 European Survey of Income and Living Conditions. On a related field, Condino proposes a procedure for clustering income data using a share density-based dynamic clustering algorithm. The paper compares subgroups’ income inequality using a dissimilarity measure based on information theory. This measure is then utilized for clustering, providing a prototype descriptor of income inequality for the clustered earners. The proposal is applied to data from the Survey on Households Income and Wealth by the Bank of Italy. The paper by Yu et al. introduces a refinement of the so-called Henseler–Ogasawara specification that integrates composites, linear combinations of variables, into structural equation models. This refined version addresses some concerns of the Henseler–Ogasawara specification, and it is less complex and less prone to misspecification mistakes. Additionally, the paper provides a strategy to compute standard errors. Statistical depth functions are a valuable tool for multivariate nonparametric data analysis, extending the concept of ranks, orderings, and quantiles to the multivariate setup. The paper by Laketa and Nagy investigates one of the fundamental open problems of contemporary depth research, the so-called characterization and reconstruction questions, focusing on the simplicial depth. Their results are illustrated via several insightful examples. On the same topic, Nagy revisits the classical definition of the simplicial depth and explores its theoretical properties. Particularly, properties of the simplicial median are investigated. The author provides the exact simplicial depth in several scenarios, outlining undesirable behaviors of this depth function. Carpita and Golia tackle the problem of choosing the rule to assign a unit to a category given the estimated probabilities. In particular, the paper compares the classical Bayesian Classifier, which minimizes the expected classification error rate, with the Max Difference Classifier and the Max Ratio Classifier, showing when these classifiers should be preferred. Findings are illustrated by means of a broad simulation study and an application on benchmark data sets. To conclude, we believe that this special issue accurately portrays the scientific features of the CLADAG community nowadays and supports the CLADAG mission of facilitating the exchange of ideas in Classification and Data Analysis. We warmly encourage all readers to attend the\",\"PeriodicalId\":342679,\"journal\":{\"name\":\"Statistical Analysis and Data Mining: The ASA Data Science Journal\",\"volume\":\"132 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Analysis and Data Mining: The ASA Data Science Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/sam.11633\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining: The ASA Data Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/sam.11633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本期《统计分析与数据挖掘》特刊精选了分类与数据分析小组(CLADAG)第13届科学会议上发表的论文,该会议定于2021年9月9日至11日在意大利佛罗伦萨举行。受新冠肺炎疫情影响,会议选择在线举行。CLADAG是意大利统计学会(SIS)的一个分会,也是国际船级社联合会(IFCS)的成员。它成立于1997年,旨在促进多元统计的先进方法研究,重点是数据分析和分类。该科每两年组织一次国际科学会议,提供分类和数据分析课程,出版通讯,并与IFCS其他协会合作规划会议。前12届CLADAG会议在意大利各地举行:佩斯卡拉(1997年)、罗马(1999年)、巴勒莫(2001年)、博洛尼亚(2003年)、帕尔马(2005年)、马切拉塔(2007年)、卡塔尼亚(2009年)、帕维亚(2011年)、摩德纳和雷吉欧艾米利亚(2013年)、卡利亚里(2015年)、米兰(2017年)和卡西诺(2019年)。经过盲目的同行评议过程,六篇在会议上发表并提交给本期特刊的论文被选中发表。这些文章涵盖了广泛的数据分析主题:性别差距分析、收入聚类、结构方程建模、多变量非参数方法和分类器选择。它们的内容简述如下。在研究性别差距这一促进平等和社会正义的相关课题时,Greselin等人利用相对分布法和Dagum参数推理提出了一种新的参数化方法。此外,他们还评估了如何选择影响性别差距的协变量。根据2018年欧洲收入和生活条件调查的数据,该方法被用于衡量和比较波兰和意大利的性别差距。在相关领域,Condino提出了一种基于份额密度的动态聚类算法对收入数据进行聚类。本文采用基于信息论的不相似性度量来比较各子群体的收入不平等。然后利用这一措施进行聚类,为聚集的收入者提供收入不平等的原型描述符。该提议适用于意大利银行家庭收入和财富调查的数据。Yu等人的论文介绍了对所谓的Henseler-Ogasawara规范的改进,该规范将复合材料、变量的线性组合集成到结构方程模型中。这个改进的版本解决了Henseler-Ogasawara规范的一些问题,它不那么复杂,也不容易出现规范错误。此外,本文还提供了一种计算标准误差的策略。统计深度函数是多变量非参数数据分析的一个有价值的工具,将秩、排序和分位数的概念扩展到多变量设置。Laketa和Nagy的论文探讨了当代深度研究的一个基本开放问题,即所谓的表征和重构问题,重点关注简单深度。他们的结果是通过几个有见地的例子来说明的。在同一主题上,纳吉重新审视了简单深度的经典定义,并探讨了其理论性质。特别地,研究了简单中值的性质。作者在几个场景中提供了精确的简单深度,概述了该深度函数的不良行为。Carpita和Golia解决了在给定估计概率的情况下选择规则将单位分配给类别的问题。特别地,本文将最小化预期分类错误率的经典贝叶斯分类器与Max Difference分类器和Max Ratio分类器进行了比较,说明了何时应该优先使用这些分类器。研究结果通过广泛的模拟研究和基准数据集的应用来说明。综上所述,我们认为这期特刊准确地描绘了当今CLADAG社区的科学特征,并支持了CLADAG促进分类和数据分析思想交流的使命。我们热忱鼓励所有读者参加
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CLADAG 2021 special issue: Selected papers on classification and data analysis
This special issue of Statistical Analysis and Data Mining contains a selection of the papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG), scheduled for September 9–11, 2021 in Florence, Italy. Due to the COVID-19 pandemic, the conference was held online. The CLADAG is a Section of the Italian Statistical Society (SIS), and a member of the International Federation of Classification Societies (IFCS). It was founded in 1997 to promote advanced methodological research in multivariate statistics, focusing on Data Analysis and Classification. The Section organizes a biennial international scientific meeting, offers classification and data analysis courses, publishes a newsletter, and collaborates on planning conferences and meetings with other IFCS societies. The previous 12 CLADAG meetings were held in various locations throughout Italy: Pescara (1997), Roma (1999), Palermo (2001), Bologna (2003), Parma (2005), Macerata (2007), Catania (2009), Pavia (2011), Modena and Reggio Emilia (2013), Cagliari (2015), Milano (2017), and Cassino (2019). Following a blind peer-review process, six papers presented at the conference and submitted to this special issue have been selected for publication. The articles cover a broad range of data analysis topics: gender gap analysis, income clustering, structural equation modeling, multivariate nonparametric methods, and classifier selection. Their content is briefly described below. In studying the gender gap, a relevant topic for promoting equality and social justice, Greselin et al. propose a new parametric approach utilizing the relative distribution method and Dagum parametric inference. Additionally, they assessed how to select covariates that impact gender gaps. The proposed approach is applied to measure and compare the gender gap in Poland and Italy, using data from the 2018 European Survey of Income and Living Conditions. On a related field, Condino proposes a procedure for clustering income data using a share density-based dynamic clustering algorithm. The paper compares subgroups’ income inequality using a dissimilarity measure based on information theory. This measure is then utilized for clustering, providing a prototype descriptor of income inequality for the clustered earners. The proposal is applied to data from the Survey on Households Income and Wealth by the Bank of Italy. The paper by Yu et al. introduces a refinement of the so-called Henseler–Ogasawara specification that integrates composites, linear combinations of variables, into structural equation models. This refined version addresses some concerns of the Henseler–Ogasawara specification, and it is less complex and less prone to misspecification mistakes. Additionally, the paper provides a strategy to compute standard errors. Statistical depth functions are a valuable tool for multivariate nonparametric data analysis, extending the concept of ranks, orderings, and quantiles to the multivariate setup. The paper by Laketa and Nagy investigates one of the fundamental open problems of contemporary depth research, the so-called characterization and reconstruction questions, focusing on the simplicial depth. Their results are illustrated via several insightful examples. On the same topic, Nagy revisits the classical definition of the simplicial depth and explores its theoretical properties. Particularly, properties of the simplicial median are investigated. The author provides the exact simplicial depth in several scenarios, outlining undesirable behaviors of this depth function. Carpita and Golia tackle the problem of choosing the rule to assign a unit to a category given the estimated probabilities. In particular, the paper compares the classical Bayesian Classifier, which minimizes the expected classification error rate, with the Max Difference Classifier and the Max Ratio Classifier, showing when these classifiers should be preferred. Findings are illustrated by means of a broad simulation study and an application on benchmark data sets. To conclude, we believe that this special issue accurately portrays the scientific features of the CLADAG community nowadays and supports the CLADAG mission of facilitating the exchange of ideas in Classification and Data Analysis. We warmly encourage all readers to attend the
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Neural interval‐censored survival regression with feature selection Bayesian batch optimization for molybdenum versus tungsten inertial confinement fusion double shell target design Gaussian process selections in semiparametric multi‐kernel machine regression for multi‐pathway analysis An automated alignment algorithm for identification of the source of footwear impressions with common class characteristics Confidence bounds for threshold similarity graph in random variable network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1