Comparison between Two Algorithms for Computing the Weighted Generalized Affinity Coefficient in the Case of Interval Data

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Stats Pub Date : 2023-10-13 DOI:10.3390/stats6040068
Áurea Sousa, Osvaldo Silva, Leonor Bacelar-Nicolau, João Cabral, Helena Bacelar-Nicolau
{"title":"Comparison between Two Algorithms for Computing the Weighted Generalized Affinity Coefficient in the Case of Interval Data","authors":"Áurea Sousa, Osvaldo Silva, Leonor Bacelar-Nicolau, João Cabral, Helena Bacelar-Nicolau","doi":"10.3390/stats6040068","DOIUrl":null,"url":null,"abstract":"From the affinity coefficient between two discrete probability distributions proposed by Matusita, Bacelar-Nicolau introduced the affinity coefficient in a cluster analysis context and extended it to different types of data, including for the case of complex and heterogeneous data within the scope of symbolic data analysis (SDA). In this study, we refer to the most significant partitions obtained using the hierarchical cluster analysis (h.c.a.) of two well-known datasets that were taken from the literature on complex (symbolic) data analysis. h.c.a. is based on the weighted generalized affinity coefficient for the case of interval data and on probabilistic aggregation criteria from a VL parametric family. To calculate the values of this coefficient, two alternative algorithms were used and compared. Both algorithms were able to detect clusters of macrodata (aggregated data into groups of interest) that were consistent and consonant with those reported in the literature, but one performed better than the other in some specific cases. Moreover, both approaches allow for the treatment of large microdatabases (non-aggregated data) after their transformation into macrodata from the huge microdata.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"55 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stats","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/stats6040068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 1

Abstract

From the affinity coefficient between two discrete probability distributions proposed by Matusita, Bacelar-Nicolau introduced the affinity coefficient in a cluster analysis context and extended it to different types of data, including for the case of complex and heterogeneous data within the scope of symbolic data analysis (SDA). In this study, we refer to the most significant partitions obtained using the hierarchical cluster analysis (h.c.a.) of two well-known datasets that were taken from the literature on complex (symbolic) data analysis. h.c.a. is based on the weighted generalized affinity coefficient for the case of interval data and on probabilistic aggregation criteria from a VL parametric family. To calculate the values of this coefficient, two alternative algorithms were used and compared. Both algorithms were able to detect clusters of macrodata (aggregated data into groups of interest) that were consistent and consonant with those reported in the literature, but one performed better than the other in some specific cases. Moreover, both approaches allow for the treatment of large microdatabases (non-aggregated data) after their transformation into macrodata from the huge microdata.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
区间数据加权广义关联系数两种计算算法的比较
Bacelar-Nicolau从Matusita提出的两个离散概率分布之间的亲和系数出发,将亲和系数引入到聚类分析环境中,并将其扩展到不同类型的数据,包括符号数据分析(SDA)范围内的复杂和异构数据。在本研究中,我们参考了使用层次聚类分析(h.c.a.)从复杂(符号)数据分析文献中获取的两个知名数据集获得的最显著分区。该方法基于区间数据的加权广义亲和系数和VL参数族的概率聚合准则。为了计算该系数的值,使用了两种替代算法并进行了比较。这两种算法都能够检测与文献中报道的一致和一致的宏观数据簇(将数据聚合到感兴趣的组中),但在某些特定情况下,一种算法比另一种算法表现得更好。此外,这两种方法都允许将大型微数据库(非聚合数据)从庞大的微数据转换为宏数据后进行处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
0.60
自引率
0.00%
发文量
0
审稿时长
7 weeks
期刊最新文献
Precise Tensor Product Smoothing via Spectral Splines Predicting Random Walks and a Data-Splitting Prediction Region The Mediating Impact of Innovation Types in the Relationship between Innovation Use Theory and Market Performance Jump-Robust Realized-GARCH-MIDAS-X Estimators for Bitcoin and Ethereum Volatility Indices Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1