{"title":"数据不对称指数及其在降维和数据可视化中的验证","authors":"","doi":"10.1016/j.ins.2024.121405","DOIUrl":null,"url":null,"abstract":"<div><p>We propose an asymmetry index as a measure of degree of asymmetry of a given dataset. It provides an additional information on a dataset allowing to guide and improve any further analysis. The index reflects the intensity of the asymmetric relationships among data resulting from hierarchical data structure. Using the information retrieved by our asymmetry index, one obtains a justification and explanation of the effectiveness of the subsequent asymmetric data analysis methods, as well as helpful preparation to asymmetrizing the tools for the further analysis. The asymmetry index is based on the <em>k</em>-nearest neighbors graph representing the considered data. Therefore, it uses the intrinsic geometry-based information on the data, in this way, providing an insight into the data structure. Our experiments on real data are designed to verify the usefulness of the asymmetry index and the correctness of its theoretical fundamentals. In our empirical validation, we employ the symmetric and asymmetric dimensionality reduction algorithms and evaluate their results on the basis of clustering in the 2-dimensional visualization space. We test, whether our index indeed predicts the level of superiority of the asymmetric methods over their symmetric counterparts.</p></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":null,"pages":null},"PeriodicalIF":8.1000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Asymmetry index for data and its verification in dimensionality reduction and data visualization\",\"authors\":\"\",\"doi\":\"10.1016/j.ins.2024.121405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We propose an asymmetry index as a measure of degree of asymmetry of a given dataset. It provides an additional information on a dataset allowing to guide and improve any further analysis. The index reflects the intensity of the asymmetric relationships among data resulting from hierarchical data structure. Using the information retrieved by our asymmetry index, one obtains a justification and explanation of the effectiveness of the subsequent asymmetric data analysis methods, as well as helpful preparation to asymmetrizing the tools for the further analysis. The asymmetry index is based on the <em>k</em>-nearest neighbors graph representing the considered data. Therefore, it uses the intrinsic geometry-based information on the data, in this way, providing an insight into the data structure. Our experiments on real data are designed to verify the usefulness of the asymmetry index and the correctness of its theoretical fundamentals. In our empirical validation, we employ the symmetric and asymmetric dimensionality reduction algorithms and evaluate their results on the basis of clustering in the 2-dimensional visualization space. We test, whether our index indeed predicts the level of superiority of the asymmetric methods over their symmetric counterparts.</p></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025524013197\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"N/A\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524013197","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"N/A","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
我们提出了一种不对称指数,用来衡量给定数据集的不对称程度。它为数据集提供了额外的信息,可用于指导和改进任何进一步的分析。该指数反映了分层数据结构导致的数据间不对称关系的强度。利用我们的不对称指数所检索到的信息,可以为后续的不对称数据分析方法的有效性提供理由和解释,并为进一步分析工具的不对称化做好准备。不对称指数基于代表所考虑数据的 k 近邻图。因此,它使用了数据的内在几何信息,从而提供了对数据结构的洞察力。我们在真实数据上的实验旨在验证不对称指数的实用性及其理论基础的正确性。在实证验证中,我们采用了对称和非对称降维算法,并在二维可视化空间聚类的基础上对其结果进行了评估。我们检验了我们的指数是否确实预测了非对称方法优于对称方法的程度。
Asymmetry index for data and its verification in dimensionality reduction and data visualization
We propose an asymmetry index as a measure of degree of asymmetry of a given dataset. It provides an additional information on a dataset allowing to guide and improve any further analysis. The index reflects the intensity of the asymmetric relationships among data resulting from hierarchical data structure. Using the information retrieved by our asymmetry index, one obtains a justification and explanation of the effectiveness of the subsequent asymmetric data analysis methods, as well as helpful preparation to asymmetrizing the tools for the further analysis. The asymmetry index is based on the k-nearest neighbors graph representing the considered data. Therefore, it uses the intrinsic geometry-based information on the data, in this way, providing an insight into the data structure. Our experiments on real data are designed to verify the usefulness of the asymmetry index and the correctness of its theoretical fundamentals. In our empirical validation, we employ the symmetric and asymmetric dimensionality reduction algorithms and evaluate their results on the basis of clustering in the 2-dimensional visualization space. We test, whether our index indeed predicts the level of superiority of the asymmetric methods over their symmetric counterparts.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.