{"title":"一种高效的概率密度函数自动聚类算法及其在表面材料分类中的应用","authors":"Thao Nguyen-Trang, Tai Vo-Van, Ha Che-Ngoc","doi":"10.1111/stan.12315","DOIUrl":null,"url":null,"abstract":"Clustering is a technique used to partition a dataset into groups of similar elements. In addition to traditional clustering methods, clustering for probability density functions (CDF) has been studied to capture data uncertainty. In CDF, automatic clustering is a clever technique that can determine the number of clusters automatically. However, current automatic clustering algorithms update the new probability density function (pdf) fi(t) based on the weighted mean of all previous pdfs fj(t − 1), j = 1, 2, …, N, resulting in slow convergence. This paper proposes an efficient automatic clustering algorithm for pdfs. In the proposed approach, the update of fi(t) is based on the weighted mean of {f1(t), f2(t),…, fi − 1(t), fi(t − 1), fi+1(t − 1),…,fN(t − 1)}, where N is the number of pdfs and i = 1,2,…, N. This technique allows for the incorporation of recently updated pdfs, leading to faster convergence. This paper also pioneers the applications of certain CDF algorithms in the field of surface image recognition. The numerical examples demonstrate that the proposed method can result in a rapid convergence at some early iterations. It also outperforms other state‐of‐the‐art automatic clustering methods in terms of the Adjusted Rand Index (ARI) and the Normalized Mutual Information (NMI). Additionally, the proposed algorithm proves to be competitive when clustering material images contaminated by noise. These results highlight the applicability of the proposed method in the problem of surface image recognition.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"26 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient automatic clustering algorithm for probability density functions and its applications in surface material classification\",\"authors\":\"Thao Nguyen-Trang, Tai Vo-Van, Ha Che-Ngoc\",\"doi\":\"10.1111/stan.12315\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering is a technique used to partition a dataset into groups of similar elements. In addition to traditional clustering methods, clustering for probability density functions (CDF) has been studied to capture data uncertainty. In CDF, automatic clustering is a clever technique that can determine the number of clusters automatically. However, current automatic clustering algorithms update the new probability density function (pdf) fi(t) based on the weighted mean of all previous pdfs fj(t − 1), j = 1, 2, …, N, resulting in slow convergence. This paper proposes an efficient automatic clustering algorithm for pdfs. In the proposed approach, the update of fi(t) is based on the weighted mean of {f1(t), f2(t),…, fi − 1(t), fi(t − 1), fi+1(t − 1),…,fN(t − 1)}, where N is the number of pdfs and i = 1,2,…, N. This technique allows for the incorporation of recently updated pdfs, leading to faster convergence. This paper also pioneers the applications of certain CDF algorithms in the field of surface image recognition. The numerical examples demonstrate that the proposed method can result in a rapid convergence at some early iterations. It also outperforms other state‐of‐the‐art automatic clustering methods in terms of the Adjusted Rand Index (ARI) and the Normalized Mutual Information (NMI). Additionally, the proposed algorithm proves to be competitive when clustering material images contaminated by noise. These results highlight the applicability of the proposed method in the problem of surface image recognition.This article is protected by copyright. All rights reserved.\",\"PeriodicalId\":51178,\"journal\":{\"name\":\"Statistica Neerlandica\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistica Neerlandica\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1111/stan.12315\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistica Neerlandica","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/stan.12315","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
An efficient automatic clustering algorithm for probability density functions and its applications in surface material classification
Clustering is a technique used to partition a dataset into groups of similar elements. In addition to traditional clustering methods, clustering for probability density functions (CDF) has been studied to capture data uncertainty. In CDF, automatic clustering is a clever technique that can determine the number of clusters automatically. However, current automatic clustering algorithms update the new probability density function (pdf) fi(t) based on the weighted mean of all previous pdfs fj(t − 1), j = 1, 2, …, N, resulting in slow convergence. This paper proposes an efficient automatic clustering algorithm for pdfs. In the proposed approach, the update of fi(t) is based on the weighted mean of {f1(t), f2(t),…, fi − 1(t), fi(t − 1), fi+1(t − 1),…,fN(t − 1)}, where N is the number of pdfs and i = 1,2,…, N. This technique allows for the incorporation of recently updated pdfs, leading to faster convergence. This paper also pioneers the applications of certain CDF algorithms in the field of surface image recognition. The numerical examples demonstrate that the proposed method can result in a rapid convergence at some early iterations. It also outperforms other state‐of‐the‐art automatic clustering methods in terms of the Adjusted Rand Index (ARI) and the Normalized Mutual Information (NMI). Additionally, the proposed algorithm proves to be competitive when clustering material images contaminated by noise. These results highlight the applicability of the proposed method in the problem of surface image recognition.This article is protected by copyright. All rights reserved.
期刊介绍:
Statistica Neerlandica has been the journal of the Netherlands Society for Statistics and Operations Research since 1946. It covers all areas of statistics, from theoretical to applied, with a special emphasis on mathematical statistics, statistics for the behavioural sciences and biostatistics. This wide scope is reflected by the expertise of the journal’s editors representing these areas. The diverse editorial board is committed to a fast and fair reviewing process, and will judge submissions on quality, correctness, relevance and originality. Statistica Neerlandica encourages transparency and reproducibility, and offers online resources to make data, code, simulation results and other additional materials publicly available.