{"title":"聚类性分析的相似统计量及其在细胞形成问题中的应用","authors":"Yingyu Zhu, Simon Li","doi":"10.1155/2018/1348147","DOIUrl":null,"url":null,"abstract":"This paper proposes the use of the statistics of similarity values to evaluate the clusterability or structuredness associated with a cell formation (CF) problem. Typically, the structuredness of a CF solution cannot be known until the CF problem is solved. In this context, this paper investigates the similarity statistics of machine pairs to estimate the potential structuredness of a given CF problem without solving it. One key observation is that a well-structured CF solution matrix has a relatively high percentage of high-similarity machine pairs. Then, histograms are used as a statistical tool to study the statistical distributions of similarity values. This study leads to the development of the U-shape criteria and the criterion based on the Kolmogorov-Smirnov test. Accordingly, a procedure is developed to classify whether an input CF problem can potentially lead to a well-structured or ill-structured CF matrix. In the numerical study, 20 matrices were initially used to determine the threshold values of the criteria, and 40 additional matrices were used to verify the results. Further, these matrix examples show that genetic algorithm cannot effectively improve the well-structured CF solutions (of high grouping efficacy values) that are obtained by hierarchical clustering (as one type of heuristics). This result supports the relevance of similarity statistics to preexamine an input CF problem instance and suggest a proper solution approach for problem solving.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2018-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2018/1348147","citationCount":"0","resultStr":"{\"title\":\"Similarity Statistics for Clusterability Analysis with the Application of Cell Formation Problem\",\"authors\":\"Yingyu Zhu, Simon Li\",\"doi\":\"10.1155/2018/1348147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes the use of the statistics of similarity values to evaluate the clusterability or structuredness associated with a cell formation (CF) problem. Typically, the structuredness of a CF solution cannot be known until the CF problem is solved. In this context, this paper investigates the similarity statistics of machine pairs to estimate the potential structuredness of a given CF problem without solving it. One key observation is that a well-structured CF solution matrix has a relatively high percentage of high-similarity machine pairs. Then, histograms are used as a statistical tool to study the statistical distributions of similarity values. This study leads to the development of the U-shape criteria and the criterion based on the Kolmogorov-Smirnov test. Accordingly, a procedure is developed to classify whether an input CF problem can potentially lead to a well-structured or ill-structured CF matrix. In the numerical study, 20 matrices were initially used to determine the threshold values of the criteria, and 40 additional matrices were used to verify the results. Further, these matrix examples show that genetic algorithm cannot effectively improve the well-structured CF solutions (of high grouping efficacy values) that are obtained by hierarchical clustering (as one type of heuristics). This result supports the relevance of similarity statistics to preexamine an input CF problem instance and suggest a proper solution approach for problem solving.\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2018-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1155/2018/1348147\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2018/1348147\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2018/1348147","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Similarity Statistics for Clusterability Analysis with the Application of Cell Formation Problem
This paper proposes the use of the statistics of similarity values to evaluate the clusterability or structuredness associated with a cell formation (CF) problem. Typically, the structuredness of a CF solution cannot be known until the CF problem is solved. In this context, this paper investigates the similarity statistics of machine pairs to estimate the potential structuredness of a given CF problem without solving it. One key observation is that a well-structured CF solution matrix has a relatively high percentage of high-similarity machine pairs. Then, histograms are used as a statistical tool to study the statistical distributions of similarity values. This study leads to the development of the U-shape criteria and the criterion based on the Kolmogorov-Smirnov test. Accordingly, a procedure is developed to classify whether an input CF problem can potentially lead to a well-structured or ill-structured CF matrix. In the numerical study, 20 matrices were initially used to determine the threshold values of the criteria, and 40 additional matrices were used to verify the results. Further, these matrix examples show that genetic algorithm cannot effectively improve the well-structured CF solutions (of high grouping efficacy values) that are obtained by hierarchical clustering (as one type of heuristics). This result supports the relevance of similarity statistics to preexamine an input CF problem instance and suggest a proper solution approach for problem solving.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.