P. Ashok, G. M. Kadhar Nawaz, K. Thangavel, E. Elayaraja
{"title":"用分割聚类方法检测蛋白质定位位点的异常值","authors":"P. Ashok, G. M. Kadhar Nawaz, K. Thangavel, E. Elayaraja","doi":"10.1109/ICPRIME.2013.6496519","DOIUrl":null,"url":null,"abstract":"A large molecule composed of one or more chains of amino acids in a specific order, the order is determined by the base sequence of nucleotides in the gene that codes for the protein. Proteins are required for the structure, function, and regulation of the body's cells, tissues, and organs and each protein has unique functions. Localization sites of proteins are identified by the mechanism and moved to its corresponding organelles. In this paper, we introduce the method clustering and its type's K-Means and K-Medoids. The clustering algorithms are improved by implementing the two initial centroid selection methods instead of selecting centroid randomly. K-Means algorithm can be improved by implementing the initial cluster centroids are selected by the two proposed algorithms instead of selecting centroids randomly, which is compared by using Davie Bouldin index measure, hence the proposed algorithm1 overcomes the drawbacks of selecting initial cluster centers then other methods. In the yeast dataset, the defective proteins (objects) are considered as outliers, which are identified by the clustering methods with ADOC (Average Distance between Object and Centroid) function. The outlier's detection method and performance analysis method are studied and compared, the experimental results shows that the K-Medoids method performs well when compare with the K-Means clustering.","PeriodicalId":123210,"journal":{"name":"2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Outliers detection on protein localization sites by partitional clustering methods\",\"authors\":\"P. Ashok, G. M. Kadhar Nawaz, K. Thangavel, E. Elayaraja\",\"doi\":\"10.1109/ICPRIME.2013.6496519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A large molecule composed of one or more chains of amino acids in a specific order, the order is determined by the base sequence of nucleotides in the gene that codes for the protein. Proteins are required for the structure, function, and regulation of the body's cells, tissues, and organs and each protein has unique functions. Localization sites of proteins are identified by the mechanism and moved to its corresponding organelles. In this paper, we introduce the method clustering and its type's K-Means and K-Medoids. The clustering algorithms are improved by implementing the two initial centroid selection methods instead of selecting centroid randomly. K-Means algorithm can be improved by implementing the initial cluster centroids are selected by the two proposed algorithms instead of selecting centroids randomly, which is compared by using Davie Bouldin index measure, hence the proposed algorithm1 overcomes the drawbacks of selecting initial cluster centers then other methods. In the yeast dataset, the defective proteins (objects) are considered as outliers, which are identified by the clustering methods with ADOC (Average Distance between Object and Centroid) function. The outlier's detection method and performance analysis method are studied and compared, the experimental results shows that the K-Medoids method performs well when compare with the K-Means clustering.\",\"PeriodicalId\":123210,\"journal\":{\"name\":\"2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPRIME.2013.6496519\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPRIME.2013.6496519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Outliers detection on protein localization sites by partitional clustering methods
A large molecule composed of one or more chains of amino acids in a specific order, the order is determined by the base sequence of nucleotides in the gene that codes for the protein. Proteins are required for the structure, function, and regulation of the body's cells, tissues, and organs and each protein has unique functions. Localization sites of proteins are identified by the mechanism and moved to its corresponding organelles. In this paper, we introduce the method clustering and its type's K-Means and K-Medoids. The clustering algorithms are improved by implementing the two initial centroid selection methods instead of selecting centroid randomly. K-Means algorithm can be improved by implementing the initial cluster centroids are selected by the two proposed algorithms instead of selecting centroids randomly, which is compared by using Davie Bouldin index measure, hence the proposed algorithm1 overcomes the drawbacks of selecting initial cluster centers then other methods. In the yeast dataset, the defective proteins (objects) are considered as outliers, which are identified by the clustering methods with ADOC (Average Distance between Object and Centroid) function. The outlier's detection method and performance analysis method are studied and compared, the experimental results shows that the K-Medoids method performs well when compare with the K-Means clustering.