{"title":"机器学习中的组合算法","authors":"Peter Shaw","doi":"10.1109/AI4I.2018.8665720","DOIUrl":null,"url":null,"abstract":"Although quite old, the classic data clustering problem strives to segment the data into homogeneous groupings where homogeneity is measured by, for example, Gini Index. Classical techniques strive to group the data, by what one would argue as “smart” trial-and-error procedure. I will show how data could be clustered using entirely combinatorial techniques where Gini Index or Mean Squared Error receive no mention whatsoever. The Cluster-Editing algorithm aka “Edit-Distance” shows a great promise to help solve those intractable high-dimensional problems because it's totally indifferent to the dimensionality of the data.","PeriodicalId":133657,"journal":{"name":"2018 First International Conference on Artificial Intelligence for Industries (AI4I)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Combinatorial Algorithms in Machine Learning\",\"authors\":\"Peter Shaw\",\"doi\":\"10.1109/AI4I.2018.8665720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although quite old, the classic data clustering problem strives to segment the data into homogeneous groupings where homogeneity is measured by, for example, Gini Index. Classical techniques strive to group the data, by what one would argue as “smart” trial-and-error procedure. I will show how data could be clustered using entirely combinatorial techniques where Gini Index or Mean Squared Error receive no mention whatsoever. The Cluster-Editing algorithm aka “Edit-Distance” shows a great promise to help solve those intractable high-dimensional problems because it's totally indifferent to the dimensionality of the data.\",\"PeriodicalId\":133657,\"journal\":{\"name\":\"2018 First International Conference on Artificial Intelligence for Industries (AI4I)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 First International Conference on Artificial Intelligence for Industries (AI4I)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AI4I.2018.8665720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 First International Conference on Artificial Intelligence for Industries (AI4I)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AI4I.2018.8665720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Although quite old, the classic data clustering problem strives to segment the data into homogeneous groupings where homogeneity is measured by, for example, Gini Index. Classical techniques strive to group the data, by what one would argue as “smart” trial-and-error procedure. I will show how data could be clustered using entirely combinatorial techniques where Gini Index or Mean Squared Error receive no mention whatsoever. The Cluster-Editing algorithm aka “Edit-Distance” shows a great promise to help solve those intractable high-dimensional problems because it's totally indifferent to the dimensionality of the data.