{"title":"一种用于大肠杆菌源跟踪的模糊不相似性测度分析","authors":"Hyo-Jin Suh, J. Keller, C. Carson","doi":"10.1109/FUZZ.2003.1206540","DOIUrl":null,"url":null,"abstract":"To identify the source of Escherichia coli (E.coli) fecal bacterial contamination, we propose a fuzzy dissimilarity measure to calculate the similarity between the E.coli DNA patterns. The fuzzy dissimilarity measure preserves the dimension of the DNA patterns and at the same time allows variation among same host patterns. The fuzzy dissimilarity measure produces a dissimilarity matrix, a form of relational data. For classification of this type of data representation we present a weighted k-nearest neighbor algorithm. The weighted k.nearest neighbor technique uses the classical k-nearest neighbor rule but solves the problem of 'tie' between multi-classes. In addition, we suggest an ensemble data set method for sample sets with a large range of class sizes. The proposed system showed potential as a stable system in detecting fecal bacterial hosts and as a base for future studies in interpreting DNA patterns.","PeriodicalId":212172,"journal":{"name":"The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03.","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An analysis of a fuzzy dissimilarity measure to perform Escherichia coli source tracking\",\"authors\":\"Hyo-Jin Suh, J. Keller, C. Carson\",\"doi\":\"10.1109/FUZZ.2003.1206540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To identify the source of Escherichia coli (E.coli) fecal bacterial contamination, we propose a fuzzy dissimilarity measure to calculate the similarity between the E.coli DNA patterns. The fuzzy dissimilarity measure preserves the dimension of the DNA patterns and at the same time allows variation among same host patterns. The fuzzy dissimilarity measure produces a dissimilarity matrix, a form of relational data. For classification of this type of data representation we present a weighted k-nearest neighbor algorithm. The weighted k.nearest neighbor technique uses the classical k-nearest neighbor rule but solves the problem of 'tie' between multi-classes. In addition, we suggest an ensemble data set method for sample sets with a large range of class sizes. The proposed system showed potential as a stable system in detecting fecal bacterial hosts and as a base for future studies in interpreting DNA patterns.\",\"PeriodicalId\":212172,\"journal\":{\"name\":\"The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03.\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FUZZ.2003.1206540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUZZ.2003.1206540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An analysis of a fuzzy dissimilarity measure to perform Escherichia coli source tracking
To identify the source of Escherichia coli (E.coli) fecal bacterial contamination, we propose a fuzzy dissimilarity measure to calculate the similarity between the E.coli DNA patterns. The fuzzy dissimilarity measure preserves the dimension of the DNA patterns and at the same time allows variation among same host patterns. The fuzzy dissimilarity measure produces a dissimilarity matrix, a form of relational data. For classification of this type of data representation we present a weighted k-nearest neighbor algorithm. The weighted k.nearest neighbor technique uses the classical k-nearest neighbor rule but solves the problem of 'tie' between multi-classes. In addition, we suggest an ensemble data set method for sample sets with a large range of class sizes. The proposed system showed potential as a stable system in detecting fecal bacterial hosts and as a base for future studies in interpreting DNA patterns.