{"title":"基于平衡决策树的大数据集支持向量机调整","authors":"Cristina Vatamanu, Dragos Gavrilut, George Popoiu","doi":"10.1109/SYNASC.2018.00043","DOIUrl":null,"url":null,"abstract":"While machine learning techniques were successfully used for malware identification, they were not without challenges. Over the years, several key points related to the usage of such algorithm for practical applications have evolved: low (close to 0) number of false positives, fast evaluation method, reasonable memory and disk footprint. Because of these constraints, security vendors had to chose a simple algorithm (that can meet all of the above requirements) instead of a more complex ones, even if the later had better detection rates. The present paper describes a hybrid approach that can be used in conjunction with an SVM classifier allowing us to overcome some of the above mentioned constraints.","PeriodicalId":273805,"journal":{"name":"2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Adjusting SVMs for Large Data Sets using Balanced Decision Trees\",\"authors\":\"Cristina Vatamanu, Dragos Gavrilut, George Popoiu\",\"doi\":\"10.1109/SYNASC.2018.00043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While machine learning techniques were successfully used for malware identification, they were not without challenges. Over the years, several key points related to the usage of such algorithm for practical applications have evolved: low (close to 0) number of false positives, fast evaluation method, reasonable memory and disk footprint. Because of these constraints, security vendors had to chose a simple algorithm (that can meet all of the above requirements) instead of a more complex ones, even if the later had better detection rates. The present paper describes a hybrid approach that can be used in conjunction with an SVM classifier allowing us to overcome some of the above mentioned constraints.\",\"PeriodicalId\":273805,\"journal\":{\"name\":\"2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SYNASC.2018.00043\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNASC.2018.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adjusting SVMs for Large Data Sets using Balanced Decision Trees
While machine learning techniques were successfully used for malware identification, they were not without challenges. Over the years, several key points related to the usage of such algorithm for practical applications have evolved: low (close to 0) number of false positives, fast evaluation method, reasonable memory and disk footprint. Because of these constraints, security vendors had to chose a simple algorithm (that can meet all of the above requirements) instead of a more complex ones, even if the later had better detection rates. The present paper describes a hybrid approach that can be used in conjunction with an SVM classifier allowing us to overcome some of the above mentioned constraints.