{"title":"Research and application of random forest model in mining automobile insurance fraud","authors":"Yaqi Li, Chun Yan, W. Liu, Maozhen Li","doi":"10.1109/FSKD.2016.7603443","DOIUrl":null,"url":null,"abstract":"Automobile insurance fraud is gradually spreading in the global scope, and mining automobile insurance fraud is more and more concerned by the society. Concerning that the number of samples in the actual automobile insurance claims data is not balance and the amount of data is large, the real data of a automobile insurance company were selected to establish the random forest fraud mining model based on the theory of automobile insurance fraud mining. The data were processed to screen the index and the importance analysis of each input variable to the output variable was obtained. The error of the model was analyzed. Finally the method has been verified by empirical analysis. The empirical results show that: compared with the traditional model, the automobile insurance fraud mining model introducing Random Forest is suitable for large data sets and unbalanced data. It can be better used for the classification and prediction of the automobile insurance claims data and mining fraud rules. And it has the better accuracy and robustness.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2016.7603443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Automobile insurance fraud is gradually spreading in the global scope, and mining automobile insurance fraud is more and more concerned by the society. Concerning that the number of samples in the actual automobile insurance claims data is not balance and the amount of data is large, the real data of a automobile insurance company were selected to establish the random forest fraud mining model based on the theory of automobile insurance fraud mining. The data were processed to screen the index and the importance analysis of each input variable to the output variable was obtained. The error of the model was analyzed. Finally the method has been verified by empirical analysis. The empirical results show that: compared with the traditional model, the automobile insurance fraud mining model introducing Random Forest is suitable for large data sets and unbalanced data. It can be better used for the classification and prediction of the automobile insurance claims data and mining fraud rules. And it has the better accuracy and robustness.