Oscar M. Cumbicus-Pineda, Tania E. Abad-Eras, Lisset A. Neyra-Romero
{"title":"Data Mining to Determine the Causes of Gender-Based Violence against Women in Ecuador","authors":"Oscar M. Cumbicus-Pineda, Tania E. Abad-Eras, Lisset A. Neyra-Romero","doi":"10.1109/ETCM53643.2021.9590664","DOIUrl":null,"url":null,"abstract":"In this paper, we applied data mining to determine the causes of gender-based violence against women in Ecuador. We divided the original database into 30 subsets, according to the scopes in which violence occurs. We previously classified these subsets using six algorithms, namely Decision Trees(J48), Exhaustive CHAID, Neural Networks, Nearest Neighbors (IBk), Decision Tables, and Random Forests. The results of this classification showed a bias towards the majority class; for this reason, we applied the SMOTE Synthetic Minority Oversampling Technique to balance the classes and obtain better results. For the predictions of the causes of violence, we used Exhaustive CHAID because our variables are mostly non-binary, and this algorithm allowed us to generate trees with more than two branches. IBk algorithm was the best at globally classifying the data, and Random Forests performed the best in classification precision. The predictions obtained from the 30 subsets of data show that the most common causes for a woman to suffer violence are: when her partner drinks or consumes alcohol or drugs; if her partner is in another relationship; when her husbands suffered violence in childhood; and when the woman was touched without her consent or denigrated for being a woman.","PeriodicalId":438567,"journal":{"name":"2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ETCM53643.2021.9590664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, we applied data mining to determine the causes of gender-based violence against women in Ecuador. We divided the original database into 30 subsets, according to the scopes in which violence occurs. We previously classified these subsets using six algorithms, namely Decision Trees(J48), Exhaustive CHAID, Neural Networks, Nearest Neighbors (IBk), Decision Tables, and Random Forests. The results of this classification showed a bias towards the majority class; for this reason, we applied the SMOTE Synthetic Minority Oversampling Technique to balance the classes and obtain better results. For the predictions of the causes of violence, we used Exhaustive CHAID because our variables are mostly non-binary, and this algorithm allowed us to generate trees with more than two branches. IBk algorithm was the best at globally classifying the data, and Random Forests performed the best in classification precision. The predictions obtained from the 30 subsets of data show that the most common causes for a woman to suffer violence are: when her partner drinks or consumes alcohol or drugs; if her partner is in another relationship; when her husbands suffered violence in childhood; and when the woman was touched without her consent or denigrated for being a woman.