Chen-Kai Wang, Hong-Jie Dai, Feng-Duo Wang, E. C. Su
{"title":"Adverse Drug Reaction Post Classification with Imbalanced Classification Techniques","authors":"Chen-Kai Wang, Hong-Jie Dai, Feng-Duo Wang, E. C. Su","doi":"10.1109/TAAI.2018.00011","DOIUrl":null,"url":null,"abstract":"Nowadays, social media is often being used by users to create public messages related to their health. With the increasing number of social media usage, a trend has been observed of users creating posts related to adverse drug reactions (ADR). Mining social media data for these information can be used for pharmacological post-marketing surveillance and monitoring. However, the development of automatic ADR detection systems remains challenging because the corpora compiled from real world social media were usually highly imbalanced resulting in barriers to develop classifiers with reliable performance. In this work, we implemented a variety of imbalanced techniques and compared their performance on two large imbalanced data sets released for the purpose of detecting ADR posts. Comparing with state-of-the-art approaches developed for the two dataset, based on much less features, the developed classifiers with implemented imbalanced classification techniques achieved comparable or even better F-scores.","PeriodicalId":211734,"journal":{"name":"2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAAI.2018.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Nowadays, social media is often being used by users to create public messages related to their health. With the increasing number of social media usage, a trend has been observed of users creating posts related to adverse drug reactions (ADR). Mining social media data for these information can be used for pharmacological post-marketing surveillance and monitoring. However, the development of automatic ADR detection systems remains challenging because the corpora compiled from real world social media were usually highly imbalanced resulting in barriers to develop classifiers with reliable performance. In this work, we implemented a variety of imbalanced techniques and compared their performance on two large imbalanced data sets released for the purpose of detecting ADR posts. Comparing with state-of-the-art approaches developed for the two dataset, based on much less features, the developed classifiers with implemented imbalanced classification techniques achieved comparable or even better F-scores.