{"title":"仇恨团体论坛中激进观点识别的部分监督学习","authors":"Ming Yang, Hsinchun Chen","doi":"10.1109/ISI.2012.6284099","DOIUrl":null,"url":null,"abstract":"Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Partially supervised learning for radical opinion identification in hate group web forums\",\"authors\":\"Ming Yang, Hsinchun Chen\",\"doi\":\"10.1109/ISI.2012.6284099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.\",\"PeriodicalId\":199734,\"journal\":{\"name\":\"2012 IEEE International Conference on Intelligence and Security Informatics\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Intelligence and Security Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISI.2012.6284099\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Intelligence and Security Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2012.6284099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Partially supervised learning for radical opinion identification in hate group web forums
Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.