{"title":"增强的基于词典的web论坛答案检测模型","authors":"A. I. Obasa, N. Salim, Atif Khan","doi":"10.1109/ICDIPC.2015.7323035","DOIUrl":null,"url":null,"abstract":"A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.","PeriodicalId":339685,"journal":{"name":"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Enhanced lexicon based model for web forum answer detection\",\"authors\":\"A. I. Obasa, N. Salim, Atif Khan\",\"doi\":\"10.1109/ICDIPC.2015.7323035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.\",\"PeriodicalId\":339685,\"journal\":{\"name\":\"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDIPC.2015.7323035\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIPC.2015.7323035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhanced lexicon based model for web forum answer detection
A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.