增强的基于词典的web论坛答案检测模型

A. I. Obasa, N. Salim, Atif Khan
{"title":"增强的基于词典的web论坛答案检测模型","authors":"A. I. Obasa, N. Salim, Atif Khan","doi":"10.1109/ICDIPC.2015.7323035","DOIUrl":null,"url":null,"abstract":"A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.","PeriodicalId":339685,"journal":{"name":"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Enhanced lexicon based model for web forum answer detection\",\"authors\":\"A. I. Obasa, N. Salim, Atif Khan\",\"doi\":\"10.1109/ICDIPC.2015.7323035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.\",\"PeriodicalId\":339685,\"journal\":{\"name\":\"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDIPC.2015.7323035\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIPC.2015.7323035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

网络论坛是一个将有共同兴趣的人联系在一起的在线社区。在论坛中,成员们相互交流,分享知识、专业知识和资源。检测网络论坛答案的一个主要问题是在问题和候选答案之间建立良好的关系。这种关系通常是通过词汇特征来建立的。与新闻文章不同,Web论坛文本面临着噪声的挑战,这阻碍了词汇特性的表现。在本文中,我们研究了噪声对挖掘web论坛答案中使用的大多数常见词汇特征的影响,以期对其进行规范化以提高特征的性能。我们提出了13个词汇特征进行探索。这些特征属于四个不同的质量维度,可以保证好的答案。本文对以下问题进行了实证研究。在网络论坛上,哪一类噪音更猖獗?哪些词法挖掘特征更容易受到噪声的影响?论坛语料库的规范化是否会提高词汇特征在论坛答案检测中的性能?我们在实验中使用了三个不同技术程度的公开数据集。实验结果表明,对网络论坛语料库进行适当的规范化处理可以使词汇特征的性能提高9%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhanced lexicon based model for web forum answer detection
A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Facial expression recognition using multi Radial Bases Function Networks and 2-D Gabor filters A cache- and memory-aware mapping algorithm for big data applications HOPHS: A hyperheuristic that solves orienteering problem with hotel selection Forecasting high magnitude price movement of crude palm oil futures by identifying the breaching of price equilibrium through price distribution mining A traffic flow analysis from psychological aspects
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1