基于三方神经网络的相似问题检索改进方法

Anirban Sen, Manjira Sinha, Sandya Mannarswamy
{"title":"基于三方神经网络的相似问题检索改进方法","authors":"Anirban Sen, Manjira Sinha, Sandya Mannarswamy","doi":"10.1145/3158354.3158355","DOIUrl":null,"url":null,"abstract":"Collective intelligence of the crowds is distilled together in various Community Question Answering (CQA) Services such as Quora, Yahoo Answers, Stack Overflow forums, wherein users share their knowledge, providing both informational and experiential support to other users. As users often search for similar information, probabilities are high that for a new incoming question, there is a related question-answer pair existing in the CQA dataset. Therefore, an efficient technique for similar question identification is need of the hour. While data is not a bottleneck in this scenario, addressing the vocabulary diversity generated by a variety pool of users certainly is. This paper proposes a novel tripartite neural network based approach towards the similar question retrieval problem. The network takes inputs in the form of question-answer and new question triplet and learns internal representations from similarities among them. Our approach achieves classification performances upto 77% on a real world CQA dataset.We have also compared our method with two other baselines and found that it performs significantly better in handling the problem of vocabulary diversity and 'zero-lexical overlap' among questions.","PeriodicalId":306212,"journal":{"name":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Similar Question Retrieval using a Novel Tripartite Neural Network based Approach\",\"authors\":\"Anirban Sen, Manjira Sinha, Sandya Mannarswamy\",\"doi\":\"10.1145/3158354.3158355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collective intelligence of the crowds is distilled together in various Community Question Answering (CQA) Services such as Quora, Yahoo Answers, Stack Overflow forums, wherein users share their knowledge, providing both informational and experiential support to other users. As users often search for similar information, probabilities are high that for a new incoming question, there is a related question-answer pair existing in the CQA dataset. Therefore, an efficient technique for similar question identification is need of the hour. While data is not a bottleneck in this scenario, addressing the vocabulary diversity generated by a variety pool of users certainly is. This paper proposes a novel tripartite neural network based approach towards the similar question retrieval problem. The network takes inputs in the form of question-answer and new question triplet and learns internal representations from similarities among them. Our approach achieves classification performances upto 77% on a real world CQA dataset.We have also compared our method with two other baselines and found that it performs significantly better in handling the problem of vocabulary diversity and 'zero-lexical overlap' among questions.\",\"PeriodicalId\":306212,\"journal\":{\"name\":\"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3158354.3158355\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3158354.3158355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

人群的集体智慧在各种社区问答(CQA)服务(如Quora、雅虎问答、Stack Overflow论坛)中被提炼出来,用户在其中分享他们的知识,为其他用户提供信息和经验支持。由于用户经常搜索相似的信息,对于一个新的输入问题,CQA数据集中存在相关的问答对的概率很高。因此,迫切需要一种有效的相似问题识别技术。虽然数据不是这个场景中的瓶颈,但解决由各种用户池生成的词汇表多样性肯定是瓶颈。本文提出了一种基于三方神经网络的相似问题检索方法。该网络以问答和新问题三元组的形式输入,并从它们之间的相似性中学习内部表征。我们的方法在真实世界的CQA数据集上实现了高达77%的分类性能。我们还将我们的方法与另外两个基线进行了比较,发现它在处理词汇多样性和问题之间的“零词汇重叠”问题方面表现得明显更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Improving Similar Question Retrieval using a Novel Tripartite Neural Network based Approach
Collective intelligence of the crowds is distilled together in various Community Question Answering (CQA) Services such as Quora, Yahoo Answers, Stack Overflow forums, wherein users share their knowledge, providing both informational and experiential support to other users. As users often search for similar information, probabilities are high that for a new incoming question, there is a related question-answer pair existing in the CQA dataset. Therefore, an efficient technique for similar question identification is need of the hour. While data is not a bottleneck in this scenario, addressing the vocabulary diversity generated by a variety pool of users certainly is. This paper proposes a novel tripartite neural network based approach towards the similar question retrieval problem. The network takes inputs in the form of question-answer and new question triplet and learns internal representations from similarities among them. Our approach achieves classification performances upto 77% on a real world CQA dataset.We have also compared our method with two other baselines and found that it performs significantly better in handling the problem of vocabulary diversity and 'zero-lexical overlap' among questions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Comparison of Automatic Search Query Enhancement Algorithms That Utilise Wikipedia as a Source of A Priori Knowledge Feature Space of Deep Learning and its Importance: Comparison of Clustering Techniques on the Extended Space of ML-ELM Improving Similar Question Retrieval using a Novel Tripartite Neural Network based Approach Segmentation of Merged Lines and Script Identification in Handwritten Bilingual Documents Language Identification in Mixed Script
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1