比较网络搜索问题

Alexander Bondarenko, Pavel Braslavski, Michael Völske, Rami Aly, Maik Fröbe, Alexander Panchenko, Christian Biemann, Benno Stein, Matthias Hagen
{"title":"比较网络搜索问题","authors":"Alexander Bondarenko, Pavel Braslavski, Michael Völske, Rami Aly, Maik Fröbe, Alexander Panchenko, Christian Biemann, Benno Stein, Matthias Hagen","doi":"10.1145/3336191.3371848","DOIUrl":null,"url":null,"abstract":"\\beginabstract We analyze comparative questions, i.e., questions asking to compare different items, that were submitted to Yandex in 2012. Responses to such questions might be quite different from the simple \"ten blue links'' and could, for example, aggregate pros and cons of the different options as direct answers. However, changing the result presentation is an intricate decision such that the classification of comparative questions forms a highly precision-oriented task. From a year-long Yandex log, we annotate a random sample of 50,000~questions; 2.8%~of which are comparative. For these annotated questions, we develop a precision-oriented classifier by combining carefully hand-crafted lexico-syntactic rules with feature-based and neural approaches---achieving a recall of~0.6 at a perfect precision of~1.0. After running the classifier on the full year log (on average, there is at least one comparative question per second), we analyze 6,250~comparative questions using more fine-grained subclasses (e.g., should the answer be a \"simple'' fact or rather a more verbose argument) for which individual classifiers are trained. An important insight is that more than 65%~of the comparative questions demand argumentation and opinions, i.e., reliable direct answers to comparative questions require more than the facts from a search engine's knowledge graph. In addition, we present a qualitative analysis of the underlying comparative information needs (separated into 14~categories likeconsumer electronics orhealth ), their seasonal dynamics, and possible answers from community question answering platforms. \\endabstract","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Comparative Web Search Questions\",\"authors\":\"Alexander Bondarenko, Pavel Braslavski, Michael Völske, Rami Aly, Maik Fröbe, Alexander Panchenko, Christian Biemann, Benno Stein, Matthias Hagen\",\"doi\":\"10.1145/3336191.3371848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\\\beginabstract We analyze comparative questions, i.e., questions asking to compare different items, that were submitted to Yandex in 2012. Responses to such questions might be quite different from the simple \\\"ten blue links'' and could, for example, aggregate pros and cons of the different options as direct answers. However, changing the result presentation is an intricate decision such that the classification of comparative questions forms a highly precision-oriented task. From a year-long Yandex log, we annotate a random sample of 50,000~questions; 2.8%~of which are comparative. For these annotated questions, we develop a precision-oriented classifier by combining carefully hand-crafted lexico-syntactic rules with feature-based and neural approaches---achieving a recall of~0.6 at a perfect precision of~1.0. After running the classifier on the full year log (on average, there is at least one comparative question per second), we analyze 6,250~comparative questions using more fine-grained subclasses (e.g., should the answer be a \\\"simple'' fact or rather a more verbose argument) for which individual classifiers are trained. An important insight is that more than 65%~of the comparative questions demand argumentation and opinions, i.e., reliable direct answers to comparative questions require more than the facts from a search engine's knowledge graph. In addition, we present a qualitative analysis of the underlying comparative information needs (separated into 14~categories likeconsumer electronics orhealth ), their seasonal dynamics, and possible answers from community question answering platforms. \\\\endabstract\",\"PeriodicalId\":319008,\"journal\":{\"name\":\"Proceedings of the 13th International Conference on Web Search and Data Mining\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3336191.3371848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3336191.3371848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

摘要

我们分析比较问题,即要求比较不同项目的问题,这些问题于2012年提交给Yandex。对这些问题的回答可能与简单的“十个蓝色链接”截然不同,例如,可以将不同选项的利弊汇总为直接答案。然而,改变结果表示是一个复杂的决定,因此比较问题的分类形成了一个高度精确导向的任务。从长达一年的Yandex日志中,我们对随机抽取的5万个问题进行了注释;其中2.8%是比较的。对于这些注释问题,我们开发了一个面向精度的分类器,通过将精心制作的词典句法规则与基于特征和神经方法相结合,实现了~0.6的召回率和~1.0的完美精度。在全年日志上运行分类器之后(平均而言,每秒至少有一个比较问题),我们使用更细粒度的子类(例如,答案应该是一个“简单”的事实还是更冗长的参数)分析了6,250~比较问题,每个分类器都为此进行了训练。一个重要的洞察是,超过65%的比较问题需要论证和观点,也就是说,对比较问题的可靠直接答案需要的不仅仅是搜索引擎知识图谱中的事实。此外,我们对潜在的比较信息需求(分为14个类别,如消费电子产品或健康),其季节性动态以及社区问答平台的可能答案进行了定性分析。\ endabstract
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comparative Web Search Questions
\beginabstract We analyze comparative questions, i.e., questions asking to compare different items, that were submitted to Yandex in 2012. Responses to such questions might be quite different from the simple "ten blue links'' and could, for example, aggregate pros and cons of the different options as direct answers. However, changing the result presentation is an intricate decision such that the classification of comparative questions forms a highly precision-oriented task. From a year-long Yandex log, we annotate a random sample of 50,000~questions; 2.8%~of which are comparative. For these annotated questions, we develop a precision-oriented classifier by combining carefully hand-crafted lexico-syntactic rules with feature-based and neural approaches---achieving a recall of~0.6 at a perfect precision of~1.0. After running the classifier on the full year log (on average, there is at least one comparative question per second), we analyze 6,250~comparative questions using more fine-grained subclasses (e.g., should the answer be a "simple'' fact or rather a more verbose argument) for which individual classifiers are trained. An important insight is that more than 65%~of the comparative questions demand argumentation and opinions, i.e., reliable direct answers to comparative questions require more than the facts from a search engine's knowledge graph. In addition, we present a qualitative analysis of the underlying comparative information needs (separated into 14~categories likeconsumer electronics orhealth ), their seasonal dynamics, and possible answers from community question answering platforms. \endabstract
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Recurrent Memory Reasoning Network for Expert Finding in Community Question Answering Joint Recognition of Names and Publications in Academic Homepages LouvainNE Enhancing Re-finding Behavior with External Memories for Personalized Search Temporal Pattern of Retweet(s) Help to Maximize Information Diffusion in Twitter
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1