Modern Theoretical Tools for Understanding and Designing Next-generation Information Retrieval System

Da Xu, Chuanwei Ruan
{"title":"Modern Theoretical Tools for Understanding and Designing Next-generation Information Retrieval System","authors":"Da Xu, Chuanwei Ruan","doi":"10.1145/3488560.3501394","DOIUrl":null,"url":null,"abstract":"In the relatively short history of machine learning, the subtle balance between engineering and theoretical progress has been proved critical at various stages. The most recent wave of AI has brought to the IR community powerful techniques, particularly for pattern recognition. While many benefits from the burst of ideas as numerous tasks become algorithmically feasible, the balance is tilting toward the application side. The existing theoretical tools in IR can no longer explain, guide, and justify the newly-established methodologies. With no choices, we have to bet our design on black-box mechanisms that we only empirically understand. The consequences can be suffering: in stark contrast to how the IR industry has envisioned modern AI making life easier, many are experiencing increased confusion and costs in data manipulation, model selection, monitoring, censoring, and decision making. This reality is not surprising: without handy theoretical tools, we often lack principled knowledge of the pattern recognition model's expressivity, optimization property, generalization guarantee, and our decision-making process has to rely on over-simplified assumptions and human judgments from time to time. Facing all the challenges, we started researching advanced theoretical tools emerging from various domains that can potentially resolve modern IR problems. We encountered many impactful ideas and made several independent publications emphasizing different pieces. Time is now to bring the community a systematic tutorial on how we successfully adapt those tools and make significant progress in understanding, designing, and eventually productionize impactful IR systems. We emphasize systematicity because IR is a comprehensive discipline that touches upon particular aspects of learning, causal inference analysis, interactive (online) decision-making, etc. It thus requires systematic calibrations to render the actual usefulness of the imported theoretical tools to serve IR problems, as they usually exhibit unique structures and definitions. Therefore, we plan this tutorial to systematically demonstrate our learning and successful experience of using advanced theoretical tools for understanding and designing IR systems.","PeriodicalId":348686,"journal":{"name":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3488560.3501394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In the relatively short history of machine learning, the subtle balance between engineering and theoretical progress has been proved critical at various stages. The most recent wave of AI has brought to the IR community powerful techniques, particularly for pattern recognition. While many benefits from the burst of ideas as numerous tasks become algorithmically feasible, the balance is tilting toward the application side. The existing theoretical tools in IR can no longer explain, guide, and justify the newly-established methodologies. With no choices, we have to bet our design on black-box mechanisms that we only empirically understand. The consequences can be suffering: in stark contrast to how the IR industry has envisioned modern AI making life easier, many are experiencing increased confusion and costs in data manipulation, model selection, monitoring, censoring, and decision making. This reality is not surprising: without handy theoretical tools, we often lack principled knowledge of the pattern recognition model's expressivity, optimization property, generalization guarantee, and our decision-making process has to rely on over-simplified assumptions and human judgments from time to time. Facing all the challenges, we started researching advanced theoretical tools emerging from various domains that can potentially resolve modern IR problems. We encountered many impactful ideas and made several independent publications emphasizing different pieces. Time is now to bring the community a systematic tutorial on how we successfully adapt those tools and make significant progress in understanding, designing, and eventually productionize impactful IR systems. We emphasize systematicity because IR is a comprehensive discipline that touches upon particular aspects of learning, causal inference analysis, interactive (online) decision-making, etc. It thus requires systematic calibrations to render the actual usefulness of the imported theoretical tools to serve IR problems, as they usually exhibit unique structures and definitions. Therefore, we plan this tutorial to systematically demonstrate our learning and successful experience of using advanced theoretical tools for understanding and designing IR systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
理解和设计下一代信息检索系统的现代理论工具
在机器学习相对较短的历史中,工程和理论进步之间的微妙平衡在各个阶段都被证明是至关重要的。最近的人工智能浪潮给IR社区带来了强大的技术,特别是模式识别。虽然随着众多任务在算法上变得可行,想法的爆发带来了许多好处,但平衡正在向应用程序方面倾斜。IR中现有的理论工具不再能够解释、指导和证明新建立的方法。在没有选择的情况下,我们不得不把我们的设计押在我们只能凭经验理解的黑盒机制上。其后果可能是痛苦的:与IR行业设想的现代人工智能使生活更容易形成鲜明对比的是,许多人在数据操作、模型选择、监控、审查和决策方面遇到了越来越多的困惑和成本。这一现实并不令人惊讶:如果没有方便的理论工具,我们往往缺乏对模式识别模型的表达能力、优化特性、泛化保证的原则性知识,我们的决策过程不得不依赖于过度简化的假设和人类的判断。面对所有的挑战,我们开始研究来自不同领域的先进理论工具,这些工具有可能解决现代红外问题。我们遇到了许多有影响力的想法,并制作了几份独立的出版物,强调不同的作品。现在是时候给社区带来一个系统的教程,告诉他们我们如何成功地适应这些工具,并在理解、设计和最终生产有影响力的IR系统方面取得重大进展。我们强调系统性,因为IR是一门涉及学习、因果推理分析、互动(在线)决策等特定方面的综合性学科。因此,它需要系统的校准,以提供实际有用的进口理论工具,以解决IR问题,因为它们通常表现出独特的结构和定义。因此,我们计划本教程系统地展示我们使用先进的理论工具来理解和设计红外系统的学习和成功经验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AdaptKT: A Domain Adaptable Method for Knowledge Tracing Doctoral Consortium of WSDM'22: Exploring the Bias of Adversarial Defenses Half-Day Tutorial on Combating Online Hate Speech: The Role of Content, Networks, Psychology, User Behavior, etc. Near Real Time AI Personalization for Notifications at LinkedIn k-Clustering with Fair Outliers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1