Foundations and Trends in Information Retrieval最新文献

英文中文

From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends 从文本分类的基础到 GPT：关于当前方法和未来趋势的全面调查

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2025-04-16 DOI: 10.1561/1500000107

Marco Siino, Ilenia Tinnirello, Marco La Cascia

Text classification stands as a cornerstone within the realmof Natural Language Processing (NLP), particularly whenviewed through computer science and engineering. The pastdecade has seen deep learning revolutionize text classification,propelling advancements in text retrieval, categorization,information extraction, and summarization. Thescholarly literature includes datasets, models, and evaluationcriteria, with English being the predominant language offocus, despite studies involving Arabic, Chinese, Hindi, andothers. The efficacy of text classification models relies heavilyon their ability to capture intricate textual relationshipsand non-linear correlations, necessitating a comprehensiveexamination of the entire text classification pipeline.

In the NLP domain, a plethora of text representation techniquesand model architectures have emerged, with LargeLanguage Models (LLMs) and Generative Pre-trained Transformers(GPTs) at the forefront. These models are adept attransforming extensive textual data into meaningful vectorrepresentations encapsulating semantic information. Themultidisciplinary nature of text classification, encompassingdata mining, linguistics, and information retrieval, highlightsthe importance of collaborative research to advance the field.This work integrates traditional and contemporary text miningmethodologies, fostering a holistic understanding of textclassification.

This monograph provides an in-depth exploration of thetext classification pipeline, with a particular emphasis onevaluating the impact of each component on the overall performanceof text classification models. The pipeline includesstate-of-the-art datasets, text preprocessing techniques, textrepresentation methods, classification models, evaluationmetrics, and future trends. Each section examines thesestages, presenting technical innovations and recent findings.The work assesses various classification strategies, offeringcomparative analyses, examples and case studies. Thesecontributions extend beyond a typical survey, providing adetailed and insightful exploration of the field.

{"title":"From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends","authors":"Marco Siino, Ilenia Tinnirello, Marco La Cascia","doi":"10.1561/1500000107","DOIUrl":"https://doi.org/10.1561/1500000107","url":null,"abstract":"\u0000Text classification stands as a cornerstone within the realm\u0000of Natural Language Processing (NLP), particularly when\u0000viewed through computer science and engineering. The past\u0000decade has seen deep learning revolutionize text classification,\u0000propelling advancements in text retrieval, categorization,\u0000information extraction, and summarization. The\u0000scholarly literature includes datasets, models, and evaluation\u0000criteria, with English being the predominant language of\u0000focus, despite studies involving Arabic, Chinese, Hindi, and\u0000others. The efficacy of text classification models relies heavily\u0000on their ability to capture intricate textual relationships\u0000and non-linear correlations, necessitating a comprehensive\u0000examination of the entire text classification pipeline.\u0000\u0000In the NLP domain, a plethora of text representation techniques\u0000and model architectures have emerged, with Large\u0000Language Models (LLMs) and Generative Pre-trained Transformers\u0000(GPTs) at the forefront. These models are adept at\u0000transforming extensive textual data into meaningful vector\u0000representations encapsulating semantic information. The\u0000multidisciplinary nature of text classification, encompassing\u0000data mining, linguistics, and information retrieval, highlights\u0000the importance of collaborative research to advance the field.\u0000This work integrates traditional and contemporary text mining\u0000methodologies, fostering a holistic understanding of text\u0000classification.\u0000\u0000This monograph provides an in-depth exploration of the\u0000text classification pipeline, with a particular emphasis on\u0000evaluating the impact of each component on the overall performance\u0000of text classification models. The pipeline includes\u0000state-of-the-art datasets, text preprocessing techniques, text\u0000representation methods, classification models, evaluation\u0000metrics, and future trends. Each section examines these\u0000stages, presenting technical innovations and recent findings.\u0000The work assesses various classification strategies, offering\u0000comparative analyses, examples and case studies. These\u0000contributions extend beyond a typical survey, providing a\u0000detailed and insightful exploration of the field.\u0000","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143841570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Search as Learning

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2025-03-10 DOI: 10.1561/1500000084

Kelsey Urgo, Jaime Arguello

Search systems are often designed to support simple look-up tasks, such as fact-finding and navigation tasks. However, people increasingly use search engines to complete tasks that require deeper learning. In recent years, the search as learning (SAL) research community has argued that search systems should also be designed to support information-seeking tasks that involve complex learning as an important outcome. This monograph aims to provide a comprehensive review of prior research in search as learning and related areas.

Searching to learn can be characterized by specific learning objectives, strategies, and context. Therefore, we begin by reviewing research in education that has aimed at characterizing learning objectives, strategies, and context. Then, we review methods used in prior studies to measure learning during a search session. Here, we discuss two important recommendations for future work: (1) measuring learning retention and (2) measuring a learner's ability to transfer their new knowledge to a novel scenario. Following this, we discuss studies that have focused on understanding factors that influence learning during search and search behaviors that are predictive of learning. Next, we survey tools that have been developed to support learning during search. Searching for the purpose of learning is often a solitary activity. Research in self-regulated learning (SRL) aims to understand how people monitor and control their own learning. Therefore, we review existing models of SRL, methods to measure engagement with specific SRL processes, and tools to support effective SRL. We conclude by discussing potential areas for future research.

{"title":"Search as Learning","authors":"Kelsey Urgo, Jaime Arguello","doi":"10.1561/1500000084","DOIUrl":"https://doi.org/10.1561/1500000084","url":null,"abstract":"\u0000Search systems are often designed to support simple look-up tasks, such as fact-finding and navigation tasks. However, people increasingly use search engines to complete tasks that require deeper learning. In recent years, the search as learning (SAL) research community has argued that search systems should also be designed to support information-seeking tasks that involve complex learning as an important outcome. This monograph aims to provide a comprehensive review of prior research in search as learning and related areas. Searching to learn can be characterized by specific learning objectives, strategies, and context. Therefore, we begin by reviewing research in education that has aimed at characterizing learning objectives, strategies, and context. Then, we review methods used in prior studies to measure learning during a search session. Here, we discuss two important recommendations for future work: (1) measuring learning retention and (2) measuring a learner's ability to transfer their new knowledge to a novel scenario. Following this, we discuss studies that have focused on understanding factors that influence learning during search and search behaviors that are predictive of learning. Next, we survey tools that have been developed to support learning during search. Searching for the purpose of learning is often a solitary activity. Research in self-regulated learning (SRL) aims to understand how people monitor and control their own learning. Therefore, we review existing models of SRL, methods to measure engagement with specific SRL processes, and tools to support effective SRL. We conclude by discussing potential areas for future research.\u0000","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"68 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143599126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding and Mitigating Gender Bias in Information Retrieval Systems

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2025-02-09 DOI: 10.1561/1500000103

Shirin Seyedsalehi, Amin Bigdeli, Negar Arabzadeh, Batool AlMousawi, Zack Marshall, Morteza Zihayat, Ebrahim Bagheri

Gender bias is a pervasive issue that continues to influence various aspects of society, including the outcomes of information retrieval (IR) systems. As these systems become increasingly integral to accessing and navigating the vast amounts of information available today, the need to understand and mitigate gender bias within them is paramount. This monograph provides a comprehensive examination of the origins, manifestations, and consequences of gender bias in IR systems, as well as the current methodologies employed to address these biases.

Theoretical frameworks surrounding gender and its representation in artificial intelligence (AI) systems are explored, particularly focusing on how traditional gender binaries are perpetuated and reinforced through data and algorithmic processes. Metrics and methodologies used to identify and measure gender bias within IR systems are then analyzed, offering a detailed evaluation of existing approaches and their limitations.

Subsequent sections address the sources of gender bias, including biased input queries, retrieval methods, and gold standard datasets. Various data-driven and method-level debiasing strategies are presented, including techniques for debiasing neural embeddings and algorithmic approaches aimed at reducing bias in IR system outputs. The monograph concludes with a discussion of the challenges and limitations faced by current debiasing efforts and provides insights into future research directions that could lead to more equitable and inclusive IR systems.

This monograph serves as a valuable resource for researchers, practitioners, and students in the fields of information retrieval, artificial intelligence, and data science, providing the knowledge and tools needed to address gender bias and contribute to the development of fair and unbiased information systems.

{"title":"Understanding and Mitigating Gender Bias in Information Retrieval Systems","authors":"Shirin Seyedsalehi, Amin Bigdeli, Negar Arabzadeh, Batool AlMousawi, Zack Marshall, Morteza Zihayat, Ebrahim Bagheri","doi":"10.1561/1500000103","DOIUrl":"https://doi.org/10.1561/1500000103","url":null,"abstract":"\u0000Gender bias is a pervasive issue that continues to influence various aspects of society, including the outcomes of information retrieval (IR) systems. As these systems become increasingly integral to accessing and navigating the vast amounts of information available today, the need to understand and mitigate gender bias within them is paramount. This monograph provides a comprehensive examination of the origins, manifestations, and consequences of gender bias in IR systems, as well as the current methodologies employed to address these biases. Theoretical frameworks surrounding gender and its representation in artificial intelligence (AI) systems are explored, particularly focusing on how traditional gender binaries are perpetuated and reinforced through data and algorithmic processes. Metrics and methodologies used to identify and measure gender bias within IR systems are then analyzed, offering a detailed evaluation of existing approaches and their limitations. Subsequent sections address the sources of gender bias, including biased input queries, retrieval methods, and gold standard datasets. Various data-driven and method-level debiasing strategies are presented, including techniques for debiasing neural embeddings and algorithmic approaches aimed at reducing bias in IR system outputs. The monograph concludes with a discussion of the challenges and limitations faced by current debiasing efforts and provides insights into future research directions that could lead to more equitable and inclusive IR systems.\u0000This monograph serves as a valuable resource for researchers, practitioners, and students in the fields of information retrieval, artificial intelligence, and data science, providing the knowledge and tools needed to address gender bias and contribute to the development of fair and unbiased information systems.\u0000","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"143 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143375149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mathematical Information Retrieval: Search and Question Answering

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2025-01-27 DOI: 10.1561/1500000095

Richard Zanibbi, Behrooz Mansouri, Anurag Agarwal

Mathematical information is essential for technical work, but its creation, interpretation, and search are challenging. To help address these challenges, researchers have developed multimodal search engines and mathematical question answering systems. This monograph begins with a simple framework characterizing the information tasks that people and systems perform as we work to answer math-related questions. The framework is used to organize and relate the other core topics of the monograph, including interactions between people and systems, representing math formulas in sources, and evaluation. We close by addressing some key questions and presenting directions for future work. This monograph is intended for students, instructors, and researchers interested in systems that help us find and use mathematical information.

引用次数: 0

Information Discovery in E-commerce 电子商务中的信息发现

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2024-12-30 DOI: 10.1561/1500000097

Zhaochun Ren, Xiangnan He, Dawei Yin, Maarten de Rijke

Electronic commerce, or e-commerce, is the buying and selling of goods and services, or the transmitting of funds or data online. E-commerce platforms come in many kinds, with global players such as Amazon, Airbnb, Alibaba, Booking.com, eBay, and JD.com and platforms targeting specific geographic regions such as Bol.com and Flipkart.com. Information retrieval has a natural role to play in e-commerce, especially in connecting people to goods and services. Information discovery in e-commerce concerns different types of search (e.g., exploratory search vs. lookup tasks), recommender systems, and natural language processing in e-commerce portals. The rise in popularity of e-commerce sites has made research on information discovery in e-commerce an increasingly active research area. This is witnessed by an increase in publications and dedicated workshops in this space. Methods for information discovery in e-commerce largely focus on improving the effectiveness of e-commerce search and recommender systems, on enriching and using knowledge graphs to support e-commerce, and on developing innovative question answering and bot-based solutions that help to connect people to goods and services. In this survey, an overview is given of the fundamental infrastructure, algorithms, and technical solutions for information discovery in e-commerce. The topics covered include user behavior and profiling, search, recommendation, and language technology in e-commerce.

电子商务，或电子商务，是商品和服务的买卖，或在线传输资金或数据。电子商务平台种类繁多，有亚马逊、Airbnb、阿里巴巴、Booking.com、eBay和京东等全球企业，也有Bol.com和Flipkart.com等针对特定地理区域的平台。信息检索在电子商务中发挥着天然的作用，特别是在将人与商品和服务联系起来的过程中。电子商务中的信息发现涉及不同类型的搜索（例如，探索性搜索与查找任务）、推荐系统和电子商务门户中的自然语言处理。电子商务网站的普及使得电子商务中的信息发现研究日益活跃。这一领域的出版物和专门讲习班的增加证明了这一点。电子商务中的信息发现方法主要集中在提高电子商务搜索和推荐系统的有效性，丰富和使用知识图来支持电子商务，以及开发创新的问答和基于机器人的解决方案，帮助将人与商品和服务联系起来。在本调查中，概述了电子商务中信息发现的基本基础设施、算法和技术解决方案。涵盖的主题包括电子商务中的用户行为和分析、搜索、推荐和语言技术。

{"title":"Information Discovery in E-commerce","authors":"Zhaochun Ren, Xiangnan He, Dawei Yin, Maarten de Rijke","doi":"10.1561/1500000097","DOIUrl":"https://doi.org/10.1561/1500000097","url":null,"abstract":"Electronic commerce, or e-commerce, is the buying and selling of goods and services, or the transmitting of funds or data online. E-commerce platforms come in many kinds, with global players such as Amazon, Airbnb, Alibaba, Booking.com, eBay, and JD.com and platforms targeting specific geographic regions such as Bol.com and Flipkart.com. Information retrieval has a natural role to play in e-commerce, especially in connecting people to goods and services. Information discovery in e-commerce concerns different types of search (e.g., exploratory search vs. lookup tasks), recommender systems, and natural language processing in e-commerce portals. The rise in popularity of e-commerce sites has made research on information discovery in e-commerce an increasingly active research area. This is witnessed by an increase in publications and dedicated workshops in this space. Methods for information discovery in e-commerce largely focus on improving the effectiveness of e-commerce search and recommender systems, on enriching and using knowledge graphs to support e-commerce, and on developing innovative question answering and bot-based solutions that help to connect people to goods and services. In this survey, an overview is given of the fundamental infrastructure, algorithms, and technical solutions for information discovery in e-commerce. The topics covered include user behavior and profiling, search, recommendation, and language technology in e-commerce.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"3 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fairness in Search Systems 搜索系统的公平性

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2024-12-23 DOI: 10.1561/1500000101

Yi Fang, Ashudeep Singh, Zhiqiang Tao

Search engines play a crucial role in organizing and deliveringinformation to billions of users worldwide. However,these systems often reflect and amplify existing societalbiases and stereotypes through their search results and rankings.This concern has prompted researchers to investigatemethods for measuring and reducing algorithmic bias, withthe goal of developing more equitable search systems. Thismonograph presents a comprehensive taxonomy of fairnessin search systems and surveys the current research landscape.We systematically examine how bias manifests acrosskey search components, including query interpretation andprocessing, document representation and indexing, resultranking algorithms, and system evaluation metrics. By criticallyanalyzing the existing literature, we identify persistentchallenges and promising research directions in the pursuitof fairer search systems. Our aim is to provide a foundationfor future work in this rapidly evolving field while highlightingopportunities to create more inclusive and equitableinformation retrieval technologies.

搜索引擎在组织和向全球数十亿用户传递信息方面发挥着至关重要的作用。然而，这些系统往往通过搜索结果和排名反映和放大了现有的社会偏见和刻板印象。这种担忧促使研究人员研究衡量和减少算法偏差的方法，目的是开发更公平的搜索系统。这本专著提出了公平的搜索系统和调查目前的研究景观的综合分类。我们系统地研究了偏见如何在关键搜索组件中表现出来，包括查询解释和处理、文档表示和索引、结果排序算法和系统评估指标。通过批判性地分析现有文献，我们确定了在追求更公平的搜索系统中持续存在的挑战和有前途的研究方向。我们的目标是为这一快速发展领域的未来工作奠定基础，同时强调创造更具包容性和公平性的信息检索技术的机会。

引用次数: 0

User Simulation for Evaluating Information Access Systems 评估信息获取系统的用户模拟

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2024-06-12 DOI: 10.1561/1500000098

Krisztian Balog, ChengXiang Zhai

Information access systems, such as search engines, recommendersystems, and conversational assistants, have becomeintegral to our daily lives as they help us satisfy our informationneeds. However, evaluating the effectiveness ofthese systems presents a long-standing and complex scientificchallenge. This challenge is rooted in the difficulty ofassessing a system’s overall effectiveness in assisting usersto complete tasks through interactive support, and furtherexacerbated by the substantial variation in user behaviourand preferences. To address this challenge, user simulationemerges as a promising solution.

This monograph focuses on providing a thorough understandingof user simulation techniques designed specificallyfor evaluation purposes. We begin with a background of informationaccess system evaluation and explore the diverseapplications of user simulation. Subsequently, we systematicallyreview the major research progress in user simulation,covering both general frameworks for designing user simulators,utilizing user simulation for evaluation, and specificmodels and algorithms for simulating user interactions withsearch engines, recommender systems, and conversationalassistants. Realizing that user simulation is an interdisciplinaryresearch topic, whenever possible, we attempt toestablish connections with related fields, including machinelearning, dialogue systems, user modeling, and economics.We end the monograph with a broad discussion of importantfuture research directions, many of which extend beyond theevaluation of information access systems and are expectedto have broader impact on how to evaluate interactive intelligentsystems in general.

信息获取系统，如搜索引擎、推荐系统和对话助手，已经成为我们日常生活中不可或缺的一部分，因为它们能帮助我们满足信息需求。然而，评估这些系统的有效性是一项长期而复杂的科学挑战。这一挑战的根源在于难以评估系统在通过交互支持协助用户完成任务方面的整体有效性，而用户行为和偏好的巨大差异又进一步加剧了这一挑战。为了应对这一挑战，用户模拟成为一种很有前途的解决方案。本专著的重点是全面介绍专为评估目的而设计的用户模拟技术。我们首先介绍了信息访问系统评估的背景，并探讨了用户模拟的各种应用。随后，我们系统地回顾了用户模拟的主要研究进展，包括设计用户模拟器的一般框架、利用用户模拟进行评估，以及模拟用户与搜索引擎、推荐系统和会话助手交互的具体模型和算法。认识到用户模拟是一个跨学科的研究课题，我们尽可能地尝试与相关领域建立联系，包括机器学习、对话系统、用户建模和经济学。

{"title":"User Simulation for Evaluating Information Access Systems","authors":"Krisztian Balog, ChengXiang Zhai","doi":"10.1561/1500000098","DOIUrl":"https://doi.org/10.1561/1500000098","url":null,"abstract":"Information access systems, such as search engines, recommender\u0000systems, and conversational assistants, have become\u0000integral to our daily lives as they help us satisfy our information\u0000needs. However, evaluating the effectiveness of\u0000these systems presents a long-standing and complex scientific\u0000challenge. This challenge is rooted in the difficulty of\u0000assessing a system’s overall effectiveness in assisting users\u0000to complete tasks through interactive support, and further\u0000exacerbated by the substantial variation in user behaviour\u0000and preferences. To address this challenge, user simulation\u0000emerges as a promising solution.This monograph focuses on providing a thorough understanding\u0000of user simulation techniques designed specifically\u0000for evaluation purposes. We begin with a background of information\u0000access system evaluation and explore the diverse\u0000applications of user simulation. Subsequently, we systematically\u0000review the major research progress in user simulation,\u0000covering both general frameworks for designing user simulators,\u0000utilizing user simulation for evaluation, and specific\u0000models and algorithms for simulating user interactions with\u0000search engines, recommender systems, and conversational\u0000assistants. Realizing that user simulation is an interdisciplinary\u0000research topic, whenever possible, we attempt to\u0000establish connections with related fields, including machine\u0000learning, dialogue systems, user modeling, and economics.\u0000We end the monograph with a broad discussion of important\u0000future research directions, many of which extend beyond the\u0000evaluation of information access systems and are expected\u0000to have broader impact on how to evaluate interactive intelligent\u0000systems in general.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"33 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-hop Question Answering 多跳问题解答

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2024-06-12 DOI: 10.1561/1500000102

Vaibhav Mavi, Anubhav Jangra, Jatowt Adam

The task of Question Answering (QA) has attracted significantresearch interest for a long time. Its relevance tolanguage understanding and knowledge retrieval tasks, alongwith the simple setting, makes the task of QA crucial forstrong AI systems. Recent success on simple QA tasks hasshifted the focus to more complex settings. Among these,Multi-Hop QA (MHQA) is one of the most researched tasksover recent years. In broad terms, MHQA is the task of answeringnatural language questions that involve extractingand combining multiple pieces of information and doing multiplesteps of reasoning. An example of a multi-hop questionwould be “The Argentine PGA Championship record holderhas won how many tournaments worldwide?”. Answeringthe question would need two pieces of information: “Who isthe record holder for Argentine PGA Championship tournaments?”and “How many tournaments did [Answer of SubQ1] win?”. The ability to answer multi-hop questions andperform multi step reasoning can significantly improve theutility of NLP systems. Consequently, the field has seen asurge of high quality datasets, models and evaluation strategies.The notion of ‘multiple hops’ is somewhat abstractwhich results in a large variety of tasks that require multihopreasoning. This leads to different datasets and modelsthat differ significantly from each other and make the fieldchallenging to generalize and survey. We aim to provide ageneral and formal definition of the MHQA task, and organizeand summarize existing MHQA frameworks. We alsooutline some best practices for building MHQA datasets.This monograph provides a systematic and thorough introductionas well as the structuring of the existing attemptsto this highly interesting, yet quite challenging task.

长期以来，问题解答（QA）任务一直备受研究关注。它与语言理解和知识检索任务的相关性以及简单的设置，使得 QA 任务对强大的人工智能系统至关重要。最近，在简单的质量保证任务上取得的成功将焦点转移到了更复杂的环境上。其中，多跳 QA（MHQA）是近年来研究最多的任务之一。从广义上讲，MHQA 是回答自然语言问题的任务，这些问题涉及提取和组合多种信息并进行多步推理。多跳问题的一个例子是 "阿根廷 PGA 锦标赛纪录保持者赢得了多少场全球锦标赛？回答这个问题需要两条信息："阿根廷 PGA 锦标赛纪录保持者是谁？"和"[子问题 1 的答案]赢得了多少场锦标赛？"。回答多跳问题和执行多步推理的能力可以显著提高 NLP 系统的实用性。因此，该领域涌现出了大量高质量的数据集、模型和评估策略。"多跳 "的概念有些抽象，这导致需要多跳推理的任务种类繁多。这导致不同的数据集和模型之间存在很大差异，给该领域的推广和调查带来了挑战。我们的目标是提供 MHQA 任务的一般和正式定义，并整理和总结现有的 MHQA 框架。本专著系统而全面地介绍了这一非常有趣但又颇具挑战性的任务，并对现有的尝试进行了结构化。

{"title":"Multi-hop Question Answering","authors":"Vaibhav Mavi, Anubhav Jangra, Jatowt Adam","doi":"10.1561/1500000102","DOIUrl":"https://doi.org/10.1561/1500000102","url":null,"abstract":"The task of Question Answering (QA) has attracted significant\u0000research interest for a long time. Its relevance to\u0000language understanding and knowledge retrieval tasks, along\u0000with the simple setting, makes the task of QA crucial for\u0000strong AI systems. Recent success on simple QA tasks has\u0000shifted the focus to more complex settings. Among these,\u0000Multi-Hop QA (MHQA) is one of the most researched tasks\u0000over recent years. In broad terms, MHQA is the task of answering\u0000natural language questions that involve extracting\u0000and combining multiple pieces of information and doing multiple\u0000steps of reasoning. An example of a multi-hop question\u0000would be “The Argentine PGA Championship record holder\u0000has won how many tournaments worldwide?”. Answering\u0000the question would need two pieces of information: “Who is\u0000the record holder for Argentine PGA Championship tournaments?”\u0000and “How many tournaments did [Answer of Sub\u0000Q1] win?”. The ability to answer multi-hop questions and\u0000perform multi step reasoning can significantly improve the\u0000utility of NLP systems. Consequently, the field has seen a\u0000surge of high quality datasets, models and evaluation strategies.\u0000The notion of ‘multiple hops’ is somewhat abstract\u0000which results in a large variety of tasks that require multihop\u0000reasoning. This leads to different datasets and models\u0000that differ significantly from each other and make the field\u0000challenging to generalize and survey. We aim to provide a\u0000general and formal definition of the MHQA task, and organize\u0000and summarize existing MHQA frameworks. We also\u0000outline some best practices for building MHQA datasets.\u0000This monograph provides a systematic and thorough introduction\u0000as well as the structuring of the existing attempts\u0000to this highly interesting, yet quite challenging task.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"44 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conversational Information Seeking 会话信息搜索

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2023-08-02 DOI: 10.1561/1500000081

Hamed Zamani, Johanne R. Trippas, Jeff Dalton, Filip Radlinski

Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. This monograph provides a thorough overview of CIS definitions, applications, interactions, interfaces, design, implementation, and evaluation. This monograph views CIS applications as including conversational search, conversational question answering, and conversational recommendation. Our aim is to provide an overview of past research related to CIS, introduce the current state-of-the-art in CIS, highlight the challenges still being faced in the community, and suggest future directions.

会话信息搜索(CIS)关注的是一个或多个用户与信息系统之间的一系列交互。CIS中的交互主要基于自然语言对话，同时它们可能包括其他类型的交互，例如点击、触摸和身体手势。这本专著提供了CIS定义、应用程序、交互、接口、设计、实现和评估的全面概述。这本专著认为CIS的应用包括会话搜索、会话问答和会话推荐。我们的目的是概述过去与CIS相关的研究，介绍当前在CIS中的最新技术，强调社区仍然面临的挑战，并建议未来的方向。

引用次数: 49

Perspectives of Neurodiverse Participants in Interactive Information Retrieval 交互信息检索中神经多样性参与者的观点

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Foundations and Trends in Information Retrieval

Pub Date : 2023-07-26 DOI: 10.1561/1500000086

Laurianne Sitbon, Gerd Berget, Margot Brereton

This monograph offers a survey of work to date to inform how interactions in information retrieval systems could afford inclusion of users who are neurodiverse. This existing work is positioned within a range of philosophies, frameworks and epistemologies which frame the importance of including neurodiverse users in all stages of research and development of Interactive Information Retrieval (IIR) systems. The monograph also offers examples and practical approaches to include neurodiverse users in IIR research, and explores the challenges ahead in the field.

本专著提供了工作的调查到目前为止，告知如何在信息检索系统的交互可以负担得起谁是神经多样性的用户包括。这项现有的工作定位于一系列哲学、框架和认识论，这些哲学、框架和认识论构成了在交互式信息检索(IIR)系统研究和开发的所有阶段包括神经多样性用户的重要性。该专著还提供了包括神经多样性用户在IIR研究中的例子和实用方法，并探讨了该领域未来的挑战。

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Foundations and Trends in Information Retrieval

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀