Text classification stands as a cornerstone within the realmof Natural Language Processing (NLP), particularly whenviewed through computer science and engineering. The pastdecade has seen deep learning revolutionize text classification,propelling advancements in text retrieval, categorization,information extraction, and summarization. Thescholarly literature includes datasets, models, and evaluationcriteria, with English being the predominant language offocus, despite studies involving Arabic, Chinese, Hindi, andothers. The efficacy of text classification models relies heavilyon their ability to capture intricate textual relationshipsand non-linear correlations, necessitating a comprehensiveexamination of the entire text classification pipeline.
In the NLP domain, a plethora of text representation techniquesand model architectures have emerged, with LargeLanguage Models (LLMs) and Generative Pre-trained Transformers(GPTs) at the forefront. These models are adept attransforming extensive textual data into meaningful vectorrepresentations encapsulating semantic information. Themultidisciplinary nature of text classification, encompassingdata mining, linguistics, and information retrieval, highlightsthe importance of collaborative research to advance the field.This work integrates traditional and contemporary text miningmethodologies, fostering a holistic understanding of textclassification.
This monograph provides an in-depth exploration of thetext classification pipeline, with a particular emphasis onevaluating the impact of each component on the overall performanceof text classification models. The pipeline includesstate-of-the-art datasets, text preprocessing techniques, textrepresentation methods, classification models, evaluationmetrics, and future trends. Each section examines thesestages, presenting technical innovations and recent findings.The work assesses various classification strategies, offeringcomparative analyses, examples and case studies. Thesecontributions extend beyond a typical survey, providing adetailed and insightful exploration of the field.
{"title":"From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends","authors":"Marco Siino, Ilenia Tinnirello, Marco La Cascia","doi":"10.1561/1500000107","DOIUrl":"https://doi.org/10.1561/1500000107","url":null,"abstract":"<p>\u0000Text classification stands as a cornerstone within the realm\u0000of Natural Language Processing (NLP), particularly when\u0000viewed through computer science and engineering. The past\u0000decade has seen deep learning revolutionize text classification,\u0000propelling advancements in text retrieval, categorization,\u0000information extraction, and summarization. The\u0000scholarly literature includes datasets, models, and evaluation\u0000criteria, with English being the predominant language of\u0000focus, despite studies involving Arabic, Chinese, Hindi, and\u0000others. The efficacy of text classification models relies heavily\u0000on their ability to capture intricate textual relationships\u0000and non-linear correlations, necessitating a comprehensive\u0000examination of the entire text classification pipeline.\u0000<p>\u0000In the NLP domain, a plethora of text representation techniques\u0000and model architectures have emerged, with Large\u0000Language Models (LLMs) and Generative Pre-trained Transformers\u0000(GPTs) at the forefront. These models are adept at\u0000transforming extensive textual data into meaningful vector\u0000representations encapsulating semantic information. The\u0000multidisciplinary nature of text classification, encompassing\u0000data mining, linguistics, and information retrieval, highlights\u0000the importance of collaborative research to advance the field.\u0000This work integrates traditional and contemporary text mining\u0000methodologies, fostering a holistic understanding of text\u0000classification.\u0000</p><p>\u0000This monograph provides an in-depth exploration of the\u0000text classification pipeline, with a particular emphasis on\u0000evaluating the impact of each component on the overall performance\u0000of text classification models. The pipeline includes\u0000state-of-the-art datasets, text preprocessing techniques, text\u0000representation methods, classification models, evaluation\u0000metrics, and future trends. Each section examines these\u0000stages, presenting technical innovations and recent findings.\u0000The work assesses various classification strategies, offering\u0000comparative analyses, examples and case studies. These\u0000contributions extend beyond a typical survey, providing a\u0000detailed and insightful exploration of the field.\u0000</p></p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143841570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search systems are often designed to support simple look-up tasks, such as fact-finding and navigation tasks. However, people increasingly use search engines to complete tasks that require deeper learning. In recent years, the search as learning (SAL) research community has argued that search systems should also be designed to support information-seeking tasks that involve complex learning as an important outcome. This monograph aims to provide a comprehensive review of prior research in search as learning and related areas.
Searching to learn can be characterized by specific learning objectives, strategies, and context. Therefore, we begin by reviewing research in education that has aimed at characterizing learning objectives, strategies, and context. Then, we review methods used in prior studies to measure learning during a search session. Here, we discuss two important recommendations for future work: (1) measuring learning retention and (2) measuring a learner's ability to transfer their new knowledge to a novel scenario. Following this, we discuss studies that have focused on understanding factors that influence learning during search and search behaviors that are predictive of learning. Next, we survey tools that have been developed to support learning during search. Searching for the purpose of learning is often a solitary activity. Research in self-regulated learning (SRL) aims to understand how people monitor and control their own learning. Therefore, we review existing models of SRL, methods to measure engagement with specific SRL processes, and tools to support effective SRL. We conclude by discussing potential areas for future research.
{"title":"Search as Learning","authors":"Kelsey Urgo, Jaime Arguello","doi":"10.1561/1500000084","DOIUrl":"https://doi.org/10.1561/1500000084","url":null,"abstract":"<p>\u0000Search systems are often designed to support simple look-up tasks, such as fact-finding and navigation tasks. However, people increasingly use search engines to complete tasks that require deeper learning. In recent years, the search as learning (SAL) research community has argued that search systems should also be designed to support information-seeking tasks that involve complex learning as an important outcome. This monograph aims to provide a comprehensive review of prior research in search as learning and related areas. <p>Searching to learn can be characterized by specific learning objectives, strategies, and context. Therefore, we begin by reviewing research in education that has aimed at characterizing learning objectives, strategies, and context. Then, we review methods used in prior studies to measure learning during a search session. Here, we discuss two important recommendations for future work: (1) measuring learning retention and (2) measuring a learner's ability to transfer their new knowledge to a novel scenario. Following this, we discuss studies that have focused on understanding factors that influence learning during search and search behaviors that are predictive of learning. Next, we survey tools that have been developed to support learning during search. Searching for the purpose of learning is often a solitary activity. Research in self-regulated learning (SRL) aims to understand how people monitor and control their own learning. Therefore, we review existing models of SRL, methods to measure engagement with specific SRL processes, and tools to support effective SRL. We conclude by discussing potential areas for future research.\u0000</p></p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"68 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143599126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gender bias is a pervasive issue that continues to influence various aspects of society, including the outcomes of information retrieval (IR) systems. As these systems become increasingly integral to accessing and navigating the vast amounts of information available today, the need to understand and mitigate gender bias within them is paramount. This monograph provides a comprehensive examination of the origins, manifestations, and consequences of gender bias in IR systems, as well as the current methodologies employed to address these biases.
Theoretical frameworks surrounding gender and its representation in artificial intelligence (AI) systems are explored, particularly focusing on how traditional gender binaries are perpetuated and reinforced through data and algorithmic processes. Metrics and methodologies used to identify and measure gender bias within IR systems are then analyzed, offering a detailed evaluation of existing approaches and their limitations.
Subsequent sections address the sources of gender bias, including biased input queries, retrieval methods, and gold standard datasets. Various data-driven and method-level debiasing strategies are presented, including techniques for debiasing neural embeddings and algorithmic approaches aimed at reducing bias in IR system outputs. The monograph concludes with a discussion of the challenges and limitations faced by current debiasing efforts and provides insights into future research directions that could lead to more equitable and inclusive IR systems.
This monograph serves as a valuable resource for researchers, practitioners, and students in the fields of information retrieval, artificial intelligence, and data science, providing the knowledge and tools needed to address gender bias and contribute to the development of fair and unbiased information systems.
{"title":"Understanding and Mitigating Gender Bias in Information Retrieval Systems","authors":"Shirin Seyedsalehi, Amin Bigdeli, Negar Arabzadeh, Batool AlMousawi, Zack Marshall, Morteza Zihayat, Ebrahim Bagheri","doi":"10.1561/1500000103","DOIUrl":"https://doi.org/10.1561/1500000103","url":null,"abstract":"<p>\u0000Gender bias is a pervasive issue that continues to influence various aspects of society, including the outcomes of information retrieval (IR) systems. As these systems become increasingly integral to accessing and navigating the vast amounts of information available today, the need to understand and mitigate gender bias within them is paramount. This monograph provides a comprehensive examination of the origins, manifestations, and consequences of gender bias in IR systems, as well as the current methodologies employed to address these biases. <p>Theoretical frameworks surrounding gender and its representation in artificial intelligence (AI) systems are explored, particularly focusing on how traditional gender binaries are perpetuated and reinforced through data and algorithmic processes. Metrics and methodologies used to identify and measure gender bias within IR systems are then analyzed, offering a detailed evaluation of existing approaches and their limitations. </p><p>Subsequent sections address the sources of gender bias, including biased input queries, retrieval methods, and gold standard datasets. Various data-driven and method-level debiasing strategies are presented, including techniques for debiasing neural embeddings and algorithmic approaches aimed at reducing bias in IR system outputs. The monograph concludes with a discussion of the challenges and limitations faced by current debiasing efforts and provides insights into future research directions that could lead to more equitable and inclusive IR systems.\u0000</p><p>This monograph serves as a valuable resource for researchers, practitioners, and students in the fields of information retrieval, artificial intelligence, and data science, providing the knowledge and tools needed to address gender bias and contribute to the development of fair and unbiased information systems.\u0000</p></p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"143 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143375149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathematical information is essential for technical work, but its creation, interpretation, and search are challenging. To help address these challenges, researchers have developed multimodal search engines and mathematical question answering systems. This monograph begins with a simple framework characterizing the information tasks that people and systems perform as we work to answer math-related questions. The framework is used to organize and relate the other core topics of the monograph, including interactions between people and systems, representing math formulas in sources, and evaluation. We close by addressing some key questions and presenting directions for future work. This monograph is intended for students, instructors, and researchers interested in systems that help us find and use mathematical information.
{"title":"Mathematical Information Retrieval: Search and Question Answering","authors":"Richard Zanibbi, Behrooz Mansouri, Anurag Agarwal","doi":"10.1561/1500000095","DOIUrl":"https://doi.org/10.1561/1500000095","url":null,"abstract":"<p>Mathematical information is essential for technical work, but its creation, interpretation, and search are challenging. To help address these challenges, researchers have developed multimodal search engines and mathematical question answering systems. This monograph begins with a simple framework characterizing the information tasks that people and systems perform as we work to answer math-related questions. The framework is used to organize and relate the other core topics of the monograph, including interactions between people and systems, representing math formulas in sources, and evaluation. We close by addressing some key questions and presenting directions for future work. This monograph is intended for students, instructors, and researchers interested in systems that help us find and use mathematical information.</p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"9 5 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143050873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaochun Ren, Xiangnan He, Dawei Yin, Maarten de Rijke
Electronic commerce, or e-commerce, is the buying and selling of goods and services, or the transmitting of funds or data online. E-commerce platforms come in many kinds, with global players such as Amazon, Airbnb, Alibaba, Booking.com, eBay, and JD.com and platforms targeting specific geographic regions such as Bol.com and Flipkart.com. Information retrieval has a natural role to play in e-commerce, especially in connecting people to goods and services. Information discovery in e-commerce concerns different types of search (e.g., exploratory search vs. lookup tasks), recommender systems, and natural language processing in e-commerce portals. The rise in popularity of e-commerce sites has made research on information discovery in e-commerce an increasingly active research area. This is witnessed by an increase in publications and dedicated workshops in this space. Methods for information discovery in e-commerce largely focus on improving the effectiveness of e-commerce search and recommender systems, on enriching and using knowledge graphs to support e-commerce, and on developing innovative question answering and bot-based solutions that help to connect people to goods and services. In this survey, an overview is given of the fundamental infrastructure, algorithms, and technical solutions for information discovery in e-commerce. The topics covered include user behavior and profiling, search, recommendation, and language technology in e-commerce.
{"title":"Information Discovery in E-commerce","authors":"Zhaochun Ren, Xiangnan He, Dawei Yin, Maarten de Rijke","doi":"10.1561/1500000097","DOIUrl":"https://doi.org/10.1561/1500000097","url":null,"abstract":"<p>Electronic commerce, or e-commerce, is the buying and selling of goods and services, or the transmitting of funds or data online. E-commerce platforms come in many kinds, with global players such as Amazon, Airbnb, Alibaba, Booking.com, eBay, and JD.com and platforms targeting specific geographic regions such as Bol.com and Flipkart.com. Information retrieval has a natural role to play in e-commerce, especially in connecting people to goods and services. Information discovery in e-commerce concerns different types of search (e.g., exploratory search vs. lookup tasks), recommender systems, and natural language processing in e-commerce portals. The rise in popularity of e-commerce sites has made research on information discovery in e-commerce an increasingly active research area. This is witnessed by an increase in publications and dedicated workshops in this space. Methods for information discovery in e-commerce largely focus on improving the effectiveness of e-commerce search and recommender systems, on enriching and using knowledge graphs to support e-commerce, and on developing innovative question answering and bot-based solutions that help to connect people to goods and services. In this survey, an overview is given of the fundamental infrastructure, algorithms, and technical solutions for information discovery in e-commerce. The topics covered include user behavior and profiling, search, recommendation, and language technology in e-commerce.</p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"3 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search engines play a crucial role in organizing and deliveringinformation to billions of users worldwide. However,these systems often reflect and amplify existing societalbiases and stereotypes through their search results and rankings.This concern has prompted researchers to investigatemethods for measuring and reducing algorithmic bias, withthe goal of developing more equitable search systems. Thismonograph presents a comprehensive taxonomy of fairnessin search systems and surveys the current research landscape.We systematically examine how bias manifests acrosskey search components, including query interpretation andprocessing, document representation and indexing, resultranking algorithms, and system evaluation metrics. By criticallyanalyzing the existing literature, we identify persistentchallenges and promising research directions in the pursuitof fairer search systems. Our aim is to provide a foundationfor future work in this rapidly evolving field while highlightingopportunities to create more inclusive and equitableinformation retrieval technologies.
{"title":"Fairness in Search Systems","authors":"Yi Fang, Ashudeep Singh, Zhiqiang Tao","doi":"10.1561/1500000101","DOIUrl":"https://doi.org/10.1561/1500000101","url":null,"abstract":"<p>Search engines play a crucial role in organizing and delivering\u0000information to billions of users worldwide. However,\u0000these systems often reflect and amplify existing societal\u0000biases and stereotypes through their search results and rankings.\u0000This concern has prompted researchers to investigate\u0000methods for measuring and reducing algorithmic bias, with\u0000the goal of developing more equitable search systems. This\u0000monograph presents a comprehensive taxonomy of fairness\u0000in search systems and surveys the current research landscape.\u0000We systematically examine how bias manifests across\u0000key search components, including query interpretation and\u0000processing, document representation and indexing, result\u0000ranking algorithms, and system evaluation metrics. By critically\u0000analyzing the existing literature, we identify persistent\u0000challenges and promising research directions in the pursuit\u0000of fairer search systems. Our aim is to provide a foundation\u0000for future work in this rapidly evolving field while highlighting\u0000opportunities to create more inclusive and equitable\u0000information retrieval technologies.</p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"40 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142879862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information access systems, such as search engines, recommendersystems, and conversational assistants, have becomeintegral to our daily lives as they help us satisfy our informationneeds. However, evaluating the effectiveness ofthese systems presents a long-standing and complex scientificchallenge. This challenge is rooted in the difficulty ofassessing a system’s overall effectiveness in assisting usersto complete tasks through interactive support, and furtherexacerbated by the substantial variation in user behaviourand preferences. To address this challenge, user simulationemerges as a promising solution.
This monograph focuses on providing a thorough understandingof user simulation techniques designed specificallyfor evaluation purposes. We begin with a background of informationaccess system evaluation and explore the diverseapplications of user simulation. Subsequently, we systematicallyreview the major research progress in user simulation,covering both general frameworks for designing user simulators,utilizing user simulation for evaluation, and specificmodels and algorithms for simulating user interactions withsearch engines, recommender systems, and conversationalassistants. Realizing that user simulation is an interdisciplinaryresearch topic, whenever possible, we attempt toestablish connections with related fields, including machinelearning, dialogue systems, user modeling, and economics.We end the monograph with a broad discussion of importantfuture research directions, many of which extend beyond theevaluation of information access systems and are expectedto have broader impact on how to evaluate interactive intelligentsystems in general.
{"title":"User Simulation for Evaluating Information Access Systems","authors":"Krisztian Balog, ChengXiang Zhai","doi":"10.1561/1500000098","DOIUrl":"https://doi.org/10.1561/1500000098","url":null,"abstract":"<p>Information access systems, such as search engines, recommender\u0000systems, and conversational assistants, have become\u0000integral to our daily lives as they help us satisfy our information\u0000needs. However, evaluating the effectiveness of\u0000these systems presents a long-standing and complex scientific\u0000challenge. This challenge is rooted in the difficulty of\u0000assessing a system’s overall effectiveness in assisting users\u0000to complete tasks through interactive support, and further\u0000exacerbated by the substantial variation in user behaviour\u0000and preferences. To address this challenge, user simulation\u0000emerges as a promising solution.<p>This monograph focuses on providing a thorough understanding\u0000of user simulation techniques designed specifically\u0000for evaluation purposes. We begin with a background of information\u0000access system evaluation and explore the diverse\u0000applications of user simulation. Subsequently, we systematically\u0000review the major research progress in user simulation,\u0000covering both general frameworks for designing user simulators,\u0000utilizing user simulation for evaluation, and specific\u0000models and algorithms for simulating user interactions with\u0000search engines, recommender systems, and conversational\u0000assistants. Realizing that user simulation is an interdisciplinary\u0000research topic, whenever possible, we attempt to\u0000establish connections with related fields, including machine\u0000learning, dialogue systems, user modeling, and economics.\u0000We end the monograph with a broad discussion of important\u0000future research directions, many of which extend beyond the\u0000evaluation of information access systems and are expected\u0000to have broader impact on how to evaluate interactive intelligent\u0000systems in general.</p></p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"33 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The task of Question Answering (QA) has attracted significantresearch interest for a long time. Its relevance tolanguage understanding and knowledge retrieval tasks, alongwith the simple setting, makes the task of QA crucial forstrong AI systems. Recent success on simple QA tasks hasshifted the focus to more complex settings. Among these,Multi-Hop QA (MHQA) is one of the most researched tasksover recent years. In broad terms, MHQA is the task of answeringnatural language questions that involve extractingand combining multiple pieces of information and doing multiplesteps of reasoning. An example of a multi-hop questionwould be “The Argentine PGA Championship record holderhas won how many tournaments worldwide?”. Answeringthe question would need two pieces of information: “Who isthe record holder for Argentine PGA Championship tournaments?”and “How many tournaments did [Answer of SubQ1] win?”. The ability to answer multi-hop questions andperform multi step reasoning can significantly improve theutility of NLP systems. Consequently, the field has seen asurge of high quality datasets, models and evaluation strategies.The notion of ‘multiple hops’ is somewhat abstractwhich results in a large variety of tasks that require multihopreasoning. This leads to different datasets and modelsthat differ significantly from each other and make the fieldchallenging to generalize and survey. We aim to provide ageneral and formal definition of the MHQA task, and organizeand summarize existing MHQA frameworks. We alsooutline some best practices for building MHQA datasets.This monograph provides a systematic and thorough introductionas well as the structuring of the existing attemptsto this highly interesting, yet quite challenging task.
{"title":"Multi-hop Question Answering","authors":"Vaibhav Mavi, Anubhav Jangra, Jatowt Adam","doi":"10.1561/1500000102","DOIUrl":"https://doi.org/10.1561/1500000102","url":null,"abstract":"<p>The task of Question Answering (QA) has attracted significant\u0000research interest for a long time. Its relevance to\u0000language understanding and knowledge retrieval tasks, along\u0000with the simple setting, makes the task of QA crucial for\u0000strong AI systems. Recent success on simple QA tasks has\u0000shifted the focus to more complex settings. Among these,\u0000Multi-Hop QA (MHQA) is one of the most researched tasks\u0000over recent years. In broad terms, MHQA is the task of answering\u0000natural language questions that involve extracting\u0000and combining multiple pieces of information and doing multiple\u0000steps of reasoning. An example of a multi-hop question\u0000would be “The Argentine PGA Championship record holder\u0000has won how many tournaments worldwide?”. Answering\u0000the question would need two pieces of information: “Who is\u0000the record holder for Argentine PGA Championship tournaments?”\u0000and “How many tournaments did [Answer of Sub\u0000Q1] win?”. The ability to answer multi-hop questions and\u0000perform multi step reasoning can significantly improve the\u0000utility of NLP systems. Consequently, the field has seen a\u0000surge of high quality datasets, models and evaluation strategies.\u0000The notion of ‘multiple hops’ is somewhat abstract\u0000which results in a large variety of tasks that require multihop\u0000reasoning. This leads to different datasets and models\u0000that differ significantly from each other and make the field\u0000challenging to generalize and survey. We aim to provide a\u0000general and formal definition of the MHQA task, and organize\u0000and summarize existing MHQA frameworks. We also\u0000outline some best practices for building MHQA datasets.\u0000This monograph provides a systematic and thorough introduction\u0000as well as the structuring of the existing attempts\u0000to this highly interesting, yet quite challenging task.</p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"44 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamed Zamani, Johanne R. Trippas, Jeff Dalton, Filip Radlinski
Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. This monograph provides a thorough overview of CIS definitions, applications, interactions, interfaces, design, implementation, and evaluation. This monograph views CIS applications as including conversational search, conversational question answering, and conversational recommendation. Our aim is to provide an overview of past research related to CIS, introduce the current state-of-the-art in CIS, highlight the challenges still being faced in the community, and suggest future directions.
{"title":"Conversational Information Seeking","authors":"Hamed Zamani, Johanne R. Trippas, Jeff Dalton, Filip Radlinski","doi":"10.1561/1500000081","DOIUrl":"https://doi.org/10.1561/1500000081","url":null,"abstract":"<p>Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. This monograph provides a thorough overview of CIS definitions, applications, interactions, interfaces, design, implementation, and evaluation. This monograph views CIS applications as including conversational search, conversational question answering, and conversational recommendation. Our aim is to provide an overview of past research related to CIS, introduce the current state-of-the-art in CIS, highlight the challenges still being faced in the community, and suggest future directions.</p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"8 31","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49696574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This monograph offers a survey of work to date to inform how interactions in information retrieval systems could afford inclusion of users who are neurodiverse. This existing work is positioned within a range of philosophies, frameworks and epistemologies which frame the importance of including neurodiverse users in all stages of research and development of Interactive Information Retrieval (IIR) systems. The monograph also offers examples and practical approaches to include neurodiverse users in IIR research, and explores the challenges ahead in the field.
{"title":"Perspectives of Neurodiverse Participants in Interactive Information Retrieval","authors":"Laurianne Sitbon, Gerd Berget, Margot Brereton","doi":"10.1561/1500000086","DOIUrl":"https://doi.org/10.1561/1500000086","url":null,"abstract":"<p>This monograph offers a survey of work to date to inform how interactions in information retrieval systems could afford inclusion of users who are neurodiverse. This existing work is positioned within a range of philosophies, frameworks and epistemologies which frame the importance of including neurodiverse users in all stages of research and development of Interactive Information Retrieval (IIR) systems. The monograph also offers examples and practical approaches to include neurodiverse users in IIR research, and explores the challenges ahead in the field.</p>","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"21 4","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49696873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}