首页 > 最新文献

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

英文 中文
Health Cards for Consumer Health Search 用于消费者健康检索的健康卡
Jimmy, G. Zuccon, B. Koopman, Gianluca Demartini
This paper investigates the impact of health cards in consumer health search (CHS) - people seeking health advice online. Health cards are a concise presentations of a health concept shown along side search results to specific health queries; they have the potential to convey health information in easily digestible form for the general public. However, little evidence exists on how effective health cards actually are for users when searching health advice online, and whether their effectiveness is limited to specific health search intents. To understand the impact of health cards on CHS, we conducted a laboratory study to observe users completing CHS tasks using two search interface variants: one just with result snippets and one containing both result snippets and health cards. Our study makes the following contributions: (1) it reveals how and when health cards are beneficial to users in completing consumer health search tasks, and (2) it identifies the features of health cards that helped users in completing their tasks. This is the first study that thoroughly investigates the effectiveness of health cards in supporting consumer health search.
本文调查了健康卡在消费者健康搜索(CHS)中的影响-人们在网上寻求健康建议。健康卡是一种简洁的健康概念介绍,显示在特定健康查询的搜索结果旁边;它们有可能以易于理解的形式向公众传达健康信息。然而,很少有证据表明,当用户在网上搜索健康建议时,健康卡实际上有多有效,以及它们的有效性是否仅限于特定的健康搜索意图。为了了解健康卡对CHS的影响,我们进行了一项实验室研究,观察使用两种搜索界面变体完成CHS任务的用户:一种只包含结果片段,另一种同时包含结果片段和健康卡。我们的研究有以下贡献:(1)揭示了健康卡如何以及何时有利于用户完成消费者健康搜索任务;(2)确定了健康卡帮助用户完成任务的特征。这是第一个彻底调查健康卡在支持消费者健康搜索方面的有效性的研究。
{"title":"Health Cards for Consumer Health Search","authors":"Jimmy, G. Zuccon, B. Koopman, Gianluca Demartini","doi":"10.1145/3331184.3331194","DOIUrl":"https://doi.org/10.1145/3331184.3331194","url":null,"abstract":"This paper investigates the impact of health cards in consumer health search (CHS) - people seeking health advice online. Health cards are a concise presentations of a health concept shown along side search results to specific health queries; they have the potential to convey health information in easily digestible form for the general public. However, little evidence exists on how effective health cards actually are for users when searching health advice online, and whether their effectiveness is limited to specific health search intents. To understand the impact of health cards on CHS, we conducted a laboratory study to observe users completing CHS tasks using two search interface variants: one just with result snippets and one containing both result snippets and health cards. Our study makes the following contributions: (1) it reveals how and when health cards are beneficial to users in completing consumer health search tasks, and (2) it identifies the features of health cards that helped users in completing their tasks. This is the first study that thoroughly investigates the effectiveness of health cards in supporting consumer health search.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86445645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Session details: Session 7A: Relevance and Evaluation 1 会议详情:会议7A:相关性和评估
M. Sanderson
{"title":"Session details: Session 7A: Relevance and Evaluation 1","authors":"M. Sanderson","doi":"10.1145/3349691","DOIUrl":"https://doi.org/10.1145/3349691","url":null,"abstract":"","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81835731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Third International Workshop on Recent Trends in News Information Retrieval (NewsIR'19) 第三届新闻信息检索最新趋势国际研讨会(NewsIR'19)
M. Albakour, Miguel Martinez, S. Tippmann, Ahmet Aker, J. Stray, Shiri Dori-Hacohen, Alberto Barrón-Cedeño
The journalism industry has undergone a revolution in the past decade, leading to new opportunities as well as challenges. News consumption, production and delivery have all been affected and transformed by technology Readers require new mechanisms to cope with the vast volume of information in order to be informed about news events. Reporters have begun to use natural language processing (NLP) and (IR) techniques for investigative work. Publishers and aggregators are seeking new business models, and new ways to reach and retain their audience. A shift in business models has led to a gradual shift in styles of journalism in attempts to increase page views; and, far more concerning, to real mis- and dis-information, alongside allegations of "fake news" threatening the journalistic freedom and integrity of legitimate news outlets. Social media platforms drive viewership, creating filter bubbles and an increasingly polarized readership. News documents have always been a part of research on information access and retrieval methods. Over the last few years, the IR community has increasingly recognized these challenges in journalism and opened a conversation about how we might begin to address them. Evidence of this recognition is the participation in the two previous editions of our NewsIR workshop, held in ECIR 2016 and 2018. One of the most important outcomes of those workshops is an increasing awareness in the community about the changing nature of journalism and the IR challenges it entails. To move yet another step forward, the goal of the third edition of our workshop will be to create a multidisciplinary venue that brings together news experts from both technology and journalism. This would take NewsIR from a European forum targeting mainly IR researchers, into a more inclusive and influential international forum. We hope that this new format will foster further understanding for both news professionals and IR researchers, as well as producing better outcomes for news consumers. We will address the possibilities and challenges that technology offers to the journalists, the challenges that new developments in journalism create for IR researchers, and the complexity of information access tasks for news readers.
过去十年,新闻业经历了一场革命,既带来了新的机遇,也带来了新的挑战。新闻的消费、生产和传递都受到技术的影响和改变,读者需要新的机制来处理大量的信息,以便了解新闻事件。记者已经开始在调查工作中使用自然语言处理(NLP)和(IR)技术。出版商和聚合商正在寻找新的商业模式,以及接触和留住受众的新方法。商业模式的转变导致了新闻风格的逐渐转变,以增加页面浏览量;更令人担忧的是,真正的错误和虚假信息,以及威胁到合法新闻媒体的新闻自由和诚信的“假新闻”指控。社交媒体平台推动了收视率,制造了过滤泡沫和日益两极分化的读者群体。新闻文献一直是信息获取与检索方法研究的一部分。过去几年来,国际关系界日益认识到新闻业面临的这些挑战,并就如何着手应对这些挑战展开了讨论。这一认可的证据是参加了2016年和2018年在ECIR举行的前两届NewsIR研讨会。这些工作坊最重要的成果之一,是提高社群对新闻性质变迁的认识,以及它所带来的国际关系挑战。为了更进一步,第三届研讨会的目标将是创建一个多学科的场所,汇集来自技术和新闻业的新闻专家。这将使NewsIR从一个主要针对IR研究人员的欧洲论坛,变成一个更具包容性和影响力的国际论坛。我们希望这种新格式能够促进新闻专业人员和IR研究人员的进一步理解,并为新闻消费者带来更好的结果。我们将讨论技术给记者带来的可能性和挑战,新闻业的新发展给IR研究人员带来的挑战,以及新闻读者获取信息任务的复杂性。
{"title":"Third International Workshop on Recent Trends in News Information Retrieval (NewsIR'19)","authors":"M. Albakour, Miguel Martinez, S. Tippmann, Ahmet Aker, J. Stray, Shiri Dori-Hacohen, Alberto Barrón-Cedeño","doi":"10.1145/3331184.3331646","DOIUrl":"https://doi.org/10.1145/3331184.3331646","url":null,"abstract":"The journalism industry has undergone a revolution in the past decade, leading to new opportunities as well as challenges. News consumption, production and delivery have all been affected and transformed by technology Readers require new mechanisms to cope with the vast volume of information in order to be informed about news events. Reporters have begun to use natural language processing (NLP) and (IR) techniques for investigative work. Publishers and aggregators are seeking new business models, and new ways to reach and retain their audience. A shift in business models has led to a gradual shift in styles of journalism in attempts to increase page views; and, far more concerning, to real mis- and dis-information, alongside allegations of \"fake news\" threatening the journalistic freedom and integrity of legitimate news outlets. Social media platforms drive viewership, creating filter bubbles and an increasingly polarized readership. News documents have always been a part of research on information access and retrieval methods. Over the last few years, the IR community has increasingly recognized these challenges in journalism and opened a conversation about how we might begin to address them. Evidence of this recognition is the participation in the two previous editions of our NewsIR workshop, held in ECIR 2016 and 2018. One of the most important outcomes of those workshops is an increasing awareness in the community about the changing nature of journalism and the IR challenges it entails. To move yet another step forward, the goal of the third edition of our workshop will be to create a multidisciplinary venue that brings together news experts from both technology and journalism. This would take NewsIR from a European forum targeting mainly IR researchers, into a more inclusive and influential international forum. We hope that this new format will foster further understanding for both news professionals and IR researchers, as well as producing better outcomes for news consumers. We will address the possibilities and challenges that technology offers to the journalists, the challenges that new developments in journalism create for IR researchers, and the complexity of information access tasks for news readers.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89517465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Session 2C: Knowledge and Entities 会议详情:会议2C:知识和实体
A. D. Vries
{"title":"Session details: Session 2C: Knowledge and Entities","authors":"A. D. Vries","doi":"10.1145/3349680","DOIUrl":"https://doi.org/10.1145/3349680","url":null,"abstract":"","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87591187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Approximate Filtering of Search Results Sorted by Attribute 按属性排序搜索结果的快速近似过滤
F. M. Nardini, Roberto Trani, Rossano Venturini
Several Web search services enable their users with the possibility of sorting the list of results by a specific attribute, e.g., sort "by price" in e-commerce. However, sorting the results by attribute could bring marginally relevant results in the top positions thus leading to a poor user experience. This motivates the definition of the relevance-aware filtering problem. This problem asks to remove results from the attribute-sorted list to maximize its final overall relevance. Recently, an optimal solution to this problem has been proposed. However, it has strong limitations in the Web scenario due to its high computational cost. In this paper, we propose ϵ-Filtering: an efficient approximate algorithm with strong approximation guarantees on the relevance of the final list. More precisely, given an allowed approximation error ϵ, the proposed algorithm finds a(1-ϵ)"optimal filtering, i.e., the relevance of its solution is at least (1-ϵ) times the optimum. We conduct a comprehensive evaluation of ϵ-Filtering against state-of-the-art competitors on two real-world public datasets. Experiments show that ϵ-Filtering achieves the desired levels of effectiveness with a speedup of up to two orders of magnitude with respect to the optimal solution while guaranteeing very small approximation errors.
一些Web搜索服务使用户能够按特定属性对结果列表进行排序,例如,电子商务中的“按价格”排序。然而,按属性排序结果可能会在顶部位置带来不太相关的结果,从而导致糟糕的用户体验。这激发了相关感知过滤问题的定义。这个问题要求从属性排序列表中删除结果,以最大化其最终的总体相关性。最近,有人提出了这个问题的最优解。然而,由于计算成本高,它在Web场景中有很强的局限性。在本文中,我们提出了ϵ-Filtering:一个有效的近似算法,对最终列表的相关性有很强的近似保证。更准确地说,给定一个允许的近似误差,所提出的算法找到一个(1- ε)最优滤波,即其解的相关性至少是最优的(1- λ)倍。我们在两个真实世界的公共数据集上对ϵ-Filtering与最先进的竞争对手进行了全面的评估。实验表明,ϵ-Filtering在保证非常小的近似误差的同时,相对于最优解的加速高达两个数量级,达到了所需的效率水平。
{"title":"Fast Approximate Filtering of Search Results Sorted by Attribute","authors":"F. M. Nardini, Roberto Trani, Rossano Venturini","doi":"10.1145/3331184.3331227","DOIUrl":"https://doi.org/10.1145/3331184.3331227","url":null,"abstract":"Several Web search services enable their users with the possibility of sorting the list of results by a specific attribute, e.g., sort \"by price\" in e-commerce. However, sorting the results by attribute could bring marginally relevant results in the top positions thus leading to a poor user experience. This motivates the definition of the relevance-aware filtering problem. This problem asks to remove results from the attribute-sorted list to maximize its final overall relevance. Recently, an optimal solution to this problem has been proposed. However, it has strong limitations in the Web scenario due to its high computational cost. In this paper, we propose ϵ-Filtering: an efficient approximate algorithm with strong approximation guarantees on the relevance of the final list. More precisely, given an allowed approximation error ϵ, the proposed algorithm finds a(1-ϵ)\"optimal filtering, i.e., the relevance of its solution is at least (1-ϵ) times the optimum. We conduct a comprehensive evaluation of ϵ-Filtering against state-of-the-art competitors on two real-world public datasets. Experiments show that ϵ-Filtering achieves the desired levels of effectiveness with a speedup of up to two orders of magnitude with respect to the optimal solution while guaranteeing very small approximation errors.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88444377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Learning from Fact-checkers: Analysis and Generation of Fact-checking Language 向事实核查者学习:事实核查语言的分析与生成
Nguyen Vo, Kyumin Lee
In fighting against fake news, many fact-checking systems comprised of human-based fact-checking sites (e.g., snopes.com and politifact.com) and automatic detection systems have been developed in recent years. However, online users still keep sharing fake news even when it has been debunked. It means that early fake news detection may be insufficient and we need another complementary approach to mitigate the spread of misinformation. In this paper, we introduce a novel application of text generation for combating fake news. In particular, we (1) leverage online users named fact-checkers, who cite fact-checking sites as credible evidences to fact-check information in public discourse; (2) analyze linguistic characteristics of fact-checking tweets; and (3) propose and build a deep learning framework to generate responses with fact-checking intention to increase the fact-checkers' engagement in fact-checking activities. Our analysis reveals that the fact-checkers tend to refute misinformation and use formal language (e.g. few swear words and Internet slangs). Our framework successfully generates relevant responses, and outperforms competing models by achieving up to 30% improvements. Our qualitative study also confirms that the superiority of our generated responses compared with responses generated from the existing models.
在打击假新闻方面,近年来开发了许多由人工事实核查网站(如snopes.com和politifact.com)和自动检测系统组成的事实核查系统。然而,即使假新闻已经被揭穿,网民们仍然会继续分享假新闻。这意味着早期的假新闻检测可能是不够的,我们需要另一种补充方法来减轻错误信息的传播。在本文中,我们介绍了一种新的文本生成用于打击假新闻的应用。特别是,我们(1)利用被称为事实核查者的在线用户,他们引用事实核查网站作为可信证据,对公共话语中的信息进行事实核查;(2)分析事实核查推文的语言特征;(3)提出并构建一个深度学习框架,生成具有事实核查意图的回应,以提高事实核查者在事实核查活动中的参与度。我们的分析表明,事实核查者倾向于驳斥错误信息,并使用正式语言(例如很少使用脏话和网络俚语)。我们的框架成功地产生了相关的响应,并通过实现高达30%的改进而优于竞争模型。我们的定性研究也证实了我们生成的响应与现有模型生成的响应相比的优越性。
{"title":"Learning from Fact-checkers: Analysis and Generation of Fact-checking Language","authors":"Nguyen Vo, Kyumin Lee","doi":"10.1145/3331184.3331248","DOIUrl":"https://doi.org/10.1145/3331184.3331248","url":null,"abstract":"In fighting against fake news, many fact-checking systems comprised of human-based fact-checking sites (e.g., snopes.com and politifact.com) and automatic detection systems have been developed in recent years. However, online users still keep sharing fake news even when it has been debunked. It means that early fake news detection may be insufficient and we need another complementary approach to mitigate the spread of misinformation. In this paper, we introduce a novel application of text generation for combating fake news. In particular, we (1) leverage online users named fact-checkers, who cite fact-checking sites as credible evidences to fact-check information in public discourse; (2) analyze linguistic characteristics of fact-checking tweets; and (3) propose and build a deep learning framework to generate responses with fact-checking intention to increase the fact-checkers' engagement in fact-checking activities. Our analysis reveals that the fact-checkers tend to refute misinformation and use formal language (e.g. few swear words and Internet slangs). Our framework successfully generates relevant responses, and outperforms competing models by achieving up to 30% improvements. Our qualitative study also confirms that the superiority of our generated responses compared with responses generated from the existing models.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88057025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Web Table Extraction, Retrieval and Augmentation Web表提取,检索和增强
Shuo Zhang, K. Balog
This tutorial synthesizes and presents research on web tables over the past two decades. We group the tasks into six main categories of information access tasks: (i) table extraction, (ii) table interpretation, (iii) table search, (iv) question answering on tables, (v) knowledge base augmentation, and (vi) table completion. For each category, we identify and introduce seminal approaches, present relevant resources, and point out interdependencies among the different tasks.
本教程综合并介绍了过去二十年来对web表的研究。我们将任务分为六大类信息访问任务:(i)表提取,(ii)表解释,(iii)表搜索,(iv)表上的问题回答,(v)知识库扩充,(vi)表补全。对于每个类别,我们确定并介绍开创性的方法,提供相关资源,并指出不同任务之间的相互依赖性。
{"title":"Web Table Extraction, Retrieval and Augmentation","authors":"Shuo Zhang, K. Balog","doi":"10.1145/3331184.3331385","DOIUrl":"https://doi.org/10.1145/3331184.3331385","url":null,"abstract":"This tutorial synthesizes and presents research on web tables over the past two decades. We group the tasks into six main categories of information access tasks: (i) table extraction, (ii) table interpretation, (iii) table search, (iv) question answering on tables, (v) knowledge base augmentation, and (vi) table completion. For each category, we identify and introduce seminal approaches, present relevant resources, and point out interdependencies among the different tasks.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88185123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Investigating Passage-level Relevance and Its Role in Document-level Relevance Judgment 文章级关联及其在文件级关联判断中的作用研究
Zhijing Wu, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma
The understanding of the process of relevance judgment helps to inspire the design of retrieval models. Traditional retrieval models usually estimate relevance based on document-level signals. Recent works consider a more fine-grain, passage-level relevance information, which can further enhance retrieval performance. However, it lacks a detailed analysis of how passage-level relevance signals determine or influence the relevance judgment of the whole document. To investigate the role of passage-level relevance in the document-level relevance judgment, we construct an ad-hoc retrieval dataset with both passage-level and document-level relevance labels. A thorough analysis reveals that: 1) there is a strong correlation between the document-level relevance and the fractions of irrelevant passages to highly relevant passages; 2) the position, length and query similarity of passages play different roles in the determination of document-level relevance; 3) The sequential passage-level relevance within a document is a potential indicator for the document-level relevance. Based on the relationship between passage-level and document-level relevance, we also show that utilizing passage-level relevance signals can improve existing document ranking models. This study helps us better understand how users perceive relevance for a document and inspire the designing of novel ranking models leveraging fine-grain, passage-level relevance signals.
对关联判断过程的理解有助于启发检索模型的设计。传统的检索模型通常基于文档级信号来估计相关性。最近的研究考虑了更细粒度的、篇章级的相关信息,可以进一步提高检索性能。然而,缺乏对段落级关联信号如何决定或影响整篇文章相关性判断的详细分析。为了研究段落级相关性在文档级相关性判断中的作用,我们构建了一个包含段落级和文档级相关标签的特别检索数据集。深入分析表明:1)文档级相关性与不相关段落与高度相关段落的比例之间存在很强的相关性;2)段落的位置、长度和查询相似度在确定文档级相关性中起着不同的作用;3)文档中的顺序段落级相关性是文档级相关性的潜在指示器。基于段落级和文档级相关性之间的关系,我们还表明利用段落级相关性信号可以改进现有的文档排序模型。这项研究帮助我们更好地理解用户如何感知文档的相关性,并启发我们设计利用细粒度、通道级相关性信号的新型排名模型。
{"title":"Investigating Passage-level Relevance and Its Role in Document-level Relevance Judgment","authors":"Zhijing Wu, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma","doi":"10.1145/3331184.3331233","DOIUrl":"https://doi.org/10.1145/3331184.3331233","url":null,"abstract":"The understanding of the process of relevance judgment helps to inspire the design of retrieval models. Traditional retrieval models usually estimate relevance based on document-level signals. Recent works consider a more fine-grain, passage-level relevance information, which can further enhance retrieval performance. However, it lacks a detailed analysis of how passage-level relevance signals determine or influence the relevance judgment of the whole document. To investigate the role of passage-level relevance in the document-level relevance judgment, we construct an ad-hoc retrieval dataset with both passage-level and document-level relevance labels. A thorough analysis reveals that: 1) there is a strong correlation between the document-level relevance and the fractions of irrelevant passages to highly relevant passages; 2) the position, length and query similarity of passages play different roles in the determination of document-level relevance; 3) The sequential passage-level relevance within a document is a potential indicator for the document-level relevance. Based on the relationship between passage-level and document-level relevance, we also show that utilizing passage-level relevance signals can improve existing document ranking models. This study helps us better understand how users perceive relevance for a document and inspire the designing of novel ranking models leveraging fine-grain, passage-level relevance signals.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86731504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Which Diversity Evaluation Measures Are "Good"? 哪些多样性评估措施是“好的”?
T. Sakai, Zhaohao Zeng
This study evaluates 30 IR evaluation measures or their instances, of which nine are for adhoc IR and 21 are for diversified IR, primarily from the viewpoint of whether their preferences of one SERP (search engine result page) over another actually align with users' preferences. The gold preferences were contructed by hiring 15 assessors, who independently examined 1,127 SERP pairs and made preference assessments. Two sets of preference assessments were obtained: one based on a relevance question "Which SERP is more relevant?'' and the other based on a diversity question "Which SERP is likely to satisfy a higher number of users?'' To our knowledge, our study is the first to have collected diversity preference assessments in this way and evaluated diversity measures successfully. Our main results are that (a) Popular adhoc IR measures such as nDCG actually align quite well with the gold relevance preferences; and that (b) While the ♯-measures align well with the gold diversity preferences, intent-aware measures perform relatively poorly. Moreover, as by-products of our analysis of existing evaluation measures, we define new adhoc measures called iRBU (intentwise Rank-Biased Utility) and EBR (Expected Blended Ratio); we demonstrate that an instance of iRBU performs as well as nDCG when compared to the gold relevance preferences. On the other hand, the original RBU, a recently-proposed diversity measure, underperforms the best ♯-measures when compared to the gold diversity preferences.
本研究评估了30个IR评估措施或其实例,其中9个用于特殊IR, 21个用于多样化IR,主要是从他们对一个SERP(搜索引擎结果页面)的偏好是否与另一个用户的偏好相一致的角度出发。黄金偏好由15名评估人员构建,他们独立检查了1,127对SERP并进行了偏好评估。获得了两组偏好评估:一组基于相关性问题“哪个SERP更相关?”,另一个则是基于一个多样性问题:“哪个SERP可能满足更多的用户?”“据我们所知,我们的研究是第一个以这种方式收集多样性偏好评估并成功评估多样性措施的研究。我们的主要结果是:(a)流行的临时IR指标,如nDCG,实际上与黄金相关性偏好相当一致;并且(b)虽然# -措施与黄金多样性偏好很好地一致,但意图意识措施表现相对较差。此外,作为我们对现有评价措施分析的副产品,我们定义了新的特别措施,称为iRBU(故意秩偏效用)和EBR(预期混合比率);我们证明,与黄金相关偏好相比,iRBU实例的表现与nDCG一样好。另一方面,与黄金多样性偏好相比,最初的RBU,最近提出的多样性指标,表现不如最佳# -指标。
{"title":"Which Diversity Evaluation Measures Are \"Good\"?","authors":"T. Sakai, Zhaohao Zeng","doi":"10.1145/3331184.3331215","DOIUrl":"https://doi.org/10.1145/3331184.3331215","url":null,"abstract":"This study evaluates 30 IR evaluation measures or their instances, of which nine are for adhoc IR and 21 are for diversified IR, primarily from the viewpoint of whether their preferences of one SERP (search engine result page) over another actually align with users' preferences. The gold preferences were contructed by hiring 15 assessors, who independently examined 1,127 SERP pairs and made preference assessments. Two sets of preference assessments were obtained: one based on a relevance question \"Which SERP is more relevant?'' and the other based on a diversity question \"Which SERP is likely to satisfy a higher number of users?'' To our knowledge, our study is the first to have collected diversity preference assessments in this way and evaluated diversity measures successfully. Our main results are that (a) Popular adhoc IR measures such as nDCG actually align quite well with the gold relevance preferences; and that (b) While the ♯-measures align well with the gold diversity preferences, intent-aware measures perform relatively poorly. Moreover, as by-products of our analysis of existing evaluation measures, we define new adhoc measures called iRBU (intentwise Rank-Biased Utility) and EBR (Expected Blended Ratio); we demonstrate that an instance of iRBU performs as well as nDCG when compared to the gold relevance preferences. On the other hand, the original RBU, a recently-proposed diversity measure, underperforms the best ♯-measures when compared to the gold diversity preferences.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85783501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Scalable Deep Multimodal Learning for Cross-Modal Retrieval 跨模态检索的可扩展深度多模态学习
Peng Hu, Liangli Zhen, Dezhong Peng, Pei Liu
Cross-modal retrieval takes one type of data as the query to retrieve relevant data of another type. Most of existing cross-modal retrieval approaches were proposed to learn a common subspace in a joint manner, where the data from all modalities have to be involved during the whole training process. For these approaches, the optimal parameters of different modality-specific transformations are dependent on each other and the whole model has to be retrained when handling samples from new modalities. In this paper, we present a novel cross-modal retrieval method, called Scalable Deep Multimodal Learning (SDML). It proposes to predefine a common subspace, in which the between-class variation is maximized while the within-class variation is minimized. Then, it trains m modality-specific networks for m modalities (one network for each modality) to transform the multimodal data into the predefined common subspace to achieve multimodal learning. Unlike many of the existing methods, our method can train different modality-specific networks independently and thus be scalable to the number of modalities. To the best of our knowledge, the proposed SDML could be one of the first works to independently project data of an unfixed number of modalities into a predefined common subspace. Comprehensive experimental results on four widely-used benchmark datasets demonstrate that the proposed method is effective and efficient in multimodal learning and outperforms the state-of-the-art methods in cross-modal retrieval.
跨模式检索以一种类型的数据作为查询,检索另一种类型的相关数据。现有的跨模态检索方法大多是为了联合学习一个公共子空间而提出的,在整个训练过程中需要涉及所有模态的数据。对于这些方法,不同模态特定变换的最优参数是相互依赖的,当处理来自新模态的样本时,整个模型必须重新训练。本文提出了一种新的跨模态检索方法,称为可扩展深度多模态学习(SDML)。提出预先定义一个公共子空间,使类间变化最大,类内变化最小。然后,针对m个模态训练m个特定于模态的网络(每个模态一个网络),将多模态数据转换为预定义的公共子空间,实现多模态学习。与许多现有的方法不同,我们的方法可以独立训练不同的特定于模态的网络,因此可以扩展到模态的数量。据我们所知,所提出的SDML可能是第一个将不固定数量的模态数据独立投影到预定义的公共子空间中的工作之一。在四个广泛使用的基准数据集上的综合实验结果表明,该方法在多模态学习中是有效和高效的,并且在跨模态检索中优于目前最先进的方法。
{"title":"Scalable Deep Multimodal Learning for Cross-Modal Retrieval","authors":"Peng Hu, Liangli Zhen, Dezhong Peng, Pei Liu","doi":"10.1145/3331184.3331213","DOIUrl":"https://doi.org/10.1145/3331184.3331213","url":null,"abstract":"Cross-modal retrieval takes one type of data as the query to retrieve relevant data of another type. Most of existing cross-modal retrieval approaches were proposed to learn a common subspace in a joint manner, where the data from all modalities have to be involved during the whole training process. For these approaches, the optimal parameters of different modality-specific transformations are dependent on each other and the whole model has to be retrained when handling samples from new modalities. In this paper, we present a novel cross-modal retrieval method, called Scalable Deep Multimodal Learning (SDML). It proposes to predefine a common subspace, in which the between-class variation is maximized while the within-class variation is minimized. Then, it trains m modality-specific networks for m modalities (one network for each modality) to transform the multimodal data into the predefined common subspace to achieve multimodal learning. Unlike many of the existing methods, our method can train different modality-specific networks independently and thus be scalable to the number of modalities. To the best of our knowledge, the proposed SDML could be one of the first works to independently project data of an unfixed number of modalities into a predefined common subspace. Comprehensive experimental results on four widely-used benchmark datasets demonstrate that the proposed method is effective and efficient in multimodal learning and outperforms the state-of-the-art methods in cross-modal retrieval.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86340371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
期刊
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1