利用开箱即用的检索模型改善心理健康支持

Proceedings of the International Conference on Health Informatics and Medical Application Technology Pub Date : 2023-01-01 DOI:10.5220/0011634300003414

Theo Rummer-Downing, Julie Weeds

{"title":"利用开箱即用的检索模型改善心理健康支持","authors":"Theo Rummer-Downing, Julie Weeds","doi":"10.5220/0011634300003414","DOIUrl":null,"url":null,"abstract":": This work compares the performance of several information retrieval (IR) models in the search for relevant mental health documents based on relevance to forum post queries from a fully-moderated online mental health service. Three different architectures are assessed: a sparse lexical model, BM25, is used as a base-line, alongside two neural SBERT-based architectures - the bi-encoder and the cross-encoder. We highlight the credibility of using pretrained language models (PLMs) out-of-the-box, without an additional fine-tuning stage, to achieve high retrieval quality across a limited set of resources. Error analysis of the ranking results suggested PLMs make errors on documents which contain so called red-herrings - words which are semantically related but irrelevant to the query - whereas human judgements were found to suffer when queries are vague and present no clear information need. Further, we show that bias towards an author’s writing style within a PLM affects retrieval quality and, therefore, can impact on the success of mental health support if left unaddressed.","PeriodicalId":20676,"journal":{"name":"Proceedings of the International Conference on Health Informatics and Medical Application Technology","volume":"17 1","pages":"64-73"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging Out-of-the-Box Retrieval Models to Improve Mental Health Support\",\"authors\":\"Theo Rummer-Downing, Julie Weeds\",\"doi\":\"10.5220/0011634300003414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": This work compares the performance of several information retrieval (IR) models in the search for relevant mental health documents based on relevance to forum post queries from a fully-moderated online mental health service. Three different architectures are assessed: a sparse lexical model, BM25, is used as a base-line, alongside two neural SBERT-based architectures - the bi-encoder and the cross-encoder. We highlight the credibility of using pretrained language models (PLMs) out-of-the-box, without an additional fine-tuning stage, to achieve high retrieval quality across a limited set of resources. Error analysis of the ranking results suggested PLMs make errors on documents which contain so called red-herrings - words which are semantically related but irrelevant to the query - whereas human judgements were found to suffer when queries are vague and present no clear information need. Further, we show that bias towards an author’s writing style within a PLM affects retrieval quality and, therefore, can impact on the success of mental health support if left unaddressed.\",\"PeriodicalId\":20676,\"journal\":{\"name\":\"Proceedings of the International Conference on Health Informatics and Medical Application Technology\",\"volume\":\"17 1\",\"pages\":\"64-73\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on Health Informatics and Medical Application Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0011634300003414\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Health Informatics and Medical Application Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0011634300003414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

这项工作比较了几种信息检索(IR)模型在搜索相关心理健康文档时的性能，这些文档是基于与来自完全调节的在线心理健康服务的论坛帖子查询的相关性。本文评估了三种不同的架构:稀疏词法模型BM25作为基线，以及两种基于sbert的神经网络架构——双编码器和交叉编码器。我们强调了使用开箱即用的预训练语言模型(PLMs)的可信度，无需额外的微调阶段，可以在有限的资源集上实现高质量的检索。对排名结果的错误分析表明，plm在包含所谓的“红鲱鱼”(语义上相关但与查询无关的单词)的文档上会出错，而当查询含糊不清且没有提供明确的信息需求时，人类的判断会受到影响。此外，我们表明，在PLM中对作者写作风格的偏见会影响检索质量，因此，如果不加以解决，可能会影响心理健康支持的成功。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Leveraging Out-of-the-Box Retrieval Models to Improve Mental Health Support

: This work compares the performance of several information retrieval (IR) models in the search for relevant mental health documents based on relevance to forum post queries from a fully-moderated online mental health service. Three different architectures are assessed: a sparse lexical model, BM25, is used as a base-line, alongside two neural SBERT-based architectures - the bi-encoder and the cross-encoder. We highlight the credibility of using pretrained language models (PLMs) out-of-the-box, without an additional fine-tuning stage, to achieve high retrieval quality across a limited set of resources. Error analysis of the ranking results suggested PLMs make errors on documents which contain so called red-herrings - words which are semantically related but irrelevant to the query - whereas human judgements were found to suffer when queries are vague and present no clear information need. Further, we show that bias towards an author’s writing style within a PLM affects retrieval quality and, therefore, can impact on the success of mental health support if left unaddressed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the International Conference on Health Informatics and Medical Application Technology

自引率

0.00%

发文量