深度提示调优密集通道检索

Proceedings of COLING. International Conference on Computational Linguistics Pub Date : 2022-08-24 DOI:10.48550/arXiv.2208.11503

Zhen-Quan Tang, Benyou Wang, Ting Yao

{"title":"深度提示调优密集通道检索","authors":"Zhen-Quan Tang, Benyou Wang, Ting Yao","doi":"10.48550/arXiv.2208.11503","DOIUrl":null,"url":null,"abstract":"Deep prompt tuning (DPT) has gained great success in most natural language processing (NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning (FT) still dominates. When deploying multiple retrieval tasks using the same backbone model (e.g., RoBERTa), FT-based methods are unfriendly in terms of deployment cost: each new retrieval model needs to repeatedly deploy the backbone model without reuse. To reduce the deployment cost in such a scenario, this work investigates applying DPT in dense retrieval. The challenge is that directly applying DPT in dense retrieval largely underperforms FT methods. To compensate for the performance drop, we propose two model-agnostic and task-agnostic strategies for DPT-based retrievers, namely retrieval-oriented intermediate pretraining and unified negative mining, as a general approach that could be compatible with any pre-trained language model and retrieval task. The experimental results show that the proposed method (called DPTDR) outperforms previous state-of-the-art models on both MS-MARCO and Natural Questions. We also conduct ablation studies to examine the effectiveness of each strategy in DPTDR. We believe this work facilitates the industry, as it saves enormous efforts and costs of deployment and increases the utility of computing resources. Our code is available at https://github.com/tangzhy/DPTDR.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"14 1","pages":"1193-1202"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"DPTDR: Deep Prompt Tuning for Dense Passage Retrieval\",\"authors\":\"Zhen-Quan Tang, Benyou Wang, Ting Yao\",\"doi\":\"10.48550/arXiv.2208.11503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep prompt tuning (DPT) has gained great success in most natural language processing (NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning (FT) still dominates. When deploying multiple retrieval tasks using the same backbone model (e.g., RoBERTa), FT-based methods are unfriendly in terms of deployment cost: each new retrieval model needs to repeatedly deploy the backbone model without reuse. To reduce the deployment cost in such a scenario, this work investigates applying DPT in dense retrieval. The challenge is that directly applying DPT in dense retrieval largely underperforms FT methods. To compensate for the performance drop, we propose two model-agnostic and task-agnostic strategies for DPT-based retrievers, namely retrieval-oriented intermediate pretraining and unified negative mining, as a general approach that could be compatible with any pre-trained language model and retrieval task. The experimental results show that the proposed method (called DPTDR) outperforms previous state-of-the-art models on both MS-MARCO and Natural Questions. We also conduct ablation studies to examine the effectiveness of each strategy in DPTDR. We believe this work facilitates the industry, as it saves enormous efforts and costs of deployment and increases the utility of computing resources. Our code is available at https://github.com/tangzhy/DPTDR.\",\"PeriodicalId\":91381,\"journal\":{\"name\":\"Proceedings of COLING. International Conference on Computational Linguistics\",\"volume\":\"14 1\",\"pages\":\"1193-1202\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of COLING. International Conference on Computational Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2208.11503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of COLING. International Conference on Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2208.11503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

深度提示调优(Deep prompt tuning, DPT)在大多数自然语言处理(NLP)任务中取得了巨大的成功。然而，在精细调整(FT)仍然占主导地位的密集检索中，尚未得到很好的研究。当使用相同的骨干模型(例如RoBERTa)部署多个检索任务时，基于ft的方法在部署成本方面是不友好的:每个新的检索模型都需要重复部署骨干模型，而不能重用。为了降低这种情况下的部署成本，本文研究了DPT在密集检索中的应用。挑战在于直接将DPT应用于密集检索在很大程度上不如FT方法。为了弥补性能的下降，我们提出了两种模型不可知和任务不可知的基于dpt的检索器策略，即面向检索的中间预训练和统一负挖掘，作为一种可以兼容任何预训练语言模型和检索任务的通用方法。实验结果表明，所提出的方法(称为DPTDR)在MS-MARCO和Natural Questions上都优于先前最先进的模型。我们还进行消融研究，以检查每种策略在DPTDR中的有效性。我们相信这项工作有利于整个行业，因为它节省了大量的工作和部署成本，并提高了计算资源的效用。我们的代码可在https://github.com/tangzhy/DPTDR上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DPTDR: Deep Prompt Tuning for Dense Passage Retrieval

Deep prompt tuning (DPT) has gained great success in most natural language processing (NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning (FT) still dominates. When deploying multiple retrieval tasks using the same backbone model (e.g., RoBERTa), FT-based methods are unfriendly in terms of deployment cost: each new retrieval model needs to repeatedly deploy the backbone model without reuse. To reduce the deployment cost in such a scenario, this work investigates applying DPT in dense retrieval. The challenge is that directly applying DPT in dense retrieval largely underperforms FT methods. To compensate for the performance drop, we propose two model-agnostic and task-agnostic strategies for DPT-based retrievers, namely retrieval-oriented intermediate pretraining and unified negative mining, as a general approach that could be compatible with any pre-trained language model and retrieval task. The experimental results show that the proposed method (called DPTDR) outperforms previous state-of-the-art models on both MS-MARCO and Natural Questions. We also conduct ablation studies to examine the effectiveness of each strategy in DPTDR. We believe this work facilitates the industry, as it saves enormous efforts and costs of deployment and increases the utility of computing resources. Our code is available at https://github.com/tangzhy/DPTDR.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of COLING. International Conference on Computational Linguistics

自引率

0.00%

发文量