域外检测的度量学习和自适应边界

International Conference on Applications of Natural Language to Data Bases Pub Date : 2022-04-22 DOI:10.48550/arXiv.2204.10849

Petr Lorenc, Tommaso Gargiani, Jan Pichl, Jakub Konrád, Petro Marek, Ondrej Kobza, J. Sedivý

{"title":"域外检测的度量学习和自适应边界","authors":"Petr Lorenc, Tommaso Gargiani, Jan Pichl, Jakub Konrád, Petro Marek, Ondrej Kobza, J. Sedivý","doi":"10.48550/arXiv.2204.10849","DOIUrl":null,"url":null,"abstract":"Conversational agents are usually designed for closed-world environments. Unfortunately, users can behave unexpectedly. Based on the open-world environment, we often encounter the situation that the training and test data are sampled from different distributions. Then, data from different distributions are called out-of-domain (OOD). A robust conversational agent needs to react to these OOD utterances adequately. Thus, the importance of robust OOD detection is emphasized. Unfortunately, collecting OOD data is a challenging task. We have designed an OOD detection algorithm independent of OOD data that outperforms a wide range of current state-of-the-art algorithms on publicly available datasets. Our algorithm is based on a simple but efficient approach of combining metric learning with adaptive decision boundary. Furthermore, compared to other algorithms, we have found that our proposed algorithm has significantly improved OOD performance in a scenario with a lower number of classes while preserving the accuracy for in-domain (IND) classes.","PeriodicalId":136374,"journal":{"name":"International Conference on Applications of Natural Language to Data Bases","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Metric Learning and Adaptive Boundary for Out-of-Domain Detection\",\"authors\":\"Petr Lorenc, Tommaso Gargiani, Jan Pichl, Jakub Konrád, Petro Marek, Ondrej Kobza, J. Sedivý\",\"doi\":\"10.48550/arXiv.2204.10849\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conversational agents are usually designed for closed-world environments. Unfortunately, users can behave unexpectedly. Based on the open-world environment, we often encounter the situation that the training and test data are sampled from different distributions. Then, data from different distributions are called out-of-domain (OOD). A robust conversational agent needs to react to these OOD utterances adequately. Thus, the importance of robust OOD detection is emphasized. Unfortunately, collecting OOD data is a challenging task. We have designed an OOD detection algorithm independent of OOD data that outperforms a wide range of current state-of-the-art algorithms on publicly available datasets. Our algorithm is based on a simple but efficient approach of combining metric learning with adaptive decision boundary. Furthermore, compared to other algorithms, we have found that our proposed algorithm has significantly improved OOD performance in a scenario with a lower number of classes while preserving the accuracy for in-domain (IND) classes.\",\"PeriodicalId\":136374,\"journal\":{\"name\":\"International Conference on Applications of Natural Language to Data Bases\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Applications of Natural Language to Data Bases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2204.10849\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Applications of Natural Language to Data Bases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2204.10849","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

会话代理通常是为封闭环境设计的。不幸的是，用户的行为可能出乎意料。基于开放世界环境，我们经常会遇到训练数据和测试数据来自不同分布的情况。然后，来自不同分布的数据被称为域外(OOD)。一个健壮的会话代理需要对这些OOD话语做出充分的反应。因此，强调了鲁棒OOD检测的重要性。不幸的是，收集OOD数据是一项具有挑战性的任务。我们设计了一种独立于OOD数据的OOD检测算法，该算法在公开可用的数据集上优于当前各种最先进的算法。我们的算法基于一种简单而有效的方法，将度量学习与自适应决策边界相结合。此外，与其他算法相比，我们发现我们提出的算法在类数量较少的场景下显著提高了OOD性能，同时保持了域内(IND)类的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Metric Learning and Adaptive Boundary for Out-of-Domain Detection

Conversational agents are usually designed for closed-world environments. Unfortunately, users can behave unexpectedly. Based on the open-world environment, we often encounter the situation that the training and test data are sampled from different distributions. Then, data from different distributions are called out-of-domain (OOD). A robust conversational agent needs to react to these OOD utterances adequately. Thus, the importance of robust OOD detection is emphasized. Unfortunately, collecting OOD data is a challenging task. We have designed an OOD detection algorithm independent of OOD data that outperforms a wide range of current state-of-the-art algorithms on publicly available datasets. Our algorithm is based on a simple but efficient approach of combining metric learning with adaptive decision boundary. Furthermore, compared to other algorithms, we have found that our proposed algorithm has significantly improved OOD performance in a scenario with a lower number of classes while preserving the accuracy for in-domain (IND) classes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Applications of Natural Language to Data Bases

自引率

0.00%

发文量