{"title":"凡事皆有代价:成本敏感型机器学习的基础及其在心理学中的应用。","authors":"Philipp Sterner, David Goretzko, Florian Pargent","doi":"10.1037/met0000586","DOIUrl":null,"url":null,"abstract":"<p><p>Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (<i>N</i> = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/). (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Everything has its price: Foundations of cost-sensitive machine learning and its application in psychology.\",\"authors\":\"Philipp Sterner, David Goretzko, Florian Pargent\",\"doi\":\"10.1037/met0000586\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (<i>N</i> = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/). (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>\",\"PeriodicalId\":20782,\"journal\":{\"name\":\"Psychological methods\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2023-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/met0000586\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000586","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
Everything has its price: Foundations of cost-sensitive machine learning and its application in psychology.
Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (N = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/). (PsycInfo Database Record (c) 2023 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.