Estimating classification consistency of machine learning models for screening measures.

IF 3.3 2区心理学 Q1 PSYCHOLOGY, CLINICAL Psychological Assessment Pub Date : 2024-06-01 DOI:10.1037/pas0001313

Oscar Gonzalez, A R Georgeson, William E Pelham

{"title":"Estimating classification consistency of machine learning models for screening measures.","authors":"Oscar Gonzalez, A R Georgeson, William E Pelham","doi":"10.1037/pas0001313","DOIUrl":null,"url":null,"abstract":"<p><p>This article illustrates novel quantitative methods to estimate classification consistency in machine learning models used for screening measures. Screening measures are used in psychology and medicine to classify individuals into diagnostic classifications. In addition to achieving high accuracy, it is ideal for the screening process to have high classification consistency, which means that respondents would be classified into the same group every time if the assessment was repeated. Although machine learning models are increasingly being used to predict a screening classification based on individual item responses, methods to describe the classification consistency of machine learning models have not yet been developed. This article addresses this gap by describing methods to estimate classification inconsistency in machine learning models arising from two different sources: sampling error during model fitting and measurement error in the item responses. These methods use data resampling techniques such as the bootstrap and Monte Carlo sampling. These methods are illustrated using three empirical examples predicting a health condition/diagnosis from item responses. R code is provided to facilitate the implementation of the methods. This article highlights the importance of considering classification consistency alongside accuracy when studying screening measures and provides the tools and guidance necessary for applied researchers to obtain classification consistency indices in their machine learning research on diagnostic assessments. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20770,"journal":{"name":"Psychological Assessment","volume":"36 6-7","pages":"395-406"},"PeriodicalIF":3.3000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological Assessment","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/pas0001313","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}

引用次数: 0

Abstract

This article illustrates novel quantitative methods to estimate classification consistency in machine learning models used for screening measures. Screening measures are used in psychology and medicine to classify individuals into diagnostic classifications. In addition to achieving high accuracy, it is ideal for the screening process to have high classification consistency, which means that respondents would be classified into the same group every time if the assessment was repeated. Although machine learning models are increasingly being used to predict a screening classification based on individual item responses, methods to describe the classification consistency of machine learning models have not yet been developed. This article addresses this gap by describing methods to estimate classification inconsistency in machine learning models arising from two different sources: sampling error during model fitting and measurement error in the item responses. These methods use data resampling techniques such as the bootstrap and Monte Carlo sampling. These methods are illustrated using three empirical examples predicting a health condition/diagnosis from item responses. R code is provided to facilitate the implementation of the methods. This article highlights the importance of considering classification consistency alongside accuracy when studying screening measures and provides the tools and guidance necessary for applied researchers to obtain classification consistency indices in their machine learning research on diagnostic assessments. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

估算机器学习模型对筛查措施的分类一致性。

本文阐述了新颖的定量方法，用于估算筛查措施所用机器学习模型的分类一致性。筛查方法被用于心理学和医学领域，以将个体划分为诊断类别。除了要达到高准确度外，筛查过程还必须具有高分类一致性，这意味着如果重复进行评估，受访者每次都会被归入同一组别。尽管机器学习模型越来越多地被用于预测基于单个项目反应的筛选分类，但描述机器学习模型分类一致性的方法尚未开发出来。本文针对这一空白，介绍了估算机器学习模型分类不一致性的方法，这种不一致性由两个不同的来源引起：模型拟合过程中的抽样误差和项目回答中的测量误差。这些方法使用了数据重采样技术，如自举法和蒙特卡罗采样。这些方法通过三个从项目回答中预测健康状况/诊断的经验示例进行了说明。本文提供了 R 代码，以方便方法的实施。本文强调了在研究筛查措施时考虑分类一致性和准确性的重要性，并为应用研究人员在诊断评估的机器学习研究中获取分类一致性指数提供了必要的工具和指导。(PsycInfo 数据库记录 (c) 2024 APA，保留所有权利）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Psychological Assessment PSYCHOLOGY, CLINICAL-

CiteScore

5.70

自引率

5.60%

发文量

167

期刊介绍： Psychological Assessment is concerned mainly with empirical research on measurement and evaluation relevant to the broad field of clinical psychology. Submissions are welcome in the areas of assessment processes and methods. Included are - clinical judgment and the application of decision-making models - paradigms derived from basic psychological research in cognition, personality–social psychology, and biological psychology - development, validation, and application of assessment instruments, observational methods, and interviews