自动视频面试够智能吗?机器学习认知能力评估的行为模式、可靠性、有效性和偏差。

IF 9.4 1区 心理学 Q1 MANAGEMENT Journal of Applied Psychology Pub Date : 2024-09-26 DOI:10.1037/apl0001236
Louis Hickman,Louis Tay,Sang Eun Woo
{"title":"自动视频面试够智能吗?机器学习认知能力评估的行为模式、可靠性、有效性和偏差。","authors":"Louis Hickman,Louis Tay,Sang Eun Woo","doi":"10.1037/apl0001236","DOIUrl":null,"url":null,"abstract":"Automated video interviews (AVIs) that use machine learning (ML) algorithms to assess interviewees are increasingly popular. Extending prior AVI research focusing on noncognitive constructs, the present study critically evaluates the possibility of assessing cognitive ability with AVIs. By developing and examining AVI ML models trained to predict measures of three cognitive ability constructs (i.e., general mental ability, verbal ability, and intellect [as observed at zero acquaintance]), this research contributes to the literature in several ways. First, it advances our understanding of how cognitive abilities relate to interviewee behavior. Specifically, we found that verbal behaviors best predicted interviewee cognitive abilities, while neither paraverbal nor nonverbal behaviors provided incremental validity, suggesting that only verbal behaviors should be used to assess cognitive abilities. Second, across two samples of mock video interviews, we extensively evaluated the psychometric properties of the verbal behavior AVI ML model scores, including their reliability (internal consistency across interview questions and test-retest), validity (relationships with other variables and content), and fairness and bias (measurement and predictive). Overall, the general mental ability, verbal ability, and intellect AVI models captured similar behavioral manifestations of cognitive ability. Validity evidence results were mixed: For example, AVIs trained on observer-rated intellect exhibited superior convergent and criterion relationships (compared to the observer ratings they were trained to model) but had limited discriminant validity evidence. Our findings illustrate the importance of examining psychometric properties beyond convergence with the test that ML algorithms are trained to model. We provide recommendations for enhancing discriminant validity evidence in future AVIs. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":15135,"journal":{"name":"Journal of Applied Psychology","volume":"217 1","pages":""},"PeriodicalIF":9.4000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Are automated video interviews smart enough? Behavioral modes, reliability, validity, and bias of machine learning cognitive ability assessments.\",\"authors\":\"Louis Hickman,Louis Tay,Sang Eun Woo\",\"doi\":\"10.1037/apl0001236\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated video interviews (AVIs) that use machine learning (ML) algorithms to assess interviewees are increasingly popular. Extending prior AVI research focusing on noncognitive constructs, the present study critically evaluates the possibility of assessing cognitive ability with AVIs. By developing and examining AVI ML models trained to predict measures of three cognitive ability constructs (i.e., general mental ability, verbal ability, and intellect [as observed at zero acquaintance]), this research contributes to the literature in several ways. First, it advances our understanding of how cognitive abilities relate to interviewee behavior. Specifically, we found that verbal behaviors best predicted interviewee cognitive abilities, while neither paraverbal nor nonverbal behaviors provided incremental validity, suggesting that only verbal behaviors should be used to assess cognitive abilities. Second, across two samples of mock video interviews, we extensively evaluated the psychometric properties of the verbal behavior AVI ML model scores, including their reliability (internal consistency across interview questions and test-retest), validity (relationships with other variables and content), and fairness and bias (measurement and predictive). Overall, the general mental ability, verbal ability, and intellect AVI models captured similar behavioral manifestations of cognitive ability. Validity evidence results were mixed: For example, AVIs trained on observer-rated intellect exhibited superior convergent and criterion relationships (compared to the observer ratings they were trained to model) but had limited discriminant validity evidence. Our findings illustrate the importance of examining psychometric properties beyond convergence with the test that ML algorithms are trained to model. We provide recommendations for enhancing discriminant validity evidence in future AVIs. (PsycInfo Database Record (c) 2024 APA, all rights reserved).\",\"PeriodicalId\":15135,\"journal\":{\"name\":\"Journal of Applied Psychology\",\"volume\":\"217 1\",\"pages\":\"\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2024-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Psychology\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/apl0001236\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MANAGEMENT\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/apl0001236","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MANAGEMENT","Score":null,"Total":0}
引用次数: 0

摘要

使用机器学习(ML)算法评估受访者的自动视频访谈(AVI)越来越受欢迎。本研究扩展了之前针对非认知建构的 AVI 研究,对使用 AVI 评估认知能力的可能性进行了批判性评估。通过开发和检验经过训练的 AVI ML 模型来预测三种认知能力结构(即一般心智能力、言语能力和智力[在零认识时观察到的])的测量结果,本研究在几个方面对文献做出了贡献。首先,它加深了我们对认知能力与受访者行为之间关系的理解。具体来说,我们发现言语行为最能预测受访者的认知能力,而准言语行为和非言语行为都不能提供增量有效性,这表明只应使用言语行为来评估认知能力。其次,在两个模拟视频面试样本中,我们广泛评估了言语行为 AVI ML 模型得分的心理测量特性,包括其可靠性(不同面试问题之间的内部一致性和重测)、有效性(与其他变量和内容之间的关系)以及公平性和偏差(测量和预测)。总体而言,一般心智能力、言语能力和智力 AVI 模型捕捉到了认知能力的类似行为表现。有效性证据结果不一:例如,根据观察者评定的智力进行训练的 AVIs 表现出较好的收敛性和标准性关系(与观察者评定的智力相比),但其判别效度证据有限。我们的研究结果表明,在对 ML 算法进行建模训练时,除了对测试的收敛性进行检查外,还必须对心理测量特性进行检查。我们为在未来的 AVI 中增强判别有效性证据提供了建议。(PsycInfo Database Record (c) 2024 APA, 版权所有)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Are automated video interviews smart enough? Behavioral modes, reliability, validity, and bias of machine learning cognitive ability assessments.
Automated video interviews (AVIs) that use machine learning (ML) algorithms to assess interviewees are increasingly popular. Extending prior AVI research focusing on noncognitive constructs, the present study critically evaluates the possibility of assessing cognitive ability with AVIs. By developing and examining AVI ML models trained to predict measures of three cognitive ability constructs (i.e., general mental ability, verbal ability, and intellect [as observed at zero acquaintance]), this research contributes to the literature in several ways. First, it advances our understanding of how cognitive abilities relate to interviewee behavior. Specifically, we found that verbal behaviors best predicted interviewee cognitive abilities, while neither paraverbal nor nonverbal behaviors provided incremental validity, suggesting that only verbal behaviors should be used to assess cognitive abilities. Second, across two samples of mock video interviews, we extensively evaluated the psychometric properties of the verbal behavior AVI ML model scores, including their reliability (internal consistency across interview questions and test-retest), validity (relationships with other variables and content), and fairness and bias (measurement and predictive). Overall, the general mental ability, verbal ability, and intellect AVI models captured similar behavioral manifestations of cognitive ability. Validity evidence results were mixed: For example, AVIs trained on observer-rated intellect exhibited superior convergent and criterion relationships (compared to the observer ratings they were trained to model) but had limited discriminant validity evidence. Our findings illustrate the importance of examining psychometric properties beyond convergence with the test that ML algorithms are trained to model. We provide recommendations for enhancing discriminant validity evidence in future AVIs. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
17.60
自引率
6.10%
发文量
175
期刊介绍: The Journal of Applied Psychology® focuses on publishing original investigations that contribute new knowledge and understanding to fields of applied psychology (excluding clinical and applied experimental or human factors, which are better suited for other APA journals). The journal primarily considers empirical and theoretical investigations that enhance understanding of cognitive, motivational, affective, and behavioral psychological phenomena in work and organizational settings. These phenomena can occur at individual, group, organizational, or cultural levels, and in various work settings such as business, education, training, health, service, government, or military institutions. The journal welcomes submissions from both public and private sector organizations, for-profit or nonprofit. It publishes several types of articles, including: 1.Rigorously conducted empirical investigations that expand conceptual understanding (original investigations or meta-analyses). 2.Theory development articles and integrative conceptual reviews that synthesize literature and generate new theories on psychological phenomena to stimulate novel research. 3.Rigorously conducted qualitative research on phenomena that are challenging to capture with quantitative methods or require inductive theory building.
期刊最新文献
Prospects for reducing group mean differences on cognitive tests via item selection strategies. Self-promotion in entrepreneurship: A driver for proactive adaptation. Coping with work-nonwork stressors over time: A person-centered, multistudy integration of coping breadth and depth. A person-centered approach to behaving badly at work: An examination of workplace deviance patterns. How perceived lack of benevolence harms trust of artificial intelligence management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1