Pub Date : 2025-12-02DOI: 10.3758/s13428-025-02891-0
Thomas W Nugent, Andrew J Zele
Lighting is routinely specified only by its impact on the three cone photoreceptors via the correlated color temperature (CCT), ignoring the visual and non-visual contributions of the melanopsin photoreceptors. Disentangling the behavioral effects of the CCT from those of the melanopsin excitation is complex but necessary to understand melanopsin's effects and to inform the design of new lighting spectra for the built environment. Melanopsin photoreception is important for driving many visual and non-visual functions in humans, including circadian rhythms, mood, attention, and arousal. Here, we introduce a methodology using a widely available LED source (Philips Hue Play, Signify N.V.) to decouple the effects of melanopsin from those of cone photoreceptors. We present a computational algorithm for producing two ambient illuminations with different melanopsin and rhodopsin activation levels, whilst maintaining the same cone excitations, CCT and visual appearance (i.e., the two lighting conditions are cone metamers); this simple and inexpensive method removes the major confounding factor present in approaches that alter the melanopsin excitation of a light by exchanging the wavelength, color, or CCT. The method may find applications in behavioral experiments, including for clinical trials.
{"title":"A method for setting the melanopsin and rhodopsin content in commercial LED sources to investigate the effects of ambient light on behavior.","authors":"Thomas W Nugent, Andrew J Zele","doi":"10.3758/s13428-025-02891-0","DOIUrl":"https://doi.org/10.3758/s13428-025-02891-0","url":null,"abstract":"<p><p>Lighting is routinely specified only by its impact on the three cone photoreceptors via the correlated color temperature (CCT), ignoring the visual and non-visual contributions of the melanopsin photoreceptors. Disentangling the behavioral effects of the CCT from those of the melanopsin excitation is complex but necessary to understand melanopsin's effects and to inform the design of new lighting spectra for the built environment. Melanopsin photoreception is important for driving many visual and non-visual functions in humans, including circadian rhythms, mood, attention, and arousal. Here, we introduce a methodology using a widely available LED source (Philips Hue Play, Signify N.V.) to decouple the effects of melanopsin from those of cone photoreceptors. We present a computational algorithm for producing two ambient illuminations with different melanopsin and rhodopsin activation levels, whilst maintaining the same cone excitations, CCT and visual appearance (i.e., the two lighting conditions are cone metamers); this simple and inexpensive method removes the major confounding factor present in approaches that alter the melanopsin excitation of a light by exchanging the wavelength, color, or CCT. The method may find applications in behavioral experiments, including for clinical trials.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"14"},"PeriodicalIF":3.9,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145660072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.3758/s13428-025-02872-3
Eden Elbaz, Itay Yaron, Liad Mudrik
A major challenge in studying unconscious processing is to effectively suppress the critical stimulus while allowing maximal signal strength for adequate sensitivity to detect an effect, if it exists. A possible way to do this is to calibrate stimulus strength. While calibrating stimulus strength is common in psychophysics, current calibration methods are not designed to find the maximal intensity in which the stimulus can still be rendered unconscious (i.e., find the upper subliminal threshold for each participant). Here, we demonstrate how calibration can be utilized to estimate, for each observer, this targeted threshold. We present a novel calibration procedure: the Subliminal Threshold Estimation Procedure (STEP), specifically designed for estimating the upper subliminal threshold for each individual. Using simulations, we showed that STEP outperforms existing calibration methods, which yielded strikingly low accuracy. We then further validated STEP using three empirical experiments. Together, these results establish STEP as highly beneficial for the study of unconscious processing.
{"title":"The Subliminal Threshold Estimation Procedure (STEP): A calibration method tailored for estimating subliminal thresholds.","authors":"Eden Elbaz, Itay Yaron, Liad Mudrik","doi":"10.3758/s13428-025-02872-3","DOIUrl":"10.3758/s13428-025-02872-3","url":null,"abstract":"<p><p>A major challenge in studying unconscious processing is to effectively suppress the critical stimulus while allowing maximal signal strength for adequate sensitivity to detect an effect, if it exists. A possible way to do this is to calibrate stimulus strength. While calibrating stimulus strength is common in psychophysics, current calibration methods are not designed to find the maximal intensity in which the stimulus can still be rendered unconscious (i.e., find the upper subliminal threshold for each participant). Here, we demonstrate how calibration can be utilized to estimate, for each observer, this targeted threshold. We present a novel calibration procedure: the Subliminal Threshold Estimation Procedure (STEP), specifically designed for estimating the upper subliminal threshold for each individual. Using simulations, we showed that STEP outperforms existing calibration methods, which yielded strikingly low accuracy. We then further validated STEP using three empirical experiments. Together, these results establish STEP as highly beneficial for the study of unconscious processing.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"13"},"PeriodicalIF":3.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669343/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145653366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.3758/s13428-025-02870-5
S A Bögemann, F Krause, A van Kraaij, M A Marciniak, J M van Leeuwen, J Weermeijer, J Mituniewicz, L M C Puhlmann, M Zerban, Z C Reppmann, D Kobylińska, K S L Yuen, B Kleim, H Walter, I Myin-Germeys, R Kalisch, I M Veer, K Roelofs, E J Hermans
Stress-related disorders present a significant global burden, highlighting the need for effective, preventive measures. Mobile just-in-time adaptive interventions (JITAI) can be applied in real time and context-specifically, precisely when individuals need them most. Yet, they are rarely applied in stress research. This study introduces a novel approach by performing real-time analysis of both psychological and physiological data to trigger interventions during moments of high stress. We evaluated the feasibility of this JITAI algorithm, which integrates ecological momentary assessments (EMA) and ecological physiological assessments (EPA) to generate a stress score that triggers interventions in real time by relating the score to a personalized stress threshold. The feasibility of the technical implementation, participant adherence, and user experience were assessed within a multicenter study with 215 participants conducted across five research sites. The JITAI algorithm successfully processed EMA and EPA data to trigger real-time interventions. A total of 68% (standard deviation [SD] = 29%) of EMA beeps contained extracted EPA features, demonstrating technical feasibility. The algorithm triggered 1.61 (SD = 1.26) interventions per day, with 43% (SD = 27%) of EMA beeps per week leading to triggered interventions. Compliance rates of 43% (SD = 22%) for EMA and 43% (SD = 30%) for the JITAI were achieved, with feedback indicating areas for improvement, particularly for daily-life integration. Our findings provide preliminary support for the feasibility of the developed JITAI algorithm, demonstrating effective data processing and intervention triggering in real time, while also highlighting areas for improvement. Future research should focus on minimizing participant burden, including the intensity of EMA protocols, to improve participant adherence and acceptability while maintaining the benefits of real-time intervention delivery.
{"title":"Triggering just-in-time adaptive interventions based on real-time detection of daily-life stress: Methodological development and longitudinal multicenter evaluation.","authors":"S A Bögemann, F Krause, A van Kraaij, M A Marciniak, J M van Leeuwen, J Weermeijer, J Mituniewicz, L M C Puhlmann, M Zerban, Z C Reppmann, D Kobylińska, K S L Yuen, B Kleim, H Walter, I Myin-Germeys, R Kalisch, I M Veer, K Roelofs, E J Hermans","doi":"10.3758/s13428-025-02870-5","DOIUrl":"10.3758/s13428-025-02870-5","url":null,"abstract":"<p><p>Stress-related disorders present a significant global burden, highlighting the need for effective, preventive measures. Mobile just-in-time adaptive interventions (JITAI) can be applied in real time and context-specifically, precisely when individuals need them most. Yet, they are rarely applied in stress research. This study introduces a novel approach by performing real-time analysis of both psychological and physiological data to trigger interventions during moments of high stress. We evaluated the feasibility of this JITAI algorithm, which integrates ecological momentary assessments (EMA) and ecological physiological assessments (EPA) to generate a stress score that triggers interventions in real time by relating the score to a personalized stress threshold. The feasibility of the technical implementation, participant adherence, and user experience were assessed within a multicenter study with 215 participants conducted across five research sites. The JITAI algorithm successfully processed EMA and EPA data to trigger real-time interventions. A total of 68% (standard deviation [SD] = 29%) of EMA beeps contained extracted EPA features, demonstrating technical feasibility. The algorithm triggered 1.61 (SD = 1.26) interventions per day, with 43% (SD = 27%) of EMA beeps per week leading to triggered interventions. Compliance rates of 43% (SD = 22%) for EMA and 43% (SD = 30%) for the JITAI were achieved, with feedback indicating areas for improvement, particularly for daily-life integration. Our findings provide preliminary support for the feasibility of the developed JITAI algorithm, demonstrating effective data processing and intervention triggering in real time, while also highlighting areas for improvement. Future research should focus on minimizing participant burden, including the intensity of EMA protocols, to improve participant adherence and acceptability while maintaining the benefits of real-time intervention delivery.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"12"},"PeriodicalIF":3.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145653283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.3758/s13428-025-02876-z
Katrina May Dulay, Jelena Mirković, Margaret Mary Rosary Carmel Fua, Deeksha Prabhu, Sonali Nag
In this study, we present age-of-acquisition (AoA) ratings for 885 Kannada and Filipino words as a new resource for research and education purposes. Beyond this, we consider the methodological and theoretical considerations of measuring AoA in morphologically rich, specifically agglutinative, languages, to study child language acquisition. Parents, teachers, and experts provided subjective ratings of when they thought a child acquired each word. Results were generally consistent between the two languages. Mixed-effects models demonstrated that word characteristics, including parts-of-speech category, word length, and age band of first occurrence in a print corpus, were significantly related to AoA ratings, whereas rater characteristics, including participant type, age, gender, and number of languages spoken, had generally non-significant associations with AoA ratings. The number of morphemes was significantly associated with AoA ratings in some analyses; however, crosslinguistic differences in the directionality of the relationships suggested the need to investigate underlying drivers of morphological complexity such as morpheme frequency, transparency/consistency, and function. The age-of-acquisition ratings were internally reliable and demonstrated consistency with the first occurrences of words in print and known trends in child language research. The results demonstrate the potential of these resources and open new directions for AoA research in morphologically rich languages.
{"title":"Measurement of age-of-acquisition in morphologically rich languages: Insights from Kannada and Filipino.","authors":"Katrina May Dulay, Jelena Mirković, Margaret Mary Rosary Carmel Fua, Deeksha Prabhu, Sonali Nag","doi":"10.3758/s13428-025-02876-z","DOIUrl":"10.3758/s13428-025-02876-z","url":null,"abstract":"<p><p>In this study, we present age-of-acquisition (AoA) ratings for 885 Kannada and Filipino words as a new resource for research and education purposes. Beyond this, we consider the methodological and theoretical considerations of measuring AoA in morphologically rich, specifically agglutinative, languages, to study child language acquisition. Parents, teachers, and experts provided subjective ratings of when they thought a child acquired each word. Results were generally consistent between the two languages. Mixed-effects models demonstrated that word characteristics, including parts-of-speech category, word length, and age band of first occurrence in a print corpus, were significantly related to AoA ratings, whereas rater characteristics, including participant type, age, gender, and number of languages spoken, had generally non-significant associations with AoA ratings. The number of morphemes was significantly associated with AoA ratings in some analyses; however, crosslinguistic differences in the directionality of the relationships suggested the need to investigate underlying drivers of morphological complexity such as morpheme frequency, transparency/consistency, and function. The age-of-acquisition ratings were internally reliable and demonstrated consistency with the first occurrences of words in print and known trends in child language research. The results demonstrate the potential of these resources and open new directions for AoA research in morphologically rich languages.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"11"},"PeriodicalIF":3.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669312/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145653349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.3758/s13428-025-02866-1
Caroline Kuhne, Quentin F Gronau, Reilly J Innes, Gavin Cooper, Niek Stevenson, Jon-Paul Cavallaro, Scott D Brown, Guy E Hawkins
Estimating quantitative cognitive models from data is a staple of modern psychological science, but can be difficult and inefficient. Particle Metropolis within Gibbs (PMwG) is a robust and efficient sampling algorithm that supports model estimation in a hierarchical Bayesian framework. This tutorial shows how cognitive modeling can proceed efficiently using pmwg, a new open-source package for the R language. We step through implementing the pmwg package with simple signal detection theory models, to more complex cognitive models in which two tasks are jointly modeled together. Through this process, we also address questions of model adequacy and model selection, which must be solved in order to answer meaningful psychological questions. PMwG, and the pmwg package, has the potential to move the field of psychology ahead in new and interesting directions, and to resolve questions that were once too hard to answer with previously available sampling methods.
从数据中估计定量认知模型是现代心理科学的主要内容,但可能是困难和低效的。粒子Metropolis within Gibbs (PMwG)是一种鲁棒、高效的采样算法,支持分层贝叶斯框架下的模型估计。本教程展示了如何使用pmwg高效地进行认知建模,pmwg是R语言的一个新的开源包。我们从简单的信号检测理论模型逐步实现pmwg包,到更复杂的认知模型,其中两个任务联合建模在一起。通过这个过程,我们还解决了模型充分性和模型选择的问题,为了回答有意义的心理学问题,必须解决这些问题。PMwG和PMwG包有潜力将心理学领域推向新的和有趣的方向,并解决曾经难以用以前可用的抽样方法回答的问题。
{"title":"Hierarchical Bayesian estimation for cognitive models using Particle Metropolis within Gibbs (PMwG): A tutorial.","authors":"Caroline Kuhne, Quentin F Gronau, Reilly J Innes, Gavin Cooper, Niek Stevenson, Jon-Paul Cavallaro, Scott D Brown, Guy E Hawkins","doi":"10.3758/s13428-025-02866-1","DOIUrl":"https://doi.org/10.3758/s13428-025-02866-1","url":null,"abstract":"<p><p>Estimating quantitative cognitive models from data is a staple of modern psychological science, but can be difficult and inefficient. Particle Metropolis within Gibbs (PMwG) is a robust and efficient sampling algorithm that supports model estimation in a hierarchical Bayesian framework. This tutorial shows how cognitive modeling can proceed efficiently using pmwg, a new open-source package for the R language. We step through implementing the pmwg package with simple signal detection theory models, to more complex cognitive models in which two tasks are jointly modeled together. Through this process, we also address questions of model adequacy and model selection, which must be solved in order to answer meaningful psychological questions. PMwG, and the pmwg package, has the potential to move the field of psychology ahead in new and interesting directions, and to resolve questions that were once too hard to answer with previously available sampling methods.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"9"},"PeriodicalIF":3.9,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145601969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.3758/s13428-025-02879-w
Jens H Fünderich, Lukas J Beinhauer, Frank Renkewitz
Data from rating scales underlie very specific restrictions: They have a lower limit, an upper limit, and they only consist of a few integers. These characteristics produce particular dependencies between means and standard deviations. A mean that is a non-integer, for example, can never be associated with zero variability, while a mean equal to one of the scale's limits can only be associated with zero variability. The relationship can be described by umbrella plots for which we present a formalization. We use that formalization to explore implications for statistical power and for the relationship between heterogeneity in unstandardized and standardized effect sizes. The analysis illustrates that power is not only affected by the mean difference and sample size, but also by the position of a mean within the respective scale. Further, the umbrella restrictions of rating scales can impede interpretability of meta-analytic heterogeneity. Estimations of relative heterogeneity can diverge between unstandardized and standardized effects, raising questions about which of the two patterns of heterogeneity we would want to explain (for example, through moderators). We reanalyze data from the Many Labs projects to illustrate the issue and finally discuss the implications of our observations as well as ways to utilize these properties of rating scales. To facilitate in-depth exploration and practical application of our formalization, we developed the Shiny Umbrellas app, which is publicly available at https://www.apps.meta-rep.lmu.de/shiny_umbrellas/ .
{"title":"Under my umbrella: Rating scales obscure statistical power and effect size heterogeneity.","authors":"Jens H Fünderich, Lukas J Beinhauer, Frank Renkewitz","doi":"10.3758/s13428-025-02879-w","DOIUrl":"10.3758/s13428-025-02879-w","url":null,"abstract":"<p><p>Data from rating scales underlie very specific restrictions: They have a lower limit, an upper limit, and they only consist of a few integers. These characteristics produce particular dependencies between means and standard deviations. A mean that is a non-integer, for example, can never be associated with zero variability, while a mean equal to one of the scale's limits can only be associated with zero variability. The relationship can be described by umbrella plots for which we present a formalization. We use that formalization to explore implications for statistical power and for the relationship between heterogeneity in unstandardized and standardized effect sizes. The analysis illustrates that power is not only affected by the mean difference and sample size, but also by the position of a mean within the respective scale. Further, the umbrella restrictions of rating scales can impede interpretability of meta-analytic heterogeneity. Estimations of relative heterogeneity can diverge between unstandardized and standardized effects, raising questions about which of the two patterns of heterogeneity we would want to explain (for example, through moderators). We reanalyze data from the Many Labs projects to illustrate the issue and finally discuss the implications of our observations as well as ways to utilize these properties of rating scales. To facilitate in-depth exploration and practical application of our formalization, we developed the Shiny Umbrellas app, which is publicly available at https://www.apps.meta-rep.lmu.de/shiny_umbrellas/ .</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"5"},"PeriodicalIF":3.9,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12644166/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.3758/s13428-025-02875-0
Hao He, Yucheng Duan
This study explores expert-novice differences in anticipation under uncertainty by combining partially observable Markov decision process (POMDP) modeling with machine learning classification. Forty-eight participants (24 experts, 24 novices) completed a basketball pass/shot anticipation task. Through POMDP modeling, two cognitive parameters-sensory precision (SP) and prior belief (pB)-were extracted to capture internal decision processes. Results showed that experts fit the POMDP model more closely, requiring more iterations for parameter convergence and achieving higher pseudo R2 values than novices. Experts demonstrated significantly higher SP, indicating superior ability to filter key cues under noisy conditions. Their pB values remained closer to neutral, suggesting flexible reliance on prior knowledge. In contrast, novices exhibited more biased priors and a lower, more dispersed SP. Machine learning analyses revealed that SP and pB jointly formed distinct clusters for experts and novices in a two-dimensional parameter space, with classification accuracies exceeding 90% across multiple methods. These findings indicate that expertise entails both enhanced perceptual precision and adaptive prior calibration, reflecting deeper cognitive reorganization rather than simple skill increments. Our dual-parameter approach offers a model-based perspective on expert cognition and may inform future research on the multifaceted nature of expertise.
{"title":"Beyond performance: A POMDP-based machine learning framework for expert cognition.","authors":"Hao He, Yucheng Duan","doi":"10.3758/s13428-025-02875-0","DOIUrl":"https://doi.org/10.3758/s13428-025-02875-0","url":null,"abstract":"<p><p>This study explores expert-novice differences in anticipation under uncertainty by combining partially observable Markov decision process (POMDP) modeling with machine learning classification. Forty-eight participants (24 experts, 24 novices) completed a basketball pass/shot anticipation task. Through POMDP modeling, two cognitive parameters-sensory precision (SP) and prior belief (pB)-were extracted to capture internal decision processes. Results showed that experts fit the POMDP model more closely, requiring more iterations for parameter convergence and achieving higher pseudo R<sup>2</sup> values than novices. Experts demonstrated significantly higher SP, indicating superior ability to filter key cues under noisy conditions. Their pB values remained closer to neutral, suggesting flexible reliance on prior knowledge. In contrast, novices exhibited more biased priors and a lower, more dispersed SP. Machine learning analyses revealed that SP and pB jointly formed distinct clusters for experts and novices in a two-dimensional parameter space, with classification accuracies exceeding 90% across multiple methods. These findings indicate that expertise entails both enhanced perceptual precision and adaptive prior calibration, reflecting deeper cognitive reorganization rather than simple skill increments. Our dual-parameter approach offers a model-based perspective on expert cognition and may inform future research on the multifaceted nature of expertise.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"6"},"PeriodicalIF":3.9,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.3758/s13428-025-02901-1
Madeline Jarvis, Adam Vasarhelyi, Joe Anderson, Caitlyn Mulley, Ottmar V Lipp, Luke J Ney
The measurement of pupil size has become a topic of interest in psychology research over the past two decades due to its sensitivity to psychological processes such as arousal or cognitive load. However, pupil measurements have been limited by the necessity to conduct experiments in laboratory settings using high-quality and costly equipment. The current article describes the development and use of a jsPsych plugin and extension that incorporates an existing software that estimates pupil size using consumer-grade hardware, such as a webcam. We validated this new program (js-mEye) across two separate studies, which each manipulated screen luminance and color using a novel luminance task, as well as different levels of cognitive load using the N-back and the Stroop tasks. Changes in luminance and color produced significant changes in pupil size in the hypothesized direction. Changes in cognitive load induced in the N-back and Stroop tasks produced less clear findings; however, these findings were explained to some extent when participant engagement - indexed by task performance - was controlled for. Most importantly, all data were at least moderately correlated with data simultaneously recorded using an EyeLink 1000, suggesting that mEye was able to effectively substitute for a gold-standard eye-tracking device. This work presents an exciting future direction for pupillometry and, with further validation, may present a platform for measuring pupil size in online research studies, as well as in laboratory-based experiments that require minimal equipment.
{"title":"js-mEye: An extension and plugin for the measurement of pupil size in the online platform jsPsych.","authors":"Madeline Jarvis, Adam Vasarhelyi, Joe Anderson, Caitlyn Mulley, Ottmar V Lipp, Luke J Ney","doi":"10.3758/s13428-025-02901-1","DOIUrl":"https://doi.org/10.3758/s13428-025-02901-1","url":null,"abstract":"<p><p>The measurement of pupil size has become a topic of interest in psychology research over the past two decades due to its sensitivity to psychological processes such as arousal or cognitive load. However, pupil measurements have been limited by the necessity to conduct experiments in laboratory settings using high-quality and costly equipment. The current article describes the development and use of a jsPsych plugin and extension that incorporates an existing software that estimates pupil size using consumer-grade hardware, such as a webcam. We validated this new program (js-mEye) across two separate studies, which each manipulated screen luminance and color using a novel luminance task, as well as different levels of cognitive load using the N-back and the Stroop tasks. Changes in luminance and color produced significant changes in pupil size in the hypothesized direction. Changes in cognitive load induced in the N-back and Stroop tasks produced less clear findings; however, these findings were explained to some extent when participant engagement - indexed by task performance - was controlled for. Most importantly, all data were at least moderately correlated with data simultaneously recorded using an EyeLink 1000, suggesting that mEye was able to effectively substitute for a gold-standard eye-tracking device. This work presents an exciting future direction for pupillometry and, with further validation, may present a platform for measuring pupil size in online research studies, as well as in laboratory-based experiments that require minimal equipment.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"8"},"PeriodicalIF":3.9,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.3758/s13428-025-02865-2
Pablo Martínez-López, Antonio Vázquez-Millán, Francisco Garre-Frutos, David Luque
Animal research has shown that repeatedly performing a rewarded action leads to its transition into a habit-an inflexible response controlled by stimulus-response associations. Efforts to reproduce this principle in humans have yielded mixed results. Only two laboratory paradigms have demonstrated behavior habitualization following extensive instrumental training compared to minimal training: the forced-response task and the "aliens" outcome-devaluation task. These paradigms assess habitualization through distinct measures. The forced-response task focuses on the persistence of a trained response when a reversal is required, whereas the outcome-devaluation task measures reaction time switch costs-slowdowns in goal-directed responses conflicting with the trained habit. Although both measures have produced results consistent with the learning theory-showing stronger evidence of habits in overtrained conditions-their construct validity remains insufficiently established. In this study, participants completed 4 days of training in each paradigm. We replicated previous results in the forced-response task; in the outcome-devaluation task, a similar pattern emerged, observing the loss of a response speed advantage gained through training. We then examined the reliability of each measure and evaluated their convergent validity. Habitual responses in the forced-response task and reaction time switch costs in the outcome-devaluation task demonstrated good reliability, allowing us to assess whether individual differences remained stable. However, the two measures were not associated, providing no evidence of convergent validity. This suggests that these measures capture distinct aspects of the balance between habitual and goal-directed control. Our results highlight the need for further evaluation of the validity and reliability of current measures of habitual control in humans.
{"title":"Assessing the validity evidence for habit measures based on time pressure.","authors":"Pablo Martínez-López, Antonio Vázquez-Millán, Francisco Garre-Frutos, David Luque","doi":"10.3758/s13428-025-02865-2","DOIUrl":"https://doi.org/10.3758/s13428-025-02865-2","url":null,"abstract":"<p><p>Animal research has shown that repeatedly performing a rewarded action leads to its transition into a habit-an inflexible response controlled by stimulus-response associations. Efforts to reproduce this principle in humans have yielded mixed results. Only two laboratory paradigms have demonstrated behavior habitualization following extensive instrumental training compared to minimal training: the forced-response task and the \"aliens\" outcome-devaluation task. These paradigms assess habitualization through distinct measures. The forced-response task focuses on the persistence of a trained response when a reversal is required, whereas the outcome-devaluation task measures reaction time switch costs-slowdowns in goal-directed responses conflicting with the trained habit. Although both measures have produced results consistent with the learning theory-showing stronger evidence of habits in overtrained conditions-their construct validity remains insufficiently established. In this study, participants completed 4 days of training in each paradigm. We replicated previous results in the forced-response task; in the outcome-devaluation task, a similar pattern emerged, observing the loss of a response speed advantage gained through training. We then examined the reliability of each measure and evaluated their convergent validity. Habitual responses in the forced-response task and reaction time switch costs in the outcome-devaluation task demonstrated good reliability, allowing us to assess whether individual differences remained stable. However, the two measures were not associated, providing no evidence of convergent validity. This suggests that these measures capture distinct aspects of the balance between habitual and goal-directed control. Our results highlight the need for further evaluation of the validity and reliability of current measures of habitual control in humans.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"7"},"PeriodicalIF":3.9,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}