Pub Date : 2025-11-18DOI: 10.3758/s13428-025-02899-6
Sami Boudelaa, Manuel Carreiras, Nazrin Jariya, Manuel Perea
{"title":"Correction: SUBTLEX-AR: Arabic word distributional characteristics based on movie subtitles.","authors":"Sami Boudelaa, Manuel Carreiras, Nazrin Jariya, Manuel Perea","doi":"10.3758/s13428-025-02899-6","DOIUrl":"10.3758/s13428-025-02899-6","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"349"},"PeriodicalIF":3.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145547756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The self-reported Language and Social Background Questionnaire (LSBQ) measures an individual's language proficiency and usage quantitatively. This cross-sectional study aims to evaluate the psychometric properties of the LSBQ in the Persian (Farsi) language. A total of 325 adults aged between 15 and 59 years (mean age = 21.00 years, SD = 3.56; 251 females, 70 males) from Tabriz and Tehran participated in this study. To evaluate the Language and Social Background Questionnaire (LSBQ), exploratory factor analysis (EFA) was employed. The psychometric properties of the Persian LSBQ were assessed through various validity measures, as well as reliability analysis and receiver operating characteristic (ROC) curve analysis. The overall content validity ratio for the questionnaire was 0.98, with an impact score of 4.47. The internal consistency of the scale was satisfactory, with a Cronbach's alpha of 0.707. The EFA identified five key factors: "dominant language at home and community," "non-Persian use," "non-Persian proficiency," "Persian comprehension," and "switching". Using Youden's J criterion, an optimal cut-off points of - 1.00 was determined to effectively distinguish between monolinguals and non-monolinguals. To assess the convergent and discriminant validity of the instrument, Spearman's correlation was utilized to analyze the relationships among the variables. The Persian version of the LSBQ is a reliable and valid tool for assessing language proficiency and usage among Persian-speaking participants. It effectively distinguishes between monolingual and non-monolingual individuals. Researchers and clinicians can utilize the LSBQ effectively, provided it aligns with their specific research questions and the language experiences of their target population.
{"title":"Cross-cultural adaptation of the Language and Social Background Questionnaire: Psychometric properties emerging from the Persian version.","authors":"Mehri Maleki, Fatemeh Jahanjoo, Samin Shibafar, Gelavizh Karimijavan, Mohammad Hassan Torabi, Farnoush Jarollahi","doi":"10.3758/s13428-025-02831-y","DOIUrl":"10.3758/s13428-025-02831-y","url":null,"abstract":"<p><p>The self-reported Language and Social Background Questionnaire (LSBQ) measures an individual's language proficiency and usage quantitatively. This cross-sectional study aims to evaluate the psychometric properties of the LSBQ in the Persian (Farsi) language. A total of 325 adults aged between 15 and 59 years (mean age = 21.00 years, SD = 3.56; 251 females, 70 males) from Tabriz and Tehran participated in this study. To evaluate the Language and Social Background Questionnaire (LSBQ), exploratory factor analysis (EFA) was employed. The psychometric properties of the Persian LSBQ were assessed through various validity measures, as well as reliability analysis and receiver operating characteristic (ROC) curve analysis. The overall content validity ratio for the questionnaire was 0.98, with an impact score of 4.47. The internal consistency of the scale was satisfactory, with a Cronbach's alpha of 0.707. The EFA identified five key factors: \"dominant language at home and community,\" \"non-Persian use,\" \"non-Persian proficiency,\" \"Persian comprehension,\" and \"switching\". Using Youden's J criterion, an optimal cut-off points of - 1.00 was determined to effectively distinguish between monolinguals and non-monolinguals. To assess the convergent and discriminant validity of the instrument, Spearman's correlation was utilized to analyze the relationships among the variables. The Persian version of the LSBQ is a reliable and valid tool for assessing language proficiency and usage among Persian-speaking participants. It effectively distinguishes between monolingual and non-monolingual individuals. Researchers and clinicians can utilize the LSBQ effectively, provided it aligns with their specific research questions and the language experiences of their target population.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"346"},"PeriodicalIF":3.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145538883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-18DOI: 10.3758/s13428-025-02888-9
Hangqi Xu, Jiawei Xiong, Feiming Li
Insufficient effort response (IER) significantly compromises the quality of questionnaire data, affecting the validity of resulting inferences. Traditional methods for detecting IER often fail to adequately capture various types of IER or consider participants' internal state transitions. This study expanded the hidden Markov model for analyzing participants' response strategies by reconstructing response and response time (RT) models that target the identification of IER in the context of questionnaires. The method takes into account the characteristics of IER in terms of response and RT, with the aim of dynamically detecting various types of IER. The simulation study demonstrated that a modified hidden Markov model (M-HMM) effectively recovers parameters, with its detection sensitivity primarily influenced by the prevalence of IER, differences in RT distributions between insufficient and effortful responses, and variations in IER severity and type among participants. Utilizing the M-HMM to analyze empirical data allowed for a deeper understanding of IER occurrences and improved item quality assessment, offering valuable insights for practitioners.
{"title":"A modified hidden Markov model for detecting insufficient effort responses in questionnaires.","authors":"Hangqi Xu, Jiawei Xiong, Feiming Li","doi":"10.3758/s13428-025-02888-9","DOIUrl":"10.3758/s13428-025-02888-9","url":null,"abstract":"<p><p>Insufficient effort response (IER) significantly compromises the quality of questionnaire data, affecting the validity of resulting inferences. Traditional methods for detecting IER often fail to adequately capture various types of IER or consider participants' internal state transitions. This study expanded the hidden Markov model for analyzing participants' response strategies by reconstructing response and response time (RT) models that target the identification of IER in the context of questionnaires. The method takes into account the characteristics of IER in terms of response and RT, with the aim of dynamically detecting various types of IER. The simulation study demonstrated that a modified hidden Markov model (M-HMM) effectively recovers parameters, with its detection sensitivity primarily influenced by the prevalence of IER, differences in RT distributions between insufficient and effortful responses, and variations in IER severity and type among participants. Utilizing the M-HMM to analyze empirical data allowed for a deeper understanding of IER occurrences and improved item quality assessment, offering valuable insights for practitioners.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"347"},"PeriodicalIF":3.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145547778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-14DOI: 10.3758/s13428-025-02863-4
Jan Chromý, Markéta Ceháková, James Brand
Large behavioral datasets that provide detailed data on reading processes are valuable resources for a range of researchers working in linguistics, psychology and cognitive science. This paper presents the HeCz corpus, which comprises self-paced reading data for 1919 newspaper headlines (23,634 words) in Czech, with each headline being accompanied by a yes-no comprehension question, resulting in a rich dataset of reading times for each individual word and comprehension accuracy. The corpus is novel in terms of the sheer scale of data collection, with 1872 native Czech speakers, each reading approximately 120 headlines, with 1162 of those participants also completing the experiment again in a re-testing session using the same stimuli approximately 1 month later. There is participant level meta-data also available relating to basic demographic information, reading habits and a profile of their mood state prior to completing the experiment. Beyond the behavioral and demographic data, we also include a range of linguistic annotations for several variables, e.g., frequency, surprisal, morphological tagging. To better understand how these variables might impact processing, we present exploratory analyses where we predicted the reading times for words, with the results indicating important roles for linguistic, demographic, and methodological variables. Given the range of multidisciplinary applications of the HeCz corpus, we hope that it will provide a valuable and unprecedented resource for a range of research applications related to reading processes.
{"title":"The HeCz corpus: A large, richly annotated reading corpus of newspaper headlines in Czech.","authors":"Jan Chromý, Markéta Ceháková, James Brand","doi":"10.3758/s13428-025-02863-4","DOIUrl":"10.3758/s13428-025-02863-4","url":null,"abstract":"<p><p>Large behavioral datasets that provide detailed data on reading processes are valuable resources for a range of researchers working in linguistics, psychology and cognitive science. This paper presents the HeCz corpus, which comprises self-paced reading data for 1919 newspaper headlines (23,634 words) in Czech, with each headline being accompanied by a yes-no comprehension question, resulting in a rich dataset of reading times for each individual word and comprehension accuracy. The corpus is novel in terms of the sheer scale of data collection, with 1872 native Czech speakers, each reading approximately 120 headlines, with 1162 of those participants also completing the experiment again in a re-testing session using the same stimuli approximately 1 month later. There is participant level meta-data also available relating to basic demographic information, reading habits and a profile of their mood state prior to completing the experiment. Beyond the behavioral and demographic data, we also include a range of linguistic annotations for several variables, e.g., frequency, surprisal, morphological tagging. To better understand how these variables might impact processing, we present exploratory analyses where we predicted the reading times for words, with the results indicating important roles for linguistic, demographic, and methodological variables. Given the range of multidisciplinary applications of the HeCz corpus, we hope that it will provide a valuable and unprecedented resource for a range of research applications related to reading processes.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"345"},"PeriodicalIF":3.9,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12618343/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145523017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-14DOI: 10.3758/s13428-025-02885-y
Shiyu Zhang, John J Dziak, Lizbeth Benson, Jamie R T Yap, Dusti R Jones, Cho Y Lam, Lindsey N Potter, David W Wetter, Inbal Nahum-Shani
The vision of leveraging digital technologies to deliver real-time psychological interventions in everyday settings is realized via just-in-time adaptive interventions (JITAI) - an intervention design that guides the use of rapidly changing information about a person's internal states and contexts to decide whether and how to intervene in daily life. Microrandomized trials (MRTs) were developed as an experimental design to address scientific questions about how to best construct JITAIs, enabling scientists to investigate whether, what type, and under what conditions, intervention delivery can promote behavior change. However, missing data present challenges to the ability of MRTs to inform the development of JITAIs. This article articulates the multiple sources of missing data that can manifest in MRT studies, discusses how such missing data can impact (1) bias, (2) variance, and (3) the future implementation of JITAIs, and discusses strategies for both minimizing missing data in an MRT design and handling missing data when they occur. The overarching goal is to provide a conceptual framework that will guide future investigators in anticipating missing data and making informed decisions to manage them. Throughout, we illustrate concepts using existing data from the Mobile Assistance for Regulating Smoking (MARS) study. MARS (n = 99) involved a 10-day MRT that included up to six randomizations per person per day.
{"title":"Missing data in microrandomized trials: Challenges and opportunities.","authors":"Shiyu Zhang, John J Dziak, Lizbeth Benson, Jamie R T Yap, Dusti R Jones, Cho Y Lam, Lindsey N Potter, David W Wetter, Inbal Nahum-Shani","doi":"10.3758/s13428-025-02885-y","DOIUrl":"10.3758/s13428-025-02885-y","url":null,"abstract":"<p><p>The vision of leveraging digital technologies to deliver real-time psychological interventions in everyday settings is realized via just-in-time adaptive interventions (JITAI) - an intervention design that guides the use of rapidly changing information about a person's internal states and contexts to decide whether and how to intervene in daily life. Microrandomized trials (MRTs) were developed as an experimental design to address scientific questions about how to best construct JITAIs, enabling scientists to investigate whether, what type, and under what conditions, intervention delivery can promote behavior change. However, missing data present challenges to the ability of MRTs to inform the development of JITAIs. This article articulates the multiple sources of missing data that can manifest in MRT studies, discusses how such missing data can impact (1) bias, (2) variance, and (3) the future implementation of JITAIs, and discusses strategies for both minimizing missing data in an MRT design and handling missing data when they occur. The overarching goal is to provide a conceptual framework that will guide future investigators in anticipating missing data and making informed decisions to manage them. Throughout, we illustrate concepts using existing data from the Mobile Assistance for Regulating Smoking (MARS) study. MARS (n = 99) involved a 10-day MRT that included up to six randomizations per person per day.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"344"},"PeriodicalIF":3.9,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12618347/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145522996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.3758/s13428-025-02878-x
Inbal Kimchi, Sascha Schroeder, Noam Siegelman
The importance or centrality of a linguistic unit to a larger unit's meaning is known to affect reading behavior. However, there is an ongoing debate on how to quantify a unit's degree of importance or centrality, with previous quantifications using either subjective ratings or computational solutions with limited interpretability. Here we introduce a novel measure, which we term "informativeness", to assess the significance of a word to the meaning of the sentence in which it appears. Our measure is based on the comparison of vectorial representations of the full sentence with a revised sentence without the target word, resulting in an easily interpretable and objective quantification. We show that our new measure correlates in expected ways with other psycholinguistic variables (e.g., frequency, length, predictability), and, importantly, uniquely predicts eye-movement reading behavior in large-scale datasets of first (L1) and second language (L2) readers (from the Multilingual Eye-tracking Corpus, MECO). We also show that the effects of informativeness generalize to diverse writing systems, and are stronger for poorer than better readers. Together, our work provides new avenues for investigating informativeness effects, towards a deeper understanding of the way it impacts reading behavior.
{"title":"Quantifying word informativeness and its impact on eye-movement reading behavior: Cross-linguistic variability and individual differences.","authors":"Inbal Kimchi, Sascha Schroeder, Noam Siegelman","doi":"10.3758/s13428-025-02878-x","DOIUrl":"10.3758/s13428-025-02878-x","url":null,"abstract":"<p><p>The importance or centrality of a linguistic unit to a larger unit's meaning is known to affect reading behavior. However, there is an ongoing debate on how to quantify a unit's degree of importance or centrality, with previous quantifications using either subjective ratings or computational solutions with limited interpretability. Here we introduce a novel measure, which we term \"informativeness\", to assess the significance of a word to the meaning of the sentence in which it appears. Our measure is based on the comparison of vectorial representations of the full sentence with a revised sentence without the target word, resulting in an easily interpretable and objective quantification. We show that our new measure correlates in expected ways with other psycholinguistic variables (e.g., frequency, length, predictability), and, importantly, uniquely predicts eye-movement reading behavior in large-scale datasets of first (L1) and second language (L2) readers (from the Multilingual Eye-tracking Corpus, MECO). We also show that the effects of informativeness generalize to diverse writing systems, and are stronger for poorer than better readers. Together, our work provides new avenues for investigating informativeness effects, towards a deeper understanding of the way it impacts reading behavior.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"343"},"PeriodicalIF":3.9,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12605455/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145494451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.3758/s13428-025-02861-6
Nikola Sekulovski, Tessa F Blanken, Jonas M B Haslbeck, Maarten Marsman
Graphical models have become an important method for studying the network structure of multivariate psychological data. Accurate recovery of the underlying network structure is paramount and requires that the models are appropriate for the data at hand. Traditionally, Gaussian graphical models for continuous data and Ising models for binary data have dominated the literature. However, psychological research often relies on ordinal data from Likert scale items, creating a model-data mismatch. This paper examines the effect of dichotomizing ordinal variables on network recovery, as opposed to analyzing the data at its original level of measurement, using a Bayesian analysis of the ordinal Markov random field model. This model is implemented in the R package bgms. Our analysis shows that dichotomization results in a loss of information, which affects the accuracy of network recovery. This is particularly true when considering the interplay between the dichotomization cutoffs used and the distribution of the ordinal categories. In addition, we demonstrate a difference in accuracy when using dichotomized data, depending on whether edges are included or excluded in the true network, which highlights the effectiveness of the ordinal model in recovering conditional independence relationships. These findings underscore the importance of using models that deal directly with ordinal data to ensure more reliable and valid inferred network structures in psychological research.
{"title":"The impact of dichotomization on network recovery.","authors":"Nikola Sekulovski, Tessa F Blanken, Jonas M B Haslbeck, Maarten Marsman","doi":"10.3758/s13428-025-02861-6","DOIUrl":"10.3758/s13428-025-02861-6","url":null,"abstract":"<p><p>Graphical models have become an important method for studying the network structure of multivariate psychological data. Accurate recovery of the underlying network structure is paramount and requires that the models are appropriate for the data at hand. Traditionally, Gaussian graphical models for continuous data and Ising models for binary data have dominated the literature. However, psychological research often relies on ordinal data from Likert scale items, creating a model-data mismatch. This paper examines the effect of dichotomizing ordinal variables on network recovery, as opposed to analyzing the data at its original level of measurement, using a Bayesian analysis of the ordinal Markov random field model. This model is implemented in the R package bgms. Our analysis shows that dichotomization results in a loss of information, which affects the accuracy of network recovery. This is particularly true when considering the interplay between the dichotomization cutoffs used and the distribution of the ordinal categories. In addition, we demonstrate a difference in accuracy when using dichotomized data, depending on whether edges are included or excluded in the true network, which highlights the effectiveness of the ordinal model in recovering conditional independence relationships. These findings underscore the importance of using models that deal directly with ordinal data to ensure more reliable and valid inferred network structures in psychological research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"342"},"PeriodicalIF":3.9,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12605567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145494392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02867-0
Sebastián Muñoz, Vladimir Maksimenko, Bastian Henriquez-Jara, Prateek Bansal, Omar David Perez
Eye-tracking has gained considerable attention across multiple research domains. Recently, web-based eye-tracking has become feasible, demonstrating reliable performance in perceptual and cognitive tasks. However, its systematic evaluation in decision-making remains unknown. Here we compare a laboratory-based eye tracker (the EyeLink 1000 Plus) with a webcam-based method (WebGazer) across two discrete-choice experiments. We systematically manipulated display size to approximate common device classes (monitor, laptop, tablet, mobile) and task complexity (simple vs. complex choice matrices). We find that on larger displays and simpler tasks, WebGazer produces gaze patterns and parameter inferences from computational models of behavior comparable to EyeLink. However, reliability diminishes on smaller displays and with more complex choice matrices. These results provide the first systematic evaluation of web-based eye tracking for decision-making research and offer practical guidance regarding its viability for online behavioral studies.
{"title":"In-lab versus web-based eye-tracking in decision-making: A systematic comparison on multiple display-size conditions mimicking common electronic devices.","authors":"Sebastián Muñoz, Vladimir Maksimenko, Bastian Henriquez-Jara, Prateek Bansal, Omar David Perez","doi":"10.3758/s13428-025-02867-0","DOIUrl":"10.3758/s13428-025-02867-0","url":null,"abstract":"<p><p>Eye-tracking has gained considerable attention across multiple research domains. Recently, web-based eye-tracking has become feasible, demonstrating reliable performance in perceptual and cognitive tasks. However, its systematic evaluation in decision-making remains unknown. Here we compare a laboratory-based eye tracker (the EyeLink 1000 Plus) with a webcam-based method (WebGazer) across two discrete-choice experiments. We systematically manipulated display size to approximate common device classes (monitor, laptop, tablet, mobile) and task complexity (simple vs. complex choice matrices). We find that on larger displays and simpler tasks, WebGazer produces gaze patterns and parameter inferences from computational models of behavior comparable to EyeLink. However, reliability diminishes on smaller displays and with more complex choice matrices. These results provide the first systematic evaluation of web-based eye tracking for decision-making research and offer practical guidance regarding its viability for online behavioral studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"339"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02801-4
Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy
Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.
{"title":"A practice-oriented guide to statistical inference in linear modeling for non-normal or heteroskedastic error distributions.","authors":"Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy","doi":"10.3758/s13428-025-02801-4","DOIUrl":"10.3758/s13428-025-02801-4","url":null,"abstract":"<p><p>Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"338"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the "temporal" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library "Pupilla."
{"title":"Dimensionality reduction techniques in pupillometry research: A primer for behavioral scientists.","authors":"Serena Castellotti, Irene Petrizzo, Roberto Arrighi, Elvio Blini","doi":"10.3758/s13428-025-02786-0","DOIUrl":"10.3758/s13428-025-02786-0","url":null,"abstract":"<p><p>The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the \"temporal\" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library \"Pupilla.\"</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"337"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}