Pub Date : 2025-11-11DOI: 10.3758/s13428-025-02878-x
Inbal Kimchi, Sascha Schroeder, Noam Siegelman
The importance or centrality of a linguistic unit to a larger unit's meaning is known to affect reading behavior. However, there is an ongoing debate on how to quantify a unit's degree of importance or centrality, with previous quantifications using either subjective ratings or computational solutions with limited interpretability. Here we introduce a novel measure, which we term "informativeness", to assess the significance of a word to the meaning of the sentence in which it appears. Our measure is based on the comparison of vectorial representations of the full sentence with a revised sentence without the target word, resulting in an easily interpretable and objective quantification. We show that our new measure correlates in expected ways with other psycholinguistic variables (e.g., frequency, length, predictability), and, importantly, uniquely predicts eye-movement reading behavior in large-scale datasets of first (L1) and second language (L2) readers (from the Multilingual Eye-tracking Corpus, MECO). We also show that the effects of informativeness generalize to diverse writing systems, and are stronger for poorer than better readers. Together, our work provides new avenues for investigating informativeness effects, towards a deeper understanding of the way it impacts reading behavior.
{"title":"Quantifying word informativeness and its impact on eye-movement reading behavior: Cross-linguistic variability and individual differences.","authors":"Inbal Kimchi, Sascha Schroeder, Noam Siegelman","doi":"10.3758/s13428-025-02878-x","DOIUrl":"10.3758/s13428-025-02878-x","url":null,"abstract":"<p><p>The importance or centrality of a linguistic unit to a larger unit's meaning is known to affect reading behavior. However, there is an ongoing debate on how to quantify a unit's degree of importance or centrality, with previous quantifications using either subjective ratings or computational solutions with limited interpretability. Here we introduce a novel measure, which we term \"informativeness\", to assess the significance of a word to the meaning of the sentence in which it appears. Our measure is based on the comparison of vectorial representations of the full sentence with a revised sentence without the target word, resulting in an easily interpretable and objective quantification. We show that our new measure correlates in expected ways with other psycholinguistic variables (e.g., frequency, length, predictability), and, importantly, uniquely predicts eye-movement reading behavior in large-scale datasets of first (L1) and second language (L2) readers (from the Multilingual Eye-tracking Corpus, MECO). We also show that the effects of informativeness generalize to diverse writing systems, and are stronger for poorer than better readers. Together, our work provides new avenues for investigating informativeness effects, towards a deeper understanding of the way it impacts reading behavior.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"343"},"PeriodicalIF":3.9,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12605455/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145494451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.3758/s13428-025-02861-6
Nikola Sekulovski, Tessa F Blanken, Jonas M B Haslbeck, Maarten Marsman
Graphical models have become an important method for studying the network structure of multivariate psychological data. Accurate recovery of the underlying network structure is paramount and requires that the models are appropriate for the data at hand. Traditionally, Gaussian graphical models for continuous data and Ising models for binary data have dominated the literature. However, psychological research often relies on ordinal data from Likert scale items, creating a model-data mismatch. This paper examines the effect of dichotomizing ordinal variables on network recovery, as opposed to analyzing the data at its original level of measurement, using a Bayesian analysis of the ordinal Markov random field model. This model is implemented in the R package bgms. Our analysis shows that dichotomization results in a loss of information, which affects the accuracy of network recovery. This is particularly true when considering the interplay between the dichotomization cutoffs used and the distribution of the ordinal categories. In addition, we demonstrate a difference in accuracy when using dichotomized data, depending on whether edges are included or excluded in the true network, which highlights the effectiveness of the ordinal model in recovering conditional independence relationships. These findings underscore the importance of using models that deal directly with ordinal data to ensure more reliable and valid inferred network structures in psychological research.
{"title":"The impact of dichotomization on network recovery.","authors":"Nikola Sekulovski, Tessa F Blanken, Jonas M B Haslbeck, Maarten Marsman","doi":"10.3758/s13428-025-02861-6","DOIUrl":"10.3758/s13428-025-02861-6","url":null,"abstract":"<p><p>Graphical models have become an important method for studying the network structure of multivariate psychological data. Accurate recovery of the underlying network structure is paramount and requires that the models are appropriate for the data at hand. Traditionally, Gaussian graphical models for continuous data and Ising models for binary data have dominated the literature. However, psychological research often relies on ordinal data from Likert scale items, creating a model-data mismatch. This paper examines the effect of dichotomizing ordinal variables on network recovery, as opposed to analyzing the data at its original level of measurement, using a Bayesian analysis of the ordinal Markov random field model. This model is implemented in the R package bgms. Our analysis shows that dichotomization results in a loss of information, which affects the accuracy of network recovery. This is particularly true when considering the interplay between the dichotomization cutoffs used and the distribution of the ordinal categories. In addition, we demonstrate a difference in accuracy when using dichotomized data, depending on whether edges are included or excluded in the true network, which highlights the effectiveness of the ordinal model in recovering conditional independence relationships. These findings underscore the importance of using models that deal directly with ordinal data to ensure more reliable and valid inferred network structures in psychological research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"342"},"PeriodicalIF":3.9,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12605567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145494392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02867-0
Sebastián Muñoz, Vladimir Maksimenko, Bastian Henriquez-Jara, Prateek Bansal, Omar David Perez
Eye-tracking has gained considerable attention across multiple research domains. Recently, web-based eye-tracking has become feasible, demonstrating reliable performance in perceptual and cognitive tasks. However, its systematic evaluation in decision-making remains unknown. Here we compare a laboratory-based eye tracker (the EyeLink 1000 Plus) with a webcam-based method (WebGazer) across two discrete-choice experiments. We systematically manipulated display size to approximate common device classes (monitor, laptop, tablet, mobile) and task complexity (simple vs. complex choice matrices). We find that on larger displays and simpler tasks, WebGazer produces gaze patterns and parameter inferences from computational models of behavior comparable to EyeLink. However, reliability diminishes on smaller displays and with more complex choice matrices. These results provide the first systematic evaluation of web-based eye tracking for decision-making research and offer practical guidance regarding its viability for online behavioral studies.
{"title":"In-lab versus web-based eye-tracking in decision-making: A systematic comparison on multiple display-size conditions mimicking common electronic devices.","authors":"Sebastián Muñoz, Vladimir Maksimenko, Bastian Henriquez-Jara, Prateek Bansal, Omar David Perez","doi":"10.3758/s13428-025-02867-0","DOIUrl":"10.3758/s13428-025-02867-0","url":null,"abstract":"<p><p>Eye-tracking has gained considerable attention across multiple research domains. Recently, web-based eye-tracking has become feasible, demonstrating reliable performance in perceptual and cognitive tasks. However, its systematic evaluation in decision-making remains unknown. Here we compare a laboratory-based eye tracker (the EyeLink 1000 Plus) with a webcam-based method (WebGazer) across two discrete-choice experiments. We systematically manipulated display size to approximate common device classes (monitor, laptop, tablet, mobile) and task complexity (simple vs. complex choice matrices). We find that on larger displays and simpler tasks, WebGazer produces gaze patterns and parameter inferences from computational models of behavior comparable to EyeLink. However, reliability diminishes on smaller displays and with more complex choice matrices. These results provide the first systematic evaluation of web-based eye tracking for decision-making research and offer practical guidance regarding its viability for online behavioral studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"339"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02801-4
Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy
Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.
{"title":"A practice-oriented guide to statistical inference in linear modeling for non-normal or heteroskedastic error distributions.","authors":"Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy","doi":"10.3758/s13428-025-02801-4","DOIUrl":"10.3758/s13428-025-02801-4","url":null,"abstract":"<p><p>Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"338"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the "temporal" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library "Pupilla."
{"title":"Dimensionality reduction techniques in pupillometry research: A primer for behavioral scientists.","authors":"Serena Castellotti, Irene Petrizzo, Roberto Arrighi, Elvio Blini","doi":"10.3758/s13428-025-02786-0","DOIUrl":"10.3758/s13428-025-02786-0","url":null,"abstract":"<p><p>The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the \"temporal\" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library \"Pupilla.\"</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"337"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02852-7
Cameron S Kay
Several prior studies have used advanced methodological techniques to demonstrate that there is an issue with the quality of data that can be collected on Amazon's Mechanical Turk (MTurk). The goal of the present project was to provide an accessible demonstration of this issue. We administered 27 semantic antonyms-pairs of items that assess clearly contradictory content (e.g., "I talk a lot" and "I rarely talk")-to samples drawn from Connect (N1 = 100), Prolific (N2 = 100), and MTurk (N3 = 400; N4 = 600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only "high-productivity" and "high-reputation" participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.
{"title":"Why you shouldn't trust data collected on MTurk.","authors":"Cameron S Kay","doi":"10.3758/s13428-025-02852-7","DOIUrl":"10.3758/s13428-025-02852-7","url":null,"abstract":"<p><p>Several prior studies have used advanced methodological techniques to demonstrate that there is an issue with the quality of data that can be collected on Amazon's Mechanical Turk (MTurk). The goal of the present project was to provide an accessible demonstration of this issue. We administered 27 semantic antonyms-pairs of items that assess clearly contradictory content (e.g., \"I talk a lot\" and \"I rarely talk\")-to samples drawn from Connect (N<sub>1</sub> = 100), Prolific (N<sub>2</sub> = 100), and MTurk (N<sub>3</sub> = 400; N<sub>4</sub> = 600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only \"high-productivity\" and \"high-reputation\" participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"340"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02820-1
Jesse Boot, Jill de Ron, Jonas Haslbeck, Sacha Epskamp
In psychological studies, it is common practice to select a sample based on the sum score of the modeled variables (e.g., based on symptom severity when investigating the associations between those same symptoms). However, this practice introduces bias if the sum score selection imperfectly defines the population of interest. Here, we propose a correction for this type of selection bias in the Ising model, a popular network model for binary data. Possible applications of our correction are when one wants to obtain (1) full population estimates when only the sum score subset of the data is available, and (2) improved estimates of a subpopulation, if we observe a mixture of populations that differ from each other in the sum score. In a simulation study, we verify that our correction recovers the network structure of the desired population after a sum score selection using both a node-wise regression and a multivariate estimation of the Ising model. In an example, we show how our correction can be used in practice using empirical data on symptoms of major depression from the National Comorbidity Study Replication (N = 9,282). We implemented our correction in four commonly used R packages for estimating the Ising model, namely IsingFit, IsingSampler, psychonetrics, and bootnet.
{"title":"Correcting for selection bias after conditioning on a sum score in the Ising model.","authors":"Jesse Boot, Jill de Ron, Jonas Haslbeck, Sacha Epskamp","doi":"10.3758/s13428-025-02820-1","DOIUrl":"10.3758/s13428-025-02820-1","url":null,"abstract":"<p><p>In psychological studies, it is common practice to select a sample based on the sum score of the modeled variables (e.g., based on symptom severity when investigating the associations between those same symptoms). However, this practice introduces bias if the sum score selection imperfectly defines the population of interest. Here, we propose a correction for this type of selection bias in the Ising model, a popular network model for binary data. Possible applications of our correction are when one wants to obtain (1) full population estimates when only the sum score subset of the data is available, and (2) improved estimates of a subpopulation, if we observe a mixture of populations that differ from each other in the sum score. In a simulation study, we verify that our correction recovers the network structure of the desired population after a sum score selection using both a node-wise regression and a multivariate estimation of the Ising model. In an example, we show how our correction can be used in practice using empirical data on symptoms of major depression from the National Comorbidity Study Replication (N = 9,282). We implemented our correction in four commonly used R packages for estimating the Ising model, namely IsingFit, IsingSampler, psychonetrics, and bootnet.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"341"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.3758/s13428-025-02868-z
Yu Wang, Wen Qu
Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.
{"title":"A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research.","authors":"Yu Wang, Wen Qu","doi":"10.3758/s13428-025-02868-z","DOIUrl":"10.3758/s13428-025-02868-z","url":null,"abstract":"<p><p>Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"336"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.3758/s13428-025-02819-8
Nir Ofir, Ayelet N Landau
Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.
{"title":"A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects.","authors":"Nir Ofir, Ayelet N Landau","doi":"10.3758/s13428-025-02819-8","DOIUrl":"10.3758/s13428-025-02819-8","url":null,"abstract":"<p><p>Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"334"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12589256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.
{"title":"Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability.","authors":"Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao","doi":"10.3758/s13428-025-02853-6","DOIUrl":"10.3758/s13428-025-02853-6","url":null,"abstract":"<p><p>Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"335"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}