Pub Date : 2024-10-01Epub Date: 2024-07-10DOI: 10.3758/s13428-024-02464-7
Haoran Li
Generalized linear mixed models (GLMMs) have great potential to deal with count data in single-case experimental designs (SCEDs). However, applied researchers have faced challenges in making various statistical decisions when using such advanced statistical techniques in their own research. This study focused on a critical issue by investigating the selection of an appropriate distribution to handle different types of count data in SCEDs due to overdispersion and/or zero-inflation. To achieve this, I proposed two model selection frameworks, one based on calculating information criteria (AIC and BIC) and another based on utilizing a multistage-model selection procedure. Four data scenarios were simulated including Poisson, negative binominal (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB). The same set of models (i.e., Poisson, NB, ZIP, and ZINB) were fitted for each scenario. In the simulation, I evaluated 10 model selection strategies within the two frameworks by assessing the model selection bias and its consequences on the accuracy of the treatment effect estimates and inferential statistics. Based on the simulation results and previous work, I provide recommendations regarding which model selection methods should be adopted in different scenarios. The implications, limitations, and future research directions are also discussed.
{"title":"Model selection of GLMMs in the analysis of count data in single-case studies: A Monte Carlo simulation.","authors":"Haoran Li","doi":"10.3758/s13428-024-02464-7","DOIUrl":"10.3758/s13428-024-02464-7","url":null,"abstract":"<p><p>Generalized linear mixed models (GLMMs) have great potential to deal with count data in single-case experimental designs (SCEDs). However, applied researchers have faced challenges in making various statistical decisions when using such advanced statistical techniques in their own research. This study focused on a critical issue by investigating the selection of an appropriate distribution to handle different types of count data in SCEDs due to overdispersion and/or zero-inflation. To achieve this, I proposed two model selection frameworks, one based on calculating information criteria (AIC and BIC) and another based on utilizing a multistage-model selection procedure. Four data scenarios were simulated including Poisson, negative binominal (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB). The same set of models (i.e., Poisson, NB, ZIP, and ZINB) were fitted for each scenario. In the simulation, I evaluated 10 model selection strategies within the two frameworks by assessing the model selection bias and its consequences on the accuracy of the treatment effect estimates and inferential statistics. Based on the simulation results and previous work, I provide recommendations regarding which model selection methods should be adopted in different scenarios. The implications, limitations, and future research directions are also discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141578887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-15DOI: 10.3758/s13428-024-02420-5
Frouke Hermens
Observational studies of human behaviour often require the annotation of objects in video recordings. Automatic object detection has been facilitated strongly by the development of YOLO ('you only look once') and particularly by YOLOv8 from Ultralytics, which is easy to use. The present study examines the conditions required for accurate object detection with YOLOv8. The results show almost perfect object detection even when the model was trained on a small dataset (100 to 350 images). The detector, however, does not extrapolate well to the same object in other backgrounds. By training the detector on images from a variety of backgrounds, excellent object detection can be restored. YOLOv8 could be a game changer for behavioural research that requires object annotation in video recordings.
{"title":"Automatic object detection for behavioural research using YOLOv8.","authors":"Frouke Hermens","doi":"10.3758/s13428-024-02420-5","DOIUrl":"10.3758/s13428-024-02420-5","url":null,"abstract":"<p><p>Observational studies of human behaviour often require the annotation of objects in video recordings. Automatic object detection has been facilitated strongly by the development of YOLO ('you only look once') and particularly by YOLOv8 from Ultralytics, which is easy to use. The present study examines the conditions required for accurate object detection with YOLOv8. The results show almost perfect object detection even when the model was trained on a small dataset (100 to 350 images). The detector, however, does not extrapolate well to the same object in other backgrounds. By training the detector on images from a variety of backgrounds, excellent object detection can be restored. YOLOv8 could be a game changer for behavioural research that requires object annotation in video recordings.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140943373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-06-11DOI: 10.3758/s13428-024-02414-3
Nadine Fitzpatrick, Caroline Floccia
Investigating how infants first establish relationships between words is a necessary step towards understanding how an interconnected network of semantic relationships develops in the adult lexical-semantic system. Stimuli selection for these child studies is critical since words must be both familiar and highly imageable. However, there has been a reliance on adult word association norms to inform stimuli selection in English infant studies to date, as no resource currently exists for child-specific word associations. We present three experiments that explore the strength of word-word relationships in 3-year-olds. Experiment 1 collected children's word associations (WA) (N = 150; female = 84, L1 = British English) and compared them to adult associative norms (Moss & Older, 1996; Nelson et al., 2004 (Behavior Research Methods, Instruments, & Computers, 36(3), 402-407)). Experiment 2 replicated WAs from Experiment 1 in an online adaptation of the task (N = 24: 13 female, L1 = British English). Both experiments indicated a high proportion of child-specific WAs not represented in adult norms (Moss & Older, 1996; Nelson et al., 2004 (Behavior Research Methods, Instruments, & Computers, 36(3), 402-407)). Experiment 3 tested noun-noun WAs from these responses in an online semantic priming study (N = 40: 19 female, L1 = British English) and found that association type modulated priming (F(2.57, 100.1) = 13.13, p <. 0001, generalized η2 = .19). This research presents a resource of child-specific imageable noun-noun word pair stimuli suitable for testing young children in word recognition and semantic priming studies.
{"title":"Comparing child word associations to adult associative norms: Evidence for child-specific associations with a strong priming effect in 3-year-olds.","authors":"Nadine Fitzpatrick, Caroline Floccia","doi":"10.3758/s13428-024-02414-3","DOIUrl":"10.3758/s13428-024-02414-3","url":null,"abstract":"<p><p>Investigating how infants first establish relationships between words is a necessary step towards understanding how an interconnected network of semantic relationships develops in the adult lexical-semantic system. Stimuli selection for these child studies is critical since words must be both familiar and highly imageable. However, there has been a reliance on adult word association norms to inform stimuli selection in English infant studies to date, as no resource currently exists for child-specific word associations. We present three experiments that explore the strength of word-word relationships in 3-year-olds. Experiment 1 collected children's word associations (WA) (N = 150; female = 84, L1 = British English) and compared them to adult associative norms (Moss & Older, 1996; Nelson et al., 2004 (Behavior Research Methods, Instruments, & Computers, 36(3), 402-407)). Experiment 2 replicated WAs from Experiment 1 in an online adaptation of the task (N = 24: 13 female, L1 = British English). Both experiments indicated a high proportion of child-specific WAs not represented in adult norms (Moss & Older, 1996; Nelson et al., 2004 (Behavior Research Methods, Instruments, & Computers, 36(3), 402-407)). Experiment 3 tested noun-noun WAs from these responses in an online semantic priming study (N = 40: 19 female, L1 = British English) and found that association type modulated priming (F(2.57, 100.1) = 13.13, p <. 0001, generalized η<sup>2</sup> = .19). This research presents a resource of child-specific imageable noun-noun word pair stimuli suitable for testing young children in word recognition and semantic priming studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362254/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141305297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-07-25DOI: 10.3758/s13428-024-02468-3
Debora de Chiusole, Umberto Granziol, Andrea Spoto, Luca Stefanutti
Indexes for estimating the overall reliability of a test in the framework of knowledge space theory (KST) are proposed and analyzed. First, the possibility of applying in KST the existing classical test theory (CTT) methods, based on the ratio between the true score variance and the total variance of the measure, has been explored. However, these methods are not suitable because in KST error and true score are not independent. Therefore, two new indexes based on the concepts of entropy and conditional entropy are developed. One index is used to estimate the reliability of the response pattern given the knowledge state, while the second one refers to the reliability of the estimated knowledge state of a person. Some theoretical considerations as well as simulations and an empirical example on real data are provided within a study of the behavior of these indexes under a certain number of different conditions.
{"title":"Reliability of a probabilistic knowledge structure.","authors":"Debora de Chiusole, Umberto Granziol, Andrea Spoto, Luca Stefanutti","doi":"10.3758/s13428-024-02468-3","DOIUrl":"10.3758/s13428-024-02468-3","url":null,"abstract":"<p><p>Indexes for estimating the overall reliability of a test in the framework of knowledge space theory (KST) are proposed and analyzed. First, the possibility of applying in KST the existing classical test theory (CTT) methods, based on the ratio between the true score variance and the total variance of the measure, has been explored. However, these methods are not suitable because in KST error and true score are not independent. Therefore, two new indexes based on the concepts of entropy and conditional entropy are developed. One index is used to estimate the reliability of the response pattern given the knowledge state, while the second one refers to the reliability of the estimated knowledge state of a person. Some theoretical considerations as well as simulations and an empirical example on real data are provided within a study of the behavior of these indexes under a certain number of different conditions.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141765069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-02DOI: 10.3758/s13428-024-02458-5
Alice Xu, Ji Y Son, Catherine M Sandhofer
This paper introduces A Library for Innovative Category Exemplars (ALICE) database, a resource that enhances research efficiency in cognitive and developmental studies by providing printable 3D objects representing 30 novel categories. Our research consists of three experiments to validate the novelty and complexity of the objects in ALICE. Experiment 1 assessed the novelty of objects through adult participants' subjective familiarity ratings and agreement on object naming and descriptions. The results confirm the general novelty of the objects. Experiment 2 employed multidimensional scaling (MDS) to analyze perceived similarities between objects, revealing a three-dimensional structure based solely on shape, indicative of their complexity. Experiment 3 used two clustering techniques to categorize objects: k-means clustering for creating nonoverlapping global categories, and hierarchical clustering for allowing global categories that overlap and have a hierarchical structure. Through stability tests, we verified the robustness of each clustering method and observed a moderate to good consensus between them, affirming the strength of our dual approach in effectively and accurately delineating meaningful object categories. By offering easy access to customizable novel stimuli, ALICE provides a practical solution to the challenges of creating novel physical objects for experimental purposes.
{"title":"A library for innovative category exemplars (ALICE) database: Streamlining research with printable 3D novel objects.","authors":"Alice Xu, Ji Y Son, Catherine M Sandhofer","doi":"10.3758/s13428-024-02458-5","DOIUrl":"10.3758/s13428-024-02458-5","url":null,"abstract":"<p><p>This paper introduces A Library for Innovative Category Exemplars (ALICE) database, a resource that enhances research efficiency in cognitive and developmental studies by providing printable 3D objects representing 30 novel categories. Our research consists of three experiments to validate the novelty and complexity of the objects in ALICE. Experiment 1 assessed the novelty of objects through adult participants' subjective familiarity ratings and agreement on object naming and descriptions. The results confirm the general novelty of the objects. Experiment 2 employed multidimensional scaling (MDS) to analyze perceived similarities between objects, revealing a three-dimensional structure based solely on shape, indicative of their complexity. Experiment 3 used two clustering techniques to categorize objects: k-means clustering for creating nonoverlapping global categories, and hierarchical clustering for allowing global categories that overlap and have a hierarchical structure. Through stability tests, we verified the robustness of each clustering method and observed a moderate to good consensus between them, affirming the strength of our dual approach in effectively and accurately delineating meaningful object categories. By offering easy access to customizable novel stimuli, ALICE provides a practical solution to the challenges of creating novel physical objects for experimental purposes.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362262/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141874023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-03-19DOI: 10.3758/s13428-024-02365-9
Ruoxuan Li, Lijuan Wang
Causal-formative indicators are often used in social science research. To achieve identification in causal-formative indicator modeling, constraints need to be applied. A conventional method is to constrain the weight of a formative indicator to be 1. The selection of which indicator to have the fixed weight, however, may influence statistical inferences of the structural path coefficients from the causal-formative construct to outcomes. Another conventional method is to use equal weights (e.g., 1) and assumes that all indicators equally contribute to the latent construct, which can be a strong assumption. To address the limitations of the conventional methods, we proposed an alternative constraint method, in which the sum of the weights is constrained to be a constant. We analytically studied the relations and interpretations of structural path coefficients from the constraint methods, and the results showed that the proposed method yields better interpretations of path coefficients. Simulation studies were conducted to compare the performance of the weight constraint methods in causal-formative indicator modeling with one or two outcomes. Results showed that higher biases in the path coefficient estimates were observed from the conventional methods compared to the proposed method. The proposed method had ignorable bias and satisfactory coverage rates in the studied conditions. This study emphasizes the importance of using an appropriate weight constraint method in causal-formative indicator modeling.
{"title":"Investigating weight constraint methods for causal-formative indicator modeling.","authors":"Ruoxuan Li, Lijuan Wang","doi":"10.3758/s13428-024-02365-9","DOIUrl":"10.3758/s13428-024-02365-9","url":null,"abstract":"<p><p>Causal-formative indicators are often used in social science research. To achieve identification in causal-formative indicator modeling, constraints need to be applied. A conventional method is to constrain the weight of a formative indicator to be 1. The selection of which indicator to have the fixed weight, however, may influence statistical inferences of the structural path coefficients from the causal-formative construct to outcomes. Another conventional method is to use equal weights (e.g., 1) and assumes that all indicators equally contribute to the latent construct, which can be a strong assumption. To address the limitations of the conventional methods, we proposed an alternative constraint method, in which the sum of the weights is constrained to be a constant. We analytically studied the relations and interpretations of structural path coefficients from the constraint methods, and the results showed that the proposed method yields better interpretations of path coefficients. Simulation studies were conducted to compare the performance of the weight constraint methods in causal-formative indicator modeling with one or two outcomes. Results showed that higher biases in the path coefficient estimates were observed from the conventional methods compared to the proposed method. The proposed method had ignorable bias and satisfactory coverage rates in the studied conditions. This study emphasizes the importance of using an appropriate weight constraint method in causal-formative indicator modeling.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-03-25DOI: 10.3758/s13428-024-02388-2
Stefan Schneider, Raymond Hernandez, Doerte U Junghaenel, Haomiao Jin, Pey-Jiuan Lee, Hongxin Gao, Danny Maupin, Bart Orriens, Erik Meijer, Arthur A Stone
Questionnaires are ever present in survey research. In this study, we examined whether an indirect indicator of general cognitive ability could be developed based on response patterns in questionnaires. We drew on two established phenomena characterizing connections between cognitive ability and people's performance on basic cognitive tasks, and examined whether they apply to questionnaires responses. (1) The worst performance rule (WPR) states that people's worst performance on multiple sequential tasks is more indicative of their cognitive ability than their average or best performance. (2) The task complexity hypothesis (TCH) suggests that relationships between cognitive ability and performance increase with task complexity. We conceptualized items of a questionnaire as a series of cognitively demanding tasks. A graded response model was used to estimate respondents' performance for each item based on the difference between the observed and model-predicted response ("response error" scores). Analyzing data from 102 items (21 questionnaires) collected from a large-scale nationally representative sample of people aged 50+ years, we found robust associations of cognitive ability with a person's largest but not with their smallest response error scores (supporting the WPR), and stronger associations of cognitive ability with response errors for more complex than for less complex questions (supporting the TCH). Results replicated across two independent samples and six assessment waves. A latent variable of response errors estimated for the most complex items correlated .50 with a latent cognitive ability factor, suggesting that response patterns can be utilized to extract a rough indicator of general cognitive ability in survey research.
{"title":"Can you tell people's cognitive ability level from their response patterns in questionnaires?","authors":"Stefan Schneider, Raymond Hernandez, Doerte U Junghaenel, Haomiao Jin, Pey-Jiuan Lee, Hongxin Gao, Danny Maupin, Bart Orriens, Erik Meijer, Arthur A Stone","doi":"10.3758/s13428-024-02388-2","DOIUrl":"10.3758/s13428-024-02388-2","url":null,"abstract":"<p><p>Questionnaires are ever present in survey research. In this study, we examined whether an indirect indicator of general cognitive ability could be developed based on response patterns in questionnaires. We drew on two established phenomena characterizing connections between cognitive ability and people's performance on basic cognitive tasks, and examined whether they apply to questionnaires responses. (1) The worst performance rule (WPR) states that people's worst performance on multiple sequential tasks is more indicative of their cognitive ability than their average or best performance. (2) The task complexity hypothesis (TCH) suggests that relationships between cognitive ability and performance increase with task complexity. We conceptualized items of a questionnaire as a series of cognitively demanding tasks. A graded response model was used to estimate respondents' performance for each item based on the difference between the observed and model-predicted response (\"response error\" scores). Analyzing data from 102 items (21 questionnaires) collected from a large-scale nationally representative sample of people aged 50+ years, we found robust associations of cognitive ability with a person's largest but not with their smallest response error scores (supporting the WPR), and stronger associations of cognitive ability with response errors for more complex than for less complex questions (supporting the TCH). Results replicated across two independent samples and six assessment waves. A latent variable of response errors estimated for the most complex items correlated .50 with a latent cognitive ability factor, suggesting that response patterns can be utilized to extract a rough indicator of general cognitive ability in survey research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362444/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140288151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-06DOI: 10.3758/s13428-024-02429-w
Tanja Kutscher, Michael Eid
Rating scales are susceptible to response styles that undermine the scale quality. Optimizing a rating scale can tailor it to individuals' cognitive abilities, thereby preventing the occurrence of response styles related to a suboptimal response format. However, the discrimination ability of individuals in a sample may vary, suggesting that different rating scales may be appropriate for different individuals. This study aims to examine (1) whether response styles can be avoided when individuals are allowed to choose a rating scale and (2) whether the psychometric properties of self-chosen rating scales improve compared to given rating scales. To address these objectives, data from the flourishing scale were used as an illustrative example. MTurk workers from Amazon's Mechanical Turk platform (N = 7042) completed an eight-item flourishing scale twice: (1) using a randomly assigned four-, six-, or 11-point rating scale, and (2) using a self-chosen rating scale. Applying the restrictive mixed generalized partial credit model (rmGPCM) allowed examination of category use across the conditions. Correlations with external variables were calculated to assess the effects of the rating scales on criterion validity. The results revealed consistent use of self-chosen rating scales, with approximately equal proportions of the three response styles. Ordinary response behavior was observed in 55-58% of individuals, which was an increase of 12-15% compared to assigned rating scales. The self-chosen rating scales also exhibited superior psychometric properties. The implications of these findings are discussed.
{"title":"Psychometric benefits of self-chosen rating scales over given rating scales.","authors":"Tanja Kutscher, Michael Eid","doi":"10.3758/s13428-024-02429-w","DOIUrl":"10.3758/s13428-024-02429-w","url":null,"abstract":"<p><p>Rating scales are susceptible to response styles that undermine the scale quality. Optimizing a rating scale can tailor it to individuals' cognitive abilities, thereby preventing the occurrence of response styles related to a suboptimal response format. However, the discrimination ability of individuals in a sample may vary, suggesting that different rating scales may be appropriate for different individuals. This study aims to examine (1) whether response styles can be avoided when individuals are allowed to choose a rating scale and (2) whether the psychometric properties of self-chosen rating scales improve compared to given rating scales. To address these objectives, data from the flourishing scale were used as an illustrative example. MTurk workers from Amazon's Mechanical Turk platform (N = 7042) completed an eight-item flourishing scale twice: (1) using a randomly assigned four-, six-, or 11-point rating scale, and (2) using a self-chosen rating scale. Applying the restrictive mixed generalized partial credit model (rmGPCM) allowed examination of category use across the conditions. Correlations with external variables were calculated to assess the effects of the rating scales on criterion validity. The results revealed consistent use of self-chosen rating scales, with approximately equal proportions of the three response styles. Ordinary response behavior was observed in 55-58% of individuals, which was an increase of 12-15% compared to assigned rating scales. The self-chosen rating scales also exhibited superior psychometric properties. The implications of these findings are discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362426/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140856462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-21DOI: 10.3758/s13428-024-02440-1
Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl
For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.
{"title":"How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.","authors":"Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl","doi":"10.3758/s13428-024-02440-1","DOIUrl":"10.3758/s13428-024-02440-1","url":null,"abstract":"<p><p>For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365748/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141074515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-23DOI: 10.3758/s13428-024-02434-z
Serena Marchesi, Davide De Tommaso, Kyveli Kompatsiari, Yan Wu, Agnieszka Wykowska
In the last decade, scientists investigating human social cognition have started bringing traditional laboratory paradigms more "into the wild" to examine how socio-cognitive mechanisms of the human brain work in real-life settings. As this implies transferring 2D observational paradigms to 3D interactive environments, there is a risk of compromising experimental control. In this context, we propose a methodological approach which uses humanoid robots as proxies of social interaction partners and embeds them in experimental protocols that adapt classical paradigms of cognitive psychology to interactive scenarios. This allows for a relatively high degree of "naturalness" of interaction and excellent experimental control at the same time. Here, we present two case studies where our methods and tools were applied and replicated across two different laboratories, namely the Italian Institute of Technology in Genova (Italy) and the Agency for Science, Technology and Research in Singapore. In the first case study, we present a replication of an interactive version of a gaze-cueing paradigm reported in Kompatsiari et al. (J Exp Psychol Gen 151(1):121-136, 2022). The second case study presents a replication of a "shared experience" paradigm reported in Marchesi et al. (Technol Mind Behav 3(3):11, 2022). As both studies replicate results across labs and different cultures, we argue that our methods allow for reliable and replicable setups, even though the protocols are complex and involve social interaction. We conclude that our approach can be of benefit to the research field of social cognition and grant higher replicability, for example, in cross-cultural comparisons of social cognition mechanisms.
近十年来,研究人类社会认知的科学家们开始将传统的实验室范式更多地 "搬到野外",以研究人类大脑的社会认知机制如何在现实生活中发挥作用。由于这意味着要将二维观察范式转移到三维互动环境中,因此存在着影响实验控制的风险。在这种情况下,我们提出了一种方法论,即使用仿人机器人作为社会互动伙伴的代理,并将其嵌入实验方案中,将认知心理学的经典范式应用到互动场景中。这样就能实现相对较高的互动 "自然度",同时又能实现出色的实验控制。在这里,我们将介绍两个案例研究,我们的方法和工具在两个不同的实验室中得到了应用和复制,这两个实验室分别是位于意大利热那亚的意大利理工学院和位于新加坡的新加坡科技研究局。在第一个案例研究中,我们复制了 Kompatsiari 等人报告的凝视提示范式的互动版本(J Exp Psychol Gen 151(1):121-136, 2022)。第二项案例研究是对马切西等人(Technol Mind Behav 3(3):11, 2022)报告的 "共享经验 "范式的复制。由于这两项研究都是在不同实验室和不同文化背景下的结果复制,我们认为,我们的方法可以实现可靠和可复制的设置,即使协议复杂并涉及社会互动。我们的结论是,我们的方法可以为社会认知研究领域带来益处,并提供更高的可复制性,例如,在社会认知机制的跨文化比较中。
{"title":"Tools and methods to study and replicate experiments addressing human social cognition in interactive scenarios.","authors":"Serena Marchesi, Davide De Tommaso, Kyveli Kompatsiari, Yan Wu, Agnieszka Wykowska","doi":"10.3758/s13428-024-02434-z","DOIUrl":"10.3758/s13428-024-02434-z","url":null,"abstract":"<p><p>In the last decade, scientists investigating human social cognition have started bringing traditional laboratory paradigms more \"into the wild\" to examine how socio-cognitive mechanisms of the human brain work in real-life settings. As this implies transferring 2D observational paradigms to 3D interactive environments, there is a risk of compromising experimental control. In this context, we propose a methodological approach which uses humanoid robots as proxies of social interaction partners and embeds them in experimental protocols that adapt classical paradigms of cognitive psychology to interactive scenarios. This allows for a relatively high degree of \"naturalness\" of interaction and excellent experimental control at the same time. Here, we present two case studies where our methods and tools were applied and replicated across two different laboratories, namely the Italian Institute of Technology in Genova (Italy) and the Agency for Science, Technology and Research in Singapore. In the first case study, we present a replication of an interactive version of a gaze-cueing paradigm reported in Kompatsiari et al. (J Exp Psychol Gen 151(1):121-136, 2022). The second case study presents a replication of a \"shared experience\" paradigm reported in Marchesi et al. (Technol Mind Behav 3(3):11, 2022). As both studies replicate results across labs and different cultures, we argue that our methods allow for reliable and replicable setups, even though the protocols are complex and involve social interaction. We conclude that our approach can be of benefit to the research field of social cognition and grant higher replicability, for example, in cross-cultural comparisons of social cognition mechanisms.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141086535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}