Pub Date : 2024-12-28DOI: 10.3758/s13428-024-02561-7
Marc Brysbaert, Gonzalo Martínez, Pedro Reviriego
This study investigates the potential of large language models (LLMs) to estimate the familiarity of words and multi-word expressions (MWEs). We validated LLM estimates for isolated words using existing human familiarity ratings and found strong correlations. LLM familiarity estimates performed even better in predicting lexical decision and naming performance in megastudies than the best available word frequency measures. We then applied LLM estimates to MWEs, also finding their effectiveness in measuring familiarity for these expressions. We have created a list of more than 400,000 English words and MWEs with LLM-generated familiarity estimates, which we hope will be a valuable resource for researchers. There is also a cleaned-up list of nearly 150,000 entries, excluding lesser-known stimuli, to streamline stimulus selection. Our findings highlight the advantages of LLM-based familiarity estimates, including their better performance than traditional word frequency measures (particularly for predicting word recognition accuracy), their ability to generalize to MWEs, availability for large lists of words, and ease of obtaining new estimates for all types of stimuli.
{"title":"Moving beyond word frequency based on tally counting: AI-generated familiarity estimates of words and phrases are an interesting additional index of language knowledge.","authors":"Marc Brysbaert, Gonzalo Martínez, Pedro Reviriego","doi":"10.3758/s13428-024-02561-7","DOIUrl":"10.3758/s13428-024-02561-7","url":null,"abstract":"<p><p>This study investigates the potential of large language models (LLMs) to estimate the familiarity of words and multi-word expressions (MWEs). We validated LLM estimates for isolated words using existing human familiarity ratings and found strong correlations. LLM familiarity estimates performed even better in predicting lexical decision and naming performance in megastudies than the best available word frequency measures. We then applied LLM estimates to MWEs, also finding their effectiveness in measuring familiarity for these expressions. We have created a list of more than 400,000 English words and MWEs with LLM-generated familiarity estimates, which we hope will be a valuable resource for researchers. There is also a cleaned-up list of nearly 150,000 entries, excluding lesser-known stimuli, to streamline stimulus selection. Our findings highlight the advantages of LLM-based familiarity estimates, including their better performance than traditional word frequency measures (particularly for predicting word recognition accuracy), their ability to generalize to MWEs, availability for large lists of words, and ease of obtaining new estimates for all types of stimuli.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"28"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-28DOI: 10.3758/s13428-024-02567-1
Justin Hachenberger, Axel Mayer, Denny Kerkhoff, Friederike Eyssel, Stefan Fries, Tina B Lonsdorf, Hilmar Zech, Lorenz Deserno, Sakari Lemola
Following the (revised) latent state-trait theory, the present study investigates the within-subject reliability, occasion specificity, common consistency, and construct validity of cognitive control measures in an intensive longitudinal design. These indices were calculated applying dynamic structural equation modeling while accounting for autoregressive effects and trait change. In two studies, participants completed two cognitive control tasks (Stroop and go/no-go) and answered questions about goal pursuit, self-control, executive functions, and situational aspects, multiple times per day. The sample (aged 18-30 years in both studies) consisted of 21 participants (14 female) in the pilot study and 70 participants (48 female) in the main study. Findings indicated poor within-subject reliability for the Stroop task error rate and reaction time difference between congruent and incongruent trials and moderate to good within-subject reliability for the go/no-go task error rate and reaction time. Occasion specificity-the systematic variance accounted for by state residuals-was at a modest level (between 1.4% and 11.1%) for the Stroop error rate and reaction time difference, and at a moderate level (between 16.1% and 37.2% for the go/no-go error rate and reaction time) in the two studies. Common consistency-the variance accounted for by latent trait variables-was at a moderate to high level for all of the investigated scores. Indicative of construct validity, the Stroop and go/no-go task error rates correlated positively with each other on the within- and between-subject level. Within-subject correlations between task scores and subjective self-control measures were very small and mostly nonsignificant.
本研究遵循(修正的)潜在状态-特质理论,在密集的纵向设计中研究了认知控制措施的被试内信度、场合特异性、共同一致性和结构效度。这些指标采用动态结构方程模型计算,同时考虑了自回归效应和性状变化。在两项研究中,参与者每天多次完成两项认知控制任务(Stroop和go/no-go),并回答关于目标追求、自我控制、执行功能和情境方面的问题。样本(两项研究的年龄均为18-30岁)由21名参与者(14名女性)和70名参与者(48名女性)组成。结果表明,Stroop任务错误率和一致与不一致试验的反应时间差异的被试内信度较差,而去/不去任务错误率和反应时间的被试内信度中等至良好。在这两项研究中,场合特异性——由状态残差引起的系统方差——在Stroop错误率和反应时间差方面处于中等水平(在1.4%到11.1%之间),在go/ not -go错误率和反应时间方面处于中等水平(在16.1%到37.2%之间)。共同一致性——由潜在特征变量引起的方差——在所有被调查的分数中都处于中等到较高的水平。在构念效度上,Stroop和去/不去任务错误率在被试内部和被试之间呈显著正相关。任务得分和主观自我控制测量之间的主题内相关性非常小,而且大多不显著。
{"title":"Within-subject reliability, occasion specificity, and validity of fluctuations of the Stroop and go/no-go tasks in ecological momentary assessment.","authors":"Justin Hachenberger, Axel Mayer, Denny Kerkhoff, Friederike Eyssel, Stefan Fries, Tina B Lonsdorf, Hilmar Zech, Lorenz Deserno, Sakari Lemola","doi":"10.3758/s13428-024-02567-1","DOIUrl":"10.3758/s13428-024-02567-1","url":null,"abstract":"<p><p>Following the (revised) latent state-trait theory, the present study investigates the within-subject reliability, occasion specificity, common consistency, and construct validity of cognitive control measures in an intensive longitudinal design. These indices were calculated applying dynamic structural equation modeling while accounting for autoregressive effects and trait change. In two studies, participants completed two cognitive control tasks (Stroop and go/no-go) and answered questions about goal pursuit, self-control, executive functions, and situational aspects, multiple times per day. The sample (aged 18-30 years in both studies) consisted of 21 participants (14 female) in the pilot study and 70 participants (48 female) in the main study. Findings indicated poor within-subject reliability for the Stroop task error rate and reaction time difference between congruent and incongruent trials and moderate to good within-subject reliability for the go/no-go task error rate and reaction time. Occasion specificity-the systematic variance accounted for by state residuals-was at a modest level (between 1.4% and 11.1%) for the Stroop error rate and reaction time difference, and at a moderate level (between 16.1% and 37.2% for the go/no-go error rate and reaction time) in the two studies. Common consistency-the variance accounted for by latent trait variables-was at a moderate to high level for all of the investigated scores. Indicative of construct validity, the Stroop and go/no-go task error rates correlated positively with each other on the within- and between-subject level. Within-subject correlations between task scores and subjective self-control measures were very small and mostly nonsignificant.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"29"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682018/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-28DOI: 10.3758/s13428-024-02534-w
Emanuel Silva, Isabel C Lisboa, Nélson Costa
The vibration perception threshold (VPT) is the minimum amplitude required for conscious vibration perception. VPT assessments are essential in medical diagnostics, safety, and human-machine interaction technologies. However, factors like age, health conditions, and external variables affect VPTs. Various methodologies and distinct procedures have been used to access VPTs, leading to challenges in establishing standardized protocols. Following the PRISMA methodology, this systematic review was conducted to answer the research question: "How are vibration perception thresholds assessed on the glabrous skin of the hands and fingers of healthy humans?" Searches were conducted across five databases to locate recent studies reporting data from VPT assessments, published in English, between 2012 and 2023. Thirty-nine studies met the inclusion criteria. Data on study goals and various methodological aspects was categorized and analyzed. Information gaps were identified, and this review offers recommendations for future studies to enhance standardization and facilitate data comparison. This review also suggests directions for future research, aiming to improve our understanding of how humans perceive haptic information.
{"title":"How to determine hands' vibration perception thresholds - a systematic review.","authors":"Emanuel Silva, Isabel C Lisboa, Nélson Costa","doi":"10.3758/s13428-024-02534-w","DOIUrl":"10.3758/s13428-024-02534-w","url":null,"abstract":"<p><p>The vibration perception threshold (VPT) is the minimum amplitude required for conscious vibration perception. VPT assessments are essential in medical diagnostics, safety, and human-machine interaction technologies. However, factors like age, health conditions, and external variables affect VPTs. Various methodologies and distinct procedures have been used to access VPTs, leading to challenges in establishing standardized protocols. Following the PRISMA methodology, this systematic review was conducted to answer the research question: \"How are vibration perception thresholds assessed on the glabrous skin of the hands and fingers of healthy humans?\" Searches were conducted across five databases to locate recent studies reporting data from VPT assessments, published in English, between 2012 and 2023. Thirty-nine studies met the inclusion criteria. Data on study goals and various methodological aspects was categorized and analyzed. Information gaps were identified, and this review offers recommendations for future studies to enhance standardization and facilitate data comparison. This review also suggests directions for future research, aiming to improve our understanding of how humans perceive haptic information.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"27"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682013/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-28DOI: 10.3758/s13428-024-02548-4
Simona Amenta, Andrea Gregor de Varda, Pawel Mandera, Emmanuel Keuleers, Marc Brysbaert, Marco Marelli
Despite being largely spoken and studied by language and cognitive scientists, Italian lacks large resources of language processing data. The Italian Crowdsourcing Project (ICP) is a dataset of word recognition times and accuracy including responses to 130,465 words, which makes it the largest dataset of its kind item-wise. The data were collected in an online word knowledge task in which over 156,000 native speakers of Italian took part. We validated the ICP dataset by (1) showing that ICP reaction times correlate strongly (r = .78) with lexical decision latencies collected in a traditional lab experiment, (2) showing that the effect of major psycholinguistic variables (e.g., frequency, length, etc.) can be replicated in this dataset, and (3) replicating the effect of word prevalence, which we compute here for the first time for Italian. Given the inclusion of many inflectional forms of verbs, adjectives, and nouns, we further showcase the potential of this dataset by exploring two phenomena (inflectional entropy in verb paradigms and the clitic effect in isolated word recognition) that build on the peculiar properties of Italian. In this paper we present the ICP resource and release response times, accuracy, and prevalence estimates for all the words included.
{"title":"The Italian Crowdsourcing Project: Visual word recognition times for 130,495 Italian words.","authors":"Simona Amenta, Andrea Gregor de Varda, Pawel Mandera, Emmanuel Keuleers, Marc Brysbaert, Marco Marelli","doi":"10.3758/s13428-024-02548-4","DOIUrl":"10.3758/s13428-024-02548-4","url":null,"abstract":"<p><p>Despite being largely spoken and studied by language and cognitive scientists, Italian lacks large resources of language processing data. The Italian Crowdsourcing Project (ICP) is a dataset of word recognition times and accuracy including responses to 130,465 words, which makes it the largest dataset of its kind item-wise. The data were collected in an online word knowledge task in which over 156,000 native speakers of Italian took part. We validated the ICP dataset by (1) showing that ICP reaction times correlate strongly (r = .78) with lexical decision latencies collected in a traditional lab experiment, (2) showing that the effect of major psycholinguistic variables (e.g., frequency, length, etc.) can be replicated in this dataset, and (3) replicating the effect of word prevalence, which we compute here for the first time for Italian. Given the inclusion of many inflectional forms of verbs, adjectives, and nouns, we further showcase the potential of this dataset by exploring two phenomena (inflectional entropy in verb paradigms and the clitic effect in isolated word recognition) that build on the peculiar properties of Italian. In this paper we present the ICP resource and release response times, accuracy, and prevalence estimates for all the words included.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"26"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-28DOI: 10.3758/s13428-024-02528-8
Yiu-Kei Tsang, Ming Yan, Jinger Pan, Megan Yin Kan Chan
The absence of explicit word boundaries is a distinctive characteristic of Chinese script, setting it apart from most alphabetic scripts, leading to word boundary disagreement among readers. Previous studies have examined how this feature may influence reading performance. However, further investigations are required to generate more ecologically valid and generalizable findings. In order to advance our understanding of the impact of word boundaries in Chinese reading, we introduce the Chinese Word Segmentation Agreement (CWSA) corpus. This corpus consists of 500 sentences, comprising 9813 character tokens and 1590 character types, and provides data on word segmentation agreement at each character position. The data revealed a high level of overall segmentation agreement (92%). However, participants disagreed on the position of word boundaries in 8.96% of the cases. Moreover, about 85% of the sentences contained at least one ambiguous word boundary. The character strings with high levels of disagreement were tentatively classified into three categories, namely the morphosyntactic type (e.g., "-"), modifier-head type (e.g., "-"), and others (e.g., "-"). Finally, the agreement scores also significantly influenced reading behaviors, as evidenced by analyses with published eye movement data. Specifically, a high level of disagreement was associated with longer single fixation durations. We discuss the implications of these results and highlight how the CWSA corpus can facilitate future research on word segmentation in Chinese reading.
{"title":"A corpus of Chinese word segmentation agreement.","authors":"Yiu-Kei Tsang, Ming Yan, Jinger Pan, Megan Yin Kan Chan","doi":"10.3758/s13428-024-02528-8","DOIUrl":"10.3758/s13428-024-02528-8","url":null,"abstract":"<p><p>The absence of explicit word boundaries is a distinctive characteristic of Chinese script, setting it apart from most alphabetic scripts, leading to word boundary disagreement among readers. Previous studies have examined how this feature may influence reading performance. However, further investigations are required to generate more ecologically valid and generalizable findings. In order to advance our understanding of the impact of word boundaries in Chinese reading, we introduce the Chinese Word Segmentation Agreement (CWSA) corpus. This corpus consists of 500 sentences, comprising 9813 character tokens and 1590 character types, and provides data on word segmentation agreement at each character position. The data revealed a high level of overall segmentation agreement (92%). However, participants disagreed on the position of word boundaries in 8.96% of the cases. Moreover, about 85% of the sentences contained at least one ambiguous word boundary. The character strings with high levels of disagreement were tentatively classified into three categories, namely the morphosyntactic type (e.g., \"-\"), modifier-head type (e.g., \"-\"), and others (e.g., \"-\"). Finally, the agreement scores also significantly influenced reading behaviors, as evidenced by analyses with published eye movement data. Specifically, a high level of disagreement was associated with longer single fixation durations. We discuss the implications of these results and highlight how the CWSA corpus can facilitate future research on word segmentation in Chinese reading.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"25"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682008/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-19DOI: 10.3758/s13428-024-02533-x
Kyla McConnell, Florian Hintz, Antje S Meyer
Experimental psychologists and psycholinguists increasingly turn to online research for data collection due to the ease of sampling many diverse participants in parallel. Online research has shown promising validity and consistency, but is it suitable for all paradigms? Specifically, is it reliable enough for individual differences research? The current paper reports performance on 15 tasks from a psycholinguistic individual differences battery, including timed and untimed assessments of linguistic abilities, as well as domain-general skills. From a demographically homogenous sample of young Dutch people, 149 participants participated in the lab study, and 515 participated online. Our results indicate that there is no reason to assume that participants tested online will underperform compared to lab-based testing, though they highlight the importance of motivation and the potential for external help (e.g., through looking up answers) online. Overall, we conclude that there is reason for optimism in the future of online research into individual differences.
{"title":"Individual differences in online research: Comparing lab-based and online administration of a psycholinguistic battery of linguistic and domain-general skills.","authors":"Kyla McConnell, Florian Hintz, Antje S Meyer","doi":"10.3758/s13428-024-02533-x","DOIUrl":"10.3758/s13428-024-02533-x","url":null,"abstract":"<p><p>Experimental psychologists and psycholinguists increasingly turn to online research for data collection due to the ease of sampling many diverse participants in parallel. Online research has shown promising validity and consistency, but is it suitable for all paradigms? Specifically, is it reliable enough for individual differences research? The current paper reports performance on 15 tasks from a psycholinguistic individual differences battery, including timed and untimed assessments of linguistic abilities, as well as domain-general skills. From a demographically homogenous sample of young Dutch people, 149 participants participated in the lab study, and 515 participated online. Our results indicate that there is no reason to assume that participants tested online will underperform compared to lab-based testing, though they highlight the importance of motivation and the potential for external help (e.g., through looking up answers) online. Overall, we conclude that there is reason for optimism in the future of online research into individual differences.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"22"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659378/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-19DOI: 10.3758/s13428-024-02556-4
Robert C A Bendall, Sam Royle, James Dodds, Hugh Watmough, Jamie C Gillman, David Beevers, Simon Cassidy, Ben Short, Paige Metcalfe, Michael J Lomas, Draco Graham-Kevan, Samantha E A Gregory
The growing interest in harnessing natural environments to enhance mental health, including cognitive functioning and mood, has yielded encouraging results in initial studies. Given that images of nature have demonstrated similar benefits, they are frequently employed as proxies for real-world environments. To ensure precision and control, researchers often manipulate images of natural environments. The effectiveness of this approach relies on standardization of imagery, and therefore, inconsistency in methods and stimuli has limited the synthesis of research findings in the area. Responding to these limitations, the current paper introduces the Salford Nature Environments Database (SNED), a standardized database of natural images created to support ongoing research into the benefits of nature exposure. The SNED currently exists as the most comprehensive nature image database available, comprising 500 high-quality, standardized photographs capturing a variety of possible natural environments across the seasons. It also includes normative scores for user-rated (801 participants) characteristics of fascination, refuge and prospect, compatibility, preference, valence, arousal, and approach-avoidance, as well as data on physical properties of the images, specifically luminance, contrast, entropy, CIELAB colour space parameter values, and fractal dimensions. All image ratings and content detail, along with participant details, are freely available online. Researchers are encouraged to use this open-access database in accordance with the specific aims and design of their study. The SNED represents a valuable resource for continued research in areas such as nature-based therapy, social prescribing, and experimental approaches investigating underlying mechanisms that help explain how natural environments improve mental health and wellbeing.
{"title":"The Salford Nature Environments Database (SNED): an open-access database of standardized high-quality pictures from natural environments.","authors":"Robert C A Bendall, Sam Royle, James Dodds, Hugh Watmough, Jamie C Gillman, David Beevers, Simon Cassidy, Ben Short, Paige Metcalfe, Michael J Lomas, Draco Graham-Kevan, Samantha E A Gregory","doi":"10.3758/s13428-024-02556-4","DOIUrl":"10.3758/s13428-024-02556-4","url":null,"abstract":"<p><p>The growing interest in harnessing natural environments to enhance mental health, including cognitive functioning and mood, has yielded encouraging results in initial studies. Given that images of nature have demonstrated similar benefits, they are frequently employed as proxies for real-world environments. To ensure precision and control, researchers often manipulate images of natural environments. The effectiveness of this approach relies on standardization of imagery, and therefore, inconsistency in methods and stimuli has limited the synthesis of research findings in the area. Responding to these limitations, the current paper introduces the Salford Nature Environments Database (SNED), a standardized database of natural images created to support ongoing research into the benefits of nature exposure. The SNED currently exists as the most comprehensive nature image database available, comprising 500 high-quality, standardized photographs capturing a variety of possible natural environments across the seasons. It also includes normative scores for user-rated (801 participants) characteristics of fascination, refuge and prospect, compatibility, preference, valence, arousal, and approach-avoidance, as well as data on physical properties of the images, specifically luminance, contrast, entropy, CIELAB colour space parameter values, and fractal dimensions. All image ratings and content detail, along with participant details, are freely available online. Researchers are encouraged to use this open-access database in accordance with the specific aims and design of their study. The SNED represents a valuable resource for continued research in areas such as nature-based therapy, social prescribing, and experimental approaches investigating underlying mechanisms that help explain how natural environments improve mental health and wellbeing.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"21"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659377/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-19DOI: 10.3758/s13428-024-02564-4
Ignace T C Hooge, Roy S Hessels, Diederick C Niehorster, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Marcus Nyström
Irrespective of the precision, the inaccuracy of a pupil-based eye tracker is about 0.5 . This paper delves into two factors that potentially increase the inaccuracy of the gaze signal, namely, 1) Pupil-size changes and the pupil-size artefact (PSA) and 2) the putative inability of experienced individuals to precisely refixate a visual target. Experiment 1 utilizes a traditional pupil-CR eye tracker, while Experiment 2 employs a retinal eye tracker, the FreezeEye tracker, eliminating the pupil-based estimation. Results reveal that the PSA significantly affects gaze accuracy, introducing up to 0.5 inaccuracies during calibration and validation. Corrections based on the relation between pupil size and apparent gaze shift substantially reduce inaccuracies, underscoring the PSA's influence on eye-tracking quality. Conversely, Experiment 2 demonstrates humans' precise refixation abilities, suggesting that the accuracy of the gaze signal is not limited by human refixation inconsistencies.
{"title":"Eye tracker calibration: How well can humans refixate a target?","authors":"Ignace T C Hooge, Roy S Hessels, Diederick C Niehorster, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Marcus Nyström","doi":"10.3758/s13428-024-02564-4","DOIUrl":"10.3758/s13428-024-02564-4","url":null,"abstract":"<p><p>Irrespective of the precision, the inaccuracy of a pupil-based eye tracker is about 0.5 <math><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> . This paper delves into two factors that potentially increase the inaccuracy of the gaze signal, namely, 1) Pupil-size changes and the pupil-size artefact (PSA) and 2) the putative inability of experienced individuals to precisely refixate a visual target. Experiment 1 utilizes a traditional pupil-CR eye tracker, while Experiment 2 employs a retinal eye tracker, the FreezeEye tracker, eliminating the pupil-based estimation. Results reveal that the PSA significantly affects gaze accuracy, introducing up to 0.5 <math><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> inaccuracies during calibration and validation. Corrections based on the relation between pupil size and apparent gaze shift substantially reduce inaccuracies, underscoring the PSA's influence on eye-tracking quality. Conversely, Experiment 2 demonstrates humans' precise refixation abilities, suggesting that the accuracy of the gaze signal is not limited by human refixation inconsistencies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"23"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659352/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-19DOI: 10.3758/s13428-024-02511-3
Bruno Michelot, Alexandra Corneyllie, Marc Thevenet, Stefan Duffner, Fabien Perrin
Artificial intelligence techniques offer promising avenues for exploring human body features from videos, yet no freely accessible tool has reliably provided holistic and fine-grained behavioral analyses to date. To address this, we developed a machine learning tool based on a two-level approach: a first lower-level processing using computer vision for extracting fine-grained and comprehensive behavioral features such as skeleton or facial points, gaze, and action units; a second level of machine learning classification coupled with explainability providing modularity, to determine which behavioral features are triggered by specific environments. To validate our tool, we filmed 16 participants across six conditions, varying according to the presence of a person ("Pers"), a sound ("Snd"), or silence ("Rest"), and according to emotional levels using self-referential ("Self") and control ("Ctrl") stimuli. We demonstrated the effectiveness of our approach by extracting and correcting behavior from videos using two computer vision software (OpenPose and OpenFace) and by training two algorithms (XGBoost and long short-term memory [LSTM]) to differentiate between experimental conditions. High classification rates were achieved for "Pers" conditions versus "Snd" or "Rest" (AUC = 0.8-0.9), with explainability revealing actions units and gaze as key features. Additionally, moderate classification rates were attained for "Snd" versus "Rest" (AUC = 0.7), attributed to action units, limbs and head points, as well as for "Self" versus "Ctrl" (AUC = 0.7-0.8), due to facial points. These findings were consistent with a more conventional hypothesis-driven approach. Overall, our study suggests that our tool is well suited for holistic and fine-grained behavioral analysis and offers modularity for extension into more complex naturalistic environments.
{"title":"A modular machine learning tool for holistic and fine-grained behavioral analysis.","authors":"Bruno Michelot, Alexandra Corneyllie, Marc Thevenet, Stefan Duffner, Fabien Perrin","doi":"10.3758/s13428-024-02511-3","DOIUrl":"10.3758/s13428-024-02511-3","url":null,"abstract":"<p><p>Artificial intelligence techniques offer promising avenues for exploring human body features from videos, yet no freely accessible tool has reliably provided holistic and fine-grained behavioral analyses to date. To address this, we developed a machine learning tool based on a two-level approach: a first lower-level processing using computer vision for extracting fine-grained and comprehensive behavioral features such as skeleton or facial points, gaze, and action units; a second level of machine learning classification coupled with explainability providing modularity, to determine which behavioral features are triggered by specific environments. To validate our tool, we filmed 16 participants across six conditions, varying according to the presence of a person (\"Pers\"), a sound (\"Snd\"), or silence (\"Rest\"), and according to emotional levels using self-referential (\"Self\") and control (\"Ctrl\") stimuli. We demonstrated the effectiveness of our approach by extracting and correcting behavior from videos using two computer vision software (OpenPose and OpenFace) and by training two algorithms (XGBoost and long short-term memory [LSTM]) to differentiate between experimental conditions. High classification rates were achieved for \"Pers\" conditions versus \"Snd\" or \"Rest\" (AUC = 0.8-0.9), with explainability revealing actions units and gaze as key features. Additionally, moderate classification rates were attained for \"Snd\" versus \"Rest\" (AUC = 0.7), attributed to action units, limbs and head points, as well as for \"Self\" versus \"Ctrl\" (AUC = 0.7-0.8), due to facial points. These findings were consistent with a more conventional hypothesis-driven approach. Overall, our study suggests that our tool is well suited for holistic and fine-grained behavioral analysis and offers modularity for extension into more complex naturalistic environments.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"24"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-18DOI: 10.3758/s13428-024-02566-2
Anna Eifert, Christian Julmi
This article develops a comprehensive database comprising 5956 German affective norms specifically tailored for the study of organizational atmospheres through computational verbal language analysis. This dictionary adopts both dimensional and categorical approaches. The theoretical foundation of this study is the circumplex model of affective atmospheres. Similar to established methodologies, each word is rated based on the dimensions of valence and arousal. Going beyond the dimensional approach, this article introduces a classification system with 11 distinct atmospheric categories, assigning the words to their corresponding categories. This dictionary represents the first attempt to apply computer-aided text analysis (CATA) to the study of organizational atmospheres, providing a practical tool to support research in this developing area.
{"title":"5956 German affective norms for atmospheres in organizations (GANAiO).","authors":"Anna Eifert, Christian Julmi","doi":"10.3758/s13428-024-02566-2","DOIUrl":"10.3758/s13428-024-02566-2","url":null,"abstract":"<p><p>This article develops a comprehensive database comprising 5956 German affective norms specifically tailored for the study of organizational atmospheres through computational verbal language analysis. This dictionary adopts both dimensional and categorical approaches. The theoretical foundation of this study is the circumplex model of affective atmospheres. Similar to established methodologies, each word is rated based on the dimensions of valence and arousal. Going beyond the dimensional approach, this article introduces a classification system with 11 distinct atmospheric categories, assigning the words to their corresponding categories. This dictionary represents the first attempt to apply computer-aided text analysis (CATA) to the study of organizational atmospheres, providing a practical tool to support research in this developing area.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"20"},"PeriodicalIF":4.6,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11655590/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}