Behavior Research Methods最新文献_第8页

Moving beyond word frequency based on tally counting: AI-generated familiarity estimates of words and phrases are an interesting additional index of language knowledge. 超越基于计数的词频：人工智能生成的单词和短语熟悉度估计是语言知识的一个有趣的附加指标。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-28 DOI: 10.3758/s13428-024-02561-7

Marc Brysbaert, Gonzalo Martínez, Pedro Reviriego

This study investigates the potential of large language models (LLMs) to estimate the familiarity of words and multi-word expressions (MWEs). We validated LLM estimates for isolated words using existing human familiarity ratings and found strong correlations. LLM familiarity estimates performed even better in predicting lexical decision and naming performance in megastudies than the best available word frequency measures. We then applied LLM estimates to MWEs, also finding their effectiveness in measuring familiarity for these expressions. We have created a list of more than 400,000 English words and MWEs with LLM-generated familiarity estimates, which we hope will be a valuable resource for researchers. There is also a cleaned-up list of nearly 150,000 entries, excluding lesser-known stimuli, to streamline stimulus selection. Our findings highlight the advantages of LLM-based familiarity estimates, including their better performance than traditional word frequency measures (particularly for predicting word recognition accuracy), their ability to generalize to MWEs, availability for large lists of words, and ease of obtaining new estimates for all types of stimuli.

本研究探讨了大型语言模型（llm）在估计单词和多词表达（MWEs）熟悉度方面的潜力。我们使用现有的人类熟悉度评级验证了LLM对孤立单词的估计，并发现了很强的相关性。在大型研究中，LLM熟悉度估计在预测词汇决策和命名性能方面比最好的可用词频测量方法表现得更好。然后，我们将LLM估计应用于MWEs，也发现它们在测量这些表达的熟悉度方面的有效性。我们已经创建了一个包含超过40万个英语单词和MWEs的列表，并使用llm生成的熟悉度估算，我们希望这将成为研究人员的宝贵资源。此外，为了简化刺激因素的选择，还清理了近15万个条目，不包括不太知名的刺激因素。我们的研究结果突出了基于llm的熟悉度估计的优势，包括它们比传统的词频测量（特别是预测单词识别准确性）更好的性能，它们推广到MWEs的能力，大单词列表的可用性，以及易于获得所有类型刺激的新估计。

{"title":"Moving beyond word frequency based on tally counting: AI-generated familiarity estimates of words and phrases are an interesting additional index of language knowledge.","authors":"Marc Brysbaert, Gonzalo Martínez, Pedro Reviriego","doi":"10.3758/s13428-024-02561-7","DOIUrl":"10.3758/s13428-024-02561-7","url":null,"abstract":"This study investigates the potential of large language models (LLMs) to estimate the familiarity of words and multi-word expressions (MWEs). We validated LLM estimates for isolated words using existing human familiarity ratings and found strong correlations. LLM familiarity estimates performed even better in predicting lexical decision and naming performance in megastudies than the best available word frequency measures. We then applied LLM estimates to MWEs, also finding their effectiveness in measuring familiarity for these expressions. We have created a list of more than 400,000 English words and MWEs with LLM-generated familiarity estimates, which we hope will be a valuable resource for researchers. There is also a cleaned-up list of nearly 150,000 entries, excluding lesser-known stimuli, to streamline stimulus selection. Our findings highlight the advantages of LLM-based familiarity estimates, including their better performance than traditional word frequency measures (particularly for predicting word recognition accuracy), their ability to generalize to MWEs, availability for large lists of words, and ease of obtaining new estimates for all types of stimuli.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"28"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Within-subject reliability, occasion specificity, and validity of fluctuations of the Stroop and go/no-go tasks in ecological momentary assessment. 生态瞬时评估中Stroop和go/no-go任务波动的主体内信度、场合特异性和效度。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-28 DOI: 10.3758/s13428-024-02567-1

Justin Hachenberger, Axel Mayer, Denny Kerkhoff, Friederike Eyssel, Stefan Fries, Tina B Lonsdorf, Hilmar Zech, Lorenz Deserno, Sakari Lemola

Following the (revised) latent state-trait theory, the present study investigates the within-subject reliability, occasion specificity, common consistency, and construct validity of cognitive control measures in an intensive longitudinal design. These indices were calculated applying dynamic structural equation modeling while accounting for autoregressive effects and trait change. In two studies, participants completed two cognitive control tasks (Stroop and go/no-go) and answered questions about goal pursuit, self-control, executive functions, and situational aspects, multiple times per day. The sample (aged 18-30 years in both studies) consisted of 21 participants (14 female) in the pilot study and 70 participants (48 female) in the main study. Findings indicated poor within-subject reliability for the Stroop task error rate and reaction time difference between congruent and incongruent trials and moderate to good within-subject reliability for the go/no-go task error rate and reaction time. Occasion specificity-the systematic variance accounted for by state residuals-was at a modest level (between 1.4% and 11.1%) for the Stroop error rate and reaction time difference, and at a moderate level (between 16.1% and 37.2% for the go/no-go error rate and reaction time) in the two studies. Common consistency-the variance accounted for by latent trait variables-was at a moderate to high level for all of the investigated scores. Indicative of construct validity, the Stroop and go/no-go task error rates correlated positively with each other on the within- and between-subject level. Within-subject correlations between task scores and subjective self-control measures were very small and mostly nonsignificant.

本研究遵循（修正的）潜在状态-特质理论，在密集的纵向设计中研究了认知控制措施的被试内信度、场合特异性、共同一致性和结构效度。这些指标采用动态结构方程模型计算，同时考虑了自回归效应和性状变化。在两项研究中，参与者每天多次完成两项认知控制任务（Stroop和go/no-go），并回答关于目标追求、自我控制、执行功能和情境方面的问题。样本（两项研究的年龄均为18-30岁）由21名参与者（14名女性）和70名参与者（48名女性）组成。结果表明，Stroop任务错误率和一致与不一致试验的反应时间差异的被试内信度较差，而去/不去任务错误率和反应时间的被试内信度中等至良好。在这两项研究中，场合特异性——由状态残差引起的系统方差——在Stroop错误率和反应时间差方面处于中等水平（在1.4%到11.1%之间），在go/ not -go错误率和反应时间方面处于中等水平（在16.1%到37.2%之间）。共同一致性——由潜在特征变量引起的方差——在所有被调查的分数中都处于中等到较高的水平。在构念效度上，Stroop和去/不去任务错误率在被试内部和被试之间呈显著正相关。任务得分和主观自我控制测量之间的主题内相关性非常小，而且大多不显著。

{"title":"Within-subject reliability, occasion specificity, and validity of fluctuations of the Stroop and go/no-go tasks in ecological momentary assessment.","authors":"Justin Hachenberger, Axel Mayer, Denny Kerkhoff, Friederike Eyssel, Stefan Fries, Tina B Lonsdorf, Hilmar Zech, Lorenz Deserno, Sakari Lemola","doi":"10.3758/s13428-024-02567-1","DOIUrl":"10.3758/s13428-024-02567-1","url":null,"abstract":"Following the (revised) latent state-trait theory, the present study investigates the within-subject reliability, occasion specificity, common consistency, and construct validity of cognitive control measures in an intensive longitudinal design. These indices were calculated applying dynamic structural equation modeling while accounting for autoregressive effects and trait change. In two studies, participants completed two cognitive control tasks (Stroop and go/no-go) and answered questions about goal pursuit, self-control, executive functions, and situational aspects, multiple times per day. The sample (aged 18-30 years in both studies) consisted of 21 participants (14 female) in the pilot study and 70 participants (48 female) in the main study. Findings indicated poor within-subject reliability for the Stroop task error rate and reaction time difference between congruent and incongruent trials and moderate to good within-subject reliability for the go/no-go task error rate and reaction time. Occasion specificity-the systematic variance accounted for by state residuals-was at a modest level (between 1.4% and 11.1%) for the Stroop error rate and reaction time difference, and at a moderate level (between 16.1% and 37.2% for the go/no-go error rate and reaction time) in the two studies. Common consistency-the variance accounted for by latent trait variables-was at a moderate to high level for all of the investigated scores. Indicative of construct validity, the Stroop and go/no-go task error rates correlated positively with each other on the within- and between-subject level. Within-subject correlations between task scores and subjective self-control measures were very small and mostly nonsignificant.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"29"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682018/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How to determine hands' vibration perception thresholds - a systematic review. 如何确定手的振动感知阈值-系统回顾。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-28 DOI: 10.3758/s13428-024-02534-w

Emanuel Silva, Isabel C Lisboa, Nélson Costa

The vibration perception threshold (VPT) is the minimum amplitude required for conscious vibration perception. VPT assessments are essential in medical diagnostics, safety, and human-machine interaction technologies. However, factors like age, health conditions, and external variables affect VPTs. Various methodologies and distinct procedures have been used to access VPTs, leading to challenges in establishing standardized protocols. Following the PRISMA methodology, this systematic review was conducted to answer the research question: "How are vibration perception thresholds assessed on the glabrous skin of the hands and fingers of healthy humans?" Searches were conducted across five databases to locate recent studies reporting data from VPT assessments, published in English, between 2012 and 2023. Thirty-nine studies met the inclusion criteria. Data on study goals and various methodological aspects was categorized and analyzed. Information gaps were identified, and this review offers recommendations for future studies to enhance standardization and facilitate data comparison. This review also suggests directions for future research, aiming to improve our understanding of how humans perceive haptic information.

振动感知阈值（VPT）是意识振动感知所需的最小振幅。VPT评估在医疗诊断、安全和人机交互技术中是必不可少的。然而，年龄、健康状况和外部变量等因素会影响vpt。使用了各种方法和不同的程序来访问vpt，导致在建立标准化协议方面面临挑战。遵循PRISMA方法，本系统综述是为了回答研究问题：“如何评估健康人类手和手指无毛皮肤的振动感知阈值？”在五个数据库中进行了搜索，以查找2012年至2023年期间发表的英文VPT评估数据的最新研究。39项研究符合纳入标准。对研究目标和各种方法方面的数据进行分类和分析。本综述确定了信息差距，并为未来的研究提供了建议，以加强标准化和促进数据比较。这篇综述也提出了未来的研究方向，旨在提高我们对人类如何感知触觉信息的理解。

{"title":"How to determine hands' vibration perception thresholds - a systematic review.","authors":"Emanuel Silva, Isabel C Lisboa, Nélson Costa","doi":"10.3758/s13428-024-02534-w","DOIUrl":"10.3758/s13428-024-02534-w","url":null,"abstract":"The vibration perception threshold (VPT) is the minimum amplitude required for conscious vibration perception. VPT assessments are essential in medical diagnostics, safety, and human-machine interaction technologies. However, factors like age, health conditions, and external variables affect VPTs. Various methodologies and distinct procedures have been used to access VPTs, leading to challenges in establishing standardized protocols. Following the PRISMA methodology, this systematic review was conducted to answer the research question: \"How are vibration perception thresholds assessed on the glabrous skin of the hands and fingers of healthy humans?\" Searches were conducted across five databases to locate recent studies reporting data from VPT assessments, published in English, between 2012 and 2023. Thirty-nine studies met the inclusion criteria. Data on study goals and various methodological aspects was categorized and analyzed. Information gaps were identified, and this review offers recommendations for future studies to enhance standardization and facilitate data comparison. This review also suggests directions for future research, aiming to improve our understanding of how humans perceive haptic information.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"27"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682013/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Italian Crowdsourcing Project: Visual word recognition times for 130,495 Italian words. 意大利众包项目：130,495个意大利单词的视觉单词识别时间。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-28 DOI: 10.3758/s13428-024-02548-4

Simona Amenta, Andrea Gregor de Varda, Pawel Mandera, Emmanuel Keuleers, Marc Brysbaert, Marco Marelli

Despite being largely spoken and studied by language and cognitive scientists, Italian lacks large resources of language processing data. The Italian Crowdsourcing Project (ICP) is a dataset of word recognition times and accuracy including responses to 130,465 words, which makes it the largest dataset of its kind item-wise. The data were collected in an online word knowledge task in which over 156,000 native speakers of Italian took part. We validated the ICP dataset by (1) showing that ICP reaction times correlate strongly (r = .78) with lexical decision latencies collected in a traditional lab experiment, (2) showing that the effect of major psycholinguistic variables (e.g., frequency, length, etc.) can be replicated in this dataset, and (3) replicating the effect of word prevalence, which we compute here for the first time for Italian. Given the inclusion of many inflectional forms of verbs, adjectives, and nouns, we further showcase the potential of this dataset by exploring two phenomena (inflectional entropy in verb paradigms and the clitic effect in isolated word recognition) that build on the peculiar properties of Italian. In this paper we present the ICP resource and release response times, accuracy, and prevalence estimates for all the words included.

尽管意大利语被语言和认知科学家广泛使用和研究，但意大利语缺乏大量的语言处理数据资源。意大利众包项目（ICP）是一个单词识别时间和准确性的数据集，包括对130465个单词的响应，这使它成为同类项目中最大的数据集。这些数据是在一项在线单词知识任务中收集的，超过15.6万名母语为意大利语的人参与了这项任务。我们通过以下方式验证了ICP数据集：(1)表明ICP反应时间与传统实验室实验中收集的词汇决策潜伏期密切相关（r = 0.78）；(2)表明主要心理语言学变量（如频率、长度等）的影响可以在该数据集中复制；(3)复制单词流行度的影响，这是我们首次对意大利语进行计算。考虑到包含了许多动词、形容词和名词的屈折形式，我们通过探索基于意大利语特有属性的两种现象（动词范式中的屈折熵和孤立词识别中的clitic效应）进一步展示了该数据集的潜力。在本文中，我们提出了ICP资源和发布响应时间，准确性和流行率估计为所有包括的词。

{"title":"The Italian Crowdsourcing Project: Visual word recognition times for 130,495 Italian words.","authors":"Simona Amenta, Andrea Gregor de Varda, Pawel Mandera, Emmanuel Keuleers, Marc Brysbaert, Marco Marelli","doi":"10.3758/s13428-024-02548-4","DOIUrl":"10.3758/s13428-024-02548-4","url":null,"abstract":"Despite being largely spoken and studied by language and cognitive scientists, Italian lacks large resources of language processing data. The Italian Crowdsourcing Project (ICP) is a dataset of word recognition times and accuracy including responses to 130,465 words, which makes it the largest dataset of its kind item-wise. The data were collected in an online word knowledge task in which over 156,000 native speakers of Italian took part. We validated the ICP dataset by (1) showing that ICP reaction times correlate strongly (r = .78) with lexical decision latencies collected in a traditional lab experiment, (2) showing that the effect of major psycholinguistic variables (e.g., frequency, length, etc.) can be replicated in this dataset, and (3) replicating the effect of word prevalence, which we compute here for the first time for Italian. Given the inclusion of many inflectional forms of verbs, adjectives, and nouns, we further showcase the potential of this dataset by exploring two phenomena (inflectional entropy in verb paradigms and the clitic effect in isolated word recognition) that build on the peculiar properties of Italian. In this paper we present the ICP resource and release response times, accuracy, and prevalence estimates for all the words included.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"26"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A corpus of Chinese word segmentation agreement. 中文分词协议语料库。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-28 DOI: 10.3758/s13428-024-02528-8

Yiu-Kei Tsang, Ming Yan, Jinger Pan, Megan Yin Kan Chan

The absence of explicit word boundaries is a distinctive characteristic of Chinese script, setting it apart from most alphabetic scripts, leading to word boundary disagreement among readers. Previous studies have examined how this feature may influence reading performance. However, further investigations are required to generate more ecologically valid and generalizable findings. In order to advance our understanding of the impact of word boundaries in Chinese reading, we introduce the Chinese Word Segmentation Agreement (CWSA) corpus. This corpus consists of 500 sentences, comprising 9813 character tokens and 1590 character types, and provides data on word segmentation agreement at each character position. The data revealed a high level of overall segmentation agreement (92%). However, participants disagreed on the position of word boundaries in 8.96% of the cases. Moreover, about 85% of the sentences contained at least one ambiguous word boundary. The character strings with high levels of disagreement were tentatively classified into three categories, namely the morphosyntactic type (e.g., "-"), modifier-head type (e.g., "-"), and others (e.g., "-"). Finally, the agreement scores also significantly influenced reading behaviors, as evidenced by analyses with published eye movement data. Specifically, a high level of disagreement was associated with longer single fixation durations. We discuss the implications of these results and highlight how the CWSA corpus can facilitate future research on word segmentation in Chinese reading.

没有明确的词边界是汉字的一个显著特征，使其与大多数字母文字不同，导致读者对词边界的分歧。之前的研究已经研究了这一特征如何影响阅读表现。然而，需要进一步的调查，以产生更多的生态有效和可推广的发现。为了进一步了解词边界对汉语阅读的影响，我们引入了汉语分词协议语料库。该语料库由500个句子组成，包含9813个字符标记和1590个字符类型，并提供每个字符位置的分词协议数据。数据显示，整体细分一致性很高（92%）。然而，在8.96%的情况下，参与者不同意单词边界的位置。此外，大约85%的句子至少包含一个模棱两可的词边界。将歧义程度较高的字符串初步分为三类，即形态句法类型（如“-”）、修饰语头部类型（如“-”）和其他类型（如“-”）。最后，通过对已发表的眼动数据的分析可以证明，一致性得分也显著影响了阅读行为。具体来说，高度的不一致与较长的单次注视持续时间有关。我们讨论了这些结果的意义，并强调了CWSA语料库如何促进汉语阅读中的分词研究。

{"title":"A corpus of Chinese word segmentation agreement.","authors":"Yiu-Kei Tsang, Ming Yan, Jinger Pan, Megan Yin Kan Chan","doi":"10.3758/s13428-024-02528-8","DOIUrl":"10.3758/s13428-024-02528-8","url":null,"abstract":"The absence of explicit word boundaries is a distinctive characteristic of Chinese script, setting it apart from most alphabetic scripts, leading to word boundary disagreement among readers. Previous studies have examined how this feature may influence reading performance. However, further investigations are required to generate more ecologically valid and generalizable findings. In order to advance our understanding of the impact of word boundaries in Chinese reading, we introduce the Chinese Word Segmentation Agreement (CWSA) corpus. This corpus consists of 500 sentences, comprising 9813 character tokens and 1590 character types, and provides data on word segmentation agreement at each character position. The data revealed a high level of overall segmentation agreement (92%). However, participants disagreed on the position of word boundaries in 8.96% of the cases. Moreover, about 85% of the sentences contained at least one ambiguous word boundary. The character strings with high levels of disagreement were tentatively classified into three categories, namely the morphosyntactic type (e.g., \"-\"), modifier-head type (e.g., \"-\"), and others (e.g., \"-\"). Finally, the agreement scores also significantly influenced reading behaviors, as evidenced by analyses with published eye movement data. Specifically, a high level of disagreement was associated with longer single fixation durations. We discuss the implications of these results and highlight how the CWSA corpus can facilitate future research on word segmentation in Chinese reading.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"25"},"PeriodicalIF":4.6,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682008/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Individual differences in online research: Comparing lab-based and online administration of a psycholinguistic battery of linguistic and domain-general skills. 在线研究中的个体差异：比较基于实验室和在线管理的语言和领域一般技能的心理语言学电池。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-19 DOI: 10.3758/s13428-024-02533-x

Kyla McConnell, Florian Hintz, Antje S Meyer

Experimental psychologists and psycholinguists increasingly turn to online research for data collection due to the ease of sampling many diverse participants in parallel. Online research has shown promising validity and consistency, but is it suitable for all paradigms? Specifically, is it reliable enough for individual differences research? The current paper reports performance on 15 tasks from a psycholinguistic individual differences battery, including timed and untimed assessments of linguistic abilities, as well as domain-general skills. From a demographically homogenous sample of young Dutch people, 149 participants participated in the lab study, and 515 participated online. Our results indicate that there is no reason to assume that participants tested online will underperform compared to lab-based testing, though they highlight the importance of motivation and the potential for external help (e.g., through looking up answers) online. Overall, we conclude that there is reason for optimism in the future of online research into individual differences.

实验心理学家和心理语言学家越来越多地转向在线研究来收集数据，因为可以方便地同时对许多不同的参与者进行抽样。在线研究显示了有希望的有效性和一致性，但它适用于所有范式吗？具体来说，它对个体差异研究是否足够可靠？目前的论文报告了心理语言学个体差异电池在15项任务中的表现，包括对语言能力的定时和非定时评估，以及领域一般技能。从人口统计学上相同的荷兰年轻人样本中，149名参与者参加了实验室研究，515名参与者参加了在线研究。我们的研究结果表明，没有理由认为在线测试的参与者会比基于实验室的测试表现不佳，尽管他们强调了动机的重要性和外部帮助（例如，通过在线查找答案）的潜力。总的来说，我们得出结论，我们有理由对在线研究个体差异的未来持乐观态度。

{"title":"Individual differences in online research: Comparing lab-based and online administration of a psycholinguistic battery of linguistic and domain-general skills.","authors":"Kyla McConnell, Florian Hintz, Antje S Meyer","doi":"10.3758/s13428-024-02533-x","DOIUrl":"10.3758/s13428-024-02533-x","url":null,"abstract":"Experimental psychologists and psycholinguists increasingly turn to online research for data collection due to the ease of sampling many diverse participants in parallel. Online research has shown promising validity and consistency, but is it suitable for all paradigms? Specifically, is it reliable enough for individual differences research? The current paper reports performance on 15 tasks from a psycholinguistic individual differences battery, including timed and untimed assessments of linguistic abilities, as well as domain-general skills. From a demographically homogenous sample of young Dutch people, 149 participants participated in the lab study, and 515 participated online. Our results indicate that there is no reason to assume that participants tested online will underperform compared to lab-based testing, though they highlight the importance of motivation and the potential for external help (e.g., through looking up answers) online. Overall, we conclude that there is reason for optimism in the future of online research into individual differences.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"22"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659378/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Salford Nature Environments Database (SNED): an open-access database of standardized high-quality pictures from natural environments. 索尔福德自然环境数据库 (SNED)：一个开放存取的标准化高质量自然环境图片数据库。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-19 DOI: 10.3758/s13428-024-02556-4

Robert C A Bendall, Sam Royle, James Dodds, Hugh Watmough, Jamie C Gillman, David Beevers, Simon Cassidy, Ben Short, Paige Metcalfe, Michael J Lomas, Draco Graham-Kevan, Samantha E A Gregory

The growing interest in harnessing natural environments to enhance mental health, including cognitive functioning and mood, has yielded encouraging results in initial studies. Given that images of nature have demonstrated similar benefits, they are frequently employed as proxies for real-world environments. To ensure precision and control, researchers often manipulate images of natural environments. The effectiveness of this approach relies on standardization of imagery, and therefore, inconsistency in methods and stimuli has limited the synthesis of research findings in the area. Responding to these limitations, the current paper introduces the Salford Nature Environments Database (SNED), a standardized database of natural images created to support ongoing research into the benefits of nature exposure. The SNED currently exists as the most comprehensive nature image database available, comprising 500 high-quality, standardized photographs capturing a variety of possible natural environments across the seasons. It also includes normative scores for user-rated (801 participants) characteristics of fascination, refuge and prospect, compatibility, preference, valence, arousal, and approach-avoidance, as well as data on physical properties of the images, specifically luminance, contrast, entropy, CIELAB colour space parameter values, and fractal dimensions. All image ratings and content detail, along with participant details, are freely available online. Researchers are encouraged to use this open-access database in accordance with the specific aims and design of their study. The SNED represents a valuable resource for continued research in areas such as nature-based therapy, social prescribing, and experimental approaches investigating underlying mechanisms that help explain how natural environments improve mental health and wellbeing.

人们对利用自然环境来增强心理健康（包括认知功能和情绪）的兴趣日益浓厚，初步研究已经取得了令人鼓舞的结果。鉴于自然图像已经显示出类似的好处，它们经常被用作现实世界环境的代理。为了确保精度和控制，研究人员经常对自然环境的图像进行处理。这种方法的有效性依赖于图像的标准化，因此，方法和刺激的不一致性限制了该领域研究成果的综合。针对这些限制，本文介绍了索尔福德自然环境数据库（SNED），这是一个标准化的自然图像数据库，旨在支持正在进行的关于自然暴露益处的研究。SNED目前是最全面的自然图像数据库，包括500张高质量、标准化的照片，捕捉了各个季节各种可能的自然环境。它还包括用户评价（801名参与者）的魅力、庇护和前景、兼容性、偏好、效价、唤醒和回避方法等特征的标准分数，以及图像物理特性的数据，特别是亮度、对比度、熵、CIELAB色彩空间参数值和分形维数。所有图像评分和内容细节，以及参与者的详细信息，都可以在网上免费获得。鼓励研究人员根据其研究的具体目的和设计使用此开放存取数据库。snd为在自然疗法、社会处方和实验方法等领域的持续研究提供了宝贵的资源，这些领域的研究有助于解释自然环境如何改善心理健康和福祉。

{"title":"The Salford Nature Environments Database (SNED): an open-access database of standardized high-quality pictures from natural environments.","authors":"Robert C A Bendall, Sam Royle, James Dodds, Hugh Watmough, Jamie C Gillman, David Beevers, Simon Cassidy, Ben Short, Paige Metcalfe, Michael J Lomas, Draco Graham-Kevan, Samantha E A Gregory","doi":"10.3758/s13428-024-02556-4","DOIUrl":"10.3758/s13428-024-02556-4","url":null,"abstract":"The growing interest in harnessing natural environments to enhance mental health, including cognitive functioning and mood, has yielded encouraging results in initial studies. Given that images of nature have demonstrated similar benefits, they are frequently employed as proxies for real-world environments. To ensure precision and control, researchers often manipulate images of natural environments. The effectiveness of this approach relies on standardization of imagery, and therefore, inconsistency in methods and stimuli has limited the synthesis of research findings in the area. Responding to these limitations, the current paper introduces the Salford Nature Environments Database (SNED), a standardized database of natural images created to support ongoing research into the benefits of nature exposure. The SNED currently exists as the most comprehensive nature image database available, comprising 500 high-quality, standardized photographs capturing a variety of possible natural environments across the seasons. It also includes normative scores for user-rated (801 participants) characteristics of fascination, refuge and prospect, compatibility, preference, valence, arousal, and approach-avoidance, as well as data on physical properties of the images, specifically luminance, contrast, entropy, CIELAB colour space parameter values, and fractal dimensions. All image ratings and content detail, along with participant details, are freely available online. Researchers are encouraged to use this open-access database in accordance with the specific aims and design of their study. The SNED represents a valuable resource for continued research in areas such as nature-based therapy, social prescribing, and experimental approaches investigating underlying mechanisms that help explain how natural environments improve mental health and wellbeing.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"21"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659377/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Eye tracker calibration: How well can humans refixate a target? 眼动仪校准：人类能多好地重新定位目标？

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-19 DOI: 10.3758/s13428-024-02564-4

Ignace T C Hooge, Roy S Hessels, Diederick C Niehorster, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Marcus Nyström

Irrespective of the precision, the inaccuracy of a pupil-based eye tracker is about 0.5 $^{\circ}$ . This paper delves into two factors that potentially increase the inaccuracy of the gaze signal, namely, 1) Pupil-size changes and the pupil-size artefact (PSA) and 2) the putative inability of experienced individuals to precisely refixate a visual target. Experiment 1 utilizes a traditional pupil-CR eye tracker, while Experiment 2 employs a retinal eye tracker, the FreezeEye tracker, eliminating the pupil-based estimation. Results reveal that the PSA significantly affects gaze accuracy, introducing up to 0.5 $^{\circ}$ inaccuracies during calibration and validation. Corrections based on the relation between pupil size and apparent gaze shift substantially reduce inaccuracies, underscoring the PSA's influence on eye-tracking quality. Conversely, Experiment 2 demonstrates humans' precise refixation abilities, suggesting that the accuracy of the gaze signal is not limited by human refixation inconsistencies.

不管精度如何，瞳孔眼动仪的误差约为0.5°。本文探讨了两个可能增加凝视信号不准确性的因素，即：1)瞳孔大小变化和瞳孔大小伪影（PSA）； 2)经验丰富的个体无法精确地重新定位视觉目标。实验1使用传统的瞳孔- cr眼动仪，而实验2使用视网膜眼动仪，即FreezeEye眼动仪，消除了瞳孔估计。结果显示，PSA会显著影响注视精度，在校准和验证过程中造成最多0.5°的误差。基于瞳孔大小和明显凝视位移之间关系的校正大大减少了不准确性，强调了PSA对眼球追踪质量的影响。相反，实验2证明了人类精确的注视能力，表明凝视信号的准确性不受人类注视不一致性的限制。

{"title":"Eye tracker calibration: How well can humans refixate a target?","authors":"Ignace T C Hooge, Roy S Hessels, Diederick C Niehorster, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Marcus Nyström","doi":"10.3758/s13428-024-02564-4","DOIUrl":"10.3758/s13428-024-02564-4","url":null,"abstract":"Irrespective of the precision, the inaccuracy of a pupil-based eye tracker is about 0.5 <math><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> . This paper delves into two factors that potentially increase the inaccuracy of the gaze signal, namely, 1) Pupil-size changes and the pupil-size artefact (PSA) and 2) the putative inability of experienced individuals to precisely refixate a visual target. Experiment 1 utilizes a traditional pupil-CR eye tracker, while Experiment 2 employs a retinal eye tracker, the FreezeEye tracker, eliminating the pupil-based estimation. Results reveal that the PSA significantly affects gaze accuracy, introducing up to 0.5 <math><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> inaccuracies during calibration and validation. Corrections based on the relation between pupil size and apparent gaze shift substantially reduce inaccuracies, underscoring the PSA's influence on eye-tracking quality. Conversely, Experiment 2 demonstrates humans' precise refixation abilities, suggesting that the accuracy of the gaze signal is not limited by human refixation inconsistencies.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"23"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659352/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A modular machine learning tool for holistic and fine-grained behavioral analysis. 用于整体和细粒度行为分析的模块化机器学习工具。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-19 DOI: 10.3758/s13428-024-02511-3

Bruno Michelot, Alexandra Corneyllie, Marc Thevenet, Stefan Duffner, Fabien Perrin

Artificial intelligence techniques offer promising avenues for exploring human body features from videos, yet no freely accessible tool has reliably provided holistic and fine-grained behavioral analyses to date. To address this, we developed a machine learning tool based on a two-level approach: a first lower-level processing using computer vision for extracting fine-grained and comprehensive behavioral features such as skeleton or facial points, gaze, and action units; a second level of machine learning classification coupled with explainability providing modularity, to determine which behavioral features are triggered by specific environments. To validate our tool, we filmed 16 participants across six conditions, varying according to the presence of a person ("Pers"), a sound ("Snd"), or silence ("Rest"), and according to emotional levels using self-referential ("Self") and control ("Ctrl") stimuli. We demonstrated the effectiveness of our approach by extracting and correcting behavior from videos using two computer vision software (OpenPose and OpenFace) and by training two algorithms (XGBoost and long short-term memory [LSTM]) to differentiate between experimental conditions. High classification rates were achieved for "Pers" conditions versus "Snd" or "Rest" (AUC = 0.8-0.9), with explainability revealing actions units and gaze as key features. Additionally, moderate classification rates were attained for "Snd" versus "Rest" (AUC = 0.7), attributed to action units, limbs and head points, as well as for "Self" versus "Ctrl" (AUC = 0.7-0.8), due to facial points. These findings were consistent with a more conventional hypothesis-driven approach. Overall, our study suggests that our tool is well suited for holistic and fine-grained behavioral analysis and offers modularity for extension into more complex naturalistic environments.

人工智能技术为从视频中探索人体特征提供了很有前途的途径，但迄今为止，还没有一种免费的工具能够可靠地提供整体和细粒度的行为分析。为了解决这个问题，我们开发了一种基于两级方法的机器学习工具：第一个低级处理使用计算机视觉提取细粒度和全面的行为特征，如骨骼或面部点、凝视和动作单元；第二级机器学习分类与可解释性相结合，提供模块化，以确定哪些行为特征是由特定环境触发的。为了验证我们的工具，我们在六种条件下拍摄了16名参与者，这些条件根据有人（“Pers”）、声音（“Snd”）或沉默（“Rest”）的存在以及使用自我指涉（“Self”）和控制（“Ctrl”）刺激的情绪水平而变化。我们通过使用两种计算机视觉软件（OpenPose和OpenFace）从视频中提取和纠正行为，并通过训练两种算法（XGBoost和长短期记忆[LSTM]）来区分实验条件，证明了我们方法的有效性。与“Snd”或“Rest”相比，“Pers”条件的分类率很高（AUC = 0.8-0.9），可解释性揭示了行动单位和凝视作为关键特征。此外，“Snd”与“Rest”的分类率中等（AUC = 0.7），归因于行动单位，四肢和头部点，以及“Self”与“Ctrl”的分类率中等（AUC = 0.7-0.8），归因于面部点。这些发现与更传统的假设驱动方法一致。总的来说，我们的研究表明，我们的工具非常适合于整体和细粒度的行为分析，并为扩展到更复杂的自然环境提供了模块化。

{"title":"A modular machine learning tool for holistic and fine-grained behavioral analysis.","authors":"Bruno Michelot, Alexandra Corneyllie, Marc Thevenet, Stefan Duffner, Fabien Perrin","doi":"10.3758/s13428-024-02511-3","DOIUrl":"10.3758/s13428-024-02511-3","url":null,"abstract":"Artificial intelligence techniques offer promising avenues for exploring human body features from videos, yet no freely accessible tool has reliably provided holistic and fine-grained behavioral analyses to date. To address this, we developed a machine learning tool based on a two-level approach: a first lower-level processing using computer vision for extracting fine-grained and comprehensive behavioral features such as skeleton or facial points, gaze, and action units; a second level of machine learning classification coupled with explainability providing modularity, to determine which behavioral features are triggered by specific environments. To validate our tool, we filmed 16 participants across six conditions, varying according to the presence of a person (\"Pers\"), a sound (\"Snd\"), or silence (\"Rest\"), and according to emotional levels using self-referential (\"Self\") and control (\"Ctrl\") stimuli. We demonstrated the effectiveness of our approach by extracting and correcting behavior from videos using two computer vision software (OpenPose and OpenFace) and by training two algorithms (XGBoost and long short-term memory [LSTM]) to differentiate between experimental conditions. High classification rates were achieved for \"Pers\" conditions versus \"Snd\" or \"Rest\" (AUC = 0.8-0.9), with explainability revealing actions units and gaze as key features. Additionally, moderate classification rates were attained for \"Snd\" versus \"Rest\" (AUC = 0.7), attributed to action units, limbs and head points, as well as for \"Self\" versus \"Ctrl\" (AUC = 0.7-0.8), due to facial points. These findings were consistent with a more conventional hypothesis-driven approach. Overall, our study suggests that our tool is well suited for holistic and fine-grained behavioral analysis and offers modularity for extension into more complex naturalistic environments.","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"24"},"PeriodicalIF":4.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

5956 German affective norms for atmospheres in organizations (GANAiO). 5956德国组织气氛情感规范（GANAiO）。

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods

Pub Date : 2024-12-18 DOI: 10.3758/s13428-024-02566-2

Anna Eifert, Christian Julmi

This article develops a comprehensive database comprising 5956 German affective norms specifically tailored for the study of organizational atmospheres through computational verbal language analysis. This dictionary adopts both dimensional and categorical approaches. The theoretical foundation of this study is the circumplex model of affective atmospheres. Similar to established methodologies, each word is rated based on the dimensions of valence and arousal. Going beyond the dimensional approach, this article introduces a classification system with 11 distinct atmospheric categories, assigning the words to their corresponding categories. This dictionary represents the first attempt to apply computer-aided text analysis (CATA) to the study of organizational atmospheres, providing a practical tool to support research in this developing area.

本文开发了一个综合数据库，其中包括5956德语情感规范，专门为组织氛围的研究量身定制，通过计算口头语言分析。这本词典既采用量纲法，又采用分类法。本研究的理论基础是情感氛围的循环模型。与已建立的方法类似，每个单词都是基于效价和唤起的维度进行评级的。在维度方法之外，本文介绍了一个包含11个不同大气类别的分类系统，并将单词分配到相应的类别中。本词典是将计算机辅助文本分析（CATA）应用于组织氛围研究的首次尝试，为支持这一发展领域的研究提供了实用工具。

引用次数: 0