Pub Date : 2025-01-17DOI: 10.3758/s13428-024-02523-z
Ming Yan, Jinger Pan, Reinhold Kliegl
We introduce a sentence corpus with eye-movement data in traditional Chinese (TC), based on the original Beijing Sentence Corpus (BSC) in simplified Chinese (SC). The most noticeable difference between TC and SC character sets is their visual complexity. There are reaction time corpora in isolated TC character/word lexical decision and naming tasks. However, up to now natural TC sentence reading corpus with recorded eye movements has not been available for general public. We report effects of word frequency, visual complexity, and predictability on eye movements on fixation location and duration based on 60 native TC readers. In addition, because the current BSC-II sentences are nearly identical to the original BSC sentences, we report similarities and differences of the linguistic influences on eye movements for the two varieties of written Chinese. The results shed light on how visual complexity affects eye movements. Together, the two sentence corpora comprise a useful tool to establish cross-script similarities and differences in TC and SC.
{"title":"The Beijing Sentence Corpus II: A cross-script comparison between traditional and simplified Chinese sentence reading.","authors":"Ming Yan, Jinger Pan, Reinhold Kliegl","doi":"10.3758/s13428-024-02523-z","DOIUrl":"10.3758/s13428-024-02523-z","url":null,"abstract":"<p><p>We introduce a sentence corpus with eye-movement data in traditional Chinese (TC), based on the original Beijing Sentence Corpus (BSC) in simplified Chinese (SC). The most noticeable difference between TC and SC character sets is their visual complexity. There are reaction time corpora in isolated TC character/word lexical decision and naming tasks. However, up to now natural TC sentence reading corpus with recorded eye movements has not been available for general public. We report effects of word frequency, visual complexity, and predictability on eye movements on fixation location and duration based on 60 native TC readers. In addition, because the current BSC-II sentences are nearly identical to the original BSC sentences, we report similarities and differences of the linguistic influences on eye movements for the two varieties of written Chinese. The results shed light on how visual complexity affects eye movements. Together, the two sentence corpora comprise a useful tool to establish cross-script similarities and differences in TC and SC.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 2","pages":"60"},"PeriodicalIF":4.6,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11748476/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-14DOI: 10.3758/s13428-024-02510-4
Cheyenne Wakeland-Hart, Mariam Aly
While viewing a visual stimulus, we often cannot tell whether it is inherently memorable or forgettable. However, the memorability of a stimulus can be quantified and partially predicted by a collection of conceptual and perceptual factors. Higher-level properties that represent the "meaningfulness" of a visual stimulus to viewers best predict whether it will be remembered or forgotten across a population. Here, we hypothesize that the feelings evoked by an image, operationalized as the valence and arousal dimensions of affect, significantly contribute to the memorability of scene images. We ran two complementary experiments to investigate the influence of affect on scene memorability, in the process creating a new image set (VAMOS) of hundreds of natural scene images for which we obtained valence, arousal, and memorability scores. From our first experiment, we found memorability to be highly reliable for scene images that span a wide range of evoked arousal and valence. From our second experiment, we found that both valence and arousal are significant but weak predictors of image memorability. Scene images were most memorable if they were slightly negatively valenced and highly arousing. Images that were extremely positive or unarousing were most forgettable. Valence and arousal together accounted for less than 8% of the variance in image memorability. These findings suggest that evoked affect contributes to the overall memorability of a scene image but, like other singular predictors, does not fully explain it. Instead, memorability is best explained by an assemblage of visual features that combine, in perhaps unintuitive ways, to predict what is likely to stick in our memory.
{"title":"Predicting image memorability from evoked feelings.","authors":"Cheyenne Wakeland-Hart, Mariam Aly","doi":"10.3758/s13428-024-02510-4","DOIUrl":"10.3758/s13428-024-02510-4","url":null,"abstract":"<p><p>While viewing a visual stimulus, we often cannot tell whether it is inherently memorable or forgettable. However, the memorability of a stimulus can be quantified and partially predicted by a collection of conceptual and perceptual factors. Higher-level properties that represent the \"meaningfulness\" of a visual stimulus to viewers best predict whether it will be remembered or forgotten across a population. Here, we hypothesize that the feelings evoked by an image, operationalized as the valence and arousal dimensions of affect, significantly contribute to the memorability of scene images. We ran two complementary experiments to investigate the influence of affect on scene memorability, in the process creating a new image set (VAMOS) of hundreds of natural scene images for which we obtained valence, arousal, and memorability scores. From our first experiment, we found memorability to be highly reliable for scene images that span a wide range of evoked arousal and valence. From our second experiment, we found that both valence and arousal are significant but weak predictors of image memorability. Scene images were most memorable if they were slightly negatively valenced and highly arousing. Images that were extremely positive or unarousing were most forgettable. Valence and arousal together accounted for less than 8% of the variance in image memorability. These findings suggest that evoked affect contributes to the overall memorability of a scene image but, like other singular predictors, does not fully explain it. Instead, memorability is best explained by an assemblage of visual features that combine, in perhaps unintuitive ways, to predict what is likely to stick in our memory.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"58"},"PeriodicalIF":4.6,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-14DOI: 10.3758/s13428-024-02543-9
Julie De Jonckere, Yves Rosseel
Model estimation for SEM analyses in commonly used software typically involves iterative optimization procedures, which can lead to nonconvergence issues. In this paper, we propose using random starting values as an alternative to the current default strategies. By drawing from uniform distributions within data-driven lower and upper bounds (see De Jonckere et al. (2022) Structural Equation Modeling: A Multidisciplinary Journal, 29(3), 412-427), random starting values are generated for each (free) parameter in the model. Through three small simulation studies, we demonstrate that incorporating such bounded random starting values significantly reduces the nonconvergence rate, resulting in increased convergence rates ranging between 87% and 96% in the first two studies. In essence, bounded random starting values seem to offer a promising alternative to the default starting values that are currently used in most software packages.
常用软件中SEM分析的模型估计通常涉及迭代优化过程,这可能导致非收敛问题。在本文中,我们建议使用随机起始值作为当前默认策略的替代方案。通过在数据驱动的下界和上界内绘制均匀分布(参见De Jonckere et al. (2022) Structural Equation Modeling: A Multidisciplinary Journal, 29(3), 412-427),为模型中的每个(自由)参数生成随机起始值。通过三次小型模拟研究,我们证明了采用这种有界随机起始值显著降低了不收敛率,导致前两项研究中收敛率提高了87%至96%。从本质上讲,有界随机起始值似乎为目前大多数软件包中使用的默认起始值提供了一个有希望的替代方案。
{"title":"A note on using random starting values in small sample SEM.","authors":"Julie De Jonckere, Yves Rosseel","doi":"10.3758/s13428-024-02543-9","DOIUrl":"10.3758/s13428-024-02543-9","url":null,"abstract":"<p><p>Model estimation for SEM analyses in commonly used software typically involves iterative optimization procedures, which can lead to nonconvergence issues. In this paper, we propose using random starting values as an alternative to the current default strategies. By drawing from uniform distributions within data-driven lower and upper bounds (see De Jonckere et al. (2022) Structural Equation Modeling: A Multidisciplinary Journal, 29(3), 412-427), random starting values are generated for each (free) parameter in the model. Through three small simulation studies, we demonstrate that incorporating such bounded random starting values significantly reduces the nonconvergence rate, resulting in increased convergence rates ranging between 87% and 96% in the first two studies. In essence, bounded random starting values seem to offer a promising alternative to the default starting values that are currently used in most software packages.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"57"},"PeriodicalIF":4.6,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-14DOI: 10.3758/s13428-024-02578-y
Wanjing Anya Ma, Adam Richie-Halford, Amy K Burkhardt, Klint Kanopka, Clementine Chou, Benjamin W Domingue, Jason D Yeatman
The Rapid Online Assessment of Reading (ROAR) is a web-based lexical decision task that measures single-word reading abilities in children and adults without a proctor. Here we study whether item response theory (IRT) and computerized adaptive testing (CAT) can be used to create a more efficient online measure of word recognition. To construct an item bank, we first analyzed data taken from four groups of students (N = 1960) who differed in age, socioeconomic status, and language-based learning disabilities. The majority of item parameters were highly consistent across groups (r = .78-.94), and six items that functioned differently across groups were removed. Next, we implemented a JavaScript CAT algorithm and conducted a validation experiment with 485 students in grades 1-8 who were randomly assigned to complete trials of all items in the item bank in either (a) a random order or (b) a CAT order. We found that, to achieve reliability of 0.9, CAT improved test efficiency by 40%: 75 CAT items produced the same standard error of measurement as 125 items in a random order. Subsequent validation in 32 public school classrooms showed that an approximately 3-min ROAR-CAT can achieve high correlations (r = .89 for first grade, r = .73 for second grade) with alternative 5-15-min individually proctored oral reading assessments. Our findings suggest that ROAR-CAT is a promising tool for efficiently and accurately measuring single-word reading ability. Furthermore, our development process serves as a model for creating adaptive online assessments that bridge research and practice.
{"title":"ROAR-CAT: Rapid Online Assessment of Reading ability with Computerized Adaptive Testing.","authors":"Wanjing Anya Ma, Adam Richie-Halford, Amy K Burkhardt, Klint Kanopka, Clementine Chou, Benjamin W Domingue, Jason D Yeatman","doi":"10.3758/s13428-024-02578-y","DOIUrl":"10.3758/s13428-024-02578-y","url":null,"abstract":"<p><p>The Rapid Online Assessment of Reading (ROAR) is a web-based lexical decision task that measures single-word reading abilities in children and adults without a proctor. Here we study whether item response theory (IRT) and computerized adaptive testing (CAT) can be used to create a more efficient online measure of word recognition. To construct an item bank, we first analyzed data taken from four groups of students (N = 1960) who differed in age, socioeconomic status, and language-based learning disabilities. The majority of item parameters were highly consistent across groups (r = .78-.94), and six items that functioned differently across groups were removed. Next, we implemented a JavaScript CAT algorithm and conducted a validation experiment with 485 students in grades 1-8 who were randomly assigned to complete trials of all items in the item bank in either (a) a random order or (b) a CAT order. We found that, to achieve reliability of 0.9, CAT improved test efficiency by 40%: 75 CAT items produced the same standard error of measurement as 125 items in a random order. Subsequent validation in 32 public school classrooms showed that an approximately 3-min ROAR-CAT can achieve high correlations (r = .89 for first grade, r = .73 for second grade) with alternative 5-15-min individually proctored oral reading assessments. Our findings suggest that ROAR-CAT is a promising tool for efficiently and accurately measuring single-word reading ability. Furthermore, our development process serves as a model for creating adaptive online assessments that bridge research and practice.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"56"},"PeriodicalIF":4.6,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11732908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-14DOI: 10.3758/s13428-024-02574-2
Audrey Filonczuk, Ying Cheng
Aberrant responses (e.g., careless responses, miskeyed items, etc.) often contaminate psychological assessments and surveys. Previous robust estimators for dichotomous IRT models have produced more accurate latent trait estimates with data containing response disturbances. However, for widely used Likert-type items with three or more response categories, a robust estimator for estimating latent traits does not exist. We propose a robust estimator for the graded response model (GRM) that can be applied to Likert-type items. Two weighting mechanisms for downweighting "suspicious" responses are considered: the Huber and the bisquare weight functions. Simulations reveal the estimator reduces bias for various test lengths, numbers of response categories, and types of response disturbances. The reduction in bias and stable standard errors suggests that the robust estimator for the GRM is effective in counteracting the harmful effects of response disturbances and providing more accurate scores on psychological assessments. The robust estimator is then applied to data from the Big Five Inventory-2 (Ober et al., 2021) to demonstrate its use. Potential applications and implications are discussed.
异常反应(例如,粗心的反应,错误的项目等)经常污染心理评估和调查。以前对二分类IRT模型的鲁棒估计已经在包含响应干扰的数据中产生了更准确的潜在性状估计。然而,对于具有三个或更多反应类别的广泛使用的李克特型项目,没有一个可靠的估计器来估计潜在特征。我们提出了一个可应用于李克特类型项目的分级响应模型(GRM)的鲁棒估计器。考虑了两种加权机制来降低“可疑”响应的权重:Huber和bissquared权重函数。仿真表明,该估计器减少了各种测试长度、响应类别数量和响应干扰类型的偏差。偏差的减少和稳定的标准误差表明,GRM的稳健估计器可以有效地抵消反应干扰的有害影响,并在心理评估中提供更准确的分数。然后将稳健估计器应用于Big Five Inventory-2中的数据(Ober等人,2021年)以演示其使用。讨论了潜在的应用和影响。
{"title":"Robust estimation of the latent trait in graded response models.","authors":"Audrey Filonczuk, Ying Cheng","doi":"10.3758/s13428-024-02574-2","DOIUrl":"10.3758/s13428-024-02574-2","url":null,"abstract":"<p><p>Aberrant responses (e.g., careless responses, miskeyed items, etc.) often contaminate psychological assessments and surveys. Previous robust estimators for dichotomous IRT models have produced more accurate latent trait estimates with data containing response disturbances. However, for widely used Likert-type items with three or more response categories, a robust estimator for estimating latent traits does not exist. We propose a robust estimator for the graded response model (GRM) that can be applied to Likert-type items. Two weighting mechanisms for downweighting \"suspicious\" responses are considered: the Huber and the bisquare weight functions. Simulations reveal the estimator reduces bias for various test lengths, numbers of response categories, and types of response disturbances. The reduction in bias and stable standard errors suggests that the robust estimator for the GRM is effective in counteracting the harmful effects of response disturbances and providing more accurate scores on psychological assessments. The robust estimator is then applied to data from the Big Five Inventory-2 (Ober et al., 2021) to demonstrate its use. Potential applications and implications are discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"55"},"PeriodicalIF":4.6,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-09DOI: 10.3758/s13428-024-02555-5
Sijia Huang, Jinwen Luo, Minjeong Jeon
Educational researchers have a long-lasting interest in the strategies examinees employ when responding to items in an assessment. Mixture item response theory (IRT) modeling is a popular class of approaches to studying examinees' item-response strategies. In the present study, we introduce a response time (RT)-based mixture IRT model for flexible modeling of examinee-and-item-specific item-response strategies. We posit that examinees may alternate between ability-based and non-ability-based strategies across different test items. Our proposed model identifies such within-examinee strategy switches without the need to predefine the non-ability-based strategies. Instead, our proposed approach allows for inferring the nature of these strategies from model parameter estimates. We illustrated the proposed approach using empirical data from PISA 2018 Science test and evaluated it through simulation studies. We concluded the article with discussions of limitations and future research directions.
{"title":"A response time-based mixture item response theory model for dynamic item-response strategies.","authors":"Sijia Huang, Jinwen Luo, Minjeong Jeon","doi":"10.3758/s13428-024-02555-5","DOIUrl":"10.3758/s13428-024-02555-5","url":null,"abstract":"<p><p>Educational researchers have a long-lasting interest in the strategies examinees employ when responding to items in an assessment. Mixture item response theory (IRT) modeling is a popular class of approaches to studying examinees' item-response strategies. In the present study, we introduce a response time (RT)-based mixture IRT model for flexible modeling of examinee-and-item-specific item-response strategies. We posit that examinees may alternate between ability-based and non-ability-based strategies across different test items. Our proposed model identifies such within-examinee strategy switches without the need to predefine the non-ability-based strategies. Instead, our proposed approach allows for inferring the nature of these strategies from model parameter estimates. We illustrated the proposed approach using empirical data from PISA 2018 Science test and evaluated it through simulation studies. We concluded the article with discussions of limitations and future research directions.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"54"},"PeriodicalIF":4.6,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-09DOI: 10.3758/s13428-024-02571-5
Suzanne Jak, Terrence D Jorgensen, Mathilde G E Verdam, Frans J Oort, Louise Elffers
{"title":"Author Correction: Analytical power calculations for structural equation modeling: A tutorial and Shiny app.","authors":"Suzanne Jak, Terrence D Jorgensen, Mathilde G E Verdam, Frans J Oort, Louise Elffers","doi":"10.3758/s13428-024-02571-5","DOIUrl":"10.3758/s13428-024-02571-5","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"53"},"PeriodicalIF":4.6,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11717876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.3758/s13428-024-02551-9
Clemens Draxler, Andreas Kurz
This article addresses the problem of measurement invariance in psychometrics. In particular, its focus is on the invariance assumption of item parameters in a class of models known as Rasch models. It suggests a mixed-effects or random intercept model for binary data together with a conditional likelihood approach of both estimating and testing the effects of multiple covariates simultaneously. The procedure can also be viewed as a multivariate multiple regression analysis which can be applied in longitudinal designs to investigate effects of covariates over time or different experimental conditions. This work also derives four statistical tests based on asymptotic theory and a parameter-free test suitable in small sample size scenarios. Finally, it outlines generalizations for categorical data in more than two categories. All procedures are illustrated on real-data examples from behavioral research and on a hypothetical data example related to clinical research in a longitudinal design.
{"title":"Testing measurement invariance in a conditional likelihood framework by considering multiple covariates simultaneously.","authors":"Clemens Draxler, Andreas Kurz","doi":"10.3758/s13428-024-02551-9","DOIUrl":"10.3758/s13428-024-02551-9","url":null,"abstract":"<p><p>This article addresses the problem of measurement invariance in psychometrics. In particular, its focus is on the invariance assumption of item parameters in a class of models known as Rasch models. It suggests a mixed-effects or random intercept model for binary data together with a conditional likelihood approach of both estimating and testing the effects of multiple covariates simultaneously. The procedure can also be viewed as a multivariate multiple regression analysis which can be applied in longitudinal designs to investigate effects of covariates over time or different experimental conditions. This work also derives four statistical tests based on asymptotic theory and a parameter-free test suitable in small sample size scenarios. Finally, it outlines generalizations for categorical data in more than two categories. All procedures are illustrated on real-data examples from behavioral research and on a hypothetical data example related to clinical research in a longitudinal design.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"50"},"PeriodicalIF":4.6,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711259/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.3758/s13428-024-02512-2
Giuseppe Notaro, Uri Hasson
Sighting dominance is an important behavioral property which has been difficult to measure quantitatively with high precision. We developed a measurement method that is grounded in a two-camera model that satisfies these aims. Using a simple alignment task, this method quantifies sighting ocular dominance during binocular viewing, identifying each eye's relative contribution to binocular vision. The method involves placing a physical target between the viewer and a display. The viewer indicates the perceived target's projection on the display with both eyes open and with only one eye open. The relative location of the binocular projection in relation to the two monocular projections is the index of dominance. The method produces a continuous variable with robust test-retest reliability (ICC = 0.96). The unit of measurement for the computed quantity is physiologically grounded: it is proportional to the distance between the monocular projections, which we show is predictable from interpupillary distance and phoria. Comparisons with the classic 'hole in card' sighting dominance test show good agreement, but also hint at potential bias for determining right-eye dominance in the latter. Interestingly, we find that some individuals systematically demonstrate nearly balanced vision, a phenomenon previously construed as mixed dominance or noisy responses. We also present ways to quantify and mitigate sources of random noise in this measurement. Overall, this new method allows for precise estimation of sighting dominance during binocular viewing. We expect it will allow a more effective understanding of the neural basis of dominance and improved effectiveness when using sighting dominance as a covariate in more complex analyses.
{"title":"Quantifying sighting dominance using on-display projections of monocular and binocular views.","authors":"Giuseppe Notaro, Uri Hasson","doi":"10.3758/s13428-024-02512-2","DOIUrl":"10.3758/s13428-024-02512-2","url":null,"abstract":"<p><p>Sighting dominance is an important behavioral property which has been difficult to measure quantitatively with high precision. We developed a measurement method that is grounded in a two-camera model that satisfies these aims. Using a simple alignment task, this method quantifies sighting ocular dominance during binocular viewing, identifying each eye's relative contribution to binocular vision. The method involves placing a physical target between the viewer and a display. The viewer indicates the perceived target's projection on the display with both eyes open and with only one eye open. The relative location of the binocular projection in relation to the two monocular projections is the index of dominance. The method produces a continuous variable with robust test-retest reliability (ICC = 0.96). The unit of measurement for the computed quantity is physiologically grounded: it is proportional to the distance between the monocular projections, which we show is predictable from interpupillary distance and phoria. Comparisons with the classic 'hole in card' sighting dominance test show good agreement, but also hint at potential bias for determining right-eye dominance in the latter. Interestingly, we find that some individuals systematically demonstrate nearly balanced vision, a phenomenon previously construed as mixed dominance or noisy responses. We also present ways to quantify and mitigate sources of random noise in this measurement. Overall, this new method allows for precise estimation of sighting dominance during binocular viewing. We expect it will allow a more effective understanding of the neural basis of dominance and improved effectiveness when using sighting dominance as a covariate in more complex analyses.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"52"},"PeriodicalIF":4.6,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.3758/s13428-024-02527-9
Yufeng Liu, Shifa Chen, Yi Yang
The degree of semantic equivalence of translation pairs is typically measured by asking bilinguals to rate the semantic similarity of them or comparing the number and meaning of dictionary entries. Such measures are subjective, labor-intensive, and unable to capture the fine-grained variation in the degree of semantic equivalence. Thompson et al. (in Nature Human Behaviour, 4(10), 1029-1038, 2020) propose a computational method to quantify the extent to which translation equivalents are semantically aligned by measuring the contextual use across languages. Here, we refine this method to quantify semantic alignment of English-Chinese translation equivalents using word2vec based on the proposal that the degree of similarity between the contexts associated with a word and those of its multiple translations vary continuously. We validate our measure using semantic alignment from GloVe and fastText, and data from two behavioral datasets. The consistency of semantic alignment induced across different models confirms the robustness of our method. We demonstrate that semantic alignment not only reflects human semantic similarity judgment of translation equivalents but also captures bilinguals' usage frequency of translations. We also show that our method is more cognitively plausible than Thompson et al.'s method. Furthermore, the correlations between semantic alignment and key psycholinguistic factors mirror those between human-rated semantic similarity and these variables, indicating that computed semantic alignment reflects the degree of semantic overlap of translation equivalents in the bilingual mental lexicon. We further provide the largest English-Chinese translation equivalent dataset to date, encompassing 50,088 translation pairs for 15,734 English words, their dominant Chinese translation equivalents, and their semantic alignment Rc values.
翻译对的语义对等程度通常是通过要求双语者评价它们的语义相似度或比较字典条目的数量和含义来衡量的。这种度量是主观的、劳动密集型的,并且无法捕捉语义等价程度的细粒度变化。Thompson等人(Nature Human Behaviour, 4(10), 1029-1038, 2020)提出了一种计算方法,通过测量跨语言的上下文使用来量化翻译对等物在语义上的对齐程度。在此,我们基于一个词的上下文与多个翻译的上下文之间的相似度连续变化的建议,对该方法进行改进,使用word2vec来量化英汉对等翻译的语义对齐。我们使用GloVe和fastText的语义对齐以及来自两个行为数据集的数据来验证我们的测量。不同模型间语义对齐的一致性证实了我们方法的鲁棒性。研究表明,语义对齐不仅反映了人类对翻译对等物的语义相似度判断,而且反映了双语者对翻译的使用频率。我们还表明,我们的方法比汤普森等人的方法在认知上更合理。此外,语义对齐与关键心理语言学因素之间的相关性反映了人类评定的语义相似度与这些变量之间的相关性,表明计算的语义对齐反映了双语心理词典中翻译对等物的语义重叠程度。我们进一步提供了迄今为止最大的英汉翻译等效数据集,包括15,734个英语单词的50,088个翻译对,它们的主要中文翻译等效,以及它们的语义对齐Rc值。
{"title":"Semantic alignment: A measure to quantify the degree of semantic equivalence for English-Chinese translation equivalents based on distributional semantics.","authors":"Yufeng Liu, Shifa Chen, Yi Yang","doi":"10.3758/s13428-024-02527-9","DOIUrl":"10.3758/s13428-024-02527-9","url":null,"abstract":"<p><p>The degree of semantic equivalence of translation pairs is typically measured by asking bilinguals to rate the semantic similarity of them or comparing the number and meaning of dictionary entries. Such measures are subjective, labor-intensive, and unable to capture the fine-grained variation in the degree of semantic equivalence. Thompson et al. (in Nature Human Behaviour, 4(10), 1029-1038, 2020) propose a computational method to quantify the extent to which translation equivalents are semantically aligned by measuring the contextual use across languages. Here, we refine this method to quantify semantic alignment of English-Chinese translation equivalents using word2vec based on the proposal that the degree of similarity between the contexts associated with a word and those of its multiple translations vary continuously. We validate our measure using semantic alignment from GloVe and fastText, and data from two behavioral datasets. The consistency of semantic alignment induced across different models confirms the robustness of our method. We demonstrate that semantic alignment not only reflects human semantic similarity judgment of translation equivalents but also captures bilinguals' usage frequency of translations. We also show that our method is more cognitively plausible than Thompson et al.'s method. Furthermore, the correlations between semantic alignment and key psycholinguistic factors mirror those between human-rated semantic similarity and these variables, indicating that computed semantic alignment reflects the degree of semantic overlap of translation equivalents in the bilingual mental lexicon. We further provide the largest English-Chinese translation equivalent dataset to date, encompassing 50,088 translation pairs for 15,734 English words, their dominant Chinese translation equivalents, and their semantic alignment Rc values.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 1","pages":"51"},"PeriodicalIF":4.6,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}