{"title":"Scoring story recall for individual differences research: Central details, peripheral details, and automated scoring.","authors":"David Martinez","doi":"10.3758/s13428-024-02480-7","DOIUrl":null,"url":null,"abstract":"<p><p>Story recall is an episodic memory paradigm that is popular among researchers interested in the effects of aging, disease, and/or injury on memory functioning; it is less popular among individual-differences researchers studying neurotypical young adults. One reason differential psychologists may favor other episodic memory paradigms is that the prospect of scoring story recall is daunting, as it typically requires manually scoring hundreds or thousands of freely recalled narratives. In this study, I investigated two questions related to scoring story recall for individual differences research. First, whether there is anything to gain by scoring story recall for memory of central and peripheral details or if a single score is sufficient. Second, I investigated whether scoring can be automated using computational methods - namely, BERTScore and GPT-4. A total of 235 individuals participated in this study. At the latent variable level, central and peripheral factors were highly correlated (r = .99), and the two factors correlated with external factors (viz., fluid intelligence, crystallized intelligence, and working memory capacity) similarly. Regarding automated scoring, both BERTScore and GPT-4 derived scores were strongly correlated with manually derived scores (r ≥ .97); additionally, factors estimated from the various scoring methods all showed a similar pattern of correlations with the external factors. Thus, differential psychologists may be able to streamline scoring by disregarding detail type and by using automated approaches. Further research is needed, particularly of the automated approaches, as both BERTScore and GPT-4 derived scores were occasionally leptokurtic while manual scores were not.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"8362-8378"},"PeriodicalIF":4.6000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-024-02480-7","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Story recall is an episodic memory paradigm that is popular among researchers interested in the effects of aging, disease, and/or injury on memory functioning; it is less popular among individual-differences researchers studying neurotypical young adults. One reason differential psychologists may favor other episodic memory paradigms is that the prospect of scoring story recall is daunting, as it typically requires manually scoring hundreds or thousands of freely recalled narratives. In this study, I investigated two questions related to scoring story recall for individual differences research. First, whether there is anything to gain by scoring story recall for memory of central and peripheral details or if a single score is sufficient. Second, I investigated whether scoring can be automated using computational methods - namely, BERTScore and GPT-4. A total of 235 individuals participated in this study. At the latent variable level, central and peripheral factors were highly correlated (r = .99), and the two factors correlated with external factors (viz., fluid intelligence, crystallized intelligence, and working memory capacity) similarly. Regarding automated scoring, both BERTScore and GPT-4 derived scores were strongly correlated with manually derived scores (r ≥ .97); additionally, factors estimated from the various scoring methods all showed a similar pattern of correlations with the external factors. Thus, differential psychologists may be able to streamline scoring by disregarding detail type and by using automated approaches. Further research is needed, particularly of the automated approaches, as both BERTScore and GPT-4 derived scores were occasionally leptokurtic while manual scores were not.
期刊介绍:
Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.