Daniel S Spina, Katie Aafjes-van Doorn, Sarah J Horne, Francesco Gazzillo, Federica Genova, Bernard S Gorman, Karl Stukenberg, Sherwood Waldron
{"title":"Can Graduate Students Rate Personality Reliably in Psychoanalytic Treatments Using the Shedler-Westen Assessment Procedure (SWAP-200)?","authors":"Daniel S Spina, Katie Aafjes-van Doorn, Sarah J Horne, Francesco Gazzillo, Federica Genova, Bernard S Gorman, Karl Stukenberg, Sherwood Waldron","doi":"10.1521/pdps.2025.53.1.102","DOIUrl":null,"url":null,"abstract":"<p><p><b>Introduction:</b> Psychotherapy researchers often recruit students to code psychotherapy process and outcome variables on individual therapy sessions. To do so, graduate students read verbatim psychotherapy transcripts, listen to audio or watch video of psychotherapy sessions, and then rate process and outcome measures on these sessions. Although prior studies have investigated the reliability and validity of graduate student ratings of personality dysfunction, no current study has investigated the reliability and validity of these ratings made on the basis of psychotherapy transcripts and audio. In this study, we evaluated the degree to which graduate students can reliably and validly code observer-rated personality assessments on the Shedler-Westen Assessment Procedure-200 (SWAP-200). <b>Methods:</b> We related graduate student and experienced clinician-researcher's SWAP-200 scores in an existing dataset of 27 patients undergoing psychoanalytic psychotherapy at early and late phases of treatment. <b>Results:</b> Using truth and bias multilevel models, we found that graduate students tended to underestimate personality pathology at the early phase of treatment and overestimate pathology at the late phase in treatment. These responses resulted in smaller effect sizes in the graduate student ratings, such that patients' personality functioning did not appear to change following treatment. By contrast, the experienced clinician's scores supported moderate to large effect sizes in personality change. <b>Discussion:</b> These deviations from expert judgment may have significant ramifications for evaluating pre and post effect sizes in psychotherapy process studies using graduate student coding. Suggestions for research and practice are discussed.</p>","PeriodicalId":38518,"journal":{"name":"Psychodynamic Psychiatry","volume":"53 1","pages":"102-120"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychodynamic Psychiatry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1521/pdps.2025.53.1.102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Psychology","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Psychotherapy researchers often recruit students to code psychotherapy process and outcome variables on individual therapy sessions. To do so, graduate students read verbatim psychotherapy transcripts, listen to audio or watch video of psychotherapy sessions, and then rate process and outcome measures on these sessions. Although prior studies have investigated the reliability and validity of graduate student ratings of personality dysfunction, no current study has investigated the reliability and validity of these ratings made on the basis of psychotherapy transcripts and audio. In this study, we evaluated the degree to which graduate students can reliably and validly code observer-rated personality assessments on the Shedler-Westen Assessment Procedure-200 (SWAP-200). Methods: We related graduate student and experienced clinician-researcher's SWAP-200 scores in an existing dataset of 27 patients undergoing psychoanalytic psychotherapy at early and late phases of treatment. Results: Using truth and bias multilevel models, we found that graduate students tended to underestimate personality pathology at the early phase of treatment and overestimate pathology at the late phase in treatment. These responses resulted in smaller effect sizes in the graduate student ratings, such that patients' personality functioning did not appear to change following treatment. By contrast, the experienced clinician's scores supported moderate to large effect sizes in personality change. Discussion: These deviations from expert judgment may have significant ramifications for evaluating pre and post effect sizes in psychotherapy process studies using graduate student coding. Suggestions for research and practice are discussed.