The Inventory of Callous-Unemotional Traits (ICU) is a widely used measure of callous-unemotional (CU) traits that may aid in the assessment of the diagnostic specifier "with limited prosocial emotions," which has been added to diagnostic criteria for conduct disorder. Though there is substantial support for use of the ICU total score, the scale's factor structure has been highly debated. Inconsistencies in past factor analyses may be largely attributed to failure to control for method variance due to item wording (i.e., half of the items being worded in the callous direction and half worded in the prosocial direction). Thus, the present study used a multitrait-multimethod confirmatory factor analytic approach that models both trait and method variance to test the factor structure of the ICU self-report in a clinically relevant, high-risk sample of justice-involved male adolescents (N = 1,216). When comparing the fit of empirical and theoretical models, goodness of fit indices (χ² = 1105.877, df = 190, root-mean-square error of approximation = .063, comparative fit index = .916, Tucker-Lewis index = .878, standardized root-mean-square residual = .051) provided support for a hierarchical four-factor model (i.e., one overarching callous-unemotional factor, four latent trait factors) when accounting for method variance (i.e., covarying positively worded items). This factor structure is consistent with the way the ICU was constructed and with criteria for the limited prosocial emotions specifier. In addition, measurement invariance of this factor structure across age, race, and ethnicity was supported, and the predictive validity of the ICU was supported across these demographic groups in predicting self-reported antisocial behavior and rearrests over a 5-year period following an adolescent's first arrest. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
There are numerous studies examining differences in the experience of disorders and symptoms of psychopathology in adolescents across racial or ethnic groups and sex. Though there is substantial research exploring potential factors that may influence these differences, few studies have considered the potential contribution of measurement properties to these differences. Therefore, this study examined whether there are differences across racial or ethnic groups and sex in the measurement of psychopathology, assessed in mother-reported behavior of 9-11 year old youth from the Adolescent Brain Cognitive Development study sample using updated Child Behavior Checklist scales (CBCL; Achenbach & Rescorla, 2001). Tests of measurement invariance of the CBCL utilized the higher order factor structure identified by Michelini et al. (2019) using this same Adolescent Brain Cognitive Development cohort. The dimensions include internalizing, somatoform, detachment, externalizing, and neurodevelopmental problems. The configural model had a good-to-excellent fit on all subscales of the CBCL across racial or ethnic groups and sex. The metric and scalar models fit just as well as the configural models, indicating that the scales are measuring the same constructs across racial or ethnic groups and sex and are not influenced by measurement properties of items on the CBCL, although some high-severity response options were not endorsed for youth in all racial or ethnic groups. These findings support the use of the CBCL in research examining psychopathology in racially or ethnically diverse samples of youth. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
Recent changes to diagnostic criteria for serious conduct problems in children and adolescents have included the presence of elevated callous-unemotional traits to define etiologically and clinically important subgroups of youth with a conduct problem diagnosis. The Clinical Assessment of Prosocial Emotions (CAPE) is an intensive assessment of the symptoms of this limited prosocial emotions specifier that uses a structured professional judgment method of scoring, which may make it useful in clinical settings when diagnoses may require more information than that provided by behavior rating scales. The present study adds to the limited tests of the CAPE's reliability and validity, using a sample of clinic-referred children ages 6-17 years of age, who were all administered the CAPE by trained clinicians. The mean age of the sample was 10.13 years (SD = 2.64); 54% of the sample identified as male and 46% identified as female; and 67% of participants identified as White, 29% identified as Black, and 52% identified as another race/ethnicity (i.e., Asian, Hispanic/Latinx, or other). The findings indicated that CAPE scores demonstrated strong interrater reliability. The scores also were associated with measures of conduct problems and aggression, even when controlling for behavior ratings of callous-unemotional traits. Further, when children with conduct problem diagnoses were divided into groups based on the presence of the limited prosocial emotions specifier from the CAPE, the subgroup with the specifier showed more severe conduct problems and aggression. The results support cautious clinical use of the CAPE, its further development and testing, and research into ways to make its use feasible in many clinical settings. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
Comparing self-reported symptom scores across time requires longitudinal measurement invariance (LMI), a psychometric property that means the measure is functioning identically across all time points. Despite its prominence as a measure of depression symptom severity in both research and health care, LMI has yet to be firmly established for the Patient Health Questionnaire-9 depression module (PHQ-9), particularly over the course of antidepressant pharmacotherapy. Accordingly, the objective of this study was to assess for LMI of the PHQ-9 during pharmacotherapy for major depressive disorder. This was a secondary analysis of data collected during a randomized controlled trial. A total of 1,944 veterans began antidepressant monotherapy and completed the PHQ-9 six times over 24 weeks of treatment. LMI was assessed using a series of four confirmatory factor analysis models that included all six time points, with estimated parameters increasingly constrained across models to test for different aspects of invariance. Root-mean-square error of approximation of the chi-square difference test values below 0.06 indicated the presence of LMI. Exploratory LMI analyses were also performed for separate sex, age, and race subgroups. Root-mean-square error of approximation of the chi-square difference test showed minimal change in model fits during invariance testing (≤ 0.06 for all steps), supporting full LMI for the PHQ-9. LMI was also supported for all tested veteran subgroups. As such, PHQ-9 sum scores can be compared across extended pharmacotherapy treatment durations. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
The triarchic model posits that distinct trait constructs of boldness, meanness, and disinhibition underlie psychopathy. The triarchic model traits are conceptualized as biobehavioral dimensions that can be assessed using different sets of indicators from alternative measurement modalities; as such, the triarchic model would hypothesize that these traits are not confined to any one item set. The present study tested whether the triarchic model dimensions would emerge from a hierarchical-structural analysis of the facet scales of the Elemental Psychopathy Assessment (EPA), an inventory designed to comprehensively index psychopathy according to the five-factor personality model. Study participants (Ns = 811, 170) completed the EPA and three different scale sets assessing the triarchic traits along with criterion measures of antisocial/externalizing behaviors. Bass-ackwards modeling of the EPA facet scales revealed a four-level structure, with factors at the third level appearing similar to the triarchic trait dimensions. An analysis in which scores for the Level-3 EPA factors were regressed onto corresponding latent-trait dimensions defined using the different triarchic scale sets revealed extremely high convergence (βs = .84-.91). The Level-3 EPA factors also evidenced validity in relation to relevant criteria, approximating and sometimes exceeding that evident for the Level-4 EPA factors. Together, these results indicate that the triarchic trait constructs are embedded in a psychopathy inventory designed to align with a general personality model and effectively predict pertinent external criteria. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
Assessment tools for depression and anxiety usually inquire about the frequency of symptoms. However, evidence suggests that different question framings might trigger different responses. Our aim is to test if asking about symptom's context, ability, duration, and botherment adds validity to Patient Health Questionnaire-9, General Anxiety Disorder-7, and Patient-Related Outcome Measurement Information Systems depression and anxiety. Participants came from two cross-sectional convenience-sampled surveys (N = 1,871) of adults (66% females, aged 33.4 ± 13.2), weighted to approximate with the state-level population. We examined measurement invariance across the different question frames, estimated whether framing affected mean scores, and tested their independent validity using covariate-adjusted and sample-weighted structural equation models. Validity was tested using tools assessing general disability, alcohol use, loneliness, well-being, grit, and frequency-based questions from depression and anxiety questionnaires. A bifactor model was applied to test the internal consistency of the question frames under the presence of a general factor (i.e., depression or anxiety). Measurement invariance was supported across the different frames. Framing questions as ability (i.e., "How easily …") produced a higher score, compared with framing by context (i.e., "In which daily situations …"). Construct and criterion validity analysis demonstrate that variance explained using multiple question frames was similar to using only one. We detected a strong overarching factor for each instrument, with little variances left to be explained by the question frame. Therefore, it is unlikely that using different adverbial phrasings can help clinicians and researchers to improve their ability to detect depression or anxiety. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
This article illustrates novel quantitative methods to estimate classification consistency in machine learning models used for screening measures. Screening measures are used in psychology and medicine to classify individuals into diagnostic classifications. In addition to achieving high accuracy, it is ideal for the screening process to have high classification consistency, which means that respondents would be classified into the same group every time if the assessment was repeated. Although machine learning models are increasingly being used to predict a screening classification based on individual item responses, methods to describe the classification consistency of machine learning models have not yet been developed. This article addresses this gap by describing methods to estimate classification inconsistency in machine learning models arising from two different sources: sampling error during model fitting and measurement error in the item responses. These methods use data resampling techniques such as the bootstrap and Monte Carlo sampling. These methods are illustrated using three empirical examples predicting a health condition/diagnosis from item responses. R code is provided to facilitate the implementation of the methods. This article highlights the importance of considering classification consistency alongside accuracy when studying screening measures and provides the tools and guidance necessary for applied researchers to obtain classification consistency indices in their machine learning research on diagnostic assessments. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
This study evaluates the use of the crosswalk between the PTSD Checklist-Civilian (PCL-C) and PTSD Checklist for DSM-5 (PCL-5) designed by Moshier et al. (2019) in a sample of service members and veterans (SM/V; N = 298) who had sustained a traumatic brain injury (TBI) and were receiving inpatient rehabilitation. The PCL-C and PCL-5 were completed at the same time. Predicted PCL-5 scores for the sample were obtained according to the crosswalk developed by Moshier et al. We used three measures of agreement: intraclass correlation coefficient (ICC), mean difference between predicted and observed scores, and Cohen's κ to determine the performance of the crosswalk in this sample. Subgroups relevant to those who have sustained a TBI, such as TBI severity, were also examined. There was strong agreement between the predicted and observed PCL-5 scores (ICC = .95). The overall mean difference between predicted and observed PCL-5 scores was 0.07 and not statistically significant (SD = 8.29, p = .89). Significant mean differences between predicted and observed PCL-5 scores calculated between subgroups were seen in Black participants (MD = -4.09, SD = 8.41, p = .01) and those in the Year 5 follow-up group (MD = 1.77, SD = 7.14, p = .03). Cohen's κ across subgroups had a mean of κ = 0.76 (.57-1.0), suggesting that there was moderate to almost perfect diagnostic agreement. Our results suggest the crosswalk created by Moshier et al. can be applied to SM/V who have suffered a TBI. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
The onset of depressive episodes is preceded by changes in mean levels of affective experiences, which can be detected using the exponentially weighted moving average procedure on experience sampling method (ESM) data. Applying the exponentially weighted moving average procedure requires sufficient baseline data from the person under study in healthy times, which is needed to calculate a control limit for monitoring incoming ESM data. It is, however, not trivial to obtain sufficient baseline data from a single person. We therefore investigate whether historical ESM data from healthy individuals can help establish an adequate control limit for the person under study via multilevel modeling. Specifically, we focus on the case in which there is very little baseline data available of the person under study (i.e., up to 7 days). This multilevel approach is compared with the traditional, person-specific approach, where estimates are obtained using the person's available baseline data. Predictive performance in terms of Matthews correlation coefficient did not differ much between the approaches; however, the multilevel approach was more sensitive at detecting mean changes. This implies that for low-cost and nonharmful interventions, the multilevel approach may prove particularly beneficial. (PsycInfo Database Record (c) 2024 APA, all rights reserved).