Correlations estimated in single-source data provide uninterpretable estimates of empirical overlap between scales. We describe a model to adjust correlations for errors and biases using test–retest and multi-rater data and compare adjusted correlations among individual items with their human-rated semantic similarity (SS). We expected adjusted correlations to predict SS better than unadjusted correlations and exceed SS in absolute magnitude. While unadjusted and adjusted correlations predicted SS rankings equally well across all items, adjusted correlations were superior where items were judged most semantically redundant in meaning. Retest- and agreement-adjusted correlations were usually higher than SS, whereas unadjusted correlations often underestimated SS. We discuss uses of test–retest and multi-rater data for identifying construct redundancy and argue SS often underestimates variables’ empirical overlap.
We examined two forms of self/observer agreement (correlational and mean-level) in personality using a Dutch university student sample (N=5,405) with self-reports and observer (informant) reports from parents, siblings, friends, and partners/spouses. Correlational self/observer agreement was strong across all HEXACO-PI-R scales and across relationship types ( ≥ 0.59, but highest for partners). Regarding mean-level self/observer agreement, alleged positive bias in self-reports was not observed. Only Openness showed higher means for self-reports than for observer reports across all relationship types (d = 0.37). Mean observer report scores varied by relationship: people perceived their children as more honest and less anxious and perceived their siblings as less agreeable than other observers did. Partner reports showed the closest mean-level agreement with self-reports.

