Joseph H Grochowalski, Ezgi Ayturk, Amy Hendrickson
We introduce a new method for estimating the degree of nonadditivity in a one-facet generalizability theory design. One-facet G-theory designs have only one observation per cell, such as persons answering items in a test, and assume that there is no interaction between facets. When there is interaction, the model becomes nonadditive, and G-theory variance estimates and reliability coefficients are likely biased. We introduce a multidimensional method for detecting interaction and nonadditivity in G-theory that has less bias and smaller error variance than methods that use the one-degree of freedom method based on Tukey's test for nonadditivity. The method we propose is more flexible and detects a greater variety of interactions than the formulation based on Tukey's test. Further, the proposed method is descriptive and illustrates the nature of the facet interaction using profile analysis, giving insight into potential interaction like rater biases, DIF, threats to test security, and other possible sources of systematic construct-irrelevant variance. We demonstrate the accuracy of our method using a simulation study and illustrate its descriptive profile features with a real data analysis of neurocognitive test scores. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Multidimensional nonadditivity in one-facet g-theory designs: A profile analytic approach.","authors":"Joseph H Grochowalski, Ezgi Ayturk, Amy Hendrickson","doi":"10.1037/met0000452","DOIUrl":"https://doi.org/10.1037/met0000452","url":null,"abstract":"<p><p>We introduce a new method for estimating the degree of nonadditivity in a one-facet generalizability theory design. One-facet G-theory designs have only one observation per cell, such as persons answering items in a test, and assume that there is no interaction between facets. When there is interaction, the model becomes nonadditive, and G-theory variance estimates and reliability coefficients are likely biased. We introduce a multidimensional method for detecting interaction and nonadditivity in G-theory that has less bias and smaller error variance than methods that use the one-degree of freedom method based on Tukey's test for nonadditivity. The method we propose is more flexible and detects a greater variety of interactions than the formulation based on Tukey's test. Further, the proposed method is descriptive and illustrates the nature of the facet interaction using profile analysis, giving insight into potential interaction like rater biases, DIF, threats to test security, and other possible sources of systematic construct-irrelevant variance. We demonstrate the accuracy of our method using a simulation study and illustrate its descriptive profile features with a real data analysis of neurocognitive test scores. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"651-663"},"PeriodicalIF":7.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10002274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haley E Yaremych, Kristopher J Preacher, Donald Hedeker
The topic of centering in multilevel modeling (MLM) has received substantial attention from methodologists, as different centering choices for lower-level predictors present important ramifications for the estimation and interpretation of model parameters. However, the centering literature has focused almost exclusively on continuous predictors, with little attention paid to whether and how categorical predictors should be centered, despite their ubiquity across applied fields. Alongside this gap in the methodological literature, a review of applied articles showed that researchers center categorical predictors infrequently and inconsistently. Algebraically and statistically, continuous and categorical predictors behave the same, but researchers using them do not, and for many, interpreting the effects of categorical predictors is not intuitive. Thus, the goals of this tutorial article are twofold: to clarify why and how categorical predictors should be centered in MLM, and to explain how multilevel regression coefficients resulting from centered categorical predictors should be interpreted. We first provide algebraic support showing that uncentered coding variables result in a conflated blend of the within- and between-cluster effects of a multicategorical predictor, whereas appropriate centering techniques yield level-specific effects. Next, we provide algebraic derivations to illuminate precisely how the within- and between-cluster effects of a multicategorical predictor should be interpreted under dummy, contrast, and effect coding schemes. Finally, we provide a detailed demonstration of our conclusions with an empirical example. Implications for practice, including relevance of our findings to categorical control variables (i.e., covariates), interaction terms with categorical focal predictors, and multilevel latent variable models, are discussed. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Centering categorical predictors in multilevel models: Best practices and interpretation.","authors":"Haley E Yaremych, Kristopher J Preacher, Donald Hedeker","doi":"10.1037/met0000434","DOIUrl":"https://doi.org/10.1037/met0000434","url":null,"abstract":"<p><p>The topic of centering in multilevel modeling (MLM) has received substantial attention from methodologists, as different centering choices for lower-level predictors present important ramifications for the estimation and interpretation of model parameters. However, the centering literature has focused almost exclusively on continuous predictors, with little attention paid to whether and how categorical predictors should be centered, despite their ubiquity across applied fields. Alongside this gap in the methodological literature, a review of applied articles showed that researchers center categorical predictors infrequently and inconsistently. Algebraically and statistically, continuous and categorical predictors behave the same, but researchers using them do not, and for many, interpreting the effects of categorical predictors is not intuitive. Thus, the goals of this tutorial article are twofold: to clarify why and how categorical predictors should be centered in MLM, and to explain how multilevel regression coefficients resulting from centered categorical predictors should be interpreted. We first provide algebraic support showing that uncentered coding variables result in a conflated blend of the within- and between-cluster effects of a multicategorical predictor, whereas appropriate centering techniques yield level-specific effects. Next, we provide algebraic derivations to illuminate precisely how the within- and between-cluster effects of a multicategorical predictor should be interpreted under dummy, contrast, and effect coding schemes. Finally, we provide a detailed demonstration of our conclusions with an empirical example. Implications for practice, including relevance of our findings to categorical control variables (i.e., covariates), interaction terms with categorical focal predictors, and multilevel latent variable models, are discussed. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"613-630"},"PeriodicalIF":7.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9646799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big Data can bring enormous benefits to psychology. However, many psychological researchers show skepticism in undertaking Big Data research. Psychologists often do not take Big Data into consideration while developing their research projects because they have difficulties imagining how Big Data could help in their specific field of research, imagining themselves as "Big Data scientists," or for lack of specific knowledge. This article provides an introductory guide for conducting Big Data research for psychologists who are considering using this approach and want to have a general idea of its processes. By taking the Knowledge Discovery from Database steps as the fil rouge, we provide useful indications for finding data suitable for psychological investigations, describe how these data can be preprocessed, and list some techniques to analyze them and programming languages (R and Python) through which all these steps can be realized. In doing so, we explain the concepts with the terminology and take examples from psychology. For psychologists, familiarizing with the language of data science is important because it may appear difficult and esoteric at first approach. As Big Data research is often multidisciplinary, this overview helps build a general insight into the research steps and a common language, facilitating collaboration across different fields. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"An introductory guide for conducting psychological research with big data.","authors":"Michela Vezzoli, Cristina Zogmaister","doi":"10.1037/met0000513","DOIUrl":"https://doi.org/10.1037/met0000513","url":null,"abstract":"<p><p>Big Data can bring enormous benefits to psychology. However, many psychological researchers show skepticism in undertaking Big Data research. Psychologists often do not take Big Data into consideration while developing their research projects because they have difficulties imagining how Big Data could help in their specific field of research, imagining themselves as \"Big Data scientists,\" or for lack of specific knowledge. This article provides an introductory guide for conducting Big Data research for psychologists who are considering using this approach and want to have a general idea of its processes. By taking the Knowledge Discovery from Database steps as the <i>fil rouge</i>, we provide useful indications for finding data suitable for psychological investigations, describe how these data can be preprocessed, and list some techniques to analyze them and programming languages (R and Python) through which all these steps can be realized. In doing so, we explain the concepts with the terminology and take examples from psychology. For psychologists, familiarizing with the language of data science is important because it may appear difficult and esoteric at first approach. As Big Data research is often multidisciplinary, this overview helps build a general insight into the research steps and a common language, facilitating collaboration across different fields. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"580-599"},"PeriodicalIF":7.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9728336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maximilian Linde, Jorge N Tendeiro, Ravi Selker, Eric-Jan Wagenmakers, Don van Ravenzwaaij
Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Decisions about equivalence: A comparison of TOST, HDI-ROPE, and the Bayes factor.","authors":"Maximilian Linde, Jorge N Tendeiro, Ravi Selker, Eric-Jan Wagenmakers, Don van Ravenzwaaij","doi":"10.1037/met0000402","DOIUrl":"https://doi.org/10.1037/met0000402","url":null,"abstract":"<p><p>Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"740-755"},"PeriodicalIF":7.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10002258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This note contains a corrective and a generalization of results by Borsboom et al. (2008), based on Heesen and Romeijn (2019). It highlights the relevance of insights from psychometrics beyond the context of psychological testing. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Measurement invariance, selection invariance, and fair selection revisited.","authors":"Remco Heesen, Jan-Willem Romeijn","doi":"10.1037/met0000491","DOIUrl":"https://doi.org/10.1037/met0000491","url":null,"abstract":"<p><p>This note contains a corrective and a generalization of results by Borsboom et al. (2008), based on Heesen and Romeijn (2019). It highlights the relevance of insights from psychometrics beyond the context of psychological testing. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"687-690"},"PeriodicalIF":7.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9644365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Coefficient α, although ubiquitous in the research literature, is frequently criticized for being a poor estimate of test reliability. In this note, we consider the range of α and prove that it has no lower bound (i.e., α ∈ ( - ∞, 1]). While outlining our proofs, we present algorithms for generating data sets that will yield any fixed value of α in its range. We also prove that for some data sets-even those with appreciable item correlations-α is undefined. Although α is a putative estimate of the correlation between parallel forms, it is not a correlation as α can assume any value below-1 (and α values below 0 are nonsensical reliability estimates). In the online supplemental materials, we provide R code for replicating our empirical findings and for generating data sets with user-defined α values. We hope that researchers will use this code to better understand the limitations of α as an index of scale reliability. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"What are the mathematical bounds for coefficient α?","authors":"Niels Waller, William Revelle","doi":"10.1037/met0000583","DOIUrl":"https://doi.org/10.1037/met0000583","url":null,"abstract":"<p><p>Coefficient α, although ubiquitous in the research literature, is frequently criticized for being a poor estimate of test reliability. In this note, we consider the range of α and prove that it has no lower bound (i.e., α ∈ ( - ∞, 1]). While outlining our proofs, we present algorithms for generating data sets that will yield any fixed value of α in its range. We also prove that for some data sets-even those with appreciable item correlations-α is undefined. Although α is a putative estimate of the correlation between parallel forms, it is not a correlation as α can assume any value below-1 (and α values below 0 are nonsensical reliability estimates). In the online supplemental materials, we provide R code for replicating our empirical findings and for generating data sets with user-defined α values. We hope that researchers will use this code to better understand the limitations of α as an index of scale reliability. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9892839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The number of available factor analytic techniques has been increasing in the last decades. However, the lack of clear guidelines and exhaustive comparison studies between the techniques might hinder that these valuable methodological advances make their way to applied research. The present paper evaluates the performance of confirmatory factor analysis (CFA), CFA with sequential model modification using modification indices and the Saris procedure, exploratory factor analysis (EFA) with different rotation procedures (Geomin, target, and objectively refined target matrix), Bayesian structural equation modeling (BSEM), and a new set of procedures that, after fitting an unrestrictive model (i.e., EFA, BSEM), identify and retain only the relevant loadings to provide a parsimonious CFA solution (ECFA, BCFA). By means of an exhaustive Monte Carlo simulation study and a real data illustration, it is shown that CFA and BSEM are overly stiff and, consequently, do not appropriately recover the structure of slightly misspecified models. EFA usually provides the most accurate parameter estimates, although the rotation procedure choice is of major importance, especially depending on whether the latent factors are correlated or not. Finally, ECFA might be a sound option whenever an a priori structure cannot be hypothesized and the latent factors are correlated. Moreover, it is shown that the pattern of the results of a factor analytic technique can be somehow predicted based on its positioning in the confirmatory-exploratory continuum. Applied recommendations are given for the selection of the most appropriate technique under different representative scenarios by means of a detailed flowchart. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Is exploratory factor analysis always to be preferred? A systematic comparison of factor analytic techniques throughout the confirmatory-exploratory continuum.","authors":"Pablo Nájera, Francisco J Abad, Miguel A Sorrel","doi":"10.1037/met0000579","DOIUrl":"https://doi.org/10.1037/met0000579","url":null,"abstract":"<p><p>The number of available factor analytic techniques has been increasing in the last decades. However, the lack of clear guidelines and exhaustive comparison studies between the techniques might hinder that these valuable methodological advances make their way to applied research. The present paper evaluates the performance of confirmatory factor analysis (CFA), CFA with sequential model modification using modification indices and the Saris procedure, exploratory factor analysis (EFA) with different rotation procedures (Geomin, target, and objectively refined target matrix), Bayesian structural equation modeling (BSEM), and a new set of procedures that, after fitting an unrestrictive model (i.e., EFA, BSEM), identify and retain only the relevant loadings to provide a parsimonious CFA solution (ECFA, BCFA). By means of an exhaustive Monte Carlo simulation study and a real data illustration, it is shown that CFA and BSEM are overly stiff and, consequently, do not appropriately recover the structure of slightly misspecified models. EFA usually provides the most accurate parameter estimates, although the rotation procedure choice is of major importance, especially depending on whether the latent factors are correlated or not. Finally, ECFA might be a sound option whenever an a priori structure cannot be hypothesized and the latent factors are correlated. Moreover, it is shown that the pattern of the results of a factor analytic technique can be somehow predicted based on its positioning in the confirmatory-exploratory continuum. Applied recommendations are given for the selection of the most appropriate technique under different representative scenarios by means of a detailed flowchart. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9876148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Egamaria Alacam, Craig K Enders, Han Du, Brian T Keller
Composite scores are an exceptionally important psychometric tool for behavioral science research applications. A prototypical example occurs with self-report data, where researchers routinely use questionnaires with multiple items that tap into different features of a target construct. Item-level missing data are endemic to composite score applications. Many studies have investigated this issue, and the near-universal theme is that item-level missing data treatment is superior because it maximizes precision and power. However, item-level missing data handling can be challenging because missing data models become very complex and suffer from the same "curse of dimensionality" problem that plagues the estimation of psychometric models. A good deal of recent missing data literature has focused on advancing factored regression specifications that use a sequence of regression models to represent the multivariate distribution of a set of incomplete variables. The purpose of this paper is to describe and evaluate a factored specification for composite scores with incomplete item responses. We used a series of computer simulations to compare the proposed approach to gold standard multiple imputation and latent variable modeling approaches. Overall, the simulation results suggest that this new approach can be very effective, even under extreme conditions where the number of items is very large (or even exceeds) the sample size. A real data analysis illustrates the application of the method using software available on the internet. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"A factored regression model for composite scores with item-level missing data.","authors":"Egamaria Alacam, Craig K Enders, Han Du, Brian T Keller","doi":"10.1037/met0000584","DOIUrl":"https://doi.org/10.1037/met0000584","url":null,"abstract":"<p><p>Composite scores are an exceptionally important psychometric tool for behavioral science research applications. A prototypical example occurs with self-report data, where researchers routinely use questionnaires with multiple items that tap into different features of a target construct. Item-level missing data are endemic to composite score applications. Many studies have investigated this issue, and the near-universal theme is that item-level missing data treatment is superior because it maximizes precision and power. However, item-level missing data handling can be challenging because missing data models become very complex and suffer from the same \"curse of dimensionality\" problem that plagues the estimation of psychometric models. A good deal of recent missing data literature has focused on advancing factored regression specifications that use a sequence of regression models to represent the multivariate distribution of a set of incomplete variables. The purpose of this paper is to describe and evaluate a factored specification for composite scores with incomplete item responses. We used a series of computer simulations to compare the proposed approach to gold standard multiple imputation and latent variable modeling approaches. Overall, the simulation results suggest that this new approach can be very effective, even under extreme conditions where the number of items is very large (or even exceeds) the sample size. A real data analysis illustrates the application of the method using software available on the internet. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9892840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scores on self-report questionnaires are often used in statistical models without accounting for measurement error, leading to bias in estimates related to those variables. While measurement error corrections exist, their broad application is limited by their simplicity (e.g., Spearman's correction for attenuation), which complicates their inclusion in specialized analyses, or complexity (e.g., latent variable modeling), which necessitates large sample sizes and can limit the analytic options available. To address these limitations, a flexible multiple imputation-based approach, called true score imputation, is described, which can accommodate a broad class of statistical models. By augmenting copies of the original dataset with sets of plausible true scores, the resulting set of datasets can be analyzed using widely available multiple imputation methodology, yielding point estimates and confidence intervals calculated with respect to the estimated true score. A simulation study demonstrates that the method yields a large reduction in bias compared to treating scores as measured without error, and a real-world data example is further used to illustrate the benefit of the method. An R package implements the proposed method via a custom imputation function for an existing, commonly used multiple imputation library (mice), allowing true score imputation to be used alongside multiple imputation for missing data, yielding a unified framework for accounting for both missing data and measurement error. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"A true score imputation method to account for psychometric measurement error.","authors":"Maxwell Mansolf","doi":"10.1037/met0000578","DOIUrl":"10.1037/met0000578","url":null,"abstract":"<p><p>Scores on self-report questionnaires are often used in statistical models without accounting for measurement error, leading to bias in estimates related to those variables. While measurement error corrections exist, their broad application is limited by their simplicity (e.g., Spearman's correction for attenuation), which complicates their inclusion in specialized analyses, or complexity (e.g., latent variable modeling), which necessitates large sample sizes and can limit the analytic options available. To address these limitations, a flexible multiple imputation-based approach, called <i>true score imputation</i>, is described, which can accommodate a broad class of statistical models. By augmenting copies of the original dataset with sets of plausible true scores, the resulting set of datasets can be analyzed using widely available multiple imputation methodology, yielding point estimates and confidence intervals calculated with respect to the estimated true score. A simulation study demonstrates that the method yields a large reduction in bias compared to treating scores as measured without error, and a real-world data example is further used to illustrate the benefit of the method. An R package implements the proposed method via a custom imputation function for an existing, commonly used multiple imputation library (mice), allowing true score imputation to be used alongside multiple imputation for missing data, yielding a unified framework for accounting for both missing data and measurement error. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10674037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9707324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mirka Henninger, Rudolf Debelak, Yannick Rothacher, Carolin Strobl
In recent years, machine learning methods have become increasingly popular prediction methods in psychology. At the same time, psychological researchers are typically not only interested in making predictions about the dependent variable, but also in learning which predictor variables are relevant, how they influence the dependent variable, and which predictors interact with each other. However, most machine learning methods are not directly interpretable. Interpretation techniques that support researchers in describing how the machine learning technique came to its prediction may be a means to this end. We present a variety of interpretation techniques and illustrate the opportunities they provide for interpreting the results of two widely used black box machine learning methods that serve as our examples: random forests and neural networks. At the same time, we illustrate potential pitfalls and risks of misinterpretation that may occur in certain data settings. We show in which way correlated predictors impact interpretations with regard to the relevance or shape of predictor effects and in which situations interaction effects may or may not be detected. We use simulated didactic examples throughout the article, as well as an empirical data set for illustrating an approach to objectify the interpretation of visualizations. We conclude that, when critically reflected, interpretable machine learning techniques may provide useful tools when describing complex psychological relationships. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Interpretable machine learning for psychological research: Opportunities and pitfalls.","authors":"Mirka Henninger, Rudolf Debelak, Yannick Rothacher, Carolin Strobl","doi":"10.1037/met0000560","DOIUrl":"10.1037/met0000560","url":null,"abstract":"<p><p>In recent years, machine learning methods have become increasingly popular prediction methods in psychology. At the same time, psychological researchers are typically not only interested in making predictions about the dependent variable, but also in learning which predictor variables are relevant, how they influence the dependent variable, and which predictors interact with each other. However, most machine learning methods are not directly interpretable. Interpretation techniques that support researchers in describing how the machine learning technique came to its prediction may be a means to this end. We present a variety of interpretation techniques and illustrate the opportunities they provide for interpreting the results of two widely used black box machine learning methods that serve as our examples: random forests and neural networks. At the same time, we illustrate potential pitfalls and risks of misinterpretation that may occur in certain data settings. We show in which way correlated predictors impact interpretations with regard to the relevance or shape of predictor effects and in which situations interaction effects may or may not be detected. We use simulated didactic examples throughout the article, as well as an empirical data set for illustrating an approach to objectify the interpretation of visualizations. We conclude that, when critically reflected, interpretable machine learning techniques may provide useful tools when describing complex psychological relationships. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9876144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}