A potentially overlooked contributor to the reproducibility crisis in psychology is the choice of statistical application software used for factor analysis. Although the open science movement promotes transparency by advocating for open access to data and statistical methods, this approach alone is insufficient to address the reproducibility crisis. It is commonly assumed that different statistical software applications produce equivalent results when conducting the same statistical analysis. However, this is not necessarily the case. Statistical programs often yield disparate outcomes, even when using identical data and factor analytic procedures, which can lead to inconsistent interpretation of results. This study examines this phenomenon by conducting exploratory factor analyses on two tests of cognitive ability—the WISC-V and the MEZURE—using four different statistical programs/applications. Factor analysis plays a critical role in determining the underlying theory of cognitive ability instruments, and guides how those instruments should be scored and interpreted. However, psychology is grappling with a reproducibility crisis in this area, as independent researchers and test publishers frequently report divergent factor analytic results. The outcome of this study revealed significant variations in structural outcomes among the statistical software programs/applications. These findings highlight the importance of using multiple statistical programs, ensuring transparency with analysis code, and recognizing the potential for varied outcomes when interpreting results from factor analytic procedures. Addressing these issues is important for advancing scientific integrity and mitigating the reproducibility crisis in psychology particularly in relation to cognitive ability structural validity.