Gearboxes, as critical components, often operate in demanding conditions, enduring constant exposure to variable loads and speeds. In the realm of condition monitoring, the dataset primarily comprises data from normal operating conditions, with significantly fewer instances of faulty conditions, resulting in imbalanced datasets. To address the challenges posed by this data disparity, researchers have proposed various solutions aimed at enhancing the performance of classification models. One such solution involves balancing the dataset before the training phase through oversampling techniques. In this study, we utilized the Sparse Autoencoder technique for data augmentation and employed Support Vector Machine (SVM) and Random Forest (RF) for classification. We conducted four experiments to evaluate the impact of data imbalance on classifier performance: (1) using the original dataset without data augmentation, (2) employing partial data augmentation, (3) applying full data augmentation, and (4) balancing the dataset while using Kernel Principal Component Analysis (KPCA) for dimensionality reduction. Our findings revealed that both algorithms achieved accuracies exceeding 90%, even when employing the original non-augmented data. When partial data augmentation was employed both algorithms were able to achieve accuracies beyond 98%. Full data augmentation yielded slightly better results compared to partial augmentation. After reducing dimensions from 18 to 11 using KPCA, both classifiers maintained robust performance. SVM achieved an overall accuracy of 98.72%, while RF achieved 96.06% accuracy.