Worldwide terrorist activities continue to pose a significant threat to global security and stability. The unpredictable nature of these acts necessitates advanced analytical approaches to enhance prevention and response strategies. This study examines undetectable word extensions across multiple datasets, using terrorism-related datasets as a case study. This research aims to overcome constraints in current predictive models associated with terrorist attack prediction. While many studies have used the GTD for predicting global terrorist attacks, this study expands beyond GTD by evaluating a corpus of terrorism incidents to enhance predictive analysis through lexical usage. The study employs several machine learning algorithms including Decision Tree (DT), Bootstrap Aggregating (BA), Random Forest (RF), Extra Trees (ET) and XGBoost (XG) algorithms for evaluation. Our approach integrates multiple datasets to reduce dependence on GTD alone. Findings indicate that RF performs best on the GTD database, with 90.20% accuracy in predicting worldwide terrorist attacks. DT achieves 90.40% accuracy when applied to the TF–IDF dataset. XG demonstrates superior performance across various aggregation settings and feature sets, achieving 95.77% accuracy in forecasting worldwide terrorist acts. XG's consistent and effective performance across various contexts highlights its versatility. Its high adaptability and robust performance position it as the preferred algorithm for conducting predictive research on global terrorist acts using the available datasets. Our research findings underscore the importance of incorporating diverse datasets to enhance understanding of terrorist activities and improve predictive capabilities.