May Y. Al-Nashashibi, Nuha El-Khalili, Wael Hadi, Abedal-Kareem Al-Banna, Ghassan Issa
{"title":"Identifying the Most Significant Features for Stress Prediction of Automobile Drivers: A Comprehensive Study","authors":"May Y. Al-Nashashibi, Nuha El-Khalili, Wael Hadi, Abedal-Kareem Al-Banna, Ghassan Issa","doi":"10.1142/s0219649223500648","DOIUrl":null,"url":null,"abstract":"Objective: This paper used three feature selection methods on a Jordanian automobile drivers’ dataset to identify the most significant features for stress prediction algorithm performance. The dataset contains “stress” and “no-stress” classes with 30 features, categorised into physiological and contextual subsets. Methods: Eighteen classifiers from six prediction algorithm categories were evaluated: Rule-based, Tree-based, Ensemble-based, Function-based, Naïve Bayes-based and Lazy-based. Three Feature Subset Selection (FSS) methods were used: Gain Ratio, Chi-square and feature separation. Eight evaluation measures included [Formula: see text]1, Accuracy, Specificity, Sensitivity, Kappa Statistics, Mean Absolute Error (MAE), Area Under Curve (AUC) and Precision Recall Curve Area (PRCA). Results: Among the classifiers, Lazy-based LocalKNN performed significantly well in [Formula: see text]1, Accuracy, Kappa and MAE. Naïve Bayes-based Bayesian Network excelled in other measures. The original dataset with all features yielded the best overall performance, followed by the physiological-only subset. Gain Ratio and Chi-square FSS methods also showed promising results, though not significant. Conclusion: Four physiological (EMG, EMG Amplitude, Heart rate, Respiration Amplitude) and seven contextual (time range of driving, gender, age, driving skills, general accidents, last year’s accidents, stress frequency) features contributed to the best prediction outcomes. The study highlights the importance of proper feature selection and identifies optimal algorithms for specific measures.","PeriodicalId":45460,"journal":{"name":"Journal of Information & Knowledge Management","volume":"41 2","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0219649223500648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This paper used three feature selection methods on a Jordanian automobile drivers’ dataset to identify the most significant features for stress prediction algorithm performance. The dataset contains “stress” and “no-stress” classes with 30 features, categorised into physiological and contextual subsets. Methods: Eighteen classifiers from six prediction algorithm categories were evaluated: Rule-based, Tree-based, Ensemble-based, Function-based, Naïve Bayes-based and Lazy-based. Three Feature Subset Selection (FSS) methods were used: Gain Ratio, Chi-square and feature separation. Eight evaluation measures included [Formula: see text]1, Accuracy, Specificity, Sensitivity, Kappa Statistics, Mean Absolute Error (MAE), Area Under Curve (AUC) and Precision Recall Curve Area (PRCA). Results: Among the classifiers, Lazy-based LocalKNN performed significantly well in [Formula: see text]1, Accuracy, Kappa and MAE. Naïve Bayes-based Bayesian Network excelled in other measures. The original dataset with all features yielded the best overall performance, followed by the physiological-only subset. Gain Ratio and Chi-square FSS methods also showed promising results, though not significant. Conclusion: Four physiological (EMG, EMG Amplitude, Heart rate, Respiration Amplitude) and seven contextual (time range of driving, gender, age, driving skills, general accidents, last year’s accidents, stress frequency) features contributed to the best prediction outcomes. The study highlights the importance of proper feature selection and identifies optimal algorithms for specific measures.
期刊介绍:
JIKM is a refereed journal published quarterly by World Scientific and dedicated to the exchange of the latest research and practical information in the field of information processing and knowledge management. The journal publishes original research and case studies by academic, business and government contributors on all aspects of information processing, information management, knowledge management, tools, techniques and technologies, knowledge creation and sharing, best practices, policies and guidelines. JIKM is an international journal aimed at providing quality information to subscribers around the world. Managed by an international editorial board, JIKM positions itself as one of the leading scholarly journals in the field of information processing and knowledge management. It is a good reference for both information and knowledge management professionals. The journal covers key areas in the field of information and knowledge management. Research papers, practical applications, working papers, and case studies are invited in the following areas: -Business intelligence and competitive intelligence -Communication and organizational culture -e-Learning and life long learning -Electronic records and document management -Information processing and information management -Information organization, taxonomies and ontology -Intellectual capital -Knowledge creation, retention, sharing and transfer -Knowledge discovery, data and text mining -Knowledge management and innovations -Knowledge management education -Knowledge management tools and technologies -Knowledge management measurements -Knowledge professionals and leadership -Learning organization and organizational learning -Practical implementations of knowledge management