Identifying the Most Significant Features for Stress Prediction of Automobile Drivers: A Comprehensive Study

IF 0.9 Q3 INFORMATION SCIENCE & LIBRARY SCIENCE Journal of Information & Knowledge Management Pub Date : 2023-11-08 DOI:10.1142/s0219649223500648

May Y. Al-Nashashibi, Nuha El-Khalili, Wael Hadi, Abedal-Kareem Al-Banna, Ghassan Issa

{"title":"Identifying the Most Significant Features for Stress Prediction of Automobile Drivers: A Comprehensive Study","authors":"May Y. Al-Nashashibi, Nuha El-Khalili, Wael Hadi, Abedal-Kareem Al-Banna, Ghassan Issa","doi":"10.1142/s0219649223500648","DOIUrl":null,"url":null,"abstract":"Objective: This paper used three feature selection methods on a Jordanian automobile drivers’ dataset to identify the most significant features for stress prediction algorithm performance. The dataset contains “stress” and “no-stress” classes with 30 features, categorised into physiological and contextual subsets. Methods: Eighteen classifiers from six prediction algorithm categories were evaluated: Rule-based, Tree-based, Ensemble-based, Function-based, Naïve Bayes-based and Lazy-based. Three Feature Subset Selection (FSS) methods were used: Gain Ratio, Chi-square and feature separation. Eight evaluation measures included [Formula: see text]1, Accuracy, Specificity, Sensitivity, Kappa Statistics, Mean Absolute Error (MAE), Area Under Curve (AUC) and Precision Recall Curve Area (PRCA). Results: Among the classifiers, Lazy-based LocalKNN performed significantly well in [Formula: see text]1, Accuracy, Kappa and MAE. Naïve Bayes-based Bayesian Network excelled in other measures. The original dataset with all features yielded the best overall performance, followed by the physiological-only subset. Gain Ratio and Chi-square FSS methods also showed promising results, though not significant. Conclusion: Four physiological (EMG, EMG Amplitude, Heart rate, Respiration Amplitude) and seven contextual (time range of driving, gender, age, driving skills, general accidents, last year’s accidents, stress frequency) features contributed to the best prediction outcomes. The study highlights the importance of proper feature selection and identifies optimal algorithms for specific measures.","PeriodicalId":45460,"journal":{"name":"Journal of Information & Knowledge Management","volume":"41 2","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0219649223500648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: This paper used three feature selection methods on a Jordanian automobile drivers’ dataset to identify the most significant features for stress prediction algorithm performance. The dataset contains “stress” and “no-stress” classes with 30 features, categorised into physiological and contextual subsets. Methods: Eighteen classifiers from six prediction algorithm categories were evaluated: Rule-based, Tree-based, Ensemble-based, Function-based, Naïve Bayes-based and Lazy-based. Three Feature Subset Selection (FSS) methods were used: Gain Ratio, Chi-square and feature separation. Eight evaluation measures included [Formula: see text]1, Accuracy, Specificity, Sensitivity, Kappa Statistics, Mean Absolute Error (MAE), Area Under Curve (AUC) and Precision Recall Curve Area (PRCA). Results: Among the classifiers, Lazy-based LocalKNN performed significantly well in [Formula: see text]1, Accuracy, Kappa and MAE. Naïve Bayes-based Bayesian Network excelled in other measures. The original dataset with all features yielded the best overall performance, followed by the physiological-only subset. Gain Ratio and Chi-square FSS methods also showed promising results, though not significant. Conclusion: Four physiological (EMG, EMG Amplitude, Heart rate, Respiration Amplitude) and seven contextual (time range of driving, gender, age, driving skills, general accidents, last year’s accidents, stress frequency) features contributed to the best prediction outcomes. The study highlights the importance of proper feature selection and identifies optimal algorithms for specific measures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

汽车驾驶员应力预测的最显著特征识别:一项综合研究

目的:采用三种特征选择方法对约旦汽车驾驶员数据集进行特征选择，以识别对应力预测算法性能影响最大的特征。该数据集包含有30个特征的“压力”和“无压力”类，分为生理和上下文子集。方法:对基于规则(Rule-based)、基于树(Tree-based)、基于集成(Ensemble-based)、基于函数(Function-based)、Naïve基于贝叶斯(bayes)和基于懒惰(Lazy-based) 6类预测算法中的18个分类器进行评价。采用增益比、卡方和特征分离三种特征子集选择方法。8项评价指标包括[公式:见文]1、准确性、特异性、敏感性、Kappa统计量、平均绝对误差(MAE)、曲线下面积(AUC)和精确召回曲线面积(PRCA)。结果:在分类器中，基于lazy的LocalKNN在[公式:见文本]1、准确率、Kappa和MAE方面表现显著。Naïve基于贝叶斯的贝叶斯网络在其他方面表现出色。具有所有特征的原始数据集产生了最佳的整体性能，其次是仅生理子集。增益比和卡方FSS方法也显示出有希望的结果，尽管不显著。结论:4个生理特征(肌电图、肌电图振幅、心率、呼吸振幅)和7个情境特征(驾驶时间范围、性别、年龄、驾驶技能、一般事故、去年事故、应激频率)对预测结果最有利。该研究强调了适当的特征选择的重要性，并为具体措施确定了最佳算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Information & Knowledge Management INFORMATION SCIENCE & LIBRARY SCIENCE-

CiteScore

2.40

自引率

25.00%

发文量

期刊介绍： JIKM is a refereed journal published quarterly by World Scientific and dedicated to the exchange of the latest research and practical information in the field of information processing and knowledge management. The journal publishes original research and case studies by academic, business and government contributors on all aspects of information processing, information management, knowledge management, tools, techniques and technologies, knowledge creation and sharing, best practices, policies and guidelines. JIKM is an international journal aimed at providing quality information to subscribers around the world. Managed by an international editorial board, JIKM positions itself as one of the leading scholarly journals in the field of information processing and knowledge management. It is a good reference for both information and knowledge management professionals. The journal covers key areas in the field of information and knowledge management. Research papers, practical applications, working papers, and case studies are invited in the following areas: -Business intelligence and competitive intelligence -Communication and organizational culture -e-Learning and life long learning -Electronic records and document management -Information processing and information management -Information organization, taxonomies and ontology -Intellectual capital -Knowledge creation, retention, sharing and transfer -Knowledge discovery, data and text mining -Knowledge management and innovations -Knowledge management education -Knowledge management tools and technologies -Knowledge management measurements -Knowledge professionals and leadership -Learning organization and organizational learning -Practical implementations of knowledge management