Pub Date : 2024-08-16DOI: 10.1186/s40537-024-00978-7
Bubryur Kim, K. R. Sri Preethaa, Sujeen Song, R. R. Lukacs, Jinwoo An, Zengshun Chen, Euijung An, Sungho Kim
The construction industry substantially contributes to the economic growth of a country. However, it records a large number of workplace injuries and fatalities annually due to its hesitant adoption of automated safety monitoring systems. To address this critical concern, this study presents a real-time monitoring approach that uses the Internet of Things and ensemble learning. This study leverages wearable sensor technology, such as photoplethysmography and electroencephalography sensors, to continuously track the physiological parameters of construction workers. The sensor data is processed using an ensemble learning approach called the ChronoEnsemble Fatigue Analysis System (CEFAS), comprising deep autoregressive and temporal fusion transformer models, to accurately predict potential physical and mental fatigue. Comprehensive evaluation metrics, including mean square error, mean absolute scaled error, and symmetric mean absolute percentage error, demonstrated the superior prediction accuracy and reliability of the proposed model compared to standalone models. The ensemble learning model exhibited remarkable precision in predicting physical and mental fatigue, as evidenced by the mean square errors of 0.0008 and 0.0033, respectively. The proposed model promptly recognizes potential hazards and irregularities, considerably enhancing worker safety and reducing on-site risks.
{"title":"Internet of things and ensemble learning-based mental and physical fatigue monitoring for smart construction sites","authors":"Bubryur Kim, K. R. Sri Preethaa, Sujeen Song, R. R. Lukacs, Jinwoo An, Zengshun Chen, Euijung An, Sungho Kim","doi":"10.1186/s40537-024-00978-7","DOIUrl":"https://doi.org/10.1186/s40537-024-00978-7","url":null,"abstract":"<p>The construction industry substantially contributes to the economic growth of a country. However, it records a large number of workplace injuries and fatalities annually due to its hesitant adoption of automated safety monitoring systems. To address this critical concern, this study presents a real-time monitoring approach that uses the Internet of Things and ensemble learning. This study leverages wearable sensor technology, such as photoplethysmography and electroencephalography sensors, to continuously track the physiological parameters of construction workers. The sensor data is processed using an ensemble learning approach called the ChronoEnsemble Fatigue Analysis System (CEFAS), comprising deep autoregressive and temporal fusion transformer models, to accurately predict potential physical and mental fatigue. Comprehensive evaluation metrics, including mean square error, mean absolute scaled error, and symmetric mean absolute percentage error, demonstrated the superior prediction accuracy and reliability of the proposed model compared to standalone models. The ensemble learning model exhibited remarkable precision in predicting physical and mental fatigue, as evidenced by the mean square errors of 0.0008 and 0.0033, respectively. The proposed model promptly recognizes potential hazards and irregularities, considerably enhancing worker safety and reducing on-site risks.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"42 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142186361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-12DOI: 10.1186/s40537-024-00979-6
Samia Loucif, Murad Al-Rajab, Raed Abu Zitar, Mahmoud Rezk
This paper presents a comprehensive approach to harmonizing lunar calendars across different global regions, addressing the long-standing challenge of variations in new crescent Moon sightings that mark the beginning of lunar months. We propose a machine learning (ML)-based framework to predict the visibility of the new crescent Moon, representing a significant advancement toward a globally unified lunar calendar. Our study utilized a dataset covering various countries globally, making it the first to analyze all 12 lunar months over a span of 13 years. We applied a wide array of ML algorithms and techniques. These techniques included feature selection, hyperparameter tuning, ensemble learning, and region-based clustering, all aimed at maximizing the model’s performance. The overall results reveal that the gradient boosting (GB) model surpasses all other models, achieving the highest F1 score of 0.882469 and an area under the curve (AUC) of 0.901009. However, with selected features identified through the ANOVA F-test and optimized parameters, the Extra Trees model exhibited the best performance with an F1 score of 0.887872, and an AUC of 0.906242. We expanded our analysis to explore ensemble models, aiming to understand how a combination of models might boost predictive accuracy. The Ensemble Model exhibited a slight improvement, with an F1 score of 0.888058 and an AUC of 0.907482. Additionally, the geographical segmentation of the dataset enhanced predictive performance in certain areas, such as Africa and Asia. In conclusion, ML techniques can provide efficient and reliable tool for predicting the new crescent Moon visibility that would support the decisions of marking the beginning of new lunar months.
本文提出了一种协调全球不同地区农历的综合方法,以解决标志着农历月份开始的新月视线变化这一长期存在的难题。我们提出了一个基于机器学习(ML)的框架来预测新月的能见度,这代表着向全球统一的农历迈进了一大步。我们的研究利用了一个涵盖全球多个国家的数据集,这也是首个对 13 年间所有 12 个农历月份进行分析的研究。我们应用了多种 ML 算法和技术。这些技术包括特征选择、超参数调整、集合学习和基于区域的聚类,所有这些都旨在最大限度地提高模型的性能。总体结果显示,梯度提升(GB)模型超越了所有其他模型,获得了最高的 F1 分数 0.882469 和曲线下面积(AUC)0.901009。然而,通过方差分析 F 检验和优化参数确定的选定特征,Extra Trees 模型表现出最佳性能,F1 得分为 0.887872,AUC 为 0.906242。我们扩大了分析范围,探索了集合模型,旨在了解模型组合如何提高预测准确性。集合模型略有改进,F1 得分为 0.888058,AUC 为 0.907482。此外,数据集的地理细分也提高了某些地区(如非洲和亚洲)的预测性能。总之,ML 技术可以为预测新月能见度提供高效、可靠的工具,从而为标记新月开始的决策提供支持。
{"title":"Toward a globally lunar calendar: a machine learning-driven approach for crescent moon visibility prediction","authors":"Samia Loucif, Murad Al-Rajab, Raed Abu Zitar, Mahmoud Rezk","doi":"10.1186/s40537-024-00979-6","DOIUrl":"https://doi.org/10.1186/s40537-024-00979-6","url":null,"abstract":"<p>This paper presents a comprehensive approach to harmonizing lunar calendars across different global regions, addressing the long-standing challenge of variations in new crescent Moon sightings that mark the beginning of lunar months. We propose a machine learning (ML)-based framework to predict the visibility of the new crescent Moon, representing a significant advancement toward a globally unified lunar calendar. Our study utilized a dataset covering various countries globally, making it the first to analyze all 12 lunar months over a span of 13 years. We applied a wide array of ML algorithms and techniques. These techniques included feature selection, hyperparameter tuning, ensemble learning, and region-based clustering, all aimed at maximizing the model’s performance. The overall results reveal that the gradient boosting (GB) model surpasses all other models, achieving the highest F1 score of 0.882469 and an area under the curve (AUC) of 0.901009. However, with selected features identified through the ANOVA F-test and optimized parameters, the Extra Trees model exhibited the best performance with an F1 score of 0.887872, and an AUC of 0.906242. We expanded our analysis to explore ensemble models, aiming to understand how a combination of models might boost predictive accuracy. The Ensemble Model exhibited a slight improvement, with an F1 score of 0.888058 and an AUC of 0.907482. Additionally, the geographical segmentation of the dataset enhanced predictive performance in certain areas, such as Africa and Asia. In conclusion, ML techniques can provide efficient and reliable tool for predicting the new crescent Moon visibility that would support the decisions of marking the beginning of new lunar months.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"4 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142186363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The k-Nearest Neighbors (kNN) method, established in 1951, has since evolved into a pivotal tool in data mining, recommendation systems, and Internet of Things (IoT), among other areas. This paper presents a comprehensive review and performance analysis of modifications made to enhance the exact kNN techniques, particularly focusing on kNN Search and kNN Join for high-dimensional data. We delve deep into 31 kNN search methods and 12 kNN join methods, providing a methodological overview and analytical insight into each, emphasizing their strengths, limitations, and applicability. An important feature of our study is the provision of the source code for each of the kNN methods discussed, fostering ease of experimentation and comparative analysis for readers. Motivated by the rising significance of kNN in high-dimensional spaces and a recognized gap in comprehensive surveys on exact kNN techniques, our work seeks to bridge this gap. Additionally, we outline existing challenges and present potential directions for future research in the domain of kNN techniques, offering a holistic guide that amalgamates, compares, and dissects existing methodologies in a coherent manner.