Pub Date : 2024-09-01DOI: 10.1016/j.mlwa.2024.100583
Wei Lun Koh, James Boon Yong Koh, Bing Tian Dai
We propose a framework for a cloud-based application of an image classification system that is highly accessible, maintains data confidentiality, and robust to incorrect training labels. The end-to-end system is implemented using Amazon Web Services (AWS), with a detailed guide provided for replication, enhancing the ways which researchers can collaborate with a community of users for mutual benefits. A front-end web application allows users across the world to securely log in, contribute labelled training images conveniently via a drag-and-drop approach, and use that same application to query an up-to-date model that has knowledge of images from the community of users. This resulting system demonstrates that theory can be effectively interlaced with practice, with various considerations addressed by our architecture. Users will have access to an image classification model that can be updated and automatically deployed within minutes, gaining benefits from and at the same time providing benefits to the community of users. At the same time, researchers, who will act as administrators, will be able to conveniently and securely engage a large number of users with their respective machine learning models and build up a labelled database over time, paying only variable costs that is proportional to utilization.
{"title":"Robust image classification system via cloud computing, aligned multimodal embeddings, centroids and neighbours","authors":"Wei Lun Koh, James Boon Yong Koh, Bing Tian Dai","doi":"10.1016/j.mlwa.2024.100583","DOIUrl":"10.1016/j.mlwa.2024.100583","url":null,"abstract":"<div><p>We propose a framework for a cloud-based application of an image classification system that is highly accessible, maintains data confidentiality, and robust to incorrect training labels. The end-to-end system is implemented using Amazon Web Services (AWS), with a detailed guide provided for replication, enhancing the ways which researchers can collaborate with a community of users for mutual benefits. A front-end web application allows users across the world to securely log in, contribute labelled training images conveniently via a drag-and-drop approach, and use that same application to query an up-to-date model that has knowledge of images from the community of users. This resulting system demonstrates that theory can be effectively interlaced with practice, with various considerations addressed by our architecture. Users will have access to an image classification model that can be updated and automatically deployed within minutes, gaining benefits from and at the same time providing benefits to the community of users. At the same time, researchers, who will act as administrators, will be able to conveniently and securely engage a large number of users with their respective machine learning models and build up a labelled database over time, paying only variable costs that is proportional to utilization.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100583"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000598/pdfft?md5=e2b0096db158ce87d429b6d8deb1da6e&pid=1-s2.0-S2666827024000598-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.
{"title":"Prediction of bike-sharing station demand using explainable artificial intelligence","authors":"Frank Ngeni , Boniphace Kutela , Tumlumbe Juliana Chengula , Cuthbert Ruseruka , Hannah Musau , Norris Novat , Debbie Aisiana Indah , Sarah Kasomi","doi":"10.1016/j.mlwa.2024.100582","DOIUrl":"10.1016/j.mlwa.2024.100582","url":null,"abstract":"<div><p>Bike-sharing systems have grown in popularity in metropolitan areas, providing a handy and environmentally friendly transportation choice for commuters and visitors alike. As demand for bike-sharing programs grows, efficient capacity planning becomes critical to ensuring good user experience and system sustainability in terms of demand. The random forest model was used in this study to predict bike-sharing station demand and is considered a strong ensemble learning approach that can successfully capture complicated nonlinear correlations and interactions between input variables. This study employed data from the Smart Location Database (SLD) to test the model accuracy in estimating station demand and used a form of explainable artificial intelligence (XAI) function to further understand machine learning (ML) prediction outcomes owing to the blackbox tendencies of ML models. Vehicle Miles of Travel (VMT) and Greenhouse Gas (GHG) emissions were the most important features in predicting docking station demand individually but not holistically based on the datasets. The percentage of zero-car households, gross residential density, road network density, aggregate frequency of transit service, and gross activity density were found to have a moderate influence on the prediction model. Further, there may be a better prediction model generating sensible results for every type of explanatory variable, but their contributions are minimum to the prediction outcome. By measuring each feature's contribution to demand prediction in feature engineering, bike-sharing operators can acquire a better understanding of the bike-sharing station capacity and forecast future demands during planning. At the same time, ML models will need further assessment before a holistic conclusion.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100582"},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000586/pdfft?md5=bf46aecfa5d4b69f24c5a8d196610032&pid=1-s2.0-S2666827024000586-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142012918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1016/j.mlwa.2024.100580
Tumlumbe Juliana Chengula , Judith Mwakalonge , Gurcan Comert , Methusela Sulle , Saidi Siuhi , Eric Osei
The recent advancements in Advanced Driver Assistance Systems (ADAS) have significantly contributed to road safety and driving comfort. An integral aspect of these systems is the detection of driver anomalies such as drowsiness, distraction, and impairment, which are crucial for preventing accidents. Building upon previous studies that utilized ensemble model learning (XGBoost) with deep learning models (ResNet50, DenseNet201, and InceptionV3) for anomaly detection, this study introduces a comprehensive feature importance analysis using the SHAP (SHapley Additive exPlanations) technique. The technique is implemented through explainable artificial intelligence (XAI). The primary objective is to unravel the complex decision-making process of the ensemble model, which has previously demonstrated near-perfect performance metrics in classifying driver behaviors using in-vehicle cameras. By applying SHAP, the study aims to identify and quantify the contribution of each feature – such as facial expressions, head position, yawning, and sleeping – in predicting driver states. This analysis offers insights into the model’s inner workings and guides the enhancement of feature engineering for more precise and reliable anomaly detection. The findings of this study are expected to impact the development of future ADAS technologies significantly. By pinpointing the most influential features and understanding their dynamics, a model can be optimized for various driving scenarios, ensuring that ADAS systems are robust, accurate, and tailored to real-world conditions. Ultimately, this study contributes to the overarching goal of enhancing road safety through technologically advanced, data-driven approaches.
{"title":"Enhancing advanced driver assistance systems through explainable artificial intelligence for driver anomaly detection","authors":"Tumlumbe Juliana Chengula , Judith Mwakalonge , Gurcan Comert , Methusela Sulle , Saidi Siuhi , Eric Osei","doi":"10.1016/j.mlwa.2024.100580","DOIUrl":"10.1016/j.mlwa.2024.100580","url":null,"abstract":"<div><p>The recent advancements in Advanced Driver Assistance Systems (ADAS) have significantly contributed to road safety and driving comfort. An integral aspect of these systems is the detection of driver anomalies such as drowsiness, distraction, and impairment, which are crucial for preventing accidents. Building upon previous studies that utilized ensemble model learning (XGBoost) with deep learning models (ResNet50, DenseNet201, and InceptionV3) for anomaly detection, this study introduces a comprehensive feature importance analysis using the SHAP (SHapley Additive exPlanations) technique. The technique is implemented through explainable artificial intelligence (XAI). The primary objective is to unravel the complex decision-making process of the ensemble model, which has previously demonstrated near-perfect performance metrics in classifying driver behaviors using in-vehicle cameras. By applying SHAP, the study aims to identify and quantify the contribution of each feature – such as facial expressions, head position, yawning, and sleeping – in predicting driver states. This analysis offers insights into the model’s inner workings and guides the enhancement of feature engineering for more precise and reliable anomaly detection. The findings of this study are expected to impact the development of future ADAS technologies significantly. By pinpointing the most influential features and understanding their dynamics, a model can be optimized for various driving scenarios, ensuring that ADAS systems are robust, accurate, and tailored to real-world conditions. Ultimately, this study contributes to the overarching goal of enhancing road safety through technologically advanced, data-driven approaches.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100580"},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000562/pdfft?md5=f80e9f1f7d3e0f04778d0e4fbebf27c0&pid=1-s2.0-S2666827024000562-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141953205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1016/j.mlwa.2024.100581
Elham Afzali, Saman Muthukumarana, Liqun Wang
The Gradient-Free Kernel Conditional Stein Discrepancy (GF-KCSD), presented in our prior work, represents a significant advancement in goodness-of-fit testing for conditional distributions. This method offers a robust alternative to previous gradient-based techniques, specially when the gradient calculation is intractable or computationally expensive. In this study, we explore previously unexamined aspects of GF-KCSD, with a particular focus on critical values and test power—essential components for effective hypothesis testing. We also present novel investigation on the impact of measurement errors on the performance of GF-KCSD in comparison to established benchmarks, enhancing our understanding of its resilience to these errors. Through controlled experiments using synthetic data, we demonstrate GF-KCSD’s superior ability to control type-I error rates and maintain high statistical power, even in the presence of measurement inaccuracies. Our empirical evaluation extends to real-world datasets, including brain MRI data. The findings confirm that GF-KCSD performs comparably to KCSD in hypothesis testing effectiveness while requiring significantly less computational time. This demonstrates GF-KCSD’s capability as an efficient tool for analyzing complex data, enhancing its value for scenarios that demand rapid and robust statistical analysis.
{"title":"Navigating interpretability and alpha control in GF-KCSD testing with measurement error: A Kernel approach","authors":"Elham Afzali, Saman Muthukumarana, Liqun Wang","doi":"10.1016/j.mlwa.2024.100581","DOIUrl":"10.1016/j.mlwa.2024.100581","url":null,"abstract":"<div><p>The Gradient-Free Kernel Conditional Stein Discrepancy (GF-KCSD), presented in our prior work, represents a significant advancement in goodness-of-fit testing for conditional distributions. This method offers a robust alternative to previous gradient-based techniques, specially when the gradient calculation is intractable or computationally expensive. In this study, we explore previously unexamined aspects of GF-KCSD, with a particular focus on critical values and test power—essential components for effective hypothesis testing. We also present novel investigation on the impact of measurement errors on the performance of GF-KCSD in comparison to established benchmarks, enhancing our understanding of its resilience to these errors. Through controlled experiments using synthetic data, we demonstrate GF-KCSD’s superior ability to control type-I error rates and maintain high statistical power, even in the presence of measurement inaccuracies. Our empirical evaluation extends to real-world datasets, including brain MRI data. The findings confirm that GF-KCSD performs comparably to KCSD in hypothesis testing effectiveness while requiring significantly less computational time. This demonstrates GF-KCSD’s capability as an efficient tool for analyzing complex data, enhancing its value for scenarios that demand rapid and robust statistical analysis.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100581"},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000574/pdfft?md5=64343827db2919a23c638fc11f2df65c&pid=1-s2.0-S2666827024000574-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141978188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-24DOI: 10.1016/j.mlwa.2024.100576
George Obaido , Ibomoiye Domor Mienye , Oluwaseun F. Egbelowo , Ikiomoye Douglas Emmanuel , Adeola Ogunleye , Blessing Ogbuokiri , Pere Mienye , Kehinde Aruleba
Drug discovery and development is a time-consuming process that involves identifying, designing, and testing new drugs to address critical medical needs. In recent years, machine learning (ML) has played a vital role in technological advancements and has shown promising results in various drug discovery and development stages. ML can be categorized into supervised, unsupervised, semi-supervised, and reinforcement learning. Supervised learning is the most used category, helping organizations solve several real-world problems. This study presents a comprehensive survey of supervised learning algorithms in drug design and development, focusing on their learning process and succinct mathematical formulations, which are lacking in the literature. Additionally, the study discusses widely encountered challenges in applying supervised learning for drug discovery and potential solutions. This study will be beneficial to researchers and practitioners in the pharmaceutical industry as it provides a simplified yet comprehensive review of the main concepts, algorithms, challenges, and prospects in supervised learning.
{"title":"Supervised machine learning in drug discovery and development: Algorithms, applications, challenges, and prospects","authors":"George Obaido , Ibomoiye Domor Mienye , Oluwaseun F. Egbelowo , Ikiomoye Douglas Emmanuel , Adeola Ogunleye , Blessing Ogbuokiri , Pere Mienye , Kehinde Aruleba","doi":"10.1016/j.mlwa.2024.100576","DOIUrl":"10.1016/j.mlwa.2024.100576","url":null,"abstract":"<div><p>Drug discovery and development is a time-consuming process that involves identifying, designing, and testing new drugs to address critical medical needs. In recent years, machine learning (ML) has played a vital role in technological advancements and has shown promising results in various drug discovery and development stages. ML can be categorized into supervised, unsupervised, semi-supervised, and reinforcement learning. Supervised learning is the most used category, helping organizations solve several real-world problems. This study presents a comprehensive survey of supervised learning algorithms in drug design and development, focusing on their learning process and succinct mathematical formulations, which are lacking in the literature. Additionally, the study discusses widely encountered challenges in applying supervised learning for drug discovery and potential solutions. This study will be beneficial to researchers and practitioners in the pharmaceutical industry as it provides a simplified yet comprehensive review of the main concepts, algorithms, challenges, and prospects in supervised learning.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100576"},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000525/pdfft?md5=96d7cf4045bc85637f8a61b12ff40fe9&pid=1-s2.0-S2666827024000525-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141952685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Breast cancer (BC) is a prevalent malignancy worldwide, posing a significant public health burden due to its high incidence rate. Accurate detection is crucial for improving survival rates, and pathological diagnosis through biopsy is essential for detailed BC detection. Convolutional Neural Network (CNN)-based methods have been proposed to support this detection, utilizing patches from Whole Slide Imaging (WSI) combined with sophisticated CNNs. In this research, we introduced DECENN, a novel deep learning architecture designed to overcome the limitations of single CNN models under fixed pre-trained parameter transfer learning settings. DECENN employs an ensemble of VGG16 and DenseNet121, integrated with innovative modules such as Multi-Scale Feature Extraction, Heterogeneous Convolution Enhancement, Feature Harmonization and Fusion, and Feature Integration Output. Through progressive stages – from baseline models, intermediate DCNN and DCNN+ models, to the fully integrated DECENN model – significant performance improvements were observed in experiments using 5-fold cross-validation on the Patch Camelyon(PCam) dataset. DECENN achieved an AUC of 99.70% ± 0.12%, an F-score of 98.93% ± 0.06%, and an Accuracy of 98.92% ± 0.06%, (). These results highlight DECENN’s potential to significantly enhance the automated detection and diagnostic accuracy of BC metastasis in biopsy specimens.
{"title":"Detection of presence or absence of metastasis in WSI patches of breast cancer using the dual-enhanced convolutional ensemble neural network","authors":"Ruigang Ge , Guoyue Chen , Kazuki Saruta , Yuki Terata","doi":"10.1016/j.mlwa.2024.100579","DOIUrl":"10.1016/j.mlwa.2024.100579","url":null,"abstract":"<div><p>Breast cancer (BC) is a prevalent malignancy worldwide, posing a significant public health burden due to its high incidence rate. Accurate detection is crucial for improving survival rates, and pathological diagnosis through biopsy is essential for detailed BC detection. Convolutional Neural Network (CNN)-based methods have been proposed to support this detection, utilizing patches from Whole Slide Imaging (WSI) combined with sophisticated CNNs. In this research, we introduced DECENN, a novel deep learning architecture designed to overcome the limitations of single CNN models under fixed pre-trained parameter transfer learning settings. DECENN employs an ensemble of VGG16 and DenseNet121, integrated with innovative modules such as Multi-Scale Feature Extraction, Heterogeneous Convolution Enhancement, Feature Harmonization and Fusion, and Feature Integration Output. Through progressive stages – from baseline models, intermediate DCNN and DCNN+ models, to the fully integrated DECENN model – significant performance improvements were observed in experiments using 5-fold cross-validation on the Patch Camelyon(PCam) dataset. DECENN achieved an AUC of 99.70% ± 0.12%, an F-score of 98.93% ± 0.06%, and an Accuracy of 98.92% ± 0.06%, (<span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span>). These results highlight DECENN’s potential to significantly enhance the automated detection and diagnostic accuracy of BC metastasis in biopsy specimens.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100579"},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000550/pdfft?md5=5deed5e31245e0c9768d99b90207c738&pid=1-s2.0-S2666827024000550-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141844498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1016/j.mlwa.2024.100577
Tahsin Reasat , Asif Sushmit , David S. Smith
Deep learning (DL) based diagnostics systems can provide accurate and robust quantitative analysis in digital pathology. These algorithms require large amounts of annotated training data which is impractical in pathology due to the high resolution of histopathological images. Hence, self-supervised methods have been proposed to learn features using ad-hoc pretext tasks. The self-supervised training process uses a large unlabeled dataset which makes the learning process time consuming. In this work, we propose a new method for actively sampling informative members from the training set using a small proxy network, decreasing sample requirement by 93% and training time by 62% while maintaining the same performance of the traditional self-supervised learning method. The code is available on github.
{"title":"Data efficient contrastive learning in histopathology using active sampling","authors":"Tahsin Reasat , Asif Sushmit , David S. Smith","doi":"10.1016/j.mlwa.2024.100577","DOIUrl":"10.1016/j.mlwa.2024.100577","url":null,"abstract":"<div><p>Deep learning (DL) based diagnostics systems can provide accurate and robust quantitative analysis in digital pathology. These algorithms require large amounts of annotated training data which is impractical in pathology due to the high resolution of histopathological images. Hence, self-supervised methods have been proposed to learn features using ad-hoc pretext tasks. The self-supervised training process uses a large unlabeled dataset which makes the learning process time consuming. In this work, we propose a new method for actively sampling informative members from the training set using a small proxy network, decreasing sample requirement by 93% and training time by 62% while maintaining the same performance of the traditional self-supervised learning method. The code is available on <span><span>github</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100577"},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000537/pdfft?md5=3a2df5b10799802c10eef16a87edcda2&pid=1-s2.0-S2666827024000537-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141961574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.1016/j.mlwa.2024.100578
Willem Roux Moore, Jan H. van Vuuren
By offering clients attractive credit terms on sales, a company may increase its turnover, but granting credit also incurs the cost of money tied up in accounts receivable (AR), increased administration and a heightened probability of incurring bad debt. The management of credit sales, although eminently important to any business, is often performed manually, which may be time-consuming, expensive and inaccurate. Such an administrative workload becomes increasingly cumbersome as the number of credit sales increases. As a result, a new approach towards proactively identifying invoices from AR accounts that are likely to be paid late, or not at all, has recently been proposed in the literature, with the aim of employing intervention strategies more effectively. Several computational techniques from the credit scoring literature and particularly techniques from the realms of survival analysis or machine learning have been embedded in the aforementioned approach. This body of work is, however, lacking due to the limited guidance provided during the data preparation phase of the model development process and because survival analytic and machine learning techniques have not yet been ensembled. In this paper, we propose a generic framework for modelling invoice payment predictions with the aim of facilitating the process of preparing transaction data for analysis, generating relevant features from past customer behaviours, and selecting and ensembling suitable models for predicting the time to payment associated with invoices. We also introduce a new sequential ensembling approach, called the Survival Boost algorithm. The rationale behind this method is that features generated by a survival analytic model can enhance the efficacy of a machine learning classification algorithm.
{"title":"A framework for modelling customer invoice payment predictions","authors":"Willem Roux Moore, Jan H. van Vuuren","doi":"10.1016/j.mlwa.2024.100578","DOIUrl":"10.1016/j.mlwa.2024.100578","url":null,"abstract":"<div><p>By offering clients attractive credit terms on sales, a company may increase its turnover, but granting credit also incurs the cost of money tied up in <em>accounts receivable</em> (AR), increased administration and a heightened probability of incurring bad debt. The management of credit sales, although eminently important to any business, is often performed manually, which may be time-consuming, expensive and inaccurate. Such an administrative workload becomes increasingly cumbersome as the number of credit sales increases. As a result, a new approach towards proactively identifying invoices from AR accounts that are likely to be paid late, or not at all, has recently been proposed in the literature, with the aim of employing intervention strategies more effectively. Several computational techniques from the credit scoring literature and particularly techniques from the realms of survival analysis or machine learning have been embedded in the aforementioned approach. This body of work is, however, lacking due to the limited guidance provided during the data preparation phase of the model development process and because survival analytic and machine learning techniques have not yet been ensembled. In this paper, we propose a generic framework for modelling invoice payment predictions with the aim of facilitating the process of preparing transaction data for analysis, generating relevant features from past customer behaviours, and selecting and ensembling suitable models for predicting the time to payment associated with invoices. We also introduce a new sequential ensembling approach, called the Survival Boost algorithm. The rationale behind this method is that features generated by a survival analytic model can enhance the efficacy of a machine learning classification algorithm.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100578"},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000549/pdfft?md5=bf6f5de7c56807d70f8402b9b60468a6&pid=1-s2.0-S2666827024000549-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141839227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-14DOI: 10.1016/j.mlwa.2024.100575
Amit Kumar Sah , Muhammad Abulaish
This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model’s ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts — (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using -grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.
{"title":"DeepCKID: A Multi-Head Attention-Based Deep Neural Network Model Leveraging Classwise Knowledge to Handle Imbalanced Textual Data","authors":"Amit Kumar Sah , Muhammad Abulaish","doi":"10.1016/j.mlwa.2024.100575","DOIUrl":"10.1016/j.mlwa.2024.100575","url":null,"abstract":"<div><p>This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model’s ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts — (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using <span><math><mi>n</mi></math></span>-grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100575"},"PeriodicalIF":0.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000513/pdfft?md5=8efb9f85f258bdd00899e0b78ef5e189&pid=1-s2.0-S2666827024000513-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141716561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scooters have gained widespread popularity in recent years due to their accessibility and affordability, but safety concerns persist due to the vulnerability of riders. Researchers are actively investigating the safety implications associated with scooters, given their relatively new status as transportation options. However, analyzing scooter safety presents a unique challenge due to the complexity of determining safe riding environments. This study presents a comprehensive analysis of scooter crash risk within various buffer zones, utilizing the Extreme Gradient Boosting (XGBoost) machine learning algorithm. The core objective was to unravel the multifaceted factors influencing scooter crashes and assess the predictive model’s performance across different buffers or spatial proximity to crash sites. After evaluating the model’s accuracy, sensitivity, and specificity across buffer distances ranging from 5 ft to 250 ft with the scooter crash as a reference point, a discernible trend emerged: as the buffer distance decreases, the model’s sensitivity increases, although at the expense of accuracy and specificity, which exhibit a gradual decline. Notably, at the widest buffer of 250 ft, the model achieved a high accuracy of 97% and specificity of 99%, but with a lower sensitivity of 31%. Contrastingly, at the closest buffer of 5 ft, sensitivity peaked at 95%, albeit with slightly reduced accuracy and specificity. Feature importance analysis highlighted the most significant predictor across all buffer distances, emphasizing the impact of vehicle interactions on scooter crash likelihood. Explainable Artificial Intelligence through SHAP value analysis provided deeper insights into each feature’s contribution to the predictive model, revealing passenger vehicle types of significantly escalated crash risks. Intriguingly, specific vehicular maneuvers, notably stopping in traffic lanes, alongside the absence of Traffic Control Devices (TCDs), were identified as the major contributors to increased crash occurrences. Road conditions, particularly wet and dry, also emerged as substantial risk factors. Furthermore, the study highlights the significance of road design, where elements like junction types and horizontal alignments – specifically 4 and 5-legged intersections and curves – are closely associated with heightened crash risks. These findings articulate a complex and spatially detailed framework of factors impacting scooter crashes, offering vital insights for urban planning and policymaking.
{"title":"Spatial instability of crash prediction models: A case of scooter crashes","authors":"Tumlumbe Juliana Chengula , Boniphace Kutela , Norris Novat , Hellen Shita , Abdallah Kinero , Reuben Tamakloe , Sarah Kasomi","doi":"10.1016/j.mlwa.2024.100574","DOIUrl":"10.1016/j.mlwa.2024.100574","url":null,"abstract":"<div><p>Scooters have gained widespread popularity in recent years due to their accessibility and affordability, but safety concerns persist due to the vulnerability of riders. Researchers are actively investigating the safety implications associated with scooters, given their relatively new status as transportation options. However, analyzing scooter safety presents a unique challenge due to the complexity of determining safe riding environments. This study presents a comprehensive analysis of scooter crash risk within various buffer zones, utilizing the Extreme Gradient Boosting (XGBoost) machine learning algorithm. The core objective was to unravel the multifaceted factors influencing scooter crashes and assess the predictive model’s performance across different buffers or spatial proximity to crash sites. After evaluating the model’s accuracy, sensitivity, and specificity across buffer distances ranging from 5 ft to 250 ft with the scooter crash as a reference point, a discernible trend emerged: as the buffer distance decreases, the model’s sensitivity increases, although at the expense of accuracy and specificity, which exhibit a gradual decline. Notably, at the widest buffer of 250 ft, the model achieved a high accuracy of 97% and specificity of 99%, but with a lower sensitivity of 31%. Contrastingly, at the closest buffer of 5 ft, sensitivity peaked at 95%, albeit with slightly reduced accuracy and specificity. Feature importance analysis highlighted the most significant predictor across all buffer distances, emphasizing the impact of vehicle interactions on scooter crash likelihood. Explainable Artificial Intelligence through SHAP value analysis provided deeper insights into each feature’s contribution to the predictive model, revealing passenger vehicle types of significantly escalated crash risks. Intriguingly, specific vehicular maneuvers, notably stopping in traffic lanes, alongside the absence of Traffic Control Devices (TCDs), were identified as the major contributors to increased crash occurrences. Road conditions, particularly wet and dry, also emerged as substantial risk factors. Furthermore, the study highlights the significance of road design, where elements like junction types and horizontal alignments – specifically 4 and 5-legged intersections and curves – are closely associated with heightened crash risks. These findings articulate a complex and spatially detailed framework of factors impacting scooter crashes, offering vital insights for urban planning and policymaking.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100574"},"PeriodicalIF":0.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000501/pdfft?md5=c3afb02a60606c22b0434ac053d3571a&pid=1-s2.0-S2666827024000501-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141630842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}