Pub Date : 2024-07-14DOI: 10.1016/j.mlwa.2024.100575
Amit Kumar Sah , Muhammad Abulaish
This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model’s ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts — (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using -grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.
{"title":"DeepCKID: A Multi-Head Attention-Based Deep Neural Network Model Leveraging Classwise Knowledge to Handle Imbalanced Textual Data","authors":"Amit Kumar Sah , Muhammad Abulaish","doi":"10.1016/j.mlwa.2024.100575","DOIUrl":"10.1016/j.mlwa.2024.100575","url":null,"abstract":"<div><p>This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model’s ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts — (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using <span><math><mi>n</mi></math></span>-grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100575"},"PeriodicalIF":0.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000513/pdfft?md5=8efb9f85f258bdd00899e0b78ef5e189&pid=1-s2.0-S2666827024000513-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141716561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scooters have gained widespread popularity in recent years due to their accessibility and affordability, but safety concerns persist due to the vulnerability of riders. Researchers are actively investigating the safety implications associated with scooters, given their relatively new status as transportation options. However, analyzing scooter safety presents a unique challenge due to the complexity of determining safe riding environments. This study presents a comprehensive analysis of scooter crash risk within various buffer zones, utilizing the Extreme Gradient Boosting (XGBoost) machine learning algorithm. The core objective was to unravel the multifaceted factors influencing scooter crashes and assess the predictive model’s performance across different buffers or spatial proximity to crash sites. After evaluating the model’s accuracy, sensitivity, and specificity across buffer distances ranging from 5 ft to 250 ft with the scooter crash as a reference point, a discernible trend emerged: as the buffer distance decreases, the model’s sensitivity increases, although at the expense of accuracy and specificity, which exhibit a gradual decline. Notably, at the widest buffer of 250 ft, the model achieved a high accuracy of 97% and specificity of 99%, but with a lower sensitivity of 31%. Contrastingly, at the closest buffer of 5 ft, sensitivity peaked at 95%, albeit with slightly reduced accuracy and specificity. Feature importance analysis highlighted the most significant predictor across all buffer distances, emphasizing the impact of vehicle interactions on scooter crash likelihood. Explainable Artificial Intelligence through SHAP value analysis provided deeper insights into each feature’s contribution to the predictive model, revealing passenger vehicle types of significantly escalated crash risks. Intriguingly, specific vehicular maneuvers, notably stopping in traffic lanes, alongside the absence of Traffic Control Devices (TCDs), were identified as the major contributors to increased crash occurrences. Road conditions, particularly wet and dry, also emerged as substantial risk factors. Furthermore, the study highlights the significance of road design, where elements like junction types and horizontal alignments – specifically 4 and 5-legged intersections and curves – are closely associated with heightened crash risks. These findings articulate a complex and spatially detailed framework of factors impacting scooter crashes, offering vital insights for urban planning and policymaking.
{"title":"Spatial instability of crash prediction models: A case of scooter crashes","authors":"Tumlumbe Juliana Chengula , Boniphace Kutela , Norris Novat , Hellen Shita , Abdallah Kinero , Reuben Tamakloe , Sarah Kasomi","doi":"10.1016/j.mlwa.2024.100574","DOIUrl":"10.1016/j.mlwa.2024.100574","url":null,"abstract":"<div><p>Scooters have gained widespread popularity in recent years due to their accessibility and affordability, but safety concerns persist due to the vulnerability of riders. Researchers are actively investigating the safety implications associated with scooters, given their relatively new status as transportation options. However, analyzing scooter safety presents a unique challenge due to the complexity of determining safe riding environments. This study presents a comprehensive analysis of scooter crash risk within various buffer zones, utilizing the Extreme Gradient Boosting (XGBoost) machine learning algorithm. The core objective was to unravel the multifaceted factors influencing scooter crashes and assess the predictive model’s performance across different buffers or spatial proximity to crash sites. After evaluating the model’s accuracy, sensitivity, and specificity across buffer distances ranging from 5 ft to 250 ft with the scooter crash as a reference point, a discernible trend emerged: as the buffer distance decreases, the model’s sensitivity increases, although at the expense of accuracy and specificity, which exhibit a gradual decline. Notably, at the widest buffer of 250 ft, the model achieved a high accuracy of 97% and specificity of 99%, but with a lower sensitivity of 31%. Contrastingly, at the closest buffer of 5 ft, sensitivity peaked at 95%, albeit with slightly reduced accuracy and specificity. Feature importance analysis highlighted the most significant predictor across all buffer distances, emphasizing the impact of vehicle interactions on scooter crash likelihood. Explainable Artificial Intelligence through SHAP value analysis provided deeper insights into each feature’s contribution to the predictive model, revealing passenger vehicle types of significantly escalated crash risks. Intriguingly, specific vehicular maneuvers, notably stopping in traffic lanes, alongside the absence of Traffic Control Devices (TCDs), were identified as the major contributors to increased crash occurrences. Road conditions, particularly wet and dry, also emerged as substantial risk factors. Furthermore, the study highlights the significance of road design, where elements like junction types and horizontal alignments – specifically 4 and 5-legged intersections and curves – are closely associated with heightened crash risks. These findings articulate a complex and spatially detailed framework of factors impacting scooter crashes, offering vital insights for urban planning and policymaking.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100574"},"PeriodicalIF":0.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000501/pdfft?md5=c3afb02a60606c22b0434ac053d3571a&pid=1-s2.0-S2666827024000501-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141630842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-09DOI: 10.1016/j.mlwa.2024.100571
Hiroshi Kage
Pattern detection is one of the essential technologies in computer vision. To solve pattern detection problems, the system needs a vast amount of computational resources. To train a multilayer perceptron or convolutional neural network, the gradient descent method is commonly used. The method consumes computational resources. To reduce the amount of computation, we propose a two-dimensional pattern detection algorithm based on Echo State Network (ESN). The training rule of ESN is based on one-shot ridge regression, which enables us to avoid the gradient descent. ESN is a kind of recurrent neural network (RNN), which is often used to embed temporal signals inside the network, rarely used for the embedding of static patterns. In our prior work (Kage, 2023), we found that static patterns can be embedded in an ESN network by associating the training patterns with its stable states, or attractors. By using the same training procedure as our prior work, we made sure that we can associate each training patch image with the desired output vector. The resulting performance of a single ESN classifier is, however, relatively poor. To overcome this poor performance, we introduced an ensemble learning framework by combining multiple ESN weak classifiers. To evaluate the performance, we used CMU-MIT frontal face images (CMU DB). We trained eleven ESN-based classifiers by using six CMU DB training images and evaluated the performance by using a CMU DB test image. We succeeded in reducing false positives in the CMU DB test image down to 0.0515 %.
模式检测是计算机视觉的基本技术之一。要解决模式检测问题,系统需要大量的计算资源。为了训练多层感知器或卷积神经网络,通常使用梯度下降法。这种方法会消耗计算资源。为了减少计算量,我们提出了一种基于回声状态网络(ESN)的二维模式检测算法。ESN 的训练规则基于单次脊回归,因此可以避免梯度下降。ESN 是一种递归神经网络(RNN),通常用于将时间信号嵌入网络内部,很少用于静态模式的嵌入。在我们之前的研究(Kage,2023 年)中,我们发现通过将训练模式与 ESN 网络的稳定状态或吸引子相关联,可以将静态模式嵌入 ESN 网络。通过使用与之前工作相同的训练程序,我们确保可以将每个训练补丁图像与所需的输出向量相关联。然而,单一 ESN 分类器的性能相对较差。为了克服这种性能低下的问题,我们通过组合多个 ESN 弱分类器引入了集合学习框架。为了评估其性能,我们使用了 CMU-MIT 正面人脸图像(CMU DB)。我们使用六张 CMU DB 训练图像训练了 11 个基于 ESN 的分类器,并使用一张 CMU DB 测试图像评估了其性能。我们成功地将 CMU DB 测试图像中的误报率降低到了 0.0515%。
{"title":"An algorithm for two-dimensional pattern detection by combining Echo State Network-based weak classifiers","authors":"Hiroshi Kage","doi":"10.1016/j.mlwa.2024.100571","DOIUrl":"10.1016/j.mlwa.2024.100571","url":null,"abstract":"<div><p>Pattern detection is one of the essential technologies in computer vision. To solve pattern detection problems, the system needs a vast amount of computational resources. To train a multilayer perceptron or convolutional neural network, the gradient descent method is commonly used. The method consumes computational resources. To reduce the amount of computation, we propose a two-dimensional pattern detection algorithm based on Echo State Network (ESN). The training rule of ESN is based on one-shot ridge regression, which enables us to avoid the gradient descent. ESN is a kind of recurrent neural network (RNN), which is often used to embed temporal signals inside the network, rarely used for the embedding of static patterns. In our prior work (Kage, 2023), we found that static patterns can be embedded in an ESN network by associating the training patterns with its stable states, or attractors. By using the same training procedure as our prior work, we made sure that we can associate each training patch image with the desired output vector. The resulting performance of a single ESN classifier is, however, relatively poor. To overcome this poor performance, we introduced an ensemble learning framework by combining multiple ESN weak classifiers. To evaluate the performance, we used CMU-MIT frontal face images (CMU DB). We trained eleven ESN-based classifiers by using six CMU DB training images and evaluated the performance by using a CMU DB test image. We succeeded in reducing false positives in the CMU DB test image down to 0.0515 %.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100571"},"PeriodicalIF":0.0,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000471/pdfft?md5=e2ec9590ba19c5c866410152f0f80ebb&pid=1-s2.0-S2666827024000471-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141637634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-09DOI: 10.1016/j.mlwa.2024.100572
Asif Ahmed Neloy , Maxime Turgeon
Unsupervised anomaly detection (UAD) is a diverse research area explored across various application domains. Over time, numerous anomaly detection techniques, including clustering, generative, and variational inference-based methods, are developed to address specific drawbacks and advance state-of-the-art techniques. Deep learning and generative models recently played a significant role in identifying unique challenges and devising advanced approaches. Auto-encoders (AEs) represent one such powerful technique that combines generative and probabilistic variational modeling with deep architecture. Auto-Encoder aims to learn the underlying data distribution to generate consequential sample data. This concept of data generation and the adoption of generative modeling have emerged in extensive research and variations in Auto-Encoder design, particularly in unsupervised representation learning. This study systematically reviews 11 Auto-Encoder architectures categorized into three groups, aiming to differentiate their reconstruction ability, sample generation, latent space visualization, and accuracy in classifying anomalous data using the Fashion-MNIST (FMNIST) and MNIST datasets. Additionally, we closely observed the reproducibility scope under different training parameters. We conducted reproducibility experiments utilizing similar model setups and hyperparameters and attempted to generate comparative results to address the scope of improvements for each Auto-Encoder. We conclude this study by analyzing the experimental results, which guide us in identifying the efficiency and trade-offs among auto-encoders, providing valuable insights into their performance and applicability in unsupervised anomaly detection techniques.
{"title":"A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs","authors":"Asif Ahmed Neloy , Maxime Turgeon","doi":"10.1016/j.mlwa.2024.100572","DOIUrl":"10.1016/j.mlwa.2024.100572","url":null,"abstract":"<div><p>Unsupervised anomaly detection (UAD) is a diverse research area explored across various application domains. Over time, numerous anomaly detection techniques, including clustering, generative, and variational inference-based methods, are developed to address specific drawbacks and advance state-of-the-art techniques. Deep learning and generative models recently played a significant role in identifying unique challenges and devising advanced approaches. Auto-encoders (AEs) represent one such powerful technique that combines generative and probabilistic variational modeling with deep architecture. Auto-Encoder aims to learn the underlying data distribution to generate consequential sample data. This concept of data generation and the adoption of generative modeling have emerged in extensive research and variations in Auto-Encoder design, particularly in unsupervised representation learning. This study systematically reviews 11 Auto-Encoder architectures categorized into three groups, aiming to differentiate their reconstruction ability, sample generation, latent space visualization, and accuracy in classifying anomalous data using the Fashion-MNIST (FMNIST) and MNIST datasets. Additionally, we closely observed the reproducibility scope under different training parameters. We conducted reproducibility experiments utilizing similar model setups and hyperparameters and attempted to generate comparative results to address the scope of improvements for each Auto-Encoder. We conclude this study by analyzing the experimental results, which guide us in identifying the efficiency and trade-offs among auto-encoders, providing valuable insights into their performance and applicability in unsupervised anomaly detection techniques.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100572"},"PeriodicalIF":0.0,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000483/pdfft?md5=deffaabf165a48bed93f11897aaeeb38&pid=1-s2.0-S2666827024000483-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141623474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-08DOI: 10.1016/j.mlwa.2024.100573
Michael Schäfer , Ulrike Faltings , Björn Glaser
There has been remarkable progress in the field of Deep Learning and Computer Vision, but there is a lack of freely available labeled data, especially when it comes to data for specific industrial applications. However, large volumes of structured, semi-structured and unstructured data are generated in industrial environments, from which meaningful representations can be learned. The effort required for manual labeling is extremely high and can often only be carried out by domain experts. Self-supervised methods have proven their effectiveness in recent years in a wide variety of areas such as natural language processing or computer vision. In contrast to supervised methods, self-supervised techniques are rarely used in real industrial applications. In this paper, we present a self-supervised contrastive learning approach that outperforms existing supervised approaches on the used scrap dataset. We use different types of augmentations to extract the fine-grained structures that are typical for this type of images of intrinsically unordered items. This extracts a wider range of features and encodes more aspects of the input image. This approach makes it possible to learn characteristics from images that are common for applications in the industry, such as quality control. In addition, we show that this self-supervised learning approach can be successfully applied to scene-like images for classification.
{"title":"CLRiuS: Contrastive Learning for intrinsically unordered Steel Scrap","authors":"Michael Schäfer , Ulrike Faltings , Björn Glaser","doi":"10.1016/j.mlwa.2024.100573","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100573","url":null,"abstract":"<div><p>There has been remarkable progress in the field of Deep Learning and Computer Vision, but there is a lack of freely available labeled data, especially when it comes to data for specific industrial applications. However, large volumes of structured, semi-structured and unstructured data are generated in industrial environments, from which meaningful representations can be learned. The effort required for manual labeling is extremely high and can often only be carried out by domain experts. Self-supervised methods have proven their effectiveness in recent years in a wide variety of areas such as natural language processing or computer vision. In contrast to supervised methods, self-supervised techniques are rarely used in real industrial applications. In this paper, we present a self-supervised contrastive learning approach that outperforms existing supervised approaches on the used scrap dataset. We use different types of augmentations to extract the fine-grained structures that are typical for this type of images of intrinsically unordered items. This extracts a wider range of features and encodes more aspects of the input image. This approach makes it possible to learn characteristics from images that are common for applications in the industry, such as quality control. In addition, we show that this self-supervised learning approach can be successfully applied to scene-like images for classification.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100573"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000495/pdfft?md5=18eb4b138c0ed688f7c6e0a6f8c6b4a3&pid=1-s2.0-S2666827024000495-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.1016/j.mlwa.2024.100570
Charles Cao , Feiyi Wang , Lisa Lindley , Zejiang Wang
This paper presents an empirical study on the application of Large Language Model (LLM)-based AI agents for automating server management tasks in Linux environments. We aim to evaluate the effectiveness, efficiency, and adaptability of LLM-based AI agents in handling a wide range of server management tasks, and to identify the potential benefits and challenges of employing such agents in real-world scenarios. We present an empirical study where a GPT-based AI agent autonomously executes 150 unique tasks across 9 categories, ranging from file management to editing to program compilations. The agent operates in a Dockerized Linux sandbox, interpreting task descriptions and generating appropriate commands or scripts. Our findings reveal the agent’s proficiency in executing tasks autonomously and adapting to feedback, demonstrating the potential of LLMs in simplifying complex server management for users with varying technical expertise. This study contributes to the understanding of LLM applications in server management scenarios, and paves the foundation for future research in this domain.
本文介绍了一项关于基于大语言模型(LLM)的人工智能代理在 Linux 环境中自动执行服务器管理任务的应用实证研究。我们旨在评估基于 LLM 的人工智能代理在处理各种服务器管理任务时的有效性、效率和适应性,并确定在实际场景中使用此类代理的潜在优势和挑战。我们介绍了一项实证研究,在这项研究中,基于 GPT 的人工智能代理自主执行了从文件管理、编辑到程序编译等 9 个类别的 150 项独特任务。该代理在 Docker 化的 Linux 沙箱中运行,解释任务描述并生成适当的命令或脚本。我们的研究结果表明,该代理能够熟练地自主执行任务并适应反馈,这证明了 LLM 在为具有不同技术专长的用户简化复杂服务器管理方面的潜力。这项研究有助于人们了解 LLM 在服务器管理场景中的应用,并为该领域的未来研究奠定了基础。
{"title":"Managing Linux servers with LLM-based AI agents: An empirical evaluation with GPT4","authors":"Charles Cao , Feiyi Wang , Lisa Lindley , Zejiang Wang","doi":"10.1016/j.mlwa.2024.100570","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100570","url":null,"abstract":"<div><p>This paper presents an empirical study on the application of Large Language Model (LLM)-based AI agents for automating server management tasks in Linux environments. We aim to evaluate the effectiveness, efficiency, and adaptability of LLM-based AI agents in handling a wide range of server management tasks, and to identify the potential benefits and challenges of employing such agents in real-world scenarios. We present an empirical study where a GPT-based AI agent autonomously executes 150 unique tasks across 9 categories, ranging from file management to editing to program compilations. The agent operates in a Dockerized Linux sandbox, interpreting task descriptions and generating appropriate commands or scripts. Our findings reveal the agent’s proficiency in executing tasks autonomously and adapting to feedback, demonstrating the potential of LLMs in simplifying complex server management for users with varying technical expertise. This study contributes to the understanding of LLM applications in server management scenarios, and paves the foundation for future research in this domain.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100570"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266682702400046X/pdfft?md5=c84038ecf9feef782cbf788c56d506da&pid=1-s2.0-S266682702400046X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intense tropical cyclones (TCs) cause significant damage to human societies. Forecasting the multiple stages of TC intensity changes is considerably crucial yet challenging. This difficulty arises due to imbalanced data distribution and the need for ordinal multi-class classification. While existing classification methods, such as linear discriminant analysis, have been utilized to predict rare rapidly intensifying (RI) stages based on features related TC intensity changes, they are limited to binary classification distinguishing between RI and non-RI stages. In this paper, we introduce a novel methodology to tackle the challenges of imbalanced ordinal multi-class classification. We extend the Area Under the Curve maximization technique with inter-instance/class cross-hinge losses and inter-class distance-based slack variables. The proposed loss function, implemented within a deep learning framework, demonstrates its effectiveness using real sequence data of multi-stage TC intensity changes, including satellite infrared images and environmental variables observed in the western North Pacific.
强烈热带气旋(TC)对人类社会造成了巨大的破坏。预测热带气旋强度变化的多个阶段相当关键,但也极具挑战性。这种困难是由于数据分布不平衡和需要进行序数多类分类造成的。虽然现有的分类方法(如线性判别分析)已被用于根据与热带气旋强度变化相关的特征预测罕见的快速增强(RI)阶段,但它们仅限于区分 RI 和非 RI 阶段的二元分类。在本文中,我们引入了一种新方法来应对不平衡序数多类分类的挑战。我们利用实例间/类间交叉铰链损失和基于类间距离的松弛变量扩展了曲线下面积最大化技术。所提出的损失函数是在深度学习框架内实现的,并利用多阶段热带气旋强度变化的真实序列数据(包括卫星红外图像和在北太平洋西部观测到的环境变量)证明了其有效性。
{"title":"Multi-class AUC maximization for imbalanced ordinal multi-stage tropical cyclone intensity change forecast","authors":"Hirotaka Hachiya , Hiroki Yoshida , Udai Shimada , Naonori Ueda","doi":"10.1016/j.mlwa.2024.100569","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100569","url":null,"abstract":"<div><p>Intense tropical cyclones (TCs) cause significant damage to human societies. Forecasting the multiple stages of TC intensity changes is considerably crucial yet challenging. This difficulty arises due to imbalanced data distribution and the need for ordinal multi-class classification. While existing classification methods, such as linear discriminant analysis, have been utilized to predict rare rapidly intensifying (RI) stages based on features related TC intensity changes, they are limited to binary classification distinguishing between RI and non-RI stages. In this paper, we introduce a novel methodology to tackle the challenges of imbalanced ordinal multi-class classification. We extend the Area Under the Curve maximization technique with inter-instance/class cross-hinge losses and inter-class distance-based slack variables. The proposed loss function, implemented within a deep learning framework, demonstrates its effectiveness using real sequence data of multi-stage TC intensity changes, including satellite infrared images and environmental variables observed in the western North Pacific.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100569"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000458/pdfft?md5=92b286b0e461b132b43d67cb754aad34&pid=1-s2.0-S2666827024000458-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141594269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1016/j.mlwa.2024.100568
Mirza Raquib , Mohammad Amzad Hossain , Md Khairul Islam , Md Sipon Miah
Automated character recognition is currently highly popular due to its wide range of applications. Bengali handwritten character recognition (BHCR) is an extremely difficult issue because of the nature of the script. Very few handwritten character recognition (HCR) models are capable of accurately classifying all different sorts of Bangla characters. Recently, image recognition, video analytics, and natural language processing have all found great success using convolutional neural network (CNN) due to its ability to extract and classify features in novel ways. In this paper, we introduce a VashaNet model for recognizing Bangla handwritten basic characters. The suggested VashaNet model employs a 26-layer deep convolutional neural network (DCNN) architecture consisting of nine convolutional layers, six max pooling layers, two dropout layers, five batch normalization layers, one flattening layer, two dense layers, and one output layer. The experiment was performed over 2 datasets consisting of a primary dataset of 5750 images, CMATERdb 3.1.2 for the purpose of training and evaluating the model. The suggested character recognition model worked very well, with test accuracy rates of 94.60% for the primary dataset, 94.43% for CMATERdb 3.1.2 dataset. These remarkable outcomes demonstrate that the proposed VashaNet outperforms other existing methods and offers improved suitability in different character recognition tasks. The proposed approach is a viable candidate for the high efficient practical automatic BHCR system. The proposed approach is a more powerful candidate for the development of an automatic BHCR system for use in practical settings.
{"title":"VashaNet: An automated system for recognizing handwritten Bangla basic characters using deep convolutional neural network","authors":"Mirza Raquib , Mohammad Amzad Hossain , Md Khairul Islam , Md Sipon Miah","doi":"10.1016/j.mlwa.2024.100568","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100568","url":null,"abstract":"<div><p>Automated character recognition is currently highly popular due to its wide range of applications. Bengali handwritten character recognition (BHCR) is an extremely difficult issue because of the nature of the script. Very few handwritten character recognition (HCR) models are capable of accurately classifying all different sorts of Bangla characters. Recently, image recognition, video analytics, and natural language processing have all found great success using convolutional neural network (CNN) due to its ability to extract and classify features in novel ways. In this paper, we introduce a VashaNet model for recognizing Bangla handwritten basic characters. The suggested VashaNet model employs a 26-layer deep convolutional neural network (DCNN) architecture consisting of nine convolutional layers, six max pooling layers, two dropout layers, five batch normalization layers, one flattening layer, two dense layers, and one output layer. The experiment was performed over 2 datasets consisting of a primary dataset of 5750 images, CMATERdb 3.1.2 for the purpose of training and evaluating the model. The suggested character recognition model worked very well, with test accuracy rates of 94.60% for the primary dataset, 94.43% for CMATERdb 3.1.2 dataset. These remarkable outcomes demonstrate that the proposed VashaNet outperforms other existing methods and offers improved suitability in different character recognition tasks. The proposed approach is a viable candidate for the high efficient practical automatic BHCR system. The proposed approach is a more powerful candidate for the development of an automatic BHCR system for use in practical settings.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100568"},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000446/pdfft?md5=5c72d6c025c7e6abd41097207c352a7c&pid=1-s2.0-S2666827024000446-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-24DOI: 10.1016/j.mlwa.2024.100566
Jessica Hauschild , Kent Eskridge
Natural language processing contains multiple methods of translating written text or spoken words into numerical information called word embeddings. Some of these embedding methods, such as Bag of Words, assume words are independent of one another. Other embedding methods, such as Bidirectional Encoder Representations from Transformers and Word2Vec, capture the relationship between words in various ways. In this paper, we are interested in comparing methods treating words as independent and methods capturing the relationship between words by looking at the effect these methods have on the classification of fake news. Using various classification methods, we compare the word embedding processes based on their effects on accuracy, precision, sensitivity, and specificity.
自然语言处理包含多种将书面文本或口语单词转化为数字信息(称为单词嵌入)的方法。其中一些嵌入方法,如 "词袋"(Bag of Words),假定单词之间是相互独立的。其他嵌入方法,如来自变换器的双向编码器表示法和 Word2Vec,则以各种方式捕捉单词之间的关系。在本文中,我们有兴趣比较将单词视为独立的方法和捕捉单词之间关系的方法,研究这些方法对假新闻分类的影响。我们使用各种分类方法,根据它们对准确度、精确度、灵敏度和特异性的影响来比较单词嵌入过程。
{"title":"Word embedding and classification methods and their effects on fake news detection","authors":"Jessica Hauschild , Kent Eskridge","doi":"10.1016/j.mlwa.2024.100566","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100566","url":null,"abstract":"<div><p>Natural language processing contains multiple methods of translating written text or spoken words into numerical information called word embeddings. Some of these embedding methods, such as Bag of Words, assume words are independent of one another. Other embedding methods, such as Bidirectional Encoder Representations from Transformers and Word2Vec, capture the relationship between words in various ways. In this paper, we are interested in comparing methods treating words as independent and methods capturing the relationship between words by looking at the effect these methods have on the classification of fake news. Using various classification methods, we compare the word embedding processes based on their effects on accuracy, precision, sensitivity, and specificity.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100566"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000422/pdfft?md5=ca2f2864023899f08c1f4e9adba5d1ef&pid=1-s2.0-S2666827024000422-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The study addresses customer churn, a major issue in service-oriented sectors like telecommunications, where it refers to the discontinuation of subscriptions. The research emphasizes the importance of recognizing customer satisfaction for retaining clients, focusing specifically on early churn prediction as a key strategy. Previous approaches mainly used generalized classification techniques for churn prediction but often neglected the aspect of interpretability, vital for decision-making. This study introduces explainer models to address this gap, providing both local and global explanations of churn predictions. Various classification models, including the standout Gradient Boosting Machine (GBM), were used alongside visualization techniques like Shapley Additive Explanations plots and scatter plots for enhanced interpretability. The GBM model demonstrated superior performance with an 81% accuracy rate. A Wilcoxon signed rank test confirmed GBM’s effectiveness over other models, with the -value indicating significant performance differences. The study concludes that GBM is notably better for churn prediction, and the employed visualization techniques effectively elucidate key churn factors in the telecommunications sector.
{"title":"Explaining customer churn prediction in telecom industry using tabular machine learning models","authors":"Sumana Sharma Poudel , Suresh Pokharel , Mohan Timilsina","doi":"10.1016/j.mlwa.2024.100567","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100567","url":null,"abstract":"<div><p>The study addresses customer churn, a major issue in service-oriented sectors like telecommunications, where it refers to the discontinuation of subscriptions. The research emphasizes the importance of recognizing customer satisfaction for retaining clients, focusing specifically on early churn prediction as a key strategy. Previous approaches mainly used generalized classification techniques for churn prediction but often neglected the aspect of interpretability, vital for decision-making. This study introduces explainer models to address this gap, providing both local and global explanations of churn predictions. Various classification models, including the standout Gradient Boosting Machine (GBM), were used alongside visualization techniques like Shapley Additive Explanations plots and scatter plots for enhanced interpretability. The GBM model demonstrated superior performance with an 81% accuracy rate. A Wilcoxon signed rank test confirmed GBM’s effectiveness over other models, with the <span><math><mi>p</mi></math></span>-value indicating significant performance differences. The study concludes that GBM is notably better for churn prediction, and the employed visualization techniques effectively elucidate key churn factors in the telecommunications sector.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100567"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000434/pdfft?md5=18da470f5a20f71eeb29e96078ff9ca6&pid=1-s2.0-S2666827024000434-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}