Machine learning with applications最新文献_第7页

An algorithm for two-dimensional pattern detection by combining Echo State Network-based weak classifiers 结合基于回声状态网络的弱分类器的二维模式检测算法

Machine learning with applications

Pub Date : 2024-07-09 DOI: 10.1016/j.mlwa.2024.100571

Hiroshi Kage

Pattern detection is one of the essential technologies in computer vision. To solve pattern detection problems, the system needs a vast amount of computational resources. To train a multilayer perceptron or convolutional neural network, the gradient descent method is commonly used. The method consumes computational resources. To reduce the amount of computation, we propose a two-dimensional pattern detection algorithm based on Echo State Network (ESN). The training rule of ESN is based on one-shot ridge regression, which enables us to avoid the gradient descent. ESN is a kind of recurrent neural network (RNN), which is often used to embed temporal signals inside the network, rarely used for the embedding of static patterns. In our prior work (Kage, 2023), we found that static patterns can be embedded in an ESN network by associating the training patterns with its stable states, or attractors. By using the same training procedure as our prior work, we made sure that we can associate each training patch image with the desired output vector. The resulting performance of a single ESN classifier is, however, relatively poor. To overcome this poor performance, we introduced an ensemble learning framework by combining multiple ESN weak classifiers. To evaluate the performance, we used CMU-MIT frontal face images (CMU DB). We trained eleven ESN-based classifiers by using six CMU DB training images and evaluated the performance by using a CMU DB test image. We succeeded in reducing false positives in the CMU DB test image down to 0.0515 %.

模式检测是计算机视觉的基本技术之一。要解决模式检测问题，系统需要大量的计算资源。为了训练多层感知器或卷积神经网络，通常使用梯度下降法。这种方法会消耗计算资源。为了减少计算量，我们提出了一种基于回声状态网络（ESN）的二维模式检测算法。ESN 的训练规则基于单次脊回归，因此可以避免梯度下降。ESN 是一种递归神经网络（RNN），通常用于将时间信号嵌入网络内部，很少用于静态模式的嵌入。在我们之前的研究（Kage，2023 年）中，我们发现通过将训练模式与 ESN 网络的稳定状态或吸引子相关联，可以将静态模式嵌入 ESN 网络。通过使用与之前工作相同的训练程序，我们确保可以将每个训练补丁图像与所需的输出向量相关联。然而，单一 ESN 分类器的性能相对较差。为了克服这种性能低下的问题，我们通过组合多个 ESN 弱分类器引入了集合学习框架。为了评估其性能，我们使用了 CMU-MIT 正面人脸图像（CMU DB）。我们使用六张 CMU DB 训练图像训练了 11 个基于 ESN 的分类器，并使用一张 CMU DB 测试图像评估了其性能。我们成功地将 CMU DB 测试图像中的误报率降低到了 0.0515%。

{"title":"An algorithm for two-dimensional pattern detection by combining Echo State Network-based weak classifiers","authors":"Hiroshi Kage","doi":"10.1016/j.mlwa.2024.100571","DOIUrl":"10.1016/j.mlwa.2024.100571","url":null,"abstract":"<div><p>Pattern detection is one of the essential technologies in computer vision. To solve pattern detection problems, the system needs a vast amount of computational resources. To train a multilayer perceptron or convolutional neural network, the gradient descent method is commonly used. The method consumes computational resources. To reduce the amount of computation, we propose a two-dimensional pattern detection algorithm based on Echo State Network (ESN). The training rule of ESN is based on one-shot ridge regression, which enables us to avoid the gradient descent. ESN is a kind of recurrent neural network (RNN), which is often used to embed temporal signals inside the network, rarely used for the embedding of static patterns. In our prior work (Kage, 2023), we found that static patterns can be embedded in an ESN network by associating the training patterns with its stable states, or attractors. By using the same training procedure as our prior work, we made sure that we can associate each training patch image with the desired output vector. The resulting performance of a single ESN classifier is, however, relatively poor. To overcome this poor performance, we introduced an ensemble learning framework by combining multiple ESN weak classifiers. To evaluate the performance, we used CMU-MIT frontal face images (CMU DB). We trained eleven ESN-based classifiers by using six CMU DB training images and evaluated the performance by using a CMU DB test image. We succeeded in reducing false positives in the CMU DB test image down to 0.0515 %.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100571"},"PeriodicalIF":0.0,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000471/pdfft?md5=e2ec9590ba19c5c866410152f0f80ebb&pid=1-s2.0-S2666827024000471-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141637634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs 异常检测自动编码器综合研究：效率与权衡

Machine learning with applications

Pub Date : 2024-07-09 DOI: 10.1016/j.mlwa.2024.100572

Asif Ahmed Neloy , Maxime Turgeon

Unsupervised anomaly detection (UAD) is a diverse research area explored across various application domains. Over time, numerous anomaly detection techniques, including clustering, generative, and variational inference-based methods, are developed to address specific drawbacks and advance state-of-the-art techniques. Deep learning and generative models recently played a significant role in identifying unique challenges and devising advanced approaches. Auto-encoders (AEs) represent one such powerful technique that combines generative and probabilistic variational modeling with deep architecture. Auto-Encoder aims to learn the underlying data distribution to generate consequential sample data. This concept of data generation and the adoption of generative modeling have emerged in extensive research and variations in Auto-Encoder design, particularly in unsupervised representation learning. This study systematically reviews 11 Auto-Encoder architectures categorized into three groups, aiming to differentiate their reconstruction ability, sample generation, latent space visualization, and accuracy in classifying anomalous data using the Fashion-MNIST (FMNIST) and MNIST datasets. Additionally, we closely observed the reproducibility scope under different training parameters. We conducted reproducibility experiments utilizing similar model setups and hyperparameters and attempted to generate comparative results to address the scope of improvements for each Auto-Encoder. We conclude this study by analyzing the experimental results, which guide us in identifying the efficiency and trade-offs among auto-encoders, providing valuable insights into their performance and applicability in unsupervised anomaly detection techniques.

无监督异常检测（UAD）是一个跨越不同应用领域的多样化研究领域。随着时间的推移，许多异常检测技术（包括聚类、生成和基于变异推理的方法）被开发出来，以解决特定的缺陷并推动最先进技术的发展。最近，深度学习和生成模型在识别独特挑战和设计先进方法方面发挥了重要作用。自动编码器（AE）就是这样一种将生成模型和概率变分模型与深度架构相结合的强大技术。自动编码器旨在学习底层数据分布，从而生成相应的样本数据。在自动编码器设计的广泛研究和变化中，特别是在无监督表示学习中，出现了数据生成和采用生成建模的概念。本研究系统地回顾了分为三组的 11 种自动编码器架构，旨在利用时尚-MNIST（FMNIST）和 MNIST 数据集区分它们的重构能力、样本生成、潜在空间可视化以及异常数据分类的准确性。此外，我们还密切观察了不同训练参数下的重现性范围。我们利用类似的模型设置和超参数进行了再现性实验，并尝试生成比较结果，以确定每个自动编码器的改进范围。最后，我们对实验结果进行了分析，这些结果指导我们确定了自动编码器的效率和权衡，为我们了解自动编码器在无监督异常检测技术中的性能和适用性提供了宝贵的意见。

{"title":"A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs","authors":"Asif Ahmed Neloy , Maxime Turgeon","doi":"10.1016/j.mlwa.2024.100572","DOIUrl":"10.1016/j.mlwa.2024.100572","url":null,"abstract":"<div><p>Unsupervised anomaly detection (UAD) is a diverse research area explored across various application domains. Over time, numerous anomaly detection techniques, including clustering, generative, and variational inference-based methods, are developed to address specific drawbacks and advance state-of-the-art techniques. Deep learning and generative models recently played a significant role in identifying unique challenges and devising advanced approaches. Auto-encoders (AEs) represent one such powerful technique that combines generative and probabilistic variational modeling with deep architecture. Auto-Encoder aims to learn the underlying data distribution to generate consequential sample data. This concept of data generation and the adoption of generative modeling have emerged in extensive research and variations in Auto-Encoder design, particularly in unsupervised representation learning. This study systematically reviews 11 Auto-Encoder architectures categorized into three groups, aiming to differentiate their reconstruction ability, sample generation, latent space visualization, and accuracy in classifying anomalous data using the Fashion-MNIST (FMNIST) and MNIST datasets. Additionally, we closely observed the reproducibility scope under different training parameters. We conducted reproducibility experiments utilizing similar model setups and hyperparameters and attempted to generate comparative results to address the scope of improvements for each Auto-Encoder. We conclude this study by analyzing the experimental results, which guide us in identifying the efficiency and trade-offs among auto-encoders, providing valuable insights into their performance and applicability in unsupervised anomaly detection techniques.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100572"},"PeriodicalIF":0.0,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000483/pdfft?md5=deffaabf165a48bed93f11897aaeeb38&pid=1-s2.0-S2666827024000483-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141623474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CLRiuS: Contrastive Learning for intrinsically unordered Steel Scrap CLRiuS：针对内在无序废钢的对比学习

Machine learning with applications

Pub Date : 2024-07-08 DOI: 10.1016/j.mlwa.2024.100573

Michael Schäfer , Ulrike Faltings , Björn Glaser

There has been remarkable progress in the field of Deep Learning and Computer Vision, but there is a lack of freely available labeled data, especially when it comes to data for specific industrial applications. However, large volumes of structured, semi-structured and unstructured data are generated in industrial environments, from which meaningful representations can be learned. The effort required for manual labeling is extremely high and can often only be carried out by domain experts. Self-supervised methods have proven their effectiveness in recent years in a wide variety of areas such as natural language processing or computer vision. In contrast to supervised methods, self-supervised techniques are rarely used in real industrial applications. In this paper, we present a self-supervised contrastive learning approach that outperforms existing supervised approaches on the used scrap dataset. We use different types of augmentations to extract the fine-grained structures that are typical for this type of images of intrinsically unordered items. This extracts a wider range of features and encodes more aspects of the input image. This approach makes it possible to learn characteristics from images that are common for applications in the industry, such as quality control. In addition, we show that this self-supervised learning approach can be successfully applied to scene-like images for classification.

深度学习和计算机视觉领域取得了令人瞩目的进展，但却缺乏可免费获取的标注数据，尤其是涉及特定工业应用的数据。然而，工业环境中会产生大量结构化、半结构化和非结构化数据，从中可以学习到有意义的表征。人工标注所需的工作量非常大，通常只能由领域专家来完成。近年来，自监督方法已在自然语言处理或计算机视觉等多个领域证明了其有效性。与监督方法相比，自监督技术在实际工业应用中很少使用。在本文中，我们提出了一种自监督对比学习方法，该方法在废料数据集上的表现优于现有的监督方法。我们使用不同类型的增强技术来提取细粒度结构，这种结构在这类无序物品图像中非常典型。这样可以提取更广泛的特征，并对输入图像的更多方面进行编码。这种方法使得从质量控制等行业应用中常见的图像中学习特征成为可能。此外，我们还展示了这种自我监督学习方法可成功应用于场景类图像的分类。

{"title":"CLRiuS: Contrastive Learning for intrinsically unordered Steel Scrap","authors":"Michael Schäfer , Ulrike Faltings , Björn Glaser","doi":"10.1016/j.mlwa.2024.100573","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100573","url":null,"abstract":"<div><p>There has been remarkable progress in the field of Deep Learning and Computer Vision, but there is a lack of freely available labeled data, especially when it comes to data for specific industrial applications. However, large volumes of structured, semi-structured and unstructured data are generated in industrial environments, from which meaningful representations can be learned. The effort required for manual labeling is extremely high and can often only be carried out by domain experts. Self-supervised methods have proven their effectiveness in recent years in a wide variety of areas such as natural language processing or computer vision. In contrast to supervised methods, self-supervised techniques are rarely used in real industrial applications. In this paper, we present a self-supervised contrastive learning approach that outperforms existing supervised approaches on the used scrap dataset. We use different types of augmentations to extract the fine-grained structures that are typical for this type of images of intrinsically unordered items. This extracts a wider range of features and encodes more aspects of the input image. This approach makes it possible to learn characteristics from images that are common for applications in the industry, such as quality control. In addition, we show that this self-supervised learning approach can be successfully applied to scene-like images for classification.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100573"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000495/pdfft?md5=18eb4b138c0ed688f7c6e0a6f8c6b4a3&pid=1-s2.0-S2666827024000495-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Managing Linux servers with LLM-based AI agents: An empirical evaluation with GPT4 使用基于 LLM 的人工智能代理管理 Linux 服务器：使用 GPT4 进行实证评估

Machine learning with applications

Pub Date : 2024-07-01 DOI: 10.1016/j.mlwa.2024.100570

Charles Cao , Feiyi Wang , Lisa Lindley , Zejiang Wang

This paper presents an empirical study on the application of Large Language Model (LLM)-based AI agents for automating server management tasks in Linux environments. We aim to evaluate the effectiveness, efficiency, and adaptability of LLM-based AI agents in handling a wide range of server management tasks, and to identify the potential benefits and challenges of employing such agents in real-world scenarios. We present an empirical study where a GPT-based AI agent autonomously executes 150 unique tasks across 9 categories, ranging from file management to editing to program compilations. The agent operates in a Dockerized Linux sandbox, interpreting task descriptions and generating appropriate commands or scripts. Our findings reveal the agent’s proficiency in executing tasks autonomously and adapting to feedback, demonstrating the potential of LLMs in simplifying complex server management for users with varying technical expertise. This study contributes to the understanding of LLM applications in server management scenarios, and paves the foundation for future research in this domain.

本文介绍了一项关于基于大语言模型（LLM）的人工智能代理在 Linux 环境中自动执行服务器管理任务的应用实证研究。我们旨在评估基于 LLM 的人工智能代理在处理各种服务器管理任务时的有效性、效率和适应性，并确定在实际场景中使用此类代理的潜在优势和挑战。我们介绍了一项实证研究，在这项研究中，基于 GPT 的人工智能代理自主执行了从文件管理、编辑到程序编译等 9 个类别的 150 项独特任务。该代理在 Docker 化的 Linux 沙箱中运行，解释任务描述并生成适当的命令或脚本。我们的研究结果表明，该代理能够熟练地自主执行任务并适应反馈，这证明了 LLM 在为具有不同技术专长的用户简化复杂服务器管理方面的潜力。这项研究有助于人们了解 LLM 在服务器管理场景中的应用，并为该领域的未来研究奠定了基础。

{"title":"Managing Linux servers with LLM-based AI agents: An empirical evaluation with GPT4","authors":"Charles Cao , Feiyi Wang , Lisa Lindley , Zejiang Wang","doi":"10.1016/j.mlwa.2024.100570","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100570","url":null,"abstract":"<div><p>This paper presents an empirical study on the application of Large Language Model (LLM)-based AI agents for automating server management tasks in Linux environments. We aim to evaluate the effectiveness, efficiency, and adaptability of LLM-based AI agents in handling a wide range of server management tasks, and to identify the potential benefits and challenges of employing such agents in real-world scenarios. We present an empirical study where a GPT-based AI agent autonomously executes 150 unique tasks across 9 categories, ranging from file management to editing to program compilations. The agent operates in a Dockerized Linux sandbox, interpreting task descriptions and generating appropriate commands or scripts. Our findings reveal the agent’s proficiency in executing tasks autonomously and adapting to feedback, demonstrating the potential of LLMs in simplifying complex server management for users with varying technical expertise. This study contributes to the understanding of LLM applications in server management scenarios, and paves the foundation for future research in this domain.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100570"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266682702400046X/pdfft?md5=c84038ecf9feef782cbf788c56d506da&pid=1-s2.0-S266682702400046X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-class AUC maximization for imbalanced ordinal multi-stage tropical cyclone intensity change forecast 不平衡序数多级热带气旋强度变化预报的多级 AUC 最大化

Machine learning with applications

Pub Date : 2024-07-01 DOI: 10.1016/j.mlwa.2024.100569

Hirotaka Hachiya , Hiroki Yoshida , Udai Shimada , Naonori Ueda

Intense tropical cyclones (TCs) cause significant damage to human societies. Forecasting the multiple stages of TC intensity changes is considerably crucial yet challenging. This difficulty arises due to imbalanced data distribution and the need for ordinal multi-class classification. While existing classification methods, such as linear discriminant analysis, have been utilized to predict rare rapidly intensifying (RI) stages based on features related TC intensity changes, they are limited to binary classification distinguishing between RI and non-RI stages. In this paper, we introduce a novel methodology to tackle the challenges of imbalanced ordinal multi-class classification. We extend the Area Under the Curve maximization technique with inter-instance/class cross-hinge losses and inter-class distance-based slack variables. The proposed loss function, implemented within a deep learning framework, demonstrates its effectiveness using real sequence data of multi-stage TC intensity changes, including satellite infrared images and environmental variables observed in the western North Pacific.

强烈热带气旋（TC）对人类社会造成了巨大的破坏。预测热带气旋强度变化的多个阶段相当关键，但也极具挑战性。这种困难是由于数据分布不平衡和需要进行序数多类分类造成的。虽然现有的分类方法（如线性判别分析）已被用于根据与热带气旋强度变化相关的特征预测罕见的快速增强（RI）阶段，但它们仅限于区分 RI 和非 RI 阶段的二元分类。在本文中，我们引入了一种新方法来应对不平衡序数多类分类的挑战。我们利用实例间/类间交叉铰链损失和基于类间距离的松弛变量扩展了曲线下面积最大化技术。所提出的损失函数是在深度学习框架内实现的，并利用多阶段热带气旋强度变化的真实序列数据（包括卫星红外图像和在北太平洋西部观测到的环境变量）证明了其有效性。

{"title":"Multi-class AUC maximization for imbalanced ordinal multi-stage tropical cyclone intensity change forecast","authors":"Hirotaka Hachiya , Hiroki Yoshida , Udai Shimada , Naonori Ueda","doi":"10.1016/j.mlwa.2024.100569","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100569","url":null,"abstract":"<div><p>Intense tropical cyclones (TCs) cause significant damage to human societies. Forecasting the multiple stages of TC intensity changes is considerably crucial yet challenging. This difficulty arises due to imbalanced data distribution and the need for ordinal multi-class classification. While existing classification methods, such as linear discriminant analysis, have been utilized to predict rare rapidly intensifying (RI) stages based on features related TC intensity changes, they are limited to binary classification distinguishing between RI and non-RI stages. In this paper, we introduce a novel methodology to tackle the challenges of imbalanced ordinal multi-class classification. We extend the Area Under the Curve maximization technique with inter-instance/class cross-hinge losses and inter-class distance-based slack variables. The proposed loss function, implemented within a deep learning framework, demonstrates its effectiveness using real sequence data of multi-stage TC intensity changes, including satellite infrared images and environmental variables observed in the western North Pacific.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100569"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000458/pdfft?md5=92b286b0e461b132b43d67cb754aad34&pid=1-s2.0-S2666827024000458-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141594269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VashaNet: An automated system for recognizing handwritten Bangla basic characters using deep convolutional neural network VashaNet：利用深度卷积神经网络识别手写孟加拉语基本字符的自动化系统

Machine learning with applications

Pub Date : 2024-06-26 DOI: 10.1016/j.mlwa.2024.100568

Mirza Raquib , Mohammad Amzad Hossain , Md Khairul Islam , Md Sipon Miah

Automated character recognition is currently highly popular due to its wide range of applications. Bengali handwritten character recognition (BHCR) is an extremely difficult issue because of the nature of the script. Very few handwritten character recognition (HCR) models are capable of accurately classifying all different sorts of Bangla characters. Recently, image recognition, video analytics, and natural language processing have all found great success using convolutional neural network (CNN) due to its ability to extract and classify features in novel ways. In this paper, we introduce a VashaNet model for recognizing Bangla handwritten basic characters. The suggested VashaNet model employs a 26-layer deep convolutional neural network (DCNN) architecture consisting of nine convolutional layers, six max pooling layers, two dropout layers, five batch normalization layers, one flattening layer, two dense layers, and one output layer. The experiment was performed over 2 datasets consisting of a primary dataset of 5750 images, CMATERdb 3.1.2 for the purpose of training and evaluating the model. The suggested character recognition model worked very well, with test accuracy rates of 94.60% for the primary dataset, 94.43% for CMATERdb 3.1.2 dataset. These remarkable outcomes demonstrate that the proposed VashaNet outperforms other existing methods and offers improved suitability in different character recognition tasks. The proposed approach is a viable candidate for the high efficient practical automatic BHCR system. The proposed approach is a more powerful candidate for the development of an automatic BHCR system for use in practical settings.

由于应用范围广泛，自动字符识别目前非常流行。孟加拉语手写字符识别 (BHCR) 是一个极其困难的问题，这是由其文字的性质决定的。很少有手写字符识别 (HCR) 模型能够准确地对所有不同种类的孟加拉字符进行分类。最近，图像识别、视频分析和自然语言处理都发现卷积神经网络（CNN）能以新颖的方式提取和分类特征，因而取得了巨大成功。本文介绍了用于识别孟加拉语手写基本字符的 VashaNet 模型。建议的 VashaNet 模型采用 26 层深度卷积神经网络（DCNN）架构，包括 9 个卷积层、6 个最大池化层、2 个剔除层、5 个批次归一化层、1 个扁平化层、2 个密集层和 1 个输出层。实验在两个数据集上进行，其中一个数据集包含 5750 幅图像，另一个数据集是 CMATERdb 3.1.2，目的是训练和评估模型。建议的字符识别模型效果非常好，主要数据集的测试准确率为 94.60%，CMATERdb 3.1.2 数据集的测试准确率为 94.43%。这些出色的结果表明，所提出的 VashaNet 优于其他现有方法，并能更好地适用于不同的字符识别任务。所提出的方法是高效实用的自动 BHCR 系统的可行候选方案。建议的方法是开发用于实际环境的自动 BHCR 系统的更强大的候选方法。

{"title":"VashaNet: An automated system for recognizing handwritten Bangla basic characters using deep convolutional neural network","authors":"Mirza Raquib , Mohammad Amzad Hossain , Md Khairul Islam , Md Sipon Miah","doi":"10.1016/j.mlwa.2024.100568","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100568","url":null,"abstract":"<div><p>Automated character recognition is currently highly popular due to its wide range of applications. Bengali handwritten character recognition (BHCR) is an extremely difficult issue because of the nature of the script. Very few handwritten character recognition (HCR) models are capable of accurately classifying all different sorts of Bangla characters. Recently, image recognition, video analytics, and natural language processing have all found great success using convolutional neural network (CNN) due to its ability to extract and classify features in novel ways. In this paper, we introduce a VashaNet model for recognizing Bangla handwritten basic characters. The suggested VashaNet model employs a 26-layer deep convolutional neural network (DCNN) architecture consisting of nine convolutional layers, six max pooling layers, two dropout layers, five batch normalization layers, one flattening layer, two dense layers, and one output layer. The experiment was performed over 2 datasets consisting of a primary dataset of 5750 images, CMATERdb 3.1.2 for the purpose of training and evaluating the model. The suggested character recognition model worked very well, with test accuracy rates of 94.60% for the primary dataset, 94.43% for CMATERdb 3.1.2 dataset. These remarkable outcomes demonstrate that the proposed VashaNet outperforms other existing methods and offers improved suitability in different character recognition tasks. The proposed approach is a viable candidate for the high efficient practical automatic BHCR system. The proposed approach is a more powerful candidate for the development of an automatic BHCR system for use in practical settings.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100568"},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000446/pdfft?md5=5c72d6c025c7e6abd41097207c352a7c&pid=1-s2.0-S2666827024000446-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Word embedding and classification methods and their effects on fake news detection 词语嵌入和分类方法及其对假新闻检测的影响

Machine learning with applications

Pub Date : 2024-06-24 DOI: 10.1016/j.mlwa.2024.100566

Jessica Hauschild , Kent Eskridge

Natural language processing contains multiple methods of translating written text or spoken words into numerical information called word embeddings. Some of these embedding methods, such as Bag of Words, assume words are independent of one another. Other embedding methods, such as Bidirectional Encoder Representations from Transformers and Word2Vec, capture the relationship between words in various ways. In this paper, we are interested in comparing methods treating words as independent and methods capturing the relationship between words by looking at the effect these methods have on the classification of fake news. Using various classification methods, we compare the word embedding processes based on their effects on accuracy, precision, sensitivity, and specificity.

自然语言处理包含多种将书面文本或口语单词转化为数字信息（称为单词嵌入）的方法。其中一些嵌入方法，如 "词袋"（Bag of Words），假定单词之间是相互独立的。其他嵌入方法，如来自变换器的双向编码器表示法和 Word2Vec，则以各种方式捕捉单词之间的关系。在本文中，我们有兴趣比较将单词视为独立的方法和捕捉单词之间关系的方法，研究这些方法对假新闻分类的影响。我们使用各种分类方法，根据它们对准确度、精确度、灵敏度和特异性的影响来比较单词嵌入过程。

引用次数: 0

Explaining customer churn prediction in telecom industry using tabular machine learning models 用表格机器学习模型解释电信业客户流失预测

Machine learning with applications

Pub Date : 2024-06-24 DOI: 10.1016/j.mlwa.2024.100567

Sumana Sharma Poudel , Suresh Pokharel , Mohan Timilsina

The study addresses customer churn, a major issue in service-oriented sectors like telecommunications, where it refers to the discontinuation of subscriptions. The research emphasizes the importance of recognizing customer satisfaction for retaining clients, focusing specifically on early churn prediction as a key strategy. Previous approaches mainly used generalized classification techniques for churn prediction but often neglected the aspect of interpretability, vital for decision-making. This study introduces explainer models to address this gap, providing both local and global explanations of churn predictions. Various classification models, including the standout Gradient Boosting Machine (GBM), were used alongside visualization techniques like Shapley Additive Explanations plots and scatter plots for enhanced interpretability. The GBM model demonstrated superior performance with an 81% accuracy rate. A Wilcoxon signed rank test confirmed GBM’s effectiveness over other models, with the $p$ -value indicating significant performance differences. The study concludes that GBM is notably better for churn prediction, and the employed visualization techniques effectively elucidate key churn factors in the telecommunications sector.

客户流失是以服务为导向的行业（如电信业）的一个主要问题，它指的是终止订购。研究强调了认识到客户满意度对留住客户的重要性，并特别关注作为关键策略的早期客户流失预测。以往的方法主要使用通用分类技术进行客户流失预测，但往往忽视了对决策至关重要的可解释性。本研究引入了解释模型来弥补这一不足，为客户流失预测提供局部和全局解释。在使用包括杰出的梯度提升机（GBM）在内的各种分类模型的同时，还使用了 Shapley Additive Explanations 图和散点图等可视化技术来增强可解释性。GBM 模型的准确率高达 81%，表现出卓越的性能。Wilcoxon 符号秩检验证实了 GBM 比其他模型更有效，P 值表明性能差异显著。研究得出结论，GBM 在预测用户流失方面具有明显优势，所采用的可视化技术有效地阐明了电信行业的关键流失因素。

{"title":"Explaining customer churn prediction in telecom industry using tabular machine learning models","authors":"Sumana Sharma Poudel , Suresh Pokharel , Mohan Timilsina","doi":"10.1016/j.mlwa.2024.100567","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100567","url":null,"abstract":"<div><p>The study addresses customer churn, a major issue in service-oriented sectors like telecommunications, where it refers to the discontinuation of subscriptions. The research emphasizes the importance of recognizing customer satisfaction for retaining clients, focusing specifically on early churn prediction as a key strategy. Previous approaches mainly used generalized classification techniques for churn prediction but often neglected the aspect of interpretability, vital for decision-making. This study introduces explainer models to address this gap, providing both local and global explanations of churn predictions. Various classification models, including the standout Gradient Boosting Machine (GBM), were used alongside visualization techniques like Shapley Additive Explanations plots and scatter plots for enhanced interpretability. The GBM model demonstrated superior performance with an 81% accuracy rate. A Wilcoxon signed rank test confirmed GBM’s effectiveness over other models, with the <span><math><mi>p</mi></math></span>-value indicating significant performance differences. The study concludes that GBM is notably better for churn prediction, and the employed visualization techniques effectively elucidate key churn factors in the telecommunications sector.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100567"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000434/pdfft?md5=18da470f5a20f71eeb29e96078ff9ca6&pid=1-s2.0-S2666827024000434-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ensemble prediction of RRC session duration in real-world NR/LTE networks 真实世界 NR/LTE 网络中 RRC 会话持续时间的集合预测

Machine learning with applications

Pub Date : 2024-06-06 DOI: 10.1016/j.mlwa.2024.100564

Roopesh Kumar Polaganga , Qilian Liang

In the rapidly evolving realm of telecommunications, Machine Learning (ML) stands as a key driver for intelligent 6 G networks, leveraging diverse datasets to optimize real-time network parameters. This transition seamlessly extends from 4 G LTE and 5 G NR to 6 G, with ML insights from existing networks, specifically in predicting RRC session durations. This work introduces a novel use of weighted ensemble approach using AutoGluon library, employing multiple base models for accurate prediction of user session durations in real-world LTE and NR networks. Comparative analysis reveals superior accuracy in LTE, with 'Data Volume' as a crucial feature due to its direct impact on network load and user experience. Notably, NR sessions, marked by extended durations, reflect unique patterns attributed to Fixed Wireless Access (FWA) devices. An ablation study underscores the weighted ensemble's superior performance. This study highlights the need for techniques like data categorization to enhance prediction accuracies for evolving technologies, providing insights for enhanced adaptability in ML-based prediction models for the next network generation.

在快速发展的电信领域，机器学习（ML）是智能 6 G 网络的关键驱动力，可利用各种数据集优化实时网络参数。这一过渡从 4 G LTE 和 5 G NR 无缝延伸到 6 G，从现有网络中获得 ML 见解，特别是在预测 RRC 会话持续时间方面。这项研究利用 AutoGluon 库引入了一种新颖的加权集合方法，采用多个基本模型来准确预测实际 LTE 和 NR 网络中的用户会话持续时间。对比分析表明，LTE 的准确度更高，其中 "数据量 "是一个关键特征，因为它对网络负载和用户体验有直接影响。值得注意的是，NR 会话以持续时间长为特点，反映了固定无线接入 (FWA) 设备的独特模式。一项消融研究强调了加权合集的卓越性能。这项研究强调了对数据分类等技术的需求，以提高不断发展的技术的预测准确性，为下一代网络中基于 ML 的预测模型增强适应性提供了启示。

{"title":"Ensemble prediction of RRC session duration in real-world NR/LTE networks","authors":"Roopesh Kumar Polaganga , Qilian Liang","doi":"10.1016/j.mlwa.2024.100564","DOIUrl":"https://doi.org/10.1016/j.mlwa.2024.100564","url":null,"abstract":"<div><p>In the rapidly evolving realm of telecommunications, Machine Learning (ML) stands as a key driver for intelligent 6 G networks, leveraging diverse datasets to optimize real-time network parameters. This transition seamlessly extends from 4 G LTE and 5 G NR to 6 G, with ML insights from existing networks, specifically in predicting RRC session durations. This work introduces a novel use of weighted ensemble approach using AutoGluon library, employing multiple base models for accurate prediction of user session durations in real-world LTE and NR networks. Comparative analysis reveals superior accuracy in LTE, with 'Data Volume' as a crucial feature due to its direct impact on network load and user experience. Notably, NR sessions, marked by extended durations, reflect unique patterns attributed to Fixed Wireless Access (FWA) devices. An ablation study underscores the weighted ensemble's superior performance. This study highlights the need for techniques like data categorization to enhance prediction accuracies for evolving technologies, providing insights for enhanced adaptability in ML-based prediction models for the next network generation.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100564"},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000409/pdfft?md5=ae11da30368beabe226dc0a04234ea0b&pid=1-s2.0-S2666827024000409-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141324369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying losers: Automatic identification of growth-stunted salmon in aquaculture using computer vision 识别失败者：利用计算机视觉自动识别水产养殖中生长发育迟缓的鲑鱼

Machine learning with applications

Pub Date : 2024-06-01 DOI: 10.1016/j.mlwa.2024.100562

Kana Banno , Filipe Marcel Fernandes Gonçalves , Clara Sauphar , Marianna Anichini , Aline Hazelaar , Linda Helen Sperre , Christian Stolz , Grete Hansen Aas , Lars Christian Gansel , Ricardo da Silva Torres

During the production of salmonids in aquaculture, it is common to observe growth-stunted individuals. The cause for the so-called “loser fish syndrome” is unclear, which needs further investigation. Here, we present and compare computer vision systems for the automatic detection and classification of loser fish in Atlantic salmon images taken in sea cages. We evaluated two end-to-end approaches (combined detection and classification) based on YoloV5 and YoloV7, and a two-stage approach based on transfer learning for detection and an ensemble of classifiers (e.g., linear perception, Adaline, C-support vector, K-nearest neighbours, and multi-layer perceptron) for classification. To our knowledge, the use of an ensemble of classifiers, considering consolidated classifiers proposed in the literature, has not been applied to this problem before. Classification entailed the assigning of every fish to a healthy and a loser class. The results of the automatic classification were compared to the reliability of human classification. The best-performing computer vision approach was based on YoloV7, which reached a precision score of 86.30%, a recall score of 71.75%, and an F1 score of 78.35%. YoloV5 presented a precision of 79.7%, while the two-stage approach reached a precision of 66.05%. Human classification had a substantial agreement strength (Fleiss’ Kappa score of 0.68), highlighting that evaluation by a human is subjective. Our proposed automatic detection and classification system will enable farmers and researchers to follow the abundance of losers throughout the production period. We provide our dataset of annotated salmon images for further research.

在水产养殖过程中，经常会观察到生长发育迟缓的鲑鱼个体。所谓 "败鱼综合征 "的原因尚不清楚，需要进一步研究。在此，我们介绍并比较了用于自动检测和分类网箱中大西洋鲑鱼图像中的败鱼的计算机视觉系统。我们评估了基于 YoloV5 和 YoloV7 的两种端到端方法（组合检测和分类），以及一种基于迁移学习的两阶段方法（用于检测）和一种组合分类器（如线性感知、Adaline、C 支持向量、K 近邻和多层感知器）（用于分类）。据我们所知，考虑到文献中提出的综合分类器，使用分类器集合以前从未应用于这一问题。分类需要将每条鱼分别归入健康类和失败类。自动分类的结果与人工分类的可靠性进行了比较。表现最好的计算机视觉方法是 YoloV7，其精确度达到 86.30%，召回率为 71.75%，F1 分数为 78.35%。YoloV5 的精确度为 79.7%，而两阶段方法的精确度为 66.05%。人工分类具有相当高的一致性（Fleiss' Kappa 分数为 0.68），这说明人工评价是主观的。我们提出的自动检测和分类系统将使农民和研究人员能够在整个生产期间跟踪失败者的丰产情况。我们提供了三文鱼图像注释数据集，供进一步研究。

{"title":"Identifying losers: Automatic identification of growth-stunted salmon in aquaculture using computer vision","authors":"Kana Banno , Filipe Marcel Fernandes Gonçalves , Clara Sauphar , Marianna Anichini , Aline Hazelaar , Linda Helen Sperre , Christian Stolz , Grete Hansen Aas , Lars Christian Gansel , Ricardo da Silva Torres","doi":"10.1016/j.mlwa.2024.100562","DOIUrl":"10.1016/j.mlwa.2024.100562","url":null,"abstract":"<div><p>During the production of salmonids in aquaculture, it is common to observe growth-stunted individuals. The cause for the so-called “loser fish syndrome” is unclear, which needs further investigation. Here, we present and compare computer vision systems for the automatic detection and classification of loser fish in Atlantic salmon images taken in sea cages. We evaluated two <em>end-to-end approaches</em> (combined detection and classification) based on YoloV5 and YoloV7, and a <em>two-stage approach</em> based on transfer learning for detection and an ensemble of classifiers (e.g., linear perception, Adaline, C-support vector, K-nearest neighbours, and multi-layer perceptron) for classification. To our knowledge, the use of an ensemble of classifiers, considering consolidated classifiers proposed in the literature, has not been applied to this problem before. Classification entailed the assigning of every fish to a healthy and a loser class. The results of the automatic classification were compared to the reliability of human classification. The best-performing computer vision approach was based on YoloV7, which reached a precision score of 86.30%, a recall score of 71.75%, and an F1 score of 78.35%. YoloV5 presented a precision of 79.7%, while the <em>two-stage approach</em> reached a precision of 66.05%. Human classification had a substantial agreement strength (Fleiss’ Kappa score of 0.68), highlighting that evaluation by a human is subjective. Our proposed automatic detection and classification system will enable farmers and researchers to follow the abundance of losers throughout the production period. We provide our dataset of annotated salmon images for further research.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"16 ","pages":"Article 100562"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000380/pdfft?md5=98890f0d3d0ca2bae4b005f262aea422&pid=1-s2.0-S2666827024000380-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141137209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0