Multimedia Tools and Applications最新文献_第2页

Laplacian nonlinear logistic stepwise and gravitational deep neural classification for facial expression recognition 用于面部表情识别的拉普拉斯非线性逻辑逐步分类和引力深度神经分类

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s11042-024-20079-0

Binthu Kumari M, Sivagami B

Facial expression recognition is the paramount segment of non-verbal communication and one frequent procedure of human communication. However, different facial expressions and attaining accuracy remain major issues to be focused on. Laplacian Non-linear Logistic Regression and Gravitational Deep Learning (LNLR-GDL) for facial expression recognition is proposed to select righteous features from face image data, via feature selection to achieve high performance at minimum time. The proposed method is split into three sections, namely, preprocessing, feature selection, and classification. In the first section, preprocessing is conducted with the face recognition dataset where noise-reduced preprocessed face images are obtained by employing the Unsharp Masking Laplacian Non-linear Filter model. Second with the preprocessed face images, computationally efficient relevant features are selected using a Logistic Stepwise Regression-based feature selection model. Finally, the Gravitational Deep Neural Classification model is applied to the selected features for robust recognition of facial expressions. The proposed method is compared with existing methods using three evaluation metrics namely, facial expression recognition accuracy, facial expression recognition time, and PSNR. The obtained results demonstrate that the proposed LNLR-GDL method outperforms the state-of-the-art methods.

面部表情识别是非语言交流中最重要的部分，也是人类交流中最常见的程序之一。然而，不同的面部表情和准确性仍然是需要重点关注的主要问题。本文提出了用于面部表情识别的拉普拉斯非线性逻辑回归和引力深度学习（LNLR-GDL）方法，通过特征选择从人脸图像数据中选取正确的特征，从而在最短的时间内实现高性能。所提出的方法分为三个部分，即预处理、特征选择和分类。第一部分是对人脸识别数据集进行预处理，通过使用非清晰遮蔽拉普拉斯非线性滤波模型，得到降噪预处理后的人脸图像。其次，利用预处理后的人脸图像，使用基于逻辑逐步回归的特征选择模型来选择计算效率高的相关特征。最后，将引力深度神经分类模型应用于所选特征，以实现面部表情的鲁棒识别。通过面部表情识别准确率、面部表情识别时间和 PSNR 这三个评价指标，将所提出的方法与现有方法进行了比较。结果表明，所提出的 LNLR-GDL 方法优于最先进的方法。

{"title":"Laplacian nonlinear logistic stepwise and gravitational deep neural classification for facial expression recognition","authors":"Binthu Kumari M, Sivagami B","doi":"10.1007/s11042-024-20079-0","DOIUrl":"https://doi.org/10.1007/s11042-024-20079-0","url":null,"abstract":"Facial expression recognition is the paramount segment of non-verbal communication and one frequent procedure of human communication. However, different facial expressions and attaining accuracy remain major issues to be focused on. Laplacian Non-linear Logistic Regression and Gravitational Deep Learning (LNLR-GDL) for facial expression recognition is proposed to select righteous features from face image data, via feature selection to achieve high performance at minimum time. The proposed method is split into three sections, namely, preprocessing, feature selection, and classification. In the first section, preprocessing is conducted with the face recognition dataset where noise-reduced preprocessed face images are obtained by employing the Unsharp Masking Laplacian Non-linear Filter model. Second with the preprocessed face images, computationally efficient relevant features are selected using a Logistic Stepwise Regression-based feature selection model. Finally, the Gravitational Deep Neural Classification model is applied to the selected features for robust recognition of facial expressions. The proposed method is compared with existing methods using three evaluation metrics namely, facial expression recognition accuracy, facial expression recognition time, and PSNR. The obtained results demonstrate that the proposed LNLR-GDL method outperforms the state-of-the-art methods.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"13 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Potato leaf disease classification using fusion of multiple color spaces with weighted majority voting on deep learning architectures 利用深度学习架构上的加权多数表决融合多种色彩空间进行马铃薯叶病分类

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s11042-024-20173-3

Samaneh Sarfarazi, Hossein Ghaderi Zefrehi, Önsen Toygar

Early identification of potato leaf disease is challenging due to variations in crop species, disease symptoms, and environmental conditions. Existing methods for detecting crop species and diseases are limited, as they rely on models trained and evaluated solely on plant leaf images from specific regions. This study proposes a novel approach utilizing a Weighted Majority Voting strategy combined with multiple color space models to diagnose potato leaf diseases. The initial detection stage employs deep learning models such as AlexNet, ResNet50, and MobileNet. Our approach aims to identify Early Blight, Late Blight, and healthy potato leaf images. The proposed detection model is trained and tested on two datasets: the PlantVillage dataset and the PLD dataset. The novel fusion and ensemble method achieves an accuracy of 98.38% on the PlantVillage dataset and 98.27% on the PLD dataset with the MobileNet model. An ensemble of all models and color spaces using Weighted Majority Voting significantly increases classification accuracies to 98.61% on the PlantVillage dataset and 97.78% on the PLD dataset. Our contributions include a novel fusion method of color spaces and deep learning models, improving disease detection accuracy beyond the state-of-the-art.

由于作物种类、病害症状和环境条件的不同，马铃薯叶片病害的早期识别具有挑战性。检测作物种类和病害的现有方法很有限，因为它们依赖于仅在特定区域的植物叶片图像上训练和评估的模型。本研究提出了一种新方法，利用加权多数票策略结合多个色彩空间模型来诊断马铃薯叶片病害。初始检测阶段采用 AlexNet、ResNet50 和 MobileNet 等深度学习模型。我们的方法旨在识别早疫病、晚疫病和健康的马铃薯叶片图像。提出的检测模型在两个数据集上进行了训练和测试：PlantVillage 数据集和 PLD 数据集。新颖的融合和集合方法在 PlantVillage 数据集上达到了 98.38% 的准确率，在 PLD 数据集上使用 MobileNet 模型达到了 98.27% 的准确率。使用加权多数投票法对所有模型和色彩空间进行集合，可显著提高分类准确率，在植物村数据集上达到 98.61%，在 PLD 数据集上达到 97.78%。我们的贡献包括一种新颖的色彩空间与深度学习模型的融合方法，提高了疾病检测的准确性，超越了最先进的水平。

{"title":"Potato leaf disease classification using fusion of multiple color spaces with weighted majority voting on deep learning architectures","authors":"Samaneh Sarfarazi, Hossein Ghaderi Zefrehi, Önsen Toygar","doi":"10.1007/s11042-024-20173-3","DOIUrl":"https://doi.org/10.1007/s11042-024-20173-3","url":null,"abstract":"Early identification of potato leaf disease is challenging due to variations in crop species, disease symptoms, and environmental conditions. Existing methods for detecting crop species and diseases are limited, as they rely on models trained and evaluated solely on plant leaf images from specific regions. This study proposes a novel approach utilizing a Weighted Majority Voting strategy combined with multiple color space models to diagnose potato leaf diseases. The initial detection stage employs deep learning models such as AlexNet, ResNet50, and MobileNet. Our approach aims to identify Early Blight, Late Blight, and healthy potato leaf images. The proposed detection model is trained and tested on two datasets: the PlantVillage dataset and the PLD dataset. The novel fusion and ensemble method achieves an accuracy of 98.38% on the PlantVillage dataset and 98.27% on the PLD dataset with the MobileNet model. An ensemble of all models and color spaces using Weighted Majority Voting significantly increases classification accuracies to 98.61% on the PlantVillage dataset and 97.78% on the PLD dataset. Our contributions include a novel fusion method of color spaces and deep learning models, improving disease detection accuracy beyond the state-of-the-art.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"195 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal emotion recognition based on a fusion of audiovisual information with temporal dynamics 基于视听信息与时间动态融合的多模态情感识别

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s11042-024-20227-6

José Salas-Cáceres, Javier Lorenzo-Navarro, David Freire-Obregón, Modesto Castrillón-Santana

In the Human-Machine Interactions (HMI) landscape, understanding user emotions is pivotal for elevating user experiences. This paper explores Facial Expression Recognition (FER) within HMI, employing a distinctive multimodal approach that integrates visual and auditory information. Recognizing the dynamic nature of HMI, where situations evolve, this study emphasizes continuous emotion analysis. This work assesses various fusion strategies that involve the addition to the main network of different architectures, such as autoencoders (AE) or an Embracement module, to combine the information of multiple biometric cues. In addition to the multimodal approach, this paper introduces a new architecture that prioritizes temporal dynamics by incorporating Long Short-Term Memory (LSTM) networks. The final proposal, which integrates different multimodal approaches with the temporal focus capabilities of the LSTM architecture, was tested across three public datasets: RAVDESS, SAVEE, and CREMA-D. It showcased state-of-the-art accuracy of 88.11%, 86.75%, and 80.27%, respectively, and outperformed other existing approaches.

在人机交互（HMI）领域，了解用户情绪对于提升用户体验至关重要。本文探讨了人机界面中的面部表情识别（FER），采用了一种独特的多模态方法，将视觉和听觉信息整合在一起。认识到人机界面的动态性质，即情况不断变化，本研究强调持续的情感分析。这项工作评估了各种融合策略，包括在主网络中添加不同的架构，如自动编码器（AE）或嵌入模块，以结合多种生物识别线索的信息。除了多模态方法外，本文还引入了一种新的架构，通过结合长短期记忆（LSTM）网络，优先考虑时间动态。最终建议将不同的多模态方法与 LSTM 架构的时间聚焦功能相结合，并在三个公共数据集上进行了测试：RAVDESS、SAVEE 和 CREMA-D。其准确率分别为 88.11%、86.75% 和 80.27%，达到了最先进的水平，优于其他现有方法。

{"title":"Multimodal emotion recognition based on a fusion of audiovisual information with temporal dynamics","authors":"José Salas-Cáceres, Javier Lorenzo-Navarro, David Freire-Obregón, Modesto Castrillón-Santana","doi":"10.1007/s11042-024-20227-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20227-6","url":null,"abstract":"In the Human-Machine Interactions (HMI) landscape, understanding user emotions is pivotal for elevating user experiences. This paper explores Facial Expression Recognition (FER) within HMI, employing a distinctive multimodal approach that integrates visual and auditory information. Recognizing the dynamic nature of HMI, where situations evolve, this study emphasizes continuous emotion analysis. This work assesses various fusion strategies that involve the addition to the main network of different architectures, such as autoencoders (AE) or an Embracement module, to combine the information of multiple biometric cues. In addition to the multimodal approach, this paper introduces a new architecture that prioritizes temporal dynamics by incorporating Long Short-Term Memory (LSTM) networks. The final proposal, which integrates different multimodal approaches with the temporal focus capabilities of the LSTM architecture, was tested across three public datasets: RAVDESS, SAVEE, and CREMA-D. It showcased state-of-the-art accuracy of 88.11%, 86.75%, and 80.27%, respectively, and outperformed other existing approaches.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"32 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improvised method for analysis and synthesis of NUFB for Speech and ECG signal applications 用于语音和心电信号的 NUFB 分析与合成改进方法

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s11042-024-20211-0

B. Keerthana, N. Raju

This article presents a rapidly converging optimization technique using a single parameter for designing non-uniform cosine modulated filter banks (CMFB_S). The non-uniform cosine modulated filter banks are derived from closed-form uniform cosine modulated filter banks by merging the relevant bandpass filters based on given decimation factors. In this proposed method, the cut-off frequency of the prototype filter is varied through analytically calculated step size using control parameters so that the filter coefficients at quadrature frequency are approximately equal to 0.707 and the formulated objective function is satisfied with the prescribed tolerance. Simulation results demonstrate that the proposed algorithm achieves superior performance, with amplitude distortion levels significantly outperforming existing methods in the literature, reaching as low as 2.4483 × 10⁻⁴. For the prototype filter design, a constrained equiripple finite impulse response (FIR) digital filter is employed, with the roll-off factor and error ratio chosen based on a stopband attenuation, a passband attenuation and a filter order. The results highlight the proposed algorithm’s effectiveness for high-quality reconstruction of speech signals, particularly in speech coding and enhancement, as well as ECG signals. This makes the method highly versatile and suitable for various practical applications, including sub-band coding of real-time and near real-time signals.

本文提出了一种快速收敛的优化技术，使用单一参数设计非均匀余弦调制滤波器组（CMFBS）。非均匀余弦调制滤波器组是从闭式均匀余弦调制滤波器组衍生而来的，方法是根据给定的抽取系数合并相关的带通滤波器。在所提出的方法中，原型滤波器的截止频率通过使用控制参数分析计算的步长来改变，从而使正交频率下的滤波器系数近似等于 0.707，并在规定的容差范围内满足所制定的目标函数。仿真结果表明，所提出的算法性能优越，振幅失真水平明显优于文献中的现有方法，最低可达 2.4483 × 10-4。在滤波器原型设计中，采用了受约束等褶有限脉冲响应（FIR）数字滤波器，根据阻带衰减、通带衰减和滤波器阶数选择滚降系数和误差比。结果表明，所提出的算法能有效地对语音信号（尤其是语音编码和增强）以及心电信号进行高质量的重建。这使得该方法具有很强的通用性，适用于各种实际应用，包括实时和近实时信号的子带编码。

{"title":"Improvised method for analysis and synthesis of NUFB for Speech and ECG signal applications","authors":"B. Keerthana, N. Raju","doi":"10.1007/s11042-024-20211-0","DOIUrl":"https://doi.org/10.1007/s11042-024-20211-0","url":null,"abstract":"This article presents a rapidly converging optimization technique using a single parameter for designing non-uniform cosine modulated filter banks (CMFBS). The non-uniform cosine modulated filter banks are derived from closed-form uniform cosine modulated filter banks by merging the relevant bandpass filters based on given decimation factors. In this proposed method, the cut-off frequency of the prototype filter is varied through analytically calculated step size using control parameters so that the filter coefficients at quadrature frequency are approximately equal to 0.707 and the formulated objective function is satisfied with the prescribed tolerance. Simulation results demonstrate that the proposed algorithm achieves superior performance, with amplitude distortion levels significantly outperforming existing methods in the literature, reaching as low as 2.4483 × 10⁻4. For the prototype filter design, a constrained equiripple finite impulse response (FIR) digital filter is employed, with the roll-off factor and error ratio chosen based on a stopband attenuation, a passband attenuation and a filter order. The results highlight the proposed algorithm’s effectiveness for high-quality reconstruction of speech signals, particularly in speech coding and enhancement, as well as ECG signals. This makes the method highly versatile and suitable for various practical applications, including sub-band coding of real-time and near real-time signals.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"49 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Template-based text field segmentation for ID documents using dynamic squeezeboxes packing 使用动态挤压框包装基于模板的身份证件文本字段分割

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s11042-024-20162-6

Michael Zingerenko, Elena Limonova, Vladimir V. Arlazarov

In this paper, we focus on the problem of text field segmentation in identity documents. These documents, characterized by their fixed layouts, present an opportunity to apply computationally efficient template-based algorithms. We consider the Dynamic Squeezeboxes Packing method and demonstrate its integration into document recognition systems, utilizing a single sample per document type. We benchmark text field segmentation on the MIDV-2019 public dataset using standard intersection-over-union and our custom intersection-over-template metrics, while also measuring processing time. We demonstrate that Dynamic Squeezeboxes Packing maintains competitive quality compared to text in the wild methods (EAST, CRAFT) and named-entity recognition method (LayoutLMv2). A significant advantage of this method is its processing speed, averaging 9 ms per image on the x86_64 platform, which is substantially faster than EAST (980 ms), CRAFT (2030 ms), and LayoutLMv2 (2210 ms). The obtained results suggest that the considered method has strong potential as a method in document image analysis, particularly for processing identity documents.

在本文中，我们重点讨论身份证件中的文本字段分割问题。这些文件的特点是布局固定，为应用基于模板的高效计算算法提供了机会。我们考虑了动态挤压盒打包方法，并演示了该方法与文档识别系统的整合，每种文档类型只需使用一个样本。我们在 MIDV-2019 公开数据集上使用标准的 "过联合交集 "和我们自定义的 "过模板交集 "指标对文本字段分割进行了基准测试，同时还测量了处理时间。我们证明，与野生文本方法（EAST、CRAFT）和命名实体识别方法（LayoutLMv2）相比，动态 Squeezeboxes Packing 保持了具有竞争力的质量。这种方法的一个显著优势是处理速度快，在 x86_64 平台上，平均每张图像的处理速度为 9 毫秒，大大快于 EAST（980 毫秒）、CRAFT（2030 毫秒）和 LayoutLMv2（2210 毫秒）。所获得的结果表明，所考虑的方法在文档图像分析中，特别是在处理身份证件方面具有很大的潜力。

{"title":"Template-based text field segmentation for ID documents using dynamic squeezeboxes packing","authors":"Michael Zingerenko, Elena Limonova, Vladimir V. Arlazarov","doi":"10.1007/s11042-024-20162-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20162-6","url":null,"abstract":"In this paper, we focus on the problem of text field segmentation in identity documents. These documents, characterized by their fixed layouts, present an opportunity to apply computationally efficient template-based algorithms. We consider the Dynamic Squeezeboxes Packing method and demonstrate its integration into document recognition systems, utilizing a single sample per document type. We benchmark text field segmentation on the MIDV-2019 public dataset using standard intersection-over-union and our custom intersection-over-template metrics, while also measuring processing time. We demonstrate that Dynamic Squeezeboxes Packing maintains competitive quality compared to text in the wild methods (EAST, CRAFT) and named-entity recognition method (LayoutLMv2). A significant advantage of this method is its processing speed, averaging 9 ms per image on the x86_64 platform, which is substantially faster than EAST (980 ms), CRAFT (2030 ms), and LayoutLMv2 (2210 ms). The obtained results suggest that the considered method has strong potential as a method in document image analysis, particularly for processing identity documents.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"99 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancement of single foggy image using feature based fusion technique 使用基于特征的融合技术增强单幅雾图像

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-18 DOI: 10.1007/s11042-024-20181-3

Pooja Pandey, Rashmi Gupta, Nidhi Goel

Foggy and hazy weather conditions are very common natural phenomenon which reduces the visibility of acquired outdoor pictures. Poor visibility creates innumerable problems in various facets of life viz. in tracking, surveillance and in many more fields. In this paper, an efficient feature based fusion technique has been used to enhance the single foggy image at transmission level. Fusion at this level retains most significant features of foggy image and using this fused single input at transmission level, output defog image is calculated. Proposed methodology overcomes the shortcoming of existing Dark Channel Prior and Bright Channel Prior methods.Output of proposed method shows promising result for all types of datasets varying in fog density as well as in size. The foremost major advantage of this method is that it does not require any pre-processing or post processing and thus, very simple to implement.

雾和朦胧的天气条件是非常常见的自然现象，会降低获取的户外图片的可见度。能见度低给生活的各个方面带来了无数问题，如跟踪、监控和其他许多领域。本文采用了一种高效的基于特征的融合技术，在传输层面上增强单幅雾天图像。这一级别的融合保留了雾图像最重要的特征，并利用传输级融合后的单一输入，计算出输出除雾图像。所提出的方法克服了现有暗通道先验法和亮通道先验法的缺点。所提出方法的输出结果表明，对于雾密度和大小各不相同的各类数据集，效果都很好。这种方法的最大优点是不需要任何预处理或后处理，因此实施起来非常简单。

引用次数: 0

Integration of Blockchain and IPFS: healthcare data management & sharing for IoT Environment 区块链与 IPFS 的整合：物联网环境下的医疗数据管理与共享

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-17 DOI: 10.1007/s11042-024-20092-3

Rajiv Kumar Mishra, Rajesh Kumar Yadav, Prem Nath

The immense volume of data generated and collected by smart devices has significantly enhanced various aspects of our daily lives. However, safeguarding the sensitive information shared among these devices is crucial. Ensuring the security of the Internet of Things (IoT) ecosystem from unauthorized access is imperative. Blockchain technology emerges as a promising solution to address these security concerns. Nevertheless, the effectiveness of Blockchain in handling the extensive data generated by smart devices is challenged by the rapid pace of IoT data generation and the slower transaction validation speed within Blockchain networks. This research aims to resolve these issues by integrating Blockchain with the Inter-Planetary File System (IPFS), creating a robust framework for secure data recording on a distributed storage network while enabling authorized access to the stored data. The proposed mechanism involves defining and recording access policies and cryptographic hash content on the Blockchain network, while storing the actual IoT-generated data on IPFS to enhance the confidentiality, integrity, and availability (CIA) triad. Performance assessments of the proposed scheme demonstrate its security and practicality, validating its potential for real-world application.

智能设备生成和收集的大量数据极大地改善了我们日常生活的各个方面。然而，保护这些设备之间共享的敏感信息至关重要。当务之急是确保物联网生态系统的安全，防止未经授权的访问。区块链技术是解决这些安全问题的大有可为的解决方案。然而，由于物联网数据生成速度快，而区块链网络内的交易验证速度较慢，区块链在处理智能设备产生的大量数据方面的有效性受到了挑战。本研究旨在通过将区块链与星际文件系统（IPFS）集成来解决这些问题，从而创建一个强大的框架，用于在分布式存储网络上安全记录数据，同时实现对存储数据的授权访问。拟议的机制包括在区块链网络上定义和记录访问策略和加密哈希内容，同时在 IPFS 上存储物联网生成的实际数据，以增强保密性、完整性和可用性（CIA）三要素。拟议方案的性能评估证明了其安全性和实用性，验证了其在现实世界中的应用潜力。

{"title":"Integration of Blockchain and IPFS: healthcare data management & sharing for IoT Environment","authors":"Rajiv Kumar Mishra, Rajesh Kumar Yadav, Prem Nath","doi":"10.1007/s11042-024-20092-3","DOIUrl":"https://doi.org/10.1007/s11042-024-20092-3","url":null,"abstract":"The immense volume of data generated and collected by smart devices has significantly enhanced various aspects of our daily lives. However, safeguarding the sensitive information shared among these devices is crucial. Ensuring the security of the Internet of Things (IoT) ecosystem from unauthorized access is imperative. Blockchain technology emerges as a promising solution to address these security concerns. Nevertheless, the effectiveness of Blockchain in handling the extensive data generated by smart devices is challenged by the rapid pace of IoT data generation and the slower transaction validation speed within Blockchain networks. This research aims to resolve these issues by integrating Blockchain with the Inter-Planetary File System (IPFS), creating a robust framework for secure data recording on a distributed storage network while enabling authorized access to the stored data. The proposed mechanism involves defining and recording access policies and cryptographic hash content on the Blockchain network, while storing the actual IoT-generated data on IPFS to enhance the confidentiality, integrity, and availability (CIA) triad. Performance assessments of the proposed scheme demonstrate its security and practicality, validating its potential for real-world application.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"1 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving agility in projects using machine learning algorithm 利用机器学习算法提高项目的敏捷性

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-17 DOI: 10.1007/s11042-024-19909-y

Janani Varun, R A Karthika

All the software products developed will need testing to ensure the quality and accuracy of the product. It makes the life of testers much easier when they can optimize on the effort spent and predict defects for the upcoming modules in the Agile era. The functionality being discussed in this paper is to predict the defects using Random Forest Algorithm. Predictive analytics draws on information from the past to create forecasts about the outcomes of future events. Product team always have the difficulty in delivering the product as per schedule. As we are in the agile era, the requirement keeps changing and team is unsure on upcoming releases. Prediction helps the team to focus on the complex and error prone modules in upcoming releases. The Predictive analytics model designed, can predict defects with an accuracy rate of 88% with the help of historical data. By predicting, testers can focus on the module where there are a greater number of defects predicted by the model and left shift the delivery.

所有开发出来的软件产品都需要测试，以确保产品的质量和准确性。在敏捷时代，如果测试人员能够优化所花费的精力并预测即将到来的模块的缺陷，那么他们的生活就会变得更加轻松。本文讨论的功能是使用随机森林算法预测缺陷。预测分析利用过去的信息来创建对未来事件结果的预测。产品团队总是难以按计划交付产品。由于我们正处于敏捷时代，需求不断变化，团队无法确定即将发布的产品。预测有助于团队在即将发布的版本中专注于复杂和易出错的模块。在历史数据的帮助下，所设计的预测分析模型能以 88% 的准确率预测缺陷。通过预测，测试人员可以将重点放在模型预测的缺陷数量较多的模块上，并对交付进行左移。

引用次数: 0

Machine learning-driven IoT device for women’s safety: a real-time sexual harassment prevention system 促进妇女安全的机器学习驱动型物联网设备：实时性骚扰预防系统

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-17 DOI: 10.1007/s11042-024-20228-5

Md Reazul Islam, Khondokar Oliullah, Mohsin Kabir, Ashifur Rahman, M. F. Mridha, Muhammed Fayyaz Khan, Nilanjan Dey

Sexual harassment is an all-encompassing problem that affects individuals in diverse environments including educational institutions, workplaces, and public areas. Despite increased awareness and advocacy efforts, many women continue to face harassment daily, especially on the Indian sub-continent, with underreporting and impunity exacerbating the problem. As technology advances, there is a growing opportunity to use innovative solutions to address this problem. In recent years, the Internet of Things (IoT) and machine learning have emerged as promising technologies for developing systems that can detect and prevent sexual harassment in real-time. This study presents a novel approach for real-time sexual harassment monitoring using a machine learning-based IoT system. The system incorporates nine force-sensitive resistors strategically embedded in women’s dresses to capture relevant data. It is portable and can be affixed to any type of dressing. If the user wishes to change their attire, the system can be easily removed from the current dress and attached to another dress of choice. This flexibility allows users to adapt the system to suit various clothing preferences and styles. The sensor data are transmitted to the cloud via the NodeMCU, enabling continuous monitoring. In the cloud, a pre-trained machine learning model, specifically the AdaBoost classifier, was employed to classify incoming data in real time. We applied four ML methods: RF with GridSearchCV, Bagging Classifier, XGBoost, and Adaboost Classifier. The AdaBoost classifier performed best with an accuracy of 99.3% using a dataset prepared by our lab, which consists of 1048 instances and was collected from 50 students. If a sexual harassment event is detected, an alert is generated through a mobile application and promptly sent to appropriate authorities for immediate action to save the victim. By integrating wearable sensors, IoT technology, and machine learning, this system offers a proactive and efficient approach, especially in uncertain situations, to detect and address sexual harassment incidents and enhance safety and security in various settings.

性骚扰是一个全方位的问题，影响着教育机构、工作场所和公共场所等各种环境中的个人。尽管人们的意识和宣传力度有所提高，但许多妇女仍然每天面临骚扰，尤其是在印度次大陆，报告不足和有罪不罚现象使问题更加严重。随着技术的进步，利用创新解决方案解决这一问题的机会越来越多。近年来，物联网（IoT）和机器学习已成为开发实时检测和预防性骚扰系统的有前途的技术。本研究提出了一种利用基于机器学习的物联网系统对性骚扰进行实时监控的新方法。该系统将九个力敏电阻器战略性地嵌入女性的裙子中，以捕捉相关数据。它便于携带，可贴在任何类型的衣服上。如果用户想更换服装，可以轻松地将系统从当前的衣服上取下，然后贴到另一件衣服上。这种灵活性使用户可以调整系统，以适应各种服装偏好和风格。传感器数据通过 NodeMCU 传输到云端，实现持续监测。在云端，我们采用了一个预先训练好的机器学习模型，特别是 AdaBoost 分类器，对接收到的数据进行实时分类。我们采用了四种 ML 方法：RF with GridSearchCV、Bagging Classifier、XGBoost 和 Adaboost Classifier。AdaBoost 分类器表现最佳，在使用我们实验室准备的数据集时，准确率达到 99.3%，该数据集由 1048 个实例组成，收集自 50 名学生。如果检测到性骚扰事件，就会通过移动应用程序发出警报，并迅速发送给相关部门，以便立即采取行动拯救受害者。通过整合可穿戴传感器、物联网技术和机器学习，该系统提供了一种积极有效的方法，尤其是在不确定的情况下，以检测和处理性骚扰事件，并加强各种环境中的安全和安保。

{"title":"Machine learning-driven IoT device for women’s safety: a real-time sexual harassment prevention system","authors":"Md Reazul Islam, Khondokar Oliullah, Mohsin Kabir, Ashifur Rahman, M. F. Mridha, Muhammed Fayyaz Khan, Nilanjan Dey","doi":"10.1007/s11042-024-20228-5","DOIUrl":"https://doi.org/10.1007/s11042-024-20228-5","url":null,"abstract":"Sexual harassment is an all-encompassing problem that affects individuals in diverse environments including educational institutions, workplaces, and public areas. Despite increased awareness and advocacy efforts, many women continue to face harassment daily, especially on the Indian sub-continent, with underreporting and impunity exacerbating the problem. As technology advances, there is a growing opportunity to use innovative solutions to address this problem. In recent years, the Internet of Things (IoT) and machine learning have emerged as promising technologies for developing systems that can detect and prevent sexual harassment in real-time. This study presents a novel approach for real-time sexual harassment monitoring using a machine learning-based IoT system. The system incorporates nine force-sensitive resistors strategically embedded in women’s dresses to capture relevant data. It is portable and can be affixed to any type of dressing. If the user wishes to change their attire, the system can be easily removed from the current dress and attached to another dress of choice. This flexibility allows users to adapt the system to suit various clothing preferences and styles. The sensor data are transmitted to the cloud via the NodeMCU, enabling continuous monitoring. In the cloud, a pre-trained machine learning model, specifically the AdaBoost classifier, was employed to classify incoming data in real time. We applied four ML methods: RF with GridSearchCV, Bagging Classifier, XGBoost, and Adaboost Classifier. The AdaBoost classifier performed best with an accuracy of 99.3% using a dataset prepared by our lab, which consists of 1048 instances and was collected from 50 students. If a sexual harassment event is detected, an alert is generated through a mobile application and promptly sent to appropriate authorities for immediate action to save the victim. By integrating wearable sensors, IoT technology, and machine learning, this system offers a proactive and efficient approach, especially in uncertain situations, to detect and address sexual harassment incidents and enhance safety and security in various settings.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"7 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing multi-target tracking stability using knowledge graph integration within the Gaussian Mixture Probability Hypothesis Density Filter 利用高斯混杂概率假设密度滤波器中的知识图谱集成增强多目标跟踪稳定性

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-17 DOI: 10.1007/s11042-024-20180-4

Ali Mehrizi, Hadi Sadoghi Yazdi

This paper proposes a novel approach to enhancing multi-target tracking of vehicles in videos with frequent camera occlusions. Our method integrates prior knowledge about vehicle behavior into a Gaussian Mixture Probability Hypothesis Density (GMPHD) filter framework. This knowledge, extracted as a knowledge graph from historical vehicle trajectories, allows the tracker to maintain persistence even during significant interruptions. The knowledge graph models expected movement patterns and generates pseudo-observations during occlusions, similar to how time series analysis leverages historical data for forecasting. We evaluate the proposed method on both simulated and real-world video datasets using the Optimal Sub Pattern Assignment (OSPA) metric, which assesses tracking accuracy. The results show a 19.5% improvement for simulated data and a 16.5% improvement for real-world video data under fully occluded conditions, demonstrating a significant enhancement in performance.

本文提出了一种新颖的方法，用于在摄像机频繁遮挡的视频中加强对车辆的多目标跟踪。我们的方法将有关车辆行为的先验知识整合到高斯混合概率假设密度（GMPHD）滤波器框架中。这些知识是从历史车辆轨迹中提取的知识图谱，即使在出现重大中断时，跟踪器也能保持持续跟踪。知识图谱对预期运动模式进行建模，并在闭塞期间生成伪观测数据，这与时间序列分析利用历史数据进行预测的方法类似。我们在模拟和真实世界的视频数据集上使用最佳子模式分配（OSPA）指标对所提出的方法进行了评估，该指标用于评估跟踪精度。结果表明，在完全遮挡的条件下，模拟数据的性能提高了 19.5%，真实世界视频数据的性能提高了 16.5%，这表明该方法的性能显著提高。

引用次数: 0