Multimedia Tools and Applications最新文献_第7页

Cubixel: a novel paradigm in image processing using three-dimensional pixel representation 立方像素：使用三维像素表示法的新型图像处理范例

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-09 DOI: 10.1007/s11042-024-20081-6

Sanad Aburass

This paper introduces the innovative concept of the Cubixel—a three-dimensional representation of the traditional pixel—alongside the derived metric, Volume of the Void (VoV), which measures spatial disparities within images. By converting pixels into Cubixels, we can analyze the image’s 3D properties, thereby enriching image processing and computer vision tasks. Utilizing Cubixels, we’ve developed algorithms for advanced image segmentation, edge detection, texture analysis, and feature extraction, yielding a deeper comprehension of image content. Our empirical experimental results on benchmark images and datasets showcase the applicability of these concepts. Further, we discuss future applications of Cubixels and VoV in various domains, particularly in medical imaging, where they have the potential to significantly enhance diagnostic processes. By interpreting images as complex ‘urban landscapes’, we envision a new frontier for deep learning models that simulate and learn from diverse environmental conditions. The integration of Cubixels into deep learning architectures promises to revolutionize the field, providing a pathway towards more intelligent, context-aware artificial intelligence systems. With this groundbreaking work, we aim to inspire future research that will unlock the full potential of image data, transforming both theoretical understanding and practical applications. Our code is available at https://github.com/sanadv/Cubixel.

本文介绍了立方像素（Cubixel）这一创新概念--传统像素的三维表示法--以及衍生指标--虚空体积（VoV），该指标用于测量图像中的空间差异。通过将像素转换为立方像素，我们可以分析图像的三维属性，从而丰富图像处理和计算机视觉任务。利用立方像素，我们开发出了用于高级图像分割、边缘检测、纹理分析和特征提取的算法，从而更深入地理解了图像内容。我们在基准图像和数据集上的经验性实验结果展示了这些概念的适用性。此外，我们还讨论了 Cubixels 和 VoV 在各个领域的未来应用，特别是在医疗成像领域，它们有可能显著增强诊断过程。通过将图像解释为复杂的 "城市景观"，我们为模拟和学习各种环境条件的深度学习模型设想了一个新的领域。将 Cubixels 集成到深度学习架构中有望彻底改变这一领域，为实现更智能、更能感知上下文的人工智能系统提供一条途径。通过这项开创性的工作，我们旨在激励未来的研究，充分释放图像数据的潜力，改变理论理解和实际应用。我们的代码见 https://github.com/sanadv/Cubixel。

{"title":"Cubixel: a novel paradigm in image processing using three-dimensional pixel representation","authors":"Sanad Aburass","doi":"10.1007/s11042-024-20081-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20081-6","url":null,"abstract":"This paper introduces the innovative concept of the Cubixel—a three-dimensional representation of the traditional pixel—alongside the derived metric, Volume of the Void (VoV), which measures spatial disparities within images. By converting pixels into Cubixels, we can analyze the image’s 3D properties, thereby enriching image processing and computer vision tasks. Utilizing Cubixels, we’ve developed algorithms for advanced image segmentation, edge detection, texture analysis, and feature extraction, yielding a deeper comprehension of image content. Our empirical experimental results on benchmark images and datasets showcase the applicability of these concepts. Further, we discuss future applications of Cubixels and VoV in various domains, particularly in medical imaging, where they have the potential to significantly enhance diagnostic processes. By interpreting images as complex ‘urban landscapes’, we envision a new frontier for deep learning models that simulate and learn from diverse environmental conditions. The integration of Cubixels into deep learning architectures promises to revolutionize the field, providing a pathway towards more intelligent, context-aware artificial intelligence systems. With this groundbreaking work, we aim to inspire future research that will unlock the full potential of image data, transforming both theoretical understanding and practical applications. Our code is available at https://github.com/sanadv/Cubixel.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"13 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Image decomposition based segmentation of retinal vessels 基于图像分解的视网膜血管分割

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-09 DOI: 10.1007/s11042-024-20171-5

Anumeha Varma, Monika Agrawal

Retinal vessel segmentation has various applications in the biomedical field. This includes early disease detection, biometric authentication using retinal scans, classification and others. Many of these applications rely critically on an accurate and efficient segmentation technique. In the existing literature, a lot of work has been done to improve the accuracy of the segmentation task, but it relies heavily on the amount of data available for training as well as the quality of the images captured. Another gap is observed in terms of the resources used in these heavily trained algorithms. This paper aims to address these gaps by using a resource-efficient unsupervised technique and also increasing the accuracy of retinal vessel segmentation using the Fourier decomposition method (FDM) along with the Gabor transform for image signals. The proposed method has an accuracy of 97.39%, 97.62%, 95.34%, and 96.57% on DRIVE, STARE, CHASE_DB1, and HRF datasets, respectively. The sensitivities were found to be 88.36%, 88.51%, 90.37%, and 79.07%, respectively. A separate section makes a detailed comparison of the proposed method with several well-known methods and an analysis of the efficiency of the proposed method. The proposed method proves to be efficient in terms of time and resource requirements.

视网膜血管分割在生物医学领域有多种应用。其中包括早期疾病检测、使用视网膜扫描进行生物识别身份验证、分类等。其中许多应用都严重依赖于精确高效的分割技术。在现有的文献中，人们已经做了大量工作来提高分割任务的准确性，但这在很大程度上依赖于可用于训练的数据量以及所捕获图像的质量。另一个差距体现在这些经过大量训练的算法所使用的资源方面。本文旨在通过使用一种资源节约型无监督技术来弥补这些差距，同时利用傅立叶分解法（FDM）和图像信号的 Gabor 变换来提高视网膜血管分割的准确性。所提出的方法在 DRIVE、STARE、CHASE_DB1 和 HRF 数据集上的准确率分别为 97.39%、97.62%、95.34% 和 96.57%。灵敏度分别为 88.36%、88.51%、90.37% 和 79.07%。另有一节详细比较了所提方法与几种著名方法，并分析了所提方法的效率。事实证明，建议的方法在时间和资源需求方面是高效的。

{"title":"Image decomposition based segmentation of retinal vessels","authors":"Anumeha Varma, Monika Agrawal","doi":"10.1007/s11042-024-20171-5","DOIUrl":"https://doi.org/10.1007/s11042-024-20171-5","url":null,"abstract":"Retinal vessel segmentation has various applications in the biomedical field. This includes early disease detection, biometric authentication using retinal scans, classification and others. Many of these applications rely critically on an accurate and efficient segmentation technique. In the existing literature, a lot of work has been done to improve the accuracy of the segmentation task, but it relies heavily on the amount of data available for training as well as the quality of the images captured. Another gap is observed in terms of the resources used in these heavily trained algorithms. This paper aims to address these gaps by using a resource-efficient unsupervised technique and also increasing the accuracy of retinal vessel segmentation using the Fourier decomposition method (FDM) along with the Gabor transform for image signals. The proposed method has an accuracy of 97.39%, 97.62%, 95.34%, and 96.57% on DRIVE, STARE, CHASE_DB1, and HRF datasets, respectively. The sensitivities were found to be 88.36%, 88.51%, 90.37%, and 79.07%, respectively. A separate section makes a detailed comparison of the proposed method with several well-known methods and an analysis of the efficiency of the proposed method. The proposed method proves to be efficient in terms of time and resource requirements.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"9 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improvement of Blowfish encryption algorithm based on Galois ring for electro-optical satellite images 基于伽罗瓦环的 Blowfish 加密算法在光电卫星图像上的改进

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-09 DOI: 10.1007/s11042-024-20169-z

Muhammad Javid, Majid Khan, Muhammad Amin

The desire for new algebraic structures is always an interesting area of research for the design and development of new information confidentiality mechanisms. Confidentiality mechanism keeps information secure from unauthentic users and transforms information into a ciphered format that cannot be easily deciphered. The key components of data security are encryption/decryption algorithms. Encryption/Decryption Algorithms depend on mathematical structures used for substitution box (S-box) and key scheduling. In the present research, the Galois ring-based S-box is used in the Blowfish confidentiality technique and practically implement for electro-optical satellite image ciphering. The results of data ciphering are compared with Blowfish cipher with the standard S-box. Ciphered images have been evaluated by standard tests such as histogram equalization, randomness, correlation, and differential assaults analysis. These tests depict that the ciphered data by Blowfish based on Galois ring-based S-box is more secure and the security is achieved in the fourth round for electro-optical satellite images. Test results for satellite images illustrate that by less execution time found better satellite image ciphering in the fourth ciphering round by using the proposed scheme. With this addition, subsequent steps of the existing algorithm were modified which add more robustness to our digital information. Security analysis of this tested mechanism added more robustness to the confidentiality of digital images. The results of the anticipated ciphering scheme clarify the better performance as compared to the standard Blowfish algorithm for satellite images.

对新代数结构的渴望始终是设计和开发新信息保密机制的一个有趣的研究领域。保密机制可确保信息不被非认证用户窃取，并将信息转换为不易破译的加密格式。数据安全的关键组成部分是加密/解密算法。加密/解密算法取决于用于置换盒（S-box）和密钥调度的数学结构。在本研究中，基于伽罗瓦环的 S-box 被用于 Blowfish 保密技术，并实际应用于光电卫星图像加密。数据加密结果与使用标准 S-box 的 Blowfish 密码进行了比较。通过直方图均衡化、随机性、相关性和差分攻击分析等标准测试对加密图像进行了评估。这些测试表明，基于伽罗瓦环的 S-box 的 Blowfish 密码的加密数据更加安全，而且在第四轮中实现了光电卫星图像的安全性。对卫星图像的测试结果表明，使用所提出的方案，在第四轮加密中，执行时间更短，卫星图像的加密效果更好。由于增加了这一功能，现有算法的后续步骤也进行了修改，从而为我们的数字信息增加了更多的鲁棒性。对这一测试机制的安全分析为数字图像的保密性增加了更多的稳健性。与用于卫星图像的标准 Blowfish 算法相比，预期加密方案的结果表明它具有更好的性能。

{"title":"Improvement of Blowfish encryption algorithm based on Galois ring for electro-optical satellite images","authors":"Muhammad Javid, Majid Khan, Muhammad Amin","doi":"10.1007/s11042-024-20169-z","DOIUrl":"https://doi.org/10.1007/s11042-024-20169-z","url":null,"abstract":"The desire for new algebraic structures is always an interesting area of research for the design and development of new information confidentiality mechanisms. Confidentiality mechanism keeps information secure from unauthentic users and transforms information into a ciphered format that cannot be easily deciphered. The key components of data security are encryption/decryption algorithms. Encryption/Decryption Algorithms depend on mathematical structures used for substitution box (S-box) and key scheduling. In the present research, the Galois ring-based S-box is used in the Blowfish confidentiality technique and practically implement for electro-optical satellite image ciphering. The results of data ciphering are compared with Blowfish cipher with the standard S-box. Ciphered images have been evaluated by standard tests such as histogram equalization, randomness, correlation, and differential assaults analysis. These tests depict that the ciphered data by Blowfish based on Galois ring-based S-box is more secure and the security is achieved in the fourth round for electro-optical satellite images. Test results for satellite images illustrate that by less execution time found better satellite image ciphering in the fourth ciphering round by using the proposed scheme. With this addition, subsequent steps of the existing algorithm were modified which add more robustness to our digital information. Security analysis of this tested mechanism added more robustness to the confidentiality of digital images. The results of the anticipated ciphering scheme clarify the better performance as compared to the standard Blowfish algorithm for satellite images.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"7 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing classification of lung diseases by optimizing training hyperparameters of the deep learning network 通过优化深度学习网络的训练超参数来增强肺部疾病的分类能力

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-09 DOI: 10.1007/s11042-024-20085-2

Hardeep Saini, Davinder Singh Saini

The COVID-19 pandemic was triggered by the SARS-CoV-2 virus which caused multiple ill-health conditions in infected individuals. There were many cases that culminated in death. Chest X-ray images became a proven method for spotting thoracic ailments. The resultant availability of huge public datasets of chest X-ray images has great potential in deep learning for lung ailment detection. This paper presents a classification that aims at acquiring the optimal hyperparameters using the metaheuristic algorithm for various pre-trained CNN training processes. The experimental results show that HSAGWO (Hybrid Simulated Annealing Grey Wolf Optimization) outperforms the other contemporary models for optimizing training hyperparameters in the ResNet50 network. The accuracy, precision, sensitivity (recall), specificity, and F1-score values obtained are 98.78%, 98.10%, 99.31%, and 98.64%, respectively, which are significantly better than the values obtained for the existing methods. The objective of this work is to improve classification accuracy and reduce false negatives while keeping computational time to a minimum.

COVID-19 大流行是由 SARS-CoV-2 病毒引发的，该病毒导致感染者出现多种健康问题。许多病例最终导致死亡。胸部 X 光图像成为发现胸部疾病的行之有效的方法。由此产生的大量胸部 X 光图像公共数据集在肺部疾病检测的深度学习方面具有巨大潜力。本文提出了一种分类方法，旨在利用元启发式算法为各种预训练的 CNN 训练过程获取最佳超参数。实验结果表明，HSAGWO（混合模拟退火灰狼优化）在优化 ResNet50 网络的训练超参数方面优于其他当代模型。获得的准确率、精确度、灵敏度（召回率）、特异性和 F1 分数分别为 98.78%、98.10%、99.31% 和 98.64%，明显优于现有方法获得的值。这项工作的目标是提高分类准确率，减少假阴性，同时将计算时间保持在最低水平。

{"title":"Enhancing classification of lung diseases by optimizing training hyperparameters of the deep learning network","authors":"Hardeep Saini, Davinder Singh Saini","doi":"10.1007/s11042-024-20085-2","DOIUrl":"https://doi.org/10.1007/s11042-024-20085-2","url":null,"abstract":"The COVID-19 pandemic was triggered by the SARS-CoV-2 virus which caused multiple ill-health conditions in infected individuals. There were many cases that culminated in death. Chest X-ray images became a proven method for spotting thoracic ailments. The resultant availability of huge public datasets of chest X-ray images has great potential in deep learning for lung ailment detection. This paper presents a classification that aims at acquiring the optimal hyperparameters using the metaheuristic algorithm for various pre-trained CNN training processes. The experimental results show that HSAGWO (Hybrid Simulated Annealing Grey Wolf Optimization) outperforms the other contemporary models for optimizing training hyperparameters in the ResNet50 network. The accuracy, precision, sensitivity (recall), specificity, and F1-score values obtained are 98.78%, 98.10%, 99.31%, and 98.64%, respectively, which are significantly better than the values obtained for the existing methods. The objective of this work is to improve classification accuracy and reduce false negatives while keeping computational time to a minimum.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"106 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An optimized congestion control protocol in cellular network for improving quality of service 优化蜂窝网络拥塞控制协议，提高服务质量

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-09 DOI: 10.1007/s11042-024-20126-w

Sandhya S. V, S. M. Joshi

In recent decades, Cellular Networks (CN) have been used broadly in communication technologies. The most critical challenge in the CN was congestion control due to the distributed mobile environment. Some approaches, like mobile edge computing, congesting controlling systems, machine learning, and heuristic models, have failed to prevent congestion in CN. The reason for this problem is the lack of continuous monitoring function at every time interval. So, in this present study, a novel Golden Eagle-based Primal–dual Congestion Management (GEbPDCM) has been developed for the Long-Term Evolution (LTE) Ad hoc On-demand Vector (AODV) network. Here, the Golden Eagle function features will afford the continuous monitoring function to monitor data congestion. Hence, the main objective of this research is to improve the Quality of service (QoS) by optimizing congestion controls. Here, the QoS is measured by different metrics, such as delay, packet delivery ratio (PDR), throughput, packet loss, and energy consumption. Initially, the nodes were created in the MATLAB environment, and the GEbPDCM was activated to predict the data load and estimate the node density to measure the node status. Then, the high data overload was migrated to another free status node to control congestion. Finally, the proposed model efficiency was measured regarding delay, packet delivery ratio (PDR), throughput, packet loss, and energy consumption. The proposed model has scored high throughput at 97.1 Mbps and 97.1 PDR, reducing delay to 67.4 ms and 50.6 mJ energy consumption. Hence, the present model is suitable for the LTE network.

近几十年来，蜂窝网络（CN）在通信技术中得到了广泛应用。蜂窝网络面临的最大挑战是分布式移动环境下的拥塞控制。一些方法，如移动边缘计算、拥塞控制系统、机器学习和启发式模型，都无法防止蜂窝网络的拥塞。造成这一问题的原因是缺乏每个时间间隔的连续监控功能。因此，在本研究中，针对长期演进（LTE）Ad hoc On-demand Vector（AODV）网络开发了一种新颖的基于金鹰的原始双拥塞管理（GEbPDCM）。在此，金鹰功能特性将提供持续监控功能，以监控数据拥塞情况。因此，本研究的主要目标是通过优化拥塞控制来提高服务质量（QoS）。在这里，QoS 通过不同的指标来衡量，如延迟、数据包交付率（PDR）、吞吐量、数据包丢失和能耗。最初，在 MATLAB 环境中创建节点，并启动 GEbPDCM 预测数据负载和估算节点密度，以衡量节点状态。然后，将高数据过载迁移到另一个空闲状态节点，以控制拥塞。最后，就延迟、数据包交付率 (PDR)、吞吐量、数据包丢失和能耗等方面测量了所提模型的效率。所提出的模型获得了 97.1 Mbps 的高吞吐量和 97.1 PDR，将延迟降低到 67.4 ms，能耗降低到 50.6 mJ。因此，本模型适用于 LTE 网络。

{"title":"An optimized congestion control protocol in cellular network for improving quality of service","authors":"Sandhya S. V, S. M. Joshi","doi":"10.1007/s11042-024-20126-w","DOIUrl":"https://doi.org/10.1007/s11042-024-20126-w","url":null,"abstract":"In recent decades, Cellular Networks (CN) have been used broadly in communication technologies. The most critical challenge in the CN was congestion control due to the distributed mobile environment. Some approaches, like mobile edge computing, congesting controlling systems, machine learning, and heuristic models, have failed to prevent congestion in CN. The reason for this problem is the lack of continuous monitoring function at every time interval. So, in this present study, a novel Golden Eagle-based Primal–dual Congestion Management (GEbPDCM) has been developed for the Long-Term Evolution (LTE) Ad hoc On-demand Vector (AODV) network. Here, the Golden Eagle function features will afford the continuous monitoring function to monitor data congestion. Hence, the main objective of this research is to improve the Quality of service (QoS) by optimizing congestion controls. Here, the QoS is measured by different metrics, such as delay, packet delivery ratio (PDR), throughput, packet loss, and energy consumption. Initially, the nodes were created in the MATLAB environment, and the GEbPDCM was activated to predict the data load and estimate the node density to measure the node status. Then, the high data overload was migrated to another free status node to control congestion. Finally, the proposed model efficiency was measured regarding delay, packet delivery ratio (PDR), throughput, packet loss, and energy consumption. The proposed model has scored high throughput at 97.1 Mbps and 97.1 PDR, reducing delay to 67.4 ms and 50.6 mJ energy consumption. Hence, the present model is suitable for the LTE network.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"7 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Augmented reality without SLAM 无 SLAM 的增强现实

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-06 DOI: 10.1007/s11042-024-20154-6

Aminreza Gholami, Behrooz Nasihatkon, Mohsen Soryani

Most augmented reality (AR) pipelines typically involve the computation of the camera’s pose in each frame, followed by the 2D projection of virtual objects. The camera pose estimation is commonly implemented as SLAM (Simultaneous Localisation and Mapping) algorithm. However, SLAM systems are often limited to scenarios where the camera intrinsics remain fixed or are known in all frames. This paper presents an initial effort to circumvent the pose estimation stage altogether and directly computes 2D projections using epipolar constraints. To achieve this, we initially calculate the fundamental matrices between the keyframes and each new frame. The 2D locations of objects can then be triangulated by finding the intersection of epipolar lines in the new frame. We propose a robust algorithm that can handle situations where some of the fundamental matrices are entirely erroneous. Most notably, we introduce a depth-buffering algorithm that relies solely on the fundamental matrices, eliminating the need to compute 3D point locations in the target view. By utilizing fundamental matrices, our method remains effective even when all intrinsic camera parameters vary over time. Notably, our proposed approach achieved sufficient accuracy, even with more degrees of freedom in the solution space.

大多数增强现实（AR）流水线通常涉及每帧中摄像机姿态的计算，然后是虚拟物体的二维投影。相机姿态估计通常采用 SLAM（同步定位和映射）算法。然而，SLAM 系统通常仅限于摄像机固有特性保持固定或在所有帧中都已知的情况。本文介绍了一种完全避开姿态估计阶段的初步尝试，即直接使用外极点约束计算 2D 投影。为此，我们首先计算关键帧和每个新帧之间的基本矩阵。然后，通过寻找新帧中外极线的交点，就可以对物体的二维位置进行三角测量。我们提出了一种稳健的算法，可以处理某些基本矩阵完全错误的情况。最值得注意的是，我们引入了一种深度缓冲算法，该算法完全依赖于基本矩阵，无需计算目标视图中的三维点位置。通过利用基本矩阵，我们的方法即使在所有相机固有参数随时间变化的情况下依然有效。值得注意的是，即使求解空间的自由度更大，我们提出的方法也能达到足够的精度。

{"title":"Augmented reality without SLAM","authors":"Aminreza Gholami, Behrooz Nasihatkon, Mohsen Soryani","doi":"10.1007/s11042-024-20154-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20154-6","url":null,"abstract":"Most augmented reality (AR) pipelines typically involve the computation of the camera’s pose in each frame, followed by the 2D projection of virtual objects. The camera pose estimation is commonly implemented as SLAM (Simultaneous Localisation and Mapping) algorithm. However, SLAM systems are often limited to scenarios where the camera intrinsics remain fixed or are known in all frames. This paper presents an initial effort to circumvent the pose estimation stage altogether and directly computes 2D projections using epipolar constraints. To achieve this, we initially calculate the fundamental matrices between the keyframes and each new frame. The 2D locations of objects can then be triangulated by finding the intersection of epipolar lines in the new frame. We propose a robust algorithm that can handle situations where some of the fundamental matrices are entirely erroneous. Most notably, we introduce a depth-buffering algorithm that relies solely on the fundamental matrices, eliminating the need to compute 3D point locations in the target view. By utilizing fundamental matrices, our method remains effective even when all intrinsic camera parameters vary over time. Notably, our proposed approach achieved sufficient accuracy, even with more degrees of freedom in the solution space.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"33 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting breast cancer molecular subtypes from H &E-stained histopathological images using a spatial-transcriptomics-based patch filter 利用基于空间转录组学的补丁过滤器从 H & E 染色组织病理学图像中预测乳腺癌分子亚型

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-06 DOI: 10.1007/s11042-024-20160-8

Yuqi Chen, Juan Liu, Lang Wang, Peng Jiang, Baochuan Pang, Dehua Cao

The molecular subtype of breast cancer plays an important role in the prognosis of patients and guides physicians to develop scientific therapeutic regimes. In clinical practice, physicians classify molecular subtypes of breast cancer with immunohistochemistry(IHC) technology, which requires a long cycle for diagnosis, resulting in a delay in effective treatment of patients with breast cancer. To improve the diagnostic rate, we proposed a machine learning method that predicted molecular subtypes of breast cancer from H&E-stained histopathological images. Although some molecular subtype prediction methods have been suggested, they are noisy and lack clinical evidence. To address these issues, we introduced a patch filter-based molecular subtype prediction (PFMSP) method using spatial transcriptomics data, training a patch filter with spatial transcriptomics data first, and then the trained filter was used to select valuable patches for molecular subtype prediction in other H&E-stained histopathological images. These valuable patches contained one or more genes expressed of ESR1, ESR2, PGR, and ERBB2. We evaluated the performance of our method on the spatial transcriptomics(ST) dataset and the TCGA-BRCA dataset, and the patches filtered by the patch filter achieved accuracies of 80% and 73.91% in predicting molecular subtypes on the ST and TCGA-BRCA datasets, respectively. Experimental results showed that the use of the trained patch filter to filter patches was beneficial to improving precision in predicting molecular subtypes of breast cancer.

乳腺癌的分子亚型对患者的预后起着重要作用，并指导医生制定科学的治疗方案。在临床实践中，医生通过免疫组化（IHC）技术对乳腺癌分子亚型进行分类，诊断周期较长，延误了乳腺癌患者的有效治疗。为了提高诊断率，我们提出了一种机器学习方法，从H&E染色的组织病理图像中预测乳腺癌的分子亚型。虽然已经提出了一些分子亚型预测方法，但这些方法存在噪声，而且缺乏临床证据。为了解决这些问题，我们利用空间转录组学数据引入了基于斑块过滤器的分子亚型预测（PFMSP）方法，首先用空间转录组学数据训练斑块过滤器，然后用训练好的过滤器在其他H&E染色组织病理学图像中选择有价值的斑块进行分子亚型预测。这些有价值的斑块包含一个或多个表达 ESR1、ESR2、PGR 和 ERBB2 的基因。我们在空间转录组学（ST）数据集和TCGA-BRCA数据集上评估了我们的方法的性能，经补丁过滤器过滤的补丁在ST和TCGA-BRCA数据集上预测分子亚型的准确率分别达到了80%和73.91%。实验结果表明，使用训练有素的补丁过滤器过滤补丁有利于提高预测乳腺癌分子亚型的精确度。

{"title":"Predicting breast cancer molecular subtypes from H &E-stained histopathological images using a spatial-transcriptomics-based patch filter","authors":"Yuqi Chen, Juan Liu, Lang Wang, Peng Jiang, Baochuan Pang, Dehua Cao","doi":"10.1007/s11042-024-20160-8","DOIUrl":"https://doi.org/10.1007/s11042-024-20160-8","url":null,"abstract":"The molecular subtype of breast cancer plays an important role in the prognosis of patients and guides physicians to develop scientific therapeutic regimes. In clinical practice, physicians classify molecular subtypes of breast cancer with immunohistochemistry(IHC) technology, which requires a long cycle for diagnosis, resulting in a delay in effective treatment of patients with breast cancer. To improve the diagnostic rate, we proposed a machine learning method that predicted molecular subtypes of breast cancer from H&E-stained histopathological images. Although some molecular subtype prediction methods have been suggested, they are noisy and lack clinical evidence. To address these issues, we introduced a patch filter-based molecular subtype prediction (PFMSP) method using spatial transcriptomics data, training a patch filter with spatial transcriptomics data first, and then the trained filter was used to select valuable patches for molecular subtype prediction in other H&E-stained histopathological images. These valuable patches contained one or more genes expressed of ESR1, ESR2, PGR, and ERBB2. We evaluated the performance of our method on the spatial transcriptomics(ST) dataset and the TCGA-BRCA dataset, and the patches filtered by the patch filter achieved accuracies of 80% and 73.91% in predicting molecular subtypes on the ST and TCGA-BRCA datasets, respectively. Experimental results showed that the use of the trained patch filter to filter patches was beneficial to improving precision in predicting molecular subtypes of breast cancer.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"96 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving breast cancer classification with mRMR + SS0 + WSVM: a hybrid approach 利用 mRMR + SS0 + WSVM 改进乳腺癌分类：一种混合方法

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-06 DOI: 10.1007/s11042-024-20146-6

Abrar Yaqoob, Navneet Kumar Verma, Rabia Musheer Aziz

Detecting breast cancer through histopathological images is time-consuming due to their volume and complexity. Speeding up early detection is crucial for timely medical intervention. Accurately classifying microarray data faces challenges from its dimensionality and noise. Researchers use gene selection techniques to address this issue. Additional techniques like pre-processing, ensemble, and normalization procedures aim to improve image quality. These can also impact classification approaches, helping resolve overfitting and data balance issues. A more sophisticated version could potentially boost classification accuracy while reducing overfitting. Recent technological advances have driven automated breast cancer diagnosis. This research introduces a novel method using Salp Swarm Optimization (SSO) and Support Vector Machines (SVMs) for gene selection and breast tumor classification. The process involves two stages: mRMR preselects genes based on their relevance and distinctiveness, followed by SSO-integrated WSVM for classification. WSVM, aided by SSO, trims redundant genes and assigns weights, enhancing gene significance. SSO also fine-tunes kernel parameters based on gene weights. Experimental results showcase the effectiveness of the mRMR-SSO-WSVM method, achieving high accuracy, precision, recall, and F1-score on breast gene expression datasets. Specifically, our approach achieved an accuracy of 99.62%, precision of 100%, recall of 100%, and an F1-score of 99.10%. Comparative analysis with existing methods demonstrates the superiority of our approach, with a 4% improvement in accuracy and a 3.5% increase in F1-score over traditional SVM-based methods. In conclusion, this study demonstrates the potential of the proposed mRMR-SSO-WSVM methodology in advancing breast cancer classification, offering significant improvements in performance metrics and effectively addressing challenges such as overfitting and data imbalance.

通过组织病理学图像检测乳腺癌既费时又费力。加快早期检测对于及时进行医疗干预至关重要。对微阵列数据进行精确分类面临着维度和噪声的挑战。研究人员使用基因选择技术来解决这一问题。预处理、集合和归一化程序等其他技术旨在提高图像质量。这些技术还能影响分类方法，帮助解决过拟合和数据平衡问题。更复杂的版本有可能在提高分类准确性的同时减少过度拟合。最近的技术进步推动了乳腺癌的自动诊断。这项研究介绍了一种使用 Salp Swarm Optimization（SSO）和支持向量机（SVM）进行基因选择和乳腺肿瘤分类的新方法。这一过程包括两个阶段：mRMR 根据基因的相关性和独特性对基因进行预选，然后使用集成了 SSO 的 WSVM 进行分类。在 SSO 的辅助下，WSVM 会修剪冗余基因并分配权重，从而提高基因的重要性。SSO 还根据基因权重微调内核参数。实验结果展示了 mRMR-SSO-WSVM 方法的有效性，在乳腺基因表达数据集上实现了较高的准确度、精确度、召回率和 F1 分数。具体来说，我们的方法达到了 99.62% 的准确率、100% 的精确率、100% 的召回率和 99.10% 的 F1 分数。与现有方法的比较分析表明了我们的方法的优越性，与传统的基于 SVM 的方法相比，准确率提高了 4%，F1 分数提高了 3.5%。总之，本研究证明了所提出的 mRMR-SSO-WSVM 方法在推进乳腺癌分类方面的潜力，它显著改善了性能指标，并有效解决了过拟合和数据不平衡等挑战。

{"title":"Improving breast cancer classification with mRMR + SS0 + WSVM: a hybrid approach","authors":"Abrar Yaqoob, Navneet Kumar Verma, Rabia Musheer Aziz","doi":"10.1007/s11042-024-20146-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20146-6","url":null,"abstract":"Detecting breast cancer through histopathological images is time-consuming due to their volume and complexity. Speeding up early detection is crucial for timely medical intervention. Accurately classifying microarray data faces challenges from its dimensionality and noise. Researchers use gene selection techniques to address this issue. Additional techniques like pre-processing, ensemble, and normalization procedures aim to improve image quality. These can also impact classification approaches, helping resolve overfitting and data balance issues. A more sophisticated version could potentially boost classification accuracy while reducing overfitting. Recent technological advances have driven automated breast cancer diagnosis. This research introduces a novel method using Salp Swarm Optimization (SSO) and Support Vector Machines (SVMs) for gene selection and breast tumor classification. The process involves two stages: mRMR preselects genes based on their relevance and distinctiveness, followed by SSO-integrated WSVM for classification. WSVM, aided by SSO, trims redundant genes and assigns weights, enhancing gene significance. SSO also fine-tunes kernel parameters based on gene weights. Experimental results showcase the effectiveness of the mRMR-SSO-WSVM method, achieving high accuracy, precision, recall, and F1-score on breast gene expression datasets. Specifically, our approach achieved an accuracy of 99.62%, precision of 100%, recall of 100%, and an F1-score of 99.10%. Comparative analysis with existing methods demonstrates the superiority of our approach, with a 4% improvement in accuracy and a 3.5% increase in F1-score over traditional SVM-based methods. In conclusion, this study demonstrates the potential of the proposed mRMR-SSO-WSVM methodology in advancing breast cancer classification, offering significant improvements in performance metrics and effectively addressing challenges such as overfitting and data imbalance.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"55 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An efficient crack detection and leakage monitoring in liquid metal pipelines using a novel BRetN and TCK-LSTM techniques 使用新型 BRetN 和 TCK-LSTM 技术高效检测液态金属管道中的裂缝并进行泄漏监测

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-06 DOI: 10.1007/s11042-024-20170-6

Praveen Sankarasubramanian

Nowadays, the pipeline system has the safest, most economical, and most efficient means of transporting petroleum products and other chemical fluids. But, the faults in pipelines cause resource wastage and environmental pollution. Most of the existing works focused either on the surface Crack Detection (CD) or Leakage Detection (LD) of pipes with limited features. Hence, efficient crack detection and leakage monitoring are proposed based on the Acoustic Emission (AE) signal and AE image features using a new Berout Retina Net (BRetN) and Tent Chaotic Kaiming-centric Long Short Term Memory (TCK-LSTM) methodologies. The process initiates from the gathering of input data, followed by preprocessing. Then, the cracks are detected by utilizing Berout Retina Net (BRetN), and the features of AE signals are retrieved. On the other hand, the AE signal is transformed into an AE image using Continuous Wavelet Transform (CWT). Further, the AE image features are extracted, followed by the integration of both the AE signal and AE image features. Further, the optimal features are chosen by using Gorilla Troops Optimizer (GTO). Eventually, the TCK-LSTM model is used for detecting the leakage level of the pipeline. The experimental outcomes illustrated that the proposed framework detected crack and leakage levels with 98.14% accuracy, 95.37% precision, and 98.84% specificity when analogizing over the existing techniques.

如今，管道系统已成为最安全、最经济、最高效的石油产品和其他化学液体运输手段。但是，管道故障会造成资源浪费和环境污染。现有的大多数研究都侧重于管道表面裂缝检测（CD）或泄漏检测（LD），但功能有限。因此，我们提出了基于声发射（AE）信号和 AE 图像特征的高效裂缝检测和泄漏监测方法，并采用了新的 Berout Retina Net（BRetN）和 Tent Chaotic Kaiming-centric Long Short Term Memory（TCK-LSTM）方法。该过程从收集输入数据开始，然后进行预处理。然后，利用 Berout Retina Net（BRetN）检测裂缝，并检索 AE 信号的特征。另一方面，利用连续小波变换 (CWT) 将 AE 信号转换为 AE 图像。然后，提取 AE 图像特征，再对 AE 信号和 AE 图像特征进行整合。然后，使用大猩猩部队优化器（GTO）选择最佳特征。最后，使用 TCK-LSTM 模型检测管道的泄漏程度。实验结果表明，与现有技术相比，拟议框架检测裂缝和泄漏水平的准确率为 98.14%，精确率为 95.37%，特异性为 98.84%。

{"title":"An efficient crack detection and leakage monitoring in liquid metal pipelines using a novel BRetN and TCK-LSTM techniques","authors":"Praveen Sankarasubramanian","doi":"10.1007/s11042-024-20170-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20170-6","url":null,"abstract":"Nowadays, the pipeline system has the safest, most economical, and most efficient means of transporting petroleum products and other chemical fluids. But, the faults in pipelines cause resource wastage and environmental pollution. Most of the existing works focused either on the surface Crack Detection (CD) or Leakage Detection (LD) of pipes with limited features. Hence, efficient crack detection and leakage monitoring are proposed based on the Acoustic Emission (AE) signal and AE image features using a new Berout Retina Net (BRetN) and Tent Chaotic Kaiming-centric Long Short Term Memory (TCK-LSTM) methodologies. The process initiates from the gathering of input data, followed by preprocessing. Then, the cracks are detected by utilizing Berout Retina Net (BRetN), and the features of AE signals are retrieved. On the other hand, the AE signal is transformed into an AE image using Continuous Wavelet Transform (CWT). Further, the AE image features are extracted, followed by the integration of both the AE signal and AE image features. Further, the optimal features are chosen by using Gorilla Troops Optimizer (GTO). Eventually, the TCK-LSTM model is used for detecting the leakage level of the pipeline. The experimental outcomes illustrated that the proposed framework detected crack and leakage levels with 98.14% accuracy, 95.37% precision, and 98.84% specificity when analogizing over the existing techniques.","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"4 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deception detection with multi-scale feature and multi-head attention in videos 利用视频中的多尺度特征和多头注意力进行欺骗检测

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Tools and Applications

Pub Date : 2024-09-06 DOI: 10.1007/s11042-024-20124-y

Shusen Yuan, Guanqun Zhou, Hongbo Xing, Youjun Jiang, Yewen Cao, Mingqiang Yang

Detecting deception in videos has been a challenging task, especially in real world situations. In this study, we extracted the facial action units from the micro-expression, and then calculated the frequency and the number of occurrences of each action unit. To get more information on different scales, we proposed a combination scheme of Multi-Scale Feature (MSF) model and Multi-Head Attention (MHA). The MSF model consists of two CNN with different convolution kernels and GELU is used as the active function. The MHA model was designed to divide the input features into different subspaces and generate attention for each subspace to make the features more effective. We evaluated our proposed method on the Real-life Trial dataset and achieved an accuracy of 87.81%. The results show that the MSF and MHA model could increase the accuracy of deception detection task. And the comparative experiment demonstrates the effectiveness of our proposed method.

检测视频中的欺骗行为一直是一项具有挑战性的任务，尤其是在现实世界中。在本研究中，我们从微表情中提取面部动作单元，然后计算每个动作单元的频率和出现次数。为了获取更多不同尺度的信息，我们提出了多尺度特征（MSF）模型和多头注意力（MHA）的组合方案。MSF 模型由两个具有不同卷积核的 CNN 组成，并使用 GELU 作为主动函数。MHA 模型的设计目的是将输入特征分为不同的子空间，并对每个子空间产生注意力，使特征更加有效。我们在真实试验数据集上对所提出的方法进行了评估，准确率达到了 87.81%。结果表明，MSF 和 MHA 模型可以提高欺骗检测任务的准确率。对比实验证明了我们提出的方法的有效性。

引用次数: 0