Pub Date : 2024-09-03DOI: 10.1007/s11042-024-20070-9
Ahmed Boussihmed, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh, Abdelaziz Chetouani
This paper presents a pioneering study on the feasibility of implementing deep learning on resource-restricted IoT devices for real-world applications. We introduce a TinyML model configured for sidewalk obstacle detection tailored explicitly to assist those with visual impairments-a demographic often hindered by urban navigation challenges. Our investigation primarily focuses on adapting traditionally computationally intensive deep learning models to the stringent confines of IoT systems, where both memory and processing power are markedly limited. With a remarkably small footprint of just 1.93 MB and a robust mean average precision (mAP) of 50%, the proposed model achieves breakthrough outcomes, making it particularly well-suited for lightweight IoT devices. We demonstrate an exceptional inference speed of 96.2 milliseconds on a standard CPU, signifying a substantial step toward real-time processing in assistive technologies. The implications of this research are profound, emphasizing TinyML’s potential to bridge the gap between advanced machine learning capabilities and the accessibility demands of assistive devices for visually impaired individuals.
{"title":"A TinyML model for sidewalk obstacle detection: aiding the blind and visually impaired people","authors":"Ahmed Boussihmed, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh, Abdelaziz Chetouani","doi":"10.1007/s11042-024-20070-9","DOIUrl":"https://doi.org/10.1007/s11042-024-20070-9","url":null,"abstract":"<p>This paper presents a pioneering study on the feasibility of implementing deep learning on resource-restricted IoT devices for real-world applications. We introduce a TinyML model configured for sidewalk obstacle detection tailored explicitly to assist those with visual impairments-a demographic often hindered by urban navigation challenges. Our investigation primarily focuses on adapting traditionally computationally intensive deep learning models to the stringent confines of IoT systems, where both memory and processing power are markedly limited. With a remarkably small footprint of just 1.93 MB and a robust mean average precision (mAP) of 50%, the proposed model achieves breakthrough outcomes, making it particularly well-suited for lightweight IoT devices. We demonstrate an exceptional inference speed of 96.2 milliseconds on a standard CPU, signifying a substantial step toward real-time processing in assistive technologies. The implications of this research are profound, emphasizing TinyML’s potential to bridge the gap between advanced machine learning capabilities and the accessibility demands of assistive devices for visually impaired individuals.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"49 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20073-6
Simranjit Singh, Mohit Sajwan, Sonal Kukreja
In the past few years, with the increase in population and health concerns, there has been a need for efficient health monitoring solutions that can help patients monitor their health consistently to be aware of any health risks at the initial stage. The advancement in sensing and smart technologies helps monitor human behaviors to predict health risks. In this work, a dynamic decision-based activity prediction system is proposed using Random Forest, SVM, Decision Trees, Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) on an edge device. We train the models using features from the MHealth dataset, such as acceleration, rate of turn, and magnetic field, to predict activities such as standing, climbing, running, and jogging, collected from various sensors. Our framework dynamically selects between machine learning (ML) and deep learning (DL) algorithms based on real-time data size and edge device capabilities, ensuring optimal performance and resource utilization. The results for the proposed models are compared and analyzed. The experimental results indicate that among all machine learning methods, Random Forest achieves the highest overall accuracy at 98%, while in deep learning algorithms, both LSTM and GRU reach a maximum accuracy of 98%.
{"title":"Decision-based framework to facilitate EDGE computing in smart health care","authors":"Simranjit Singh, Mohit Sajwan, Sonal Kukreja","doi":"10.1007/s11042-024-20073-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20073-6","url":null,"abstract":"<p>In the past few years, with the increase in population and health concerns, there has been a need for efficient health monitoring solutions that can help patients monitor their health consistently to be aware of any health risks at the initial stage. The advancement in sensing and smart technologies helps monitor human behaviors to predict health risks. In this work, a dynamic decision-based activity prediction system is proposed using Random Forest, SVM, Decision Trees, Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) on an edge device. We train the models using features from the MHealth dataset, such as acceleration, rate of turn, and magnetic field, to predict activities such as standing, climbing, running, and jogging, collected from various sensors. Our framework dynamically selects between machine learning (ML) and deep learning (DL) algorithms based on real-time data size and edge device capabilities, ensuring optimal performance and resource utilization. The results for the proposed models are compared and analyzed. The experimental results indicate that among all machine learning methods, Random Forest achieves the highest overall accuracy at 98%, while in deep learning algorithms, both LSTM and GRU reach a maximum accuracy of 98%.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"106 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20106-0
Cheng-Hsing Yang, Chi-Yao Weng, Chia-Ling Hung, Shiuh-Jeng WANG
Reversible data hiding in the encrypted images (RDHEI) has attracted more attention because RDHEI can be used for both information protection and image encryption. Many researches based on RDHEI have been proposed by using the Most Significant Bit (MSB) inversion to embed confidential information, but they might subject to errors when extracting the hidden information. This paper improves the approach based on MSB inversion and proposes a new RDHEI technique. Our approach hides the block’s position of the block in the image, which would cause misinterpretation in the original image, and then encrypts the image. The MSB inversion strategy is applied to embed the secret messages in the encrypted image. Since the location information of the error block is pre-hidden in the image, this information ensures that the secret message is correctly extracted and the image is fully recovered. We also created a multi-regular block complexity formula to determine the secret bits hidden in a block and recover the original block. In addition, we extended the design of four methods to cover various segmentation strategies and complexity calculation methods. According to the experimental results, our method can successfully extract the secret message and recover the original image intact after the encrypted image is embedded with the secret message. Generally, in using different image size, we averagely achieve the PSNR and embedding capacity of 39 experimental images at 40.633 dB and 46,298.46 bits, respectively.
{"title":"Efficient reversible data hiding in encrypted images using Block Complexity and most significant bit inversion strategy","authors":"Cheng-Hsing Yang, Chi-Yao Weng, Chia-Ling Hung, Shiuh-Jeng WANG","doi":"10.1007/s11042-024-20106-0","DOIUrl":"https://doi.org/10.1007/s11042-024-20106-0","url":null,"abstract":"<p>Reversible data hiding in the encrypted images (RDHEI) has attracted more attention because RDHEI can be used for both information protection and image encryption. Many researches based on RDHEI have been proposed by using the Most Significant Bit (MSB) inversion to embed confidential information, but they might subject to errors when extracting the hidden information. This paper improves the approach based on MSB inversion and proposes a new RDHEI technique. Our approach hides the block’s position of the block in the image, which would cause misinterpretation in the original image, and then encrypts the image. The MSB inversion strategy is applied to embed the secret messages in the encrypted image. Since the location information of the error block is pre-hidden in the image, this information ensures that the secret message is correctly extracted and the image is fully recovered. We also created a multi-regular block complexity formula to determine the secret bits hidden in a block and recover the original block. In addition, we extended the design of four methods to cover various segmentation strategies and complexity calculation methods. According to the experimental results, our method can successfully extract the secret message and recover the original image intact after the encrypted image is embedded with the secret message. Generally, in using different image size, we averagely achieve the PSNR and embedding capacity of 39 experimental images at 40.633 dB and 46,298.46 bits, respectively.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"7 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20058-5
Shang Zhuge, Zhiheng Zhou, Wenlue Zhou, Jiangfeng Wu, Ming Deng, Ming Dai
The central challenge in noisy image segmentation is how to effectively suppress or remove noise while preserving important features, thereby achieving accurate image segmentation. Active contour models are widely utilized in these tasks. Nevertheless, they are unable to remove high noise while segmenting images with weak edges. In order to mitigate the adverse effects of non-uniformity while preserving the details of the image on image segmentation, a novel approach is introduced: the adaptive fractional differential active contour image segmentation method. This method aims to address the aforementioned problem. Our methods adaptively define the fractional order using the proposed entropy, which enhances the edge extraction ability of image entropy in the presence of image intensity inhomogeneity and noise, different orders are applied to different pixels. The introduced entropy demonstrates resilience against significant noise, thereby enhancing the model’s capacity to accurately and seamlessly delineate boundaries. Empirical evaluations conducted on various test images substantiate the model’s efficacy in addressing intensity inhomogeneity and achieving exceptional segmentation accuracy.
{"title":"Noisy image segmentation utilizing entropy-adaptive fractional differential-driven active contours","authors":"Shang Zhuge, Zhiheng Zhou, Wenlue Zhou, Jiangfeng Wu, Ming Deng, Ming Dai","doi":"10.1007/s11042-024-20058-5","DOIUrl":"https://doi.org/10.1007/s11042-024-20058-5","url":null,"abstract":"<p>The central challenge in noisy image segmentation is how to effectively suppress or remove noise while preserving important features, thereby achieving accurate image segmentation. Active contour models are widely utilized in these tasks. Nevertheless, they are unable to remove high noise while segmenting images with weak edges. In order to mitigate the adverse effects of non-uniformity while preserving the details of the image on image segmentation, a novel approach is introduced: the adaptive fractional differential active contour image segmentation method. This method aims to address the aforementioned problem. Our methods adaptively define the fractional order using the proposed entropy, which enhances the edge extraction ability of image entropy in the presence of image intensity inhomogeneity and noise, different orders are applied to different pixels. The introduced entropy demonstrates resilience against significant noise, thereby enhancing the model’s capacity to accurately and seamlessly delineate boundaries. Empirical evaluations conducted on various test images substantiate the model’s efficacy in addressing intensity inhomogeneity and achieving exceptional segmentation accuracy.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"16 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20065-6
Nirmaladevi P, Asokan Ramasamy
An efficient fusion based speckle denoising algorithm is proposed in this paper to improve the edge and detail preservation of US images. This is accomplished by integrating complementary information from two wavelet despeckled source images. The two source images are such that one denoise the coefficients greater than threshold for improving the noise removal performance and another denoise the coefficients below threshold to preserve the fine details. For fusion, a two stage fusion algorithm utilizing a novel fusion rule exploiting the inter and intra scale dependency of the wavelet coefficients is proposed. The first stage performs an interscale activity based fusion and the second stage accomplishes an intra scale dependency based fusion for fusing the detail subbands of the two images. The approximation coefficients are fused with a maximum rule. The resulting fused image give an outstanding performance compared with existing wavelet based approaches and other fusion techniques in terms of Peak-Signal to Noise Ratio (PSNR), Mean Square Error (MSE), Structural Similarity Index Measure (SSSIM), Equivalent Number Of Looks (ENL) And Edge Preservation Index (EPI).
本文提出了一种高效的基于融合的斑点去噪算法,以改善 US 图像的边缘和细节保存。这是通过整合两幅小波去斑源图像的互补信息来实现的。两幅源图像中,一幅图像对高于阈值的系数进行去噪,以提高去噪性能,另一幅图像对低于阈值的系数进行去噪,以保留精细细节。在融合方面,提出了一种两阶段融合算法,利用小波系数的尺度间和尺度内依赖性的新颖融合规则。第一阶段执行基于尺度间活动的融合,第二阶段完成基于尺度内依赖性的融合,以融合两幅图像的细节子带。近似系数采用最大值规则进行融合。在峰值信噪比 (PSNR)、均方误差 (MSE)、结构相似性指数 (SSSIM)、等效外观数 (ENL) 和边缘保留指数 (EPI) 等方面,与现有的基于小波的方法和其他融合技术相比,融合后的图像具有出色的性能。
{"title":"An undecimated wavelet based adaptive fusion filtering for ultrasound despeckling","authors":"Nirmaladevi P, Asokan Ramasamy","doi":"10.1007/s11042-024-20065-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20065-6","url":null,"abstract":"<p>An efficient fusion based speckle denoising algorithm is proposed in this paper to improve the edge and detail preservation of US images. This is accomplished by integrating complementary information from two wavelet despeckled source images. The two source images are such that one denoise the coefficients greater than threshold for improving the noise removal performance and another denoise the coefficients below threshold to preserve the fine details. For fusion, a two stage fusion algorithm utilizing a novel fusion rule exploiting the inter and intra scale dependency of the wavelet coefficients is proposed. The first stage performs an interscale activity based fusion and the second stage accomplishes an intra scale dependency based fusion for fusing the detail subbands of the two images. The approximation coefficients are fused with a maximum rule. The resulting fused image give an outstanding performance compared with existing wavelet based approaches and other fusion techniques in terms of Peak-Signal to Noise Ratio (PSNR), Mean Square Error (MSE), Structural Similarity Index Measure (SSSIM), Equivalent Number Of Looks (ENL) And Edge Preservation Index (EPI).</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"13 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-023-16777-w
Fatma Khallaf, Walid El-Shafai, El-Sayed M. El-Rabaie, Fathi E. Abd El-Samie
In recent years, the proliferation of smart devices and associated technologies, such as the Internet of Things (IoT), Industrial Internet of Things (IIoT), and Internet of Medical Things (IoMT), has witnessed a substantial growth. However, the limited processing power and storage capacity of smart devices make them vulnerable to cyberattacks, rendering traditional security and cryptography techniques inadequate. To address these challenges, blockchain (BC) technology has emerged as a promising solution. This study introduces an efficient framework for the Internet of Healthcare Things (IoHT), presenting a novel cryptosystem for color medical images using BC technology in conjunction with the IoT, Secure Hash Algorithm 256-bit (SHA256), shuffling, and bitwise XOR operations. The encryption scheme is specifically designed for an IIoT grid network computing system, relying on diffusion and confusion principles. In this paper, the proposed cryptosystem strength is evaluated against differential attacks with several comprehensive metrics. Simulation results and theoretical analysis demonstrate the cryptosystem effectiveness, showcasing its ability to provide high levels of security and immunity to data leakage. The proposed cryptosystem offers a versatile range of technical solutions and strategies that are adaptable to various scenarios. The evaluation metrics, with approximate values of 99.61% for Number of Pixels Change Rate (NPCR), 33.46% for Unified Average Changed Intensity (UACI), and 8 for information entropy, closely align with the desired ideal outcomes. Consequently, this paper contributes to the advancement of secure and private systems for medical image encryption based on BC technology, potentially mitigating the risks associated with cyberattacks on smart medical devices.
{"title":"Blockchain-based color medical image cryptosystem for industrial Internet of Healthcare Things (IoHT)","authors":"Fatma Khallaf, Walid El-Shafai, El-Sayed M. El-Rabaie, Fathi E. Abd El-Samie","doi":"10.1007/s11042-023-16777-w","DOIUrl":"https://doi.org/10.1007/s11042-023-16777-w","url":null,"abstract":"<p>In recent years, the proliferation of smart devices and associated technologies, such as the Internet of Things (IoT), Industrial Internet of Things (IIoT), and Internet of Medical Things (IoMT), has witnessed a substantial growth. However, the limited processing power and storage capacity of smart devices make them vulnerable to cyberattacks, rendering traditional security and cryptography techniques inadequate. To address these challenges, blockchain (BC) technology has emerged as a promising solution. This study introduces an efficient framework for the Internet of Healthcare Things (IoHT), presenting a novel cryptosystem for color medical images using BC technology in conjunction with the IoT, Secure Hash Algorithm 256-bit (SHA256), shuffling, and bitwise XOR operations. The encryption scheme is specifically designed for an IIoT grid network computing system, relying on diffusion and confusion principles. In this paper, the proposed cryptosystem strength is evaluated against differential attacks with several comprehensive metrics. Simulation results and theoretical analysis demonstrate the cryptosystem effectiveness, showcasing its ability to provide high levels of security and immunity to data leakage. The proposed cryptosystem offers a versatile range of technical solutions and strategies that are adaptable to various scenarios. The evaluation metrics, with approximate values of 99.61% for Number of Pixels Change Rate (NPCR), 33.46% for Unified Average Changed Intensity (UACI), and 8 for information entropy, closely align with the desired ideal outcomes. Consequently, this paper contributes to the advancement of secure and private systems for medical image encryption based on BC technology, potentially mitigating the risks associated with cyberattacks on smart medical devices.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"47 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20109-x
Shankar M. Patil, Bhawana S. Dakhare, Shilpa M. Satre, Shivaji D. Pawar
Blockchain, a distributed ledger technology utilizing cryptographic methods, offers promising solutions for enhancing security and privacy in smart healthcare big data (HBD) management systems. However, scalability remains a significant challenge, as the decentralized nature of blockchain networks often leads to performance bottlenecks and increased transaction costs, especially when managing large volumes of healthcare data. This framework presents a Blockchain-Based Privacy Preservation Framework (PPF) designed to mitigate cyber threats in smart HBD management systems. The framework integrates blockchain technology with privacy-preserving mechanisms, including singular public key cryptography for off-chain data encryption and a private data storage system built on linked ring signatures based on elliptic curve cryptography without certificates. To protect the ecosystem from cyber-attacks targeting data storage facilities and service providers, secure multiparty computation is employed. The proposed solution is evaluated using Python for analysis. Results show an average delay of 27 s for a 2ms block time and 53 s for a 250ms block time. For a file size of 45 MB, the response time is notably low at 9.5 s. The findings demonstrate the framework’s viability, employing Hyper ledger smart contracts to achieve the required level of security while improving system efficiency compared to existing solutions.
{"title":"Blockchain-based privacy preservation framework for preventing cyberattacks in smart healthcare big data management systems","authors":"Shankar M. Patil, Bhawana S. Dakhare, Shilpa M. Satre, Shivaji D. Pawar","doi":"10.1007/s11042-024-20109-x","DOIUrl":"https://doi.org/10.1007/s11042-024-20109-x","url":null,"abstract":"<p>Blockchain, a distributed ledger technology utilizing cryptographic methods, offers promising solutions for enhancing security and privacy in smart healthcare big data (HBD) management systems. However, scalability remains a significant challenge, as the decentralized nature of blockchain networks often leads to performance bottlenecks and increased transaction costs, especially when managing large volumes of healthcare data. This framework presents a Blockchain-Based Privacy Preservation Framework (PPF) designed to mitigate cyber threats in smart HBD management systems. The framework integrates blockchain technology with privacy-preserving mechanisms, including singular public key cryptography for off-chain data encryption and a private data storage system built on linked ring signatures based on elliptic curve cryptography without certificates. To protect the ecosystem from cyber-attacks targeting data storage facilities and service providers, secure multiparty computation is employed. The proposed solution is evaluated using Python for analysis. Results show an average delay of 27 s for a 2ms block time and 53 s for a 250ms block time. For a file size of 45 MB, the response time is notably low at 9.5 s. The findings demonstrate the framework’s viability, employing Hyper ledger smart contracts to achieve the required level of security while improving system efficiency compared to existing solutions.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"4 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-19864-8
Noorah Jaber Faisal Jaber, Ayhan Akbas
The issue of skin cancer has garnered significant attention from the scientific community worldwide, with melanoma being the most lethal and uncommon form of the disease. Melanoma occurs due to the uncontrolled growth of melanocyte cells, which are responsible for imparting color to the skin. If left untreated, melanoma can spread throughout the body and cause death. Early detection of melanoma can lower its mortality rate. In this study, we propose a robust Convolutional Neural Network (CNN)-based method for classifying melanoma images as healthy or non-healthy. To train and test the model, we utilized public datasets from International Skin Imaging Collaboration (ISIC). Additionally, we compared our method with other classification techniques, including Support Vector Machine (SVM), Decision Tree, and K-Nearest Neighbors (K-NN), using the Harris Hawks Optimization algorithm. The results of our method showed superior performance compared to the other approaches.
{"title":"Melanoma skin cancer detection based on deep learning methods and binary Harris Hawk optimization","authors":"Noorah Jaber Faisal Jaber, Ayhan Akbas","doi":"10.1007/s11042-024-19864-8","DOIUrl":"https://doi.org/10.1007/s11042-024-19864-8","url":null,"abstract":"<p>The issue of skin cancer has garnered significant attention from the scientific community worldwide, with melanoma being the most lethal and uncommon form of the disease. Melanoma occurs due to the uncontrolled growth of melanocyte cells, which are responsible for imparting color to the skin. If left untreated, melanoma can spread throughout the body and cause death. Early detection of melanoma can lower its mortality rate. In this study, we propose a robust Convolutional Neural Network (CNN)-based method for classifying melanoma images as healthy or non-healthy. To train and test the model, we utilized public datasets from International Skin Imaging Collaboration (ISIC). Additionally, we compared our method with other classification techniques, including Support Vector Machine (SVM), Decision Tree, and K-Nearest Neighbors (K-NN), using the Harris Hawks Optimization algorithm. The results of our method showed superior performance compared to the other approaches.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"93 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20163-5
Vijayameenakshi T. M, Swapna T. R
Music genre classification is one of the most interesting topics in digital music. Classifying genres is basically subjective, and different listeners may perceive genres in various ways. Furthermore, it might be difficult to classify some songs accurately since they belong to numerous genres. Genres are incredibly wide and ill-defined categories, which makes them problematic. Thus, genre-based measures are inherently inaccurate and coarse. Moreover, not every piece of music cleanly fits into a particular genre. Many papers based on deep neural networks perform sound recognition and classification with input images of audio, which do not affect the time–frequency representation of a signal. The traditional method adds waveform augmentation to the audio signal, thereby increasing the network's training speed. This paper considers music genre classification with the convolution temporal pooling framework and explores the impact of adding the SpecAugment method to augment the spectrogram itself. The augmented spectrogram is then fed into a convolutional temporal pooling network. In this model, the temporal and pooling layers identify the genre pattern and classify the songs based on the genre. It also predicts these duplication that will occur in the given sample. We apply this model to the GTZAN dataset, a widely used dataset for music genre classification. This method improves the identification of Rock and Pop song and also eliminates the replication of the songs. The trained model reports an accuracy of 0.75 for training a 30-s audio file.
{"title":"Music genre classification using convolution temporal pooling network","authors":"Vijayameenakshi T. M, Swapna T. R","doi":"10.1007/s11042-024-20163-5","DOIUrl":"https://doi.org/10.1007/s11042-024-20163-5","url":null,"abstract":"<p>Music genre classification is one of the most interesting topics in digital music. Classifying genres is basically subjective, and different listeners may perceive genres in various ways. Furthermore, it might be difficult to classify some songs accurately since they belong to numerous genres. Genres are incredibly wide and ill-defined categories, which makes them problematic. Thus, genre-based measures are inherently inaccurate and coarse. Moreover, not every piece of music cleanly fits into a particular genre. Many papers based on deep neural networks perform sound recognition and classification with input images of audio, which do not affect the time–frequency representation of a signal. The traditional method adds waveform augmentation to the audio signal, thereby increasing the network's training speed. This paper considers music genre classification with the convolution temporal pooling framework and explores the impact of adding the SpecAugment method to augment the spectrogram itself. The augmented spectrogram is then fed into a convolutional temporal pooling network. In this model, the temporal and pooling layers identify the genre pattern and classify the songs based on the genre. It also predicts these duplication that will occur in the given sample. We apply this model to the GTZAN dataset, a widely used dataset for music genre classification. This method improves the identification of Rock and Pop song and also eliminates the replication of the songs. The trained model reports an accuracy of 0.75 for training a 30-s audio file.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"15 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s11042-024-20141-x
Samira Mavaddati
Accurate brain tumor classification using magnetic resonance imaging (MRI) is crucial for guiding patient treatment decisions. However, differentiating tumor types can be challenging due to subtle variations in texture. This study investigates the potential of deep learning, specifically a 50-layer ResNet architecture, for improved brain tumor classification from MRI scans. The transfer learning technique is leveraged to enhance model performance and compare its effectiveness with other deep learning architectures such as CNN, RNN, and a dictionary learning-based classifier. The results demonstrate that the ResNet-50 model achieves superior performance in terms of accuracy, sensitivity, and robustness compared to the evaluated methods. This highlights the novelty of our work: combining a deep residual network (ResNet-50) with transfer learning for brain tumor classification. This approach offers a promising avenue for improved diagnostic accuracy and potentially better patient outcomes in a clinical setting with an accuracy rate of over 99.85%. The results of the experiments show that the proposed approach has significant potential in improving the accuracy of brain tumor classification using MRI and medical knowledge. Additionally, the use of deep learning structures combined with transfer learning yields a novel and effective solution for brain tumor classification.
{"title":"Brain tumors classification using deep models and transfer learning","authors":"Samira Mavaddati","doi":"10.1007/s11042-024-20141-x","DOIUrl":"https://doi.org/10.1007/s11042-024-20141-x","url":null,"abstract":"<p>Accurate brain tumor classification using magnetic resonance imaging (MRI) is crucial for guiding patient treatment decisions. However, differentiating tumor types can be challenging due to subtle variations in texture. This study investigates the potential of deep learning, specifically a 50-layer ResNet architecture, for improved brain tumor classification from MRI scans. The transfer learning technique is leveraged to enhance model performance and compare its effectiveness with other deep learning architectures such as CNN, RNN, and a dictionary learning-based classifier. The results demonstrate that the ResNet-50 model achieves superior performance in terms of accuracy, sensitivity, and robustness compared to the evaluated methods. This highlights the novelty of our work: combining a deep residual network (ResNet-50) with transfer learning for brain tumor classification. This approach offers a promising avenue for improved diagnostic accuracy and potentially better patient outcomes in a clinical setting with an accuracy rate of over 99.85%. The results of the experiments show that the proposed approach has significant potential in improving the accuracy of brain tumor classification using MRI and medical knowledge. Additionally, the use of deep learning structures combined with transfer learning yields a novel and effective solution for brain tumor classification.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"35 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}