首页 > 最新文献

2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)最新文献

英文 中文
An Adaptive Data Processing Framework for Cost-Effective COVID-19 and Pneumonia Detection 成本效益高的COVID-19和肺炎检测自适应数据处理框架
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576805
Kin Wai Lee, R. Chin
Medical imaging modalities have been showing great potentials for faster and efficient disease transmission control and containment. In the paper, we propose a cost-effective COVID-19 and pneumonia detection framework using CT scans acquired from several hospitals. To this end, we incorporate a novel data processing framework that utilizes 3D and 2D CT scans to diversify the trainable inputs in a resource-limited setting. Moreover, we empirically demonstrate the significance of several data processing schemes for our COVID-19 and pneumonia detection network. Experiment results show that our proposed pneumonia detection network is comparable to other pneumonia detection tasks integrated with imaging modalities, with 93% mean AUC and 85.22% mean accuracy scores on generalized datasets. Additionally, our proposed data processing framework can be easily adapted to other applications of CT modality, especially for cost-effective and resource-limited scenarios, such as breast cancer detection, pulmonary nodules diagnosis, etc.
医学成像方式在更快、更有效地控制和遏制疾病传播方面显示出巨大的潜力。在本文中,我们提出了一个具有成本效益的COVID-19和肺炎检测框架,使用从几家医院获得的CT扫描。为此,我们采用了一种新的数据处理框架,该框架利用3D和2D CT扫描,在资源有限的情况下使可训练的输入多样化。此外,我们通过实证证明了几种数据处理方案对我们的COVID-19和肺炎检测网络的意义。实验结果表明,我们提出的肺炎检测网络与其他集成了成像模式的肺炎检测任务相当,在广义数据集上平均AUC为93%,平均准确率为85.22%。此外,我们提出的数据处理框架可以很容易地适应CT模式的其他应用,特别是在成本效益和资源有限的情况下,如乳腺癌检测,肺结节诊断等。
{"title":"An Adaptive Data Processing Framework for Cost-Effective COVID-19 and Pneumonia Detection","authors":"Kin Wai Lee, R. Chin","doi":"10.1109/ICSIPA52582.2021.9576805","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576805","url":null,"abstract":"Medical imaging modalities have been showing great potentials for faster and efficient disease transmission control and containment. In the paper, we propose a cost-effective COVID-19 and pneumonia detection framework using CT scans acquired from several hospitals. To this end, we incorporate a novel data processing framework that utilizes 3D and 2D CT scans to diversify the trainable inputs in a resource-limited setting. Moreover, we empirically demonstrate the significance of several data processing schemes for our COVID-19 and pneumonia detection network. Experiment results show that our proposed pneumonia detection network is comparable to other pneumonia detection tasks integrated with imaging modalities, with 93% mean AUC and 85.22% mean accuracy scores on generalized datasets. Additionally, our proposed data processing framework can be easily adapted to other applications of CT modality, especially for cost-effective and resource-limited scenarios, such as breast cancer detection, pulmonary nodules diagnosis, etc.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125958832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Simplified Skeleton Joints Based Approach For Human Action Recognition 基于简化骨骼关节的人体动作识别方法
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576770
N. Malik, S. Abu-Bakar, U. U. Sheikh
The growing technological development in the field of computer vision in general, and human action recognition (HAR), in particular, have attracted increasing number of researchers from various disciplines. Amid the variety of challenges in the field of human action recognition, one of the major issues is complex modelling which requires multiple parameters leading to troublesome training which further requires heavy configuration machines for real-time recognition. Therefore, there is a need to develop a simplified method that could result in reduced complexity, without compromising the performance accuracy. In order to address the mentioned issue, this paper proposes an approach that extracts the mean, variance and median from the skeleton joint locations and directly uses them in the classification process. The system used MCAD dataset for extracting 2D skeleton features with the help of OpenPose technique, which is suitable for the extraction of skeleton features from the 2D image instead of 3D image or using a depth sensor. Henceforth, we avoid using either the RGB images or the skeleton images in the recognition process. The method shows a promising performance with an accuracy of 73.8% when tested with MCAD dataset.
随着计算机视觉领域技术的不断发展,尤其是人体动作识别(HAR),吸引了越来越多来自不同学科的研究人员。在人体动作识别领域面临的各种挑战中,一个主要问题是建模复杂,需要多个参数,导致训练困难,这进一步要求重型配置机器进行实时识别。因此,有必要开发一种简化的方法,在不影响性能准确性的情况下降低复杂性。为了解决上述问题,本文提出了一种从骨骼关节位置提取均值、方差和中位数并直接用于分类过程的方法。该系统使用MCAD数据集提取二维骨骼特征,并借助OpenPose技术,适合于从二维图像中提取骨骼特征,而不是3D图像或使用深度传感器。因此,我们在识别过程中尽量避免使用RGB图像或骨架图像。通过对MCAD数据集的测试,表明该方法具有良好的性能,准确率达到73.8%。
{"title":"A Simplified Skeleton Joints Based Approach For Human Action Recognition","authors":"N. Malik, S. Abu-Bakar, U. U. Sheikh","doi":"10.1109/ICSIPA52582.2021.9576770","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576770","url":null,"abstract":"The growing technological development in the field of computer vision in general, and human action recognition (HAR), in particular, have attracted increasing number of researchers from various disciplines. Amid the variety of challenges in the field of human action recognition, one of the major issues is complex modelling which requires multiple parameters leading to troublesome training which further requires heavy configuration machines for real-time recognition. Therefore, there is a need to develop a simplified method that could result in reduced complexity, without compromising the performance accuracy. In order to address the mentioned issue, this paper proposes an approach that extracts the mean, variance and median from the skeleton joint locations and directly uses them in the classification process. The system used MCAD dataset for extracting 2D skeleton features with the help of OpenPose technique, which is suitable for the extraction of skeleton features from the 2D image instead of 3D image or using a depth sensor. Henceforth, we avoid using either the RGB images or the skeleton images in the recognition process. The method shows a promising performance with an accuracy of 73.8% when tested with MCAD dataset.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129212884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Image to Image Translation Networks using Perceptual Adversarial Loss Function 基于感知对抗损失函数的图像到图像翻译网络
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576815
Saleh Altakrouri, S. Usman, N. Ahmad, Taghreed Justinia, N. Noor
Image to image translation based on deep learning models is a subject of immense importance in the disciplines of Artificial Intelligence (AI) and Computer Vision (CV). A variety of traditional tasks such as image colorization, image denoising and image inpainting, are categorized as typical paired image translation tasks. In computer vision, super-resolution regeneration is particularly important field. We proposed an improved algorithm to mitigate the issues that arises during the reconstruction using super resolution based on generative adversarial network. It is difficult to train in reconstruction of results. The generated images and the corresponding ground-truth images should share the same fundamental structure in order to output the required resultant images. The shared basic structure between the input and the corresponding output image is not as optimal as assumed for paired image translation tasks, which can greatly impact the generating model performance. The traditional GAN based model used in image-to-image translation tasks used a pre-trained classification network. The pre-trained networks perform well on the classification tasks compared to image translation tasks because they were trained on features that contribute to better classification. We proposed the perceptual loss based efficient net Generative Adversarial Network (PL-E-GAN) for super resolution tasks. Unlike other state of the art image translation models, the PL-E-GAN offers a generic architecture for image super-resolution tasks. PL-E-GAN is constituted of two convolutional neural networks (CNNs) that are the Generative network and Discriminator network Gn and Dn, respectively. PL-E-GAN employed both the generative adversarial loss and perceptual adversarial loss as objective function to the network. The integration of these loss function undergoes an adversarial training and both the networks Gn and Dn trains alternatively. The feasibility and benefits of the PL-E-GAN over several image translation models are shown in studies and tested on many image-to-image translation tasks
基于深度学习模型的图像到图像翻译是人工智能(AI)和计算机视觉(CV)学科中非常重要的课题。各种传统的任务,如图像着色、图像去噪和图像上绘,被归类为典型的成对图像翻译任务。在计算机视觉中,超分辨率再生是一个特别重要的领域。我们提出了一种基于生成对抗网络的超分辨率重建算法,以缓解重建过程中出现的问题。结果的重建训练是困难的。生成的图像和相应的真地图像应具有相同的基本结构,以便输出所需的合成图像。对于配对图像转换任务,输入和相应输出图像之间的共享基本结构并不像假设的那么理想,这将极大地影响生成模型的性能。传统的基于GAN的模型用于图像到图像的翻译任务,使用预训练的分类网络。与图像翻译任务相比,预训练的网络在分类任务上表现良好,因为它们是在有助于更好分类的特征上进行训练的。针对超分辨率任务,提出了基于感知损失的高效网络生成对抗网络(PL-E-GAN)。与其他先进的图像翻译模型不同,PL-E-GAN为图像超分辨率任务提供了通用架构。PL-E-GAN由两个卷积神经网络(cnn)组成,分别是生成网络(Generative network)和判别网络(Discriminator network) Gn和Dn。PL-E-GAN将生成对抗损失和感知对抗损失作为网络的目标函数。这些损失函数的积分经过对抗性训练,网络Gn和Dn交替训练。在许多图像到图像的翻译任务中,研究和测试了PL-E-GAN在几种图像翻译模型上的可行性和优势
{"title":"Image to Image Translation Networks using Perceptual Adversarial Loss Function","authors":"Saleh Altakrouri, S. Usman, N. Ahmad, Taghreed Justinia, N. Noor","doi":"10.1109/ICSIPA52582.2021.9576815","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576815","url":null,"abstract":"Image to image translation based on deep learning models is a subject of immense importance in the disciplines of Artificial Intelligence (AI) and Computer Vision (CV). A variety of traditional tasks such as image colorization, image denoising and image inpainting, are categorized as typical paired image translation tasks. In computer vision, super-resolution regeneration is particularly important field. We proposed an improved algorithm to mitigate the issues that arises during the reconstruction using super resolution based on generative adversarial network. It is difficult to train in reconstruction of results. The generated images and the corresponding ground-truth images should share the same fundamental structure in order to output the required resultant images. The shared basic structure between the input and the corresponding output image is not as optimal as assumed for paired image translation tasks, which can greatly impact the generating model performance. The traditional GAN based model used in image-to-image translation tasks used a pre-trained classification network. The pre-trained networks perform well on the classification tasks compared to image translation tasks because they were trained on features that contribute to better classification. We proposed the perceptual loss based efficient net Generative Adversarial Network (PL-E-GAN) for super resolution tasks. Unlike other state of the art image translation models, the PL-E-GAN offers a generic architecture for image super-resolution tasks. PL-E-GAN is constituted of two convolutional neural networks (CNNs) that are the Generative network and Discriminator network Gn and Dn, respectively. PL-E-GAN employed both the generative adversarial loss and perceptual adversarial loss as objective function to the network. The integration of these loss function undergoes an adversarial training and both the networks Gn and Dn trains alternatively. The feasibility and benefits of the PL-E-GAN over several image translation models are shown in studies and tested on many image-to-image translation tasks","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124048664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Robust Segmentation of Malaria Parasites Detection using Fast k-Means and Enhanced k-Means Clustering Algorithms 基于快速k-Means和增强k-Means聚类算法的疟疾寄生虫检测鲁棒分割
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576799
T. A. Aris, A. Nasir, Z. Mohamed
Image segmentation is the crucial stage in image analysis since it represents the first step towards extracting important information from the image. In summary, this paper presents several clustering approach to obtain fully malaria parasite cells segmented images of Plasmodium Falciparum and Plasmodium Vivax species on thick smear images. Despite k-means is a renowned clustering approach, its effectiveness is still unreliable due to some vulnerabilities which leads to the need of a better approach. To be specific, fast k-means and enhanced k-means are the adaptation of existing k-means. Fast k-means eliminates the requirement to retraining cluster centres, thus reducing the amount of time it takes to train image cluster centres. While, enhanced k-means introduces the idea of variance and a revised edition of the transferring method for clustered members to aid the distribution of data to the appropriate centre throughout the clustering action. Hence, the goal of this study is to explore the efficacy of k-means, fast k-means and enhanced k-means algorithms in order to achieve a clean segmented image with ability to correctly segment whole region of parasites on thick smear images. Practically, about 100 thick blood smear images were analyzed, and the verdict demonstrate that segmentation via fast k-means clustering algorithm has splendid segmentation performance, with an accuracy of 99.91%, sensitivity of 75.75%, and specificity of 99.93%.
图像分割是图像分析的关键阶段,是从图像中提取重要信息的第一步。综上所述,本文提出了几种聚类方法来获得恶性疟原虫和间日疟原虫在厚涂片上的完全疟原虫细胞分割图像。尽管k-means是一种著名的聚类方法,但由于一些漏洞,它的有效性仍然不可靠,这导致需要更好的方法。具体来说,快速k-means和增强k-means是对现有k-means的适应。快速k-means消除了重新训练聚类中心的要求,从而减少了训练图像聚类中心所需的时间。而增强的k-means则引入了方差的概念,并对聚类成员的转移方法进行了修订,以帮助在整个聚类过程中将数据分布到适当的中心。因此,本研究的目的是探索k-means、快速k-means和增强k-means算法的有效性,以获得干净的分割图像,并能够正确分割厚涂片图像上的整个寄生虫区域。实际对100张厚血涂片图像进行了分析,结果表明,快速k-means聚类算法具有良好的分割性能,分割准确率为99.91%,灵敏度为75.75%,特异性为99.93%。
{"title":"A Robust Segmentation of Malaria Parasites Detection using Fast k-Means and Enhanced k-Means Clustering Algorithms","authors":"T. A. Aris, A. Nasir, Z. Mohamed","doi":"10.1109/ICSIPA52582.2021.9576799","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576799","url":null,"abstract":"Image segmentation is the crucial stage in image analysis since it represents the first step towards extracting important information from the image. In summary, this paper presents several clustering approach to obtain fully malaria parasite cells segmented images of Plasmodium Falciparum and Plasmodium Vivax species on thick smear images. Despite k-means is a renowned clustering approach, its effectiveness is still unreliable due to some vulnerabilities which leads to the need of a better approach. To be specific, fast k-means and enhanced k-means are the adaptation of existing k-means. Fast k-means eliminates the requirement to retraining cluster centres, thus reducing the amount of time it takes to train image cluster centres. While, enhanced k-means introduces the idea of variance and a revised edition of the transferring method for clustered members to aid the distribution of data to the appropriate centre throughout the clustering action. Hence, the goal of this study is to explore the efficacy of k-means, fast k-means and enhanced k-means algorithms in order to achieve a clean segmented image with ability to correctly segment whole region of parasites on thick smear images. Practically, about 100 thick blood smear images were analyzed, and the verdict demonstrate that segmentation via fast k-means clustering algorithm has splendid segmentation performance, with an accuracy of 99.91%, sensitivity of 75.75%, and specificity of 99.93%.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133989189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Malaria Parasite Detection using Residual Attention U-Net 残留注意力u网检测疟疾寄生虫
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576814
Chiang Kang Tan, C. M. Goh, S. Aluwee, Siak Wang Khor, C. M. Tyng
Malaria is a life-threatening disease caused by Plasmodium parasites, and which is still a serious health concern worldwide nowadays. However, it is curable if early diagnosis could be performed. Due to the lack of access to expertise for diagnosis of the disease, often in poorly developed and remote areas, an automated yet accurate diagnostic solution is sought. In Malaysia, there exists 5 types of malaria parasites. As an initial proof of concept, automated segmentation of one of the types, Plasmodium falciparum, on thin blood smear was experimented using our proposed Residual Attention U-net, a type of Convolutional Neural Network that is used in deep learning system. Results showed an accuracy of 0.9687 and precision of 0.9691 when the trained system was used on verified test data.
疟疾是由疟原虫引起的一种危及生命的疾病,目前仍是全世界严重的健康问题。然而,如果早期诊断,它是可以治愈的。由于缺乏疾病诊断的专门知识,往往在欠发达和偏远地区,因此寻求一种自动化但准确的诊断解决方案。在马来西亚,有5种疟疾寄生虫。作为概念的初步证明,使用我们提出的残余注意力U-net(一种用于深度学习系统的卷积神经网络)对薄血涂片上的恶性疟原虫进行了自动分割实验。结果表明,训练后的系统在经过验证的测试数据上的准确率为0.9687,精密度为0.9691。
{"title":"Malaria Parasite Detection using Residual Attention U-Net","authors":"Chiang Kang Tan, C. M. Goh, S. Aluwee, Siak Wang Khor, C. M. Tyng","doi":"10.1109/ICSIPA52582.2021.9576814","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576814","url":null,"abstract":"Malaria is a life-threatening disease caused by Plasmodium parasites, and which is still a serious health concern worldwide nowadays. However, it is curable if early diagnosis could be performed. Due to the lack of access to expertise for diagnosis of the disease, often in poorly developed and remote areas, an automated yet accurate diagnostic solution is sought. In Malaysia, there exists 5 types of malaria parasites. As an initial proof of concept, automated segmentation of one of the types, Plasmodium falciparum, on thin blood smear was experimented using our proposed Residual Attention U-net, a type of Convolutional Neural Network that is used in deep learning system. Results showed an accuracy of 0.9687 and precision of 0.9691 when the trained system was used on verified test data.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115513011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Evaluation of State-of-the-Art Object Detectors for Pornography Detection 评价最先进的对象检测器用于色情检测
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576796
Sui Lyn Hor, H. A. Karim, Mohd Haris Lye Abdullah, Nouar Aldahoul, Sarina Mansor, M. F. A. Fauzi, John See, A. Wazir
Pornographic and nudity content detection in videos is gaining importance as Internet grows to become a source for exposure to such content. Recent literature involved pornography recognition using deep learning techniques such as convolutional neural network, object detection models and recurrent neural networks, as well as combinations of these methods. In this paper, the effectiveness of three pretrained object detection models (YOLOv3, EfficientDet-d7x and Faster R-CNN with ResNet50 as backbone) were tested to compare their performance in detecting pornographic contents. Video frames consisting of real humans from the public NPDI dataset were utilised to form four categories of target content (female breast, female lower body, male lower body and nude human) by cropping the specific image regions and augmenting them. Results demonstrated that COCO-pretrained EfficientDet-d7x model achieved the highest overall detection accuracy of 75.61%. Interestingly, human detection of YOLOv3 may be dependent on image quality and/or presence of external body parts that belong only to humans.
随着互联网成为暴露色情和裸露内容的来源,视频中的色情和裸露内容检测变得越来越重要。最近的文献涉及使用深度学习技术(如卷积神经网络、对象检测模型和循环神经网络)以及这些方法的组合来识别色情内容。本文测试了三种预训练对象检测模型(YOLOv3、EfficientDet-d7x和以ResNet50为骨干的Faster R-CNN)的有效性,比较了它们在检测色情内容方面的性能。利用NPDI公开数据集的真人视频帧,对特定图像区域进行裁剪和增强,形成四类目标内容(女性乳房、女性下半身、男性下半身和裸人)。结果表明,coco预训练的effentdet -d7x模型总体检测准确率最高,为75.61%。有趣的是,人类对YOLOv3的检测可能依赖于图像质量和/或仅属于人类的外部身体部位的存在。
{"title":"An Evaluation of State-of-the-Art Object Detectors for Pornography Detection","authors":"Sui Lyn Hor, H. A. Karim, Mohd Haris Lye Abdullah, Nouar Aldahoul, Sarina Mansor, M. F. A. Fauzi, John See, A. Wazir","doi":"10.1109/ICSIPA52582.2021.9576796","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576796","url":null,"abstract":"Pornographic and nudity content detection in videos is gaining importance as Internet grows to become a source for exposure to such content. Recent literature involved pornography recognition using deep learning techniques such as convolutional neural network, object detection models and recurrent neural networks, as well as combinations of these methods. In this paper, the effectiveness of three pretrained object detection models (YOLOv3, EfficientDet-d7x and Faster R-CNN with ResNet50 as backbone) were tested to compare their performance in detecting pornographic contents. Video frames consisting of real humans from the public NPDI dataset were utilised to form four categories of target content (female breast, female lower body, male lower body and nude human) by cropping the specific image regions and augmenting them. Results demonstrated that COCO-pretrained EfficientDet-d7x model achieved the highest overall detection accuracy of 75.61%. Interestingly, human detection of YOLOv3 may be dependent on image quality and/or presence of external body parts that belong only to humans.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129654301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Camera-Based Remote Photoplethysmography for Physiological Monitoring in Neonatal Intensive Care 基于摄像机的远程光容积脉搏波在新生儿重症监护中的生理监测
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576779
B. Sweely, Ziming Liu, T. Wyatt, Katherine Newnam, H. Qi, Xiaopeng Zhao
The ability to measure heart rate (HR) noninvasively is important in both a hospital and a home setting due to the role this vital sign plays in health and wellbeing. Despite great advancements and improvements in recent years, safety remains a challenging issue in the neonatal intensive care unit (NICU). Traditional sensors found in the NICU incubators require adhesives and wires. The objective of this article was to develop a wireless, noncontact monitoring system that measures multiple physiological parameters in human faces from a distance using a camera and a single board computer. Experiments were conducted to estimate heart rate. The current practices of measuring HR involve collecting electrocardiogram (ECG) signals from adhesive electrodes placed on various parts of the body or using a pulse oximeter (PO) typically placed on the ear lobe or finger. We developed a monitoring system and compared its results to that from a PO. The monitoring system is low-cost at less than $200. The system has not been shown to exist in literature thus making it a novel implementation. In conclusion, we were able to estimate HR from a distance using a camera-based system. The developed system may have many useful applications, in both clinical and home health settings.
无创测量心率(HR)的能力在医院和家庭环境中都很重要,因为这一生命体征在健康和幸福中起着重要作用。尽管近年来取得了很大的进步和改善,但新生儿重症监护病房(NICU)的安全性仍然是一个具有挑战性的问题。在新生儿重症监护病房孵化器中发现的传统传感器需要粘合剂和电线。本文的目的是开发一种无线、非接触式监测系统,该系统使用相机和单板计算机从远处测量人脸的多种生理参数。进行实验来估计心率。目前测量心率的方法包括从放置在身体不同部位的粘附电极收集心电图(ECG)信号,或者使用通常放置在耳垂或手指上的脉搏血氧计(PO)。我们开发了一个监测系统,并将其结果与PO的结果进行了比较。该监控系统价格低廉,不到200美元。该系统尚未在文献中显示存在,因此使其成为一种新颖的实现。总之,我们能够使用基于摄像机的系统从远处估计HR。开发的系统可能在临床和家庭健康环境中有许多有用的应用。
{"title":"Camera-Based Remote Photoplethysmography for Physiological Monitoring in Neonatal Intensive Care","authors":"B. Sweely, Ziming Liu, T. Wyatt, Katherine Newnam, H. Qi, Xiaopeng Zhao","doi":"10.1109/ICSIPA52582.2021.9576779","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576779","url":null,"abstract":"The ability to measure heart rate (HR) noninvasively is important in both a hospital and a home setting due to the role this vital sign plays in health and wellbeing. Despite great advancements and improvements in recent years, safety remains a challenging issue in the neonatal intensive care unit (NICU). Traditional sensors found in the NICU incubators require adhesives and wires. The objective of this article was to develop a wireless, noncontact monitoring system that measures multiple physiological parameters in human faces from a distance using a camera and a single board computer. Experiments were conducted to estimate heart rate. The current practices of measuring HR involve collecting electrocardiogram (ECG) signals from adhesive electrodes placed on various parts of the body or using a pulse oximeter (PO) typically placed on the ear lobe or finger. We developed a monitoring system and compared its results to that from a PO. The monitoring system is low-cost at less than $200. The system has not been shown to exist in literature thus making it a novel implementation. In conclusion, we were able to estimate HR from a distance using a camera-based system. The developed system may have many useful applications, in both clinical and home health settings.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130183425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Rewritable Data Embedding in Image based on Improved Coefficient Recovery 基于改进系数恢复的图像可重写数据嵌入
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576791
A. Sii, Simying Ong, M. Wee, Koksheik Wong
Nowadays, most images are stored and transmitted in certain compressed forms based on some coding standards. Usually, the image is transformed, e.g., by discrete cosine transformation, and hence coefficient makes up a large proportion of the compressed bit stream. However, these coefficients might be corrupted or completely lost due to transmission errors or damages incurred on the storage device. Therefore, in this work, we aim to improve a conventional coefficient recovery method. Specifically, instead of using the Otsu’s method adopted in the conventional method, an adaptive segmentation method is utilized to split the image into background and foreground regions, forming non-overlapping patches. Missing coefficients in these non-overlapping patches are recovered independently. In addition, a rewritable data embedding method is put forward by judiciously selecting patches to embed data. Experiments are carried to verify the basic performance of the proposed methods. In the best-case scenario, an improvement of 31.32% in terms of CPU time is observed, while up to 7149 bits of external data can be embedded into the image.
目前,大多数图像都是按照一定的编码标准以一定的压缩形式存储和传输的。通常对图像进行变换,如离散余弦变换,因此系数在压缩比特流中占很大比例。但是,由于传输错误或存储设备损坏,这些系数可能会被损坏或完全丢失。因此,在这项工作中,我们的目标是改进传统的系数恢复方法。具体而言,采用自适应分割方法将图像分割为背景和前景区域,形成不重叠的小块,而不是传统方法中采用的Otsu方法。在这些不重叠的patch中,缺失系数被独立地恢复。此外,提出了一种可重写的数据嵌入方法,即合理选择嵌入数据的补丁。实验验证了所提方法的基本性能。在最好的情况下,可以观察到CPU时间方面的31.32%的改进,同时可以将多达7149位的外部数据嵌入到图像中。
{"title":"Rewritable Data Embedding in Image based on Improved Coefficient Recovery","authors":"A. Sii, Simying Ong, M. Wee, Koksheik Wong","doi":"10.1109/ICSIPA52582.2021.9576791","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576791","url":null,"abstract":"Nowadays, most images are stored and transmitted in certain compressed forms based on some coding standards. Usually, the image is transformed, e.g., by discrete cosine transformation, and hence coefficient makes up a large proportion of the compressed bit stream. However, these coefficients might be corrupted or completely lost due to transmission errors or damages incurred on the storage device. Therefore, in this work, we aim to improve a conventional coefficient recovery method. Specifically, instead of using the Otsu’s method adopted in the conventional method, an adaptive segmentation method is utilized to split the image into background and foreground regions, forming non-overlapping patches. Missing coefficients in these non-overlapping patches are recovered independently. In addition, a rewritable data embedding method is put forward by judiciously selecting patches to embed data. Experiments are carried to verify the basic performance of the proposed methods. In the best-case scenario, an improvement of 31.32% in terms of CPU time is observed, while up to 7149 bits of external data can be embedded into the image.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121328490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personal Protective Equipment Detection with Live Camera 个人防护装备检测与实时摄像头
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576811
Ng Wei Bhing, P. Sebastian
With the recent outbreak and rapid transmission of COVID-19, medical personal protective equipment (PPE) detection has seen significant importance in the domain of computer vision and deep learning. The need for the public to wear face masks in public is ever increasing. Research has shown that proper usage of face masks and PPE can significantly reduce transmission of COVID-19. In this paper, a computer vision with a deep-learning approach is proposed to develop a medical PPE detection algorithm with real-time video feed capability. This paper aims to use the YOLO object detection algorithm to perform one-stage object detection and classification to identify the three different states of face mask usage and detect the presence of medical PPE. At present, there is no publicly available PPE dataset for object detection. Thus, this paper aims to establish a medical PPE dataset for future applications and development. The YOLO model achieved 84.5% accuracy on our established PPE dataset comprising seven classes in more than 1300 images, the largest dataset for evaluating medical PPE detection in the wild.
随着COVID-19的爆发和快速传播,医疗个人防护装备(PPE)检测在计算机视觉和深度学习领域具有重要意义。公众在公共场合戴口罩的需求日益增加。研究表明,正确使用口罩和个人防护装备可显著减少COVID-19的传播。本文提出了一种基于深度学习的计算机视觉方法,开发具有实时视频馈送能力的医用PPE检测算法。本文旨在利用YOLO目标检测算法进行一阶段目标检测与分类,识别口罩使用的三种不同状态,检测医用PPE是否存在。目前,还没有公开可用的PPE目标检测数据集。因此,本文旨在为未来的应用和发展建立一个医疗PPE数据集。YOLO模型在我们建立的PPE数据集上达到了84.5%的准确率,该数据集包括七个类别,超过1300张图像,这是评估野外医疗PPE检测的最大数据集。
{"title":"Personal Protective Equipment Detection with Live Camera","authors":"Ng Wei Bhing, P. Sebastian","doi":"10.1109/ICSIPA52582.2021.9576811","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576811","url":null,"abstract":"With the recent outbreak and rapid transmission of COVID-19, medical personal protective equipment (PPE) detection has seen significant importance in the domain of computer vision and deep learning. The need for the public to wear face masks in public is ever increasing. Research has shown that proper usage of face masks and PPE can significantly reduce transmission of COVID-19. In this paper, a computer vision with a deep-learning approach is proposed to develop a medical PPE detection algorithm with real-time video feed capability. This paper aims to use the YOLO object detection algorithm to perform one-stage object detection and classification to identify the three different states of face mask usage and detect the presence of medical PPE. At present, there is no publicly available PPE dataset for object detection. Thus, this paper aims to establish a medical PPE dataset for future applications and development. The YOLO model achieved 84.5% accuracy on our established PPE dataset comprising seven classes in more than 1300 images, the largest dataset for evaluating medical PPE detection in the wild.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114257263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Monaural Speech Enhancement using Deep Neural Network with Cross-Speech Dataset 基于交叉语音数据集的深度神经网络单耳语音增强
Pub Date : 2021-09-13 DOI: 10.1109/ICSIPA52582.2021.9576789
N. Jamal, N. Fuad, Shahnoor Shanta, M. N. A. Sha'abani
Deep Neural Network (DNN)-based mask estimation approach is an emerging algorithm in monaural speech enhancement. It is used to enhance speech signals from the noisy background by calculating either speech or noise dominant in a particular frame of the noisy speech signal. It can construct complex models for nonlinear processing. However, the limitation of the DNN-based mask algorithm is a generalization of the targeted population. Past research works focused on their target dataset because of time consumption for the audio recording session. Thus, in this work, different recording conditions were used to study the performance of the DNN-based mask estimation approach. The findings revealed that different language test dataset, as well as different conditions, may not give large impact in speech enhancement performance since the algorithm only learn the noise information. But, the performance of speech enhancement is promising when the trained model has been designed properly, especially given the less sample variations in the input dataset involved during the training session.
基于深度神经网络(DNN)的掩码估计方法是一种新兴的单音语音增强算法。它通过计算在噪声语音信号的特定帧中占主导地位的语音或噪声来增强来自噪声背景的语音信号。它可以构造复杂的模型进行非线性处理。然而,基于dnn的掩码算法的局限性是对目标人群的泛化。过去的研究工作主要集中在他们的目标数据集,因为音频录制会话的时间消耗。因此,在这项工作中,使用不同的记录条件来研究基于dnn的掩码估计方法的性能。研究结果表明,不同的语言测试数据集以及不同的条件可能不会对语音增强性能产生太大影响,因为算法只学习噪声信息。但是,当训练模型设计得当时,语音增强的性能是有希望的,特别是在训练过程中输入数据集中涉及的样本变化较少的情况下。
{"title":"Monaural Speech Enhancement using Deep Neural Network with Cross-Speech Dataset","authors":"N. Jamal, N. Fuad, Shahnoor Shanta, M. N. A. Sha'abani","doi":"10.1109/ICSIPA52582.2021.9576789","DOIUrl":"https://doi.org/10.1109/ICSIPA52582.2021.9576789","url":null,"abstract":"Deep Neural Network (DNN)-based mask estimation approach is an emerging algorithm in monaural speech enhancement. It is used to enhance speech signals from the noisy background by calculating either speech or noise dominant in a particular frame of the noisy speech signal. It can construct complex models for nonlinear processing. However, the limitation of the DNN-based mask algorithm is a generalization of the targeted population. Past research works focused on their target dataset because of time consumption for the audio recording session. Thus, in this work, different recording conditions were used to study the performance of the DNN-based mask estimation approach. The findings revealed that different language test dataset, as well as different conditions, may not give large impact in speech enhancement performance since the algorithm only learn the noise information. But, the performance of speech enhancement is promising when the trained model has been designed properly, especially given the less sample variations in the input dataset involved during the training session.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123241541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1