2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)最新文献_第5页

C-ESRGAN: Synthesis of super-resolution images by image classification C-ESRGAN:基于图像分类的超分辨率图像合成

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10053050

Jingan Liu, N. P. Chandrasiri

With the development of deep learning, super-resolution image synthesis techniques for enhancing low-resolution images have advanced remarkably. However, mainstream algorithms focus on improving the quality of the entire image on average and this may result in blurring. In this paper, we propose three key components for synthesizing super-resolution images that can reflect the fine details of an image. We synthesize super-resolution images by image classification. First, the neural network weights learned using the images in the same image category were utilized in synthesizing super-resolution images. For this purpose, image classification was performed using a transfer-trained ResNet. Second, SENet was applied to the generators in our proposed method to obtain detailed information about the images. Finally, the feature extraction network was changed from VGG to ResNet in order to get more important features. As a result, we achieved better image evaluation values (PSNR, NIQE) for the super-resolution images of dogs and cats compared to the previous studies. Furthermore, the images were generated more naturally on the benchmark dataset.

随着深度学习的发展，用于增强低分辨率图像的超分辨率图像合成技术取得了显著的进步。然而，主流算法关注的是平均提高整个图像的质量，这可能会导致模糊。本文提出了合成能够反映图像细节的超分辨率图像的三个关键组件。通过图像分类合成超分辨率图像。首先，利用同一图像类别的图像学习到的神经网络权值用于超分辨率图像的合成;为此，使用传输训练的ResNet进行图像分类。其次，将SENet应用到生成器中，获取图像的详细信息。最后，将特征提取网络由VGG改为ResNet，以获得更重要的特征。因此，我们对狗和猫的超分辨率图像获得了比以往更好的图像评价值(PSNR, NIQE)。此外，在基准数据集上生成的图像更自然。

{"title":"C-ESRGAN: Synthesis of super-resolution images by image classification","authors":"Jingan Liu, N. P. Chandrasiri","doi":"10.1109/IPAS55744.2022.10053050","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053050","url":null,"abstract":"With the development of deep learning, super-resolution image synthesis techniques for enhancing low-resolution images have advanced remarkably. However, mainstream algorithms focus on improving the quality of the entire image on average and this may result in blurring. In this paper, we propose three key components for synthesizing super-resolution images that can reflect the fine details of an image. We synthesize super-resolution images by image classification. First, the neural network weights learned using the images in the same image category were utilized in synthesizing super-resolution images. For this purpose, image classification was performed using a transfer-trained ResNet. Second, SENet was applied to the generators in our proposed method to obtain detailed information about the images. Finally, the feature extraction network was changed from VGG to ResNet in order to get more important features. As a result, we achieved better image evaluation values (PSNR, NIQE) for the super-resolution images of dogs and cats compared to the previous studies. Furthermore, the images were generated more naturally on the benchmark dataset.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132939856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hyperspectral Brain Tissue Classification using a Fast and Compact 3D CNN Approach 使用快速紧凑3D CNN方法的高光谱脑组织分类

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10053044

Hamail Ayaz, D. Tormey, Ian McLoughlin, Muhammad Ahmad, S. Unnikrishnan

Glioblastoma (GB) is a malignant brain tumor and requires surgical resection. Although complete resection of GB improves prognosis, supratotal resection may cause neurological abnormalities. Therefore, intraoperative tissue classification techniques are needed to delineate infected tumor regions to remove reoccurrences. To delineate the affected regions, surgeons mostly rely on traditional magnetic resonance imaging (MRI) which often lacks accuracy and precision due to the brain-shift phenomenon. Hyperspectral Imaging (HSI) is a noninvasive advanced optical technique and has the potential to classify tissue cells accurately. However, HSI tumor classification is challenging due to overlapping regions, high interclass similarity, and homogeneous information. Additionally, HSI models using 2D Convolutional Neural Network (CNN) models works with spectral information eliminating spatial features and 3D followed by 2D hybrid model lacks abstract level spatial information. Therefore, in this study, we have used a minimal layer 3D CNN model to classify the GB tumor region from normal tissues using an intraoperative VivoHSI dataset. The HSI data have normal tissue (NT), tumor tissue (TT), hypervascularized tissue or blood vessels (BV), and background (BG) tissue cells. The proposed 3D CNN model consists of only two 3D layers using limited training samples (20%), which are further divided into 50% for training and 50% for validation and blind tested (80%) on the rest of the data. This study outperformed then state-of-the-art hybrid architecture by achieving an overall accuracy of 99.99%.

胶质母细胞瘤(GB)是一种恶性脑肿瘤，需要手术切除。虽然完全切除GB可改善预后，但顶骨上切除可引起神经系统异常。因此，需要术中组织分类技术来划定感染的肿瘤区域，以消除复发。为了描绘受影响的区域，外科医生大多依靠传统的磁共振成像(MRI)，但由于脑转移现象，往往缺乏准确性和精度。高光谱成像(HSI)是一种无创的先进光学技术，具有准确分类组织细胞的潜力。然而，由于重叠的区域、高的类间相似性和同质信息，HSI肿瘤分类是具有挑战性的。此外，使用二维卷积神经网络(CNN)模型的HSI模型具有光谱信息消除空间特征，3D后2D混合模型缺乏抽象层次的空间信息。因此，在本研究中，我们使用最小层3D CNN模型，使用术中VivoHSI数据集对GB肿瘤区域与正常组织进行分类。HSI数据包括正常组织(NT)、肿瘤组织(TT)、血管化组织或血管(BV)和背景组织细胞(BG)。本文提出的3D CNN模型仅由两个3D层组成，使用有限的训练样本(20%)，其中50%用于训练，50%用于验证，其余数据进行盲测(80%)。该研究通过实现99.99%的总体准确率，优于当时最先进的混合架构。

{"title":"Hyperspectral Brain Tissue Classification using a Fast and Compact 3D CNN Approach","authors":"Hamail Ayaz, D. Tormey, Ian McLoughlin, Muhammad Ahmad, S. Unnikrishnan","doi":"10.1109/IPAS55744.2022.10053044","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053044","url":null,"abstract":"Glioblastoma (GB) is a malignant brain tumor and requires surgical resection. Although complete resection of GB improves prognosis, supratotal resection may cause neurological abnormalities. Therefore, intraoperative tissue classification techniques are needed to delineate infected tumor regions to remove reoccurrences. To delineate the affected regions, surgeons mostly rely on traditional magnetic resonance imaging (MRI) which often lacks accuracy and precision due to the brain-shift phenomenon. Hyperspectral Imaging (HSI) is a noninvasive advanced optical technique and has the potential to classify tissue cells accurately. However, HSI tumor classification is challenging due to overlapping regions, high interclass similarity, and homogeneous information. Additionally, HSI models using 2D Convolutional Neural Network (CNN) models works with spectral information eliminating spatial features and 3D followed by 2D hybrid model lacks abstract level spatial information. Therefore, in this study, we have used a minimal layer 3D CNN model to classify the GB tumor region from normal tissues using an intraoperative VivoHSI dataset. The HSI data have normal tissue (NT), tumor tissue (TT), hypervascularized tissue or blood vessels (BV), and background (BG) tissue cells. The proposed 3D CNN model consists of only two 3D layers using limited training samples (20%), which are further divided into 50% for training and 50% for validation and blind tested (80%) on the rest of the data. This study outperformed then state-of-the-art hybrid architecture by achieving an overall accuracy of 99.99%.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115304981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of Real-Time Hostile Activitiy Detection from Spatiotemporal Features Using Time Distributed Deep CNNs, RNNs and Attention-Based Mechanisms 基于时间分布深度cnn、rnn和基于注意力机制的实时敌对活动时空特征检测分析

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10053001

Labib Ahmed Siddique, Rabita Junhai, Tanzim Reza, Salman Khan, Tanvir Rahman

Real-time video surveillance, through CCTV camera systems has become essential for ensuring public safety which is a priority today. Although CCTV cameras help a lot in increasing security, these systems require constant human interaction and monitoring. To eradicate this issue, intelligent surveillance systems can be built using deep learning video classification techniques that can help us automate surveillance systems to detect violence as it happens. In this research, we explore deep learning video classification techniques to detect violence as they are happening. Traditional image classification techniques fall short when it comes to classifying videos as they attempt to classify each frame separately for which the predictions start to flicker. Therefore, many researchers are coming up with video classification techniques that consider spatiotemporal features while classifying. However, deploying these deep learning models with methods such as skeleton points obtained through pose estimation and optical flow obtained through depth sensors, are not always practical in an IoT environment. Although these techniques ensure a higher accuracy score, they are computationally heavier. Keeping these constraints in mind, we experimented with various video classification and action recognition techniques such as ConvLSTM, LRCN (with both custom CNN layers and VGG-16 as feature extractor) CNNTransformer and C3D. We achieved a test accuracy of 80% on ConvLSTM, 83.33% on CNN-BiLSTM, 70% on VGG16-BiLstm, 76.76% on CNN-Transformer and 80% on C3D.

通过闭路电视摄像机系统进行的实时视频监控已成为确保公共安全的必要条件，这是当今的优先事项。尽管闭路电视摄像机在提高安全性方面有很大帮助，但这些系统需要持续的人工交互和监控。为了根除这个问题，智能监控系统可以使用深度学习视频分类技术来建立，这可以帮助我们自动化监控系统，在暴力发生时检测暴力。在这项研究中，我们探索了深度学习视频分类技术，以检测正在发生的暴力事件。传统的图像分类技术在对视频进行分类时存在不足，因为它们试图对预测开始闪烁的每一帧进行单独分类。因此，许多研究者提出了在分类时考虑时空特征的视频分类技术。然而，将这些深度学习模型与通过姿态估计获得的骨架点和通过深度传感器获得的光流等方法一起部署在物联网环境中并不总是实用的。尽管这些技术确保了更高的精度分数，但它们的计算量更大。考虑到这些限制，我们实验了各种视频分类和动作识别技术，如ConvLSTM、LRCN(使用自定义CNN层和VGG-16作为特征提取器)、CNNTransformer和C3D。我们在ConvLSTM上实现了80%的测试准确率，在CNN-BiLSTM上达到83.33%，在VGG16-BiLstm上达到70%，在CNN-Transformer上达到76.76%，在C3D上达到80%。

{"title":"Analysis of Real-Time Hostile Activitiy Detection from Spatiotemporal Features Using Time Distributed Deep CNNs, RNNs and Attention-Based Mechanisms","authors":"Labib Ahmed Siddique, Rabita Junhai, Tanzim Reza, Salman Khan, Tanvir Rahman","doi":"10.1109/IPAS55744.2022.10053001","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053001","url":null,"abstract":"Real-time video surveillance, through CCTV camera systems has become essential for ensuring public safety which is a priority today. Although CCTV cameras help a lot in increasing security, these systems require constant human interaction and monitoring. To eradicate this issue, intelligent surveillance systems can be built using deep learning video classification techniques that can help us automate surveillance systems to detect violence as it happens. In this research, we explore deep learning video classification techniques to detect violence as they are happening. Traditional image classification techniques fall short when it comes to classifying videos as they attempt to classify each frame separately for which the predictions start to flicker. Therefore, many researchers are coming up with video classification techniques that consider spatiotemporal features while classifying. However, deploying these deep learning models with methods such as skeleton points obtained through pose estimation and optical flow obtained through depth sensors, are not always practical in an IoT environment. Although these techniques ensure a higher accuracy score, they are computationally heavier. Keeping these constraints in mind, we experimented with various video classification and action recognition techniques such as ConvLSTM, LRCN (with both custom CNN layers and VGG-16 as feature extractor) CNNTransformer and C3D. We achieved a test accuracy of 80% on ConvLSTM, 83.33% on CNN-BiLSTM, 70% on VGG16-BiLstm, 76.76% on CNN-Transformer and 80% on C3D.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"11303 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114895935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

DONEX: Real-time occupancy grid based dynamic echo classification for 3D point cloud DONEX:基于实时占用网格的三维点云动态回波分类

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10053064

Niklas Stralau, Chengxuan Fu

For driving assistance and autonomous driving systems, it is important to differentiate between dynamic objects such as moving vehicles and static objects such as guard rails. Among all the sensor modalities, RADAR and FMCW LiDAR can provide information regarding the motion state of the raw measurement data. On the other hand, perception pipelines using measurement data from ToF LiDAR typically can only differentiate between dynamic and static states on the object level. In this work, a new algorithm called DONEX was developed to classify the motion state of 3D LiDAR point cloud echoes using an occupancy grid approach. Through algorithmic improvements, e.g. 2D grid approach, it was possible to reduce the runtime. Scenarios, in which the measuring sensor is located in a moving vehicle, were also considered.

对于驾驶辅助和自动驾驶系统，区分移动车辆等动态物体和护栏等静态物体非常重要。在所有的传感器模式中，雷达和FMCW激光雷达可以提供有关原始测量数据的运动状态的信息。另一方面，使用ToF激光雷达测量数据的感知管道通常只能区分物体层面的动态和静态状态。在这项工作中，开发了一种名为DONEX的新算法，使用占用网格方法对3D LiDAR点云回波的运动状态进行分类。通过算法改进，例如2D网格方法，可以减少运行时间。还考虑了测量传感器位于移动车辆中的场景。

引用次数: 0

Bioacoustic augmentation of Orcas using TransGAN 利用TransGAN增强逆戟鲸的生物声学

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10052983

Nishant Yella, Manisai Eppakayala, Tauqir Pasha

The Southern Resident Killer Whale (Orcinus Orca) is an apex predator in the oceans. Currently, these are listed as endangered species and have slowly declined in number over the past two decades. There is a lack of availability of data on audio vocalizations of killer whales, which in itself creates a demanding task to acquire labelled audio sets. The vocalizations of orcas are usually categorized into two groups namely, whistles and pulsed calls. There is a significant amount of scarcity on audio sets of these two types of vocalizations. Hence this creates a challenge to address the lack of availability of data on these vocalizations. Methods of data augmentations have proven over the years to be very effective in generating synthetically created data for the use of labelled training of a given feed-forward neural network. The Transformer based Generative Adversarial neural network (Trans-GAN) has performed phenomenally well on tasks pertaining to visual perception. In this paper, we would like to demonstrate the use of trans-GAN on audio datasets, which would be used to perform bioacoustics augmentation of the killer whale audio vocalizations obtained from existing open-source libraries to generate a synthetically substantial amount of audio data on the killer whale vocalizations for tasks pertaining to audio perception. To validate the Trans-GAN generated audio to the original killer Whale vocalization sample, we have implemented a time-sequence-based algorithm called Dynamic Time Wrapping (DTW), which compares the similarity index between these two audio samples.

南方虎鲸(Orcinus Orca)是海洋中的顶级掠食者。目前，它们被列为濒危物种，在过去的二十年里，它们的数量在缓慢下降。关于虎鲸发声的音频数据缺乏，这本身就创造了一项艰巨的任务，即获取有标签的音频集。逆戟鲸的叫声通常分为两类，即口哨声和脉冲叫声。这两种发声方式的音频集非常稀缺。因此，这对解决缺乏这些发声数据的问题提出了挑战。多年来，数据增强方法已被证明在生成用于给定前馈神经网络的标记训练的综合创建数据方面非常有效。基于Transformer的生成对抗神经网络(Trans-GAN)在与视觉感知相关的任务上表现得非常好。在本文中，我们想展示在音频数据集上使用trans-GAN，这将用于对从现有开源库中获得的虎鲸音频发声进行生物声学增强，以生成与音频感知相关的任务有关的虎鲸发声的合成大量音频数据。为了将Trans-GAN生成的音频与原始虎鲸发声样本进行验证，我们实现了一种基于时间序列的算法，称为动态时间包裹(DTW)，该算法比较了这两个音频样本之间的相似性指数。

{"title":"Bioacoustic augmentation of Orcas using TransGAN","authors":"Nishant Yella, Manisai Eppakayala, Tauqir Pasha","doi":"10.1109/IPAS55744.2022.10052983","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052983","url":null,"abstract":"The Southern Resident Killer Whale (Orcinus Orca) is an apex predator in the oceans. Currently, these are listed as endangered species and have slowly declined in number over the past two decades. There is a lack of availability of data on audio vocalizations of killer whales, which in itself creates a demanding task to acquire labelled audio sets. The vocalizations of orcas are usually categorized into two groups namely, whistles and pulsed calls. There is a significant amount of scarcity on audio sets of these two types of vocalizations. Hence this creates a challenge to address the lack of availability of data on these vocalizations. Methods of data augmentations have proven over the years to be very effective in generating synthetically created data for the use of labelled training of a given feed-forward neural network. The Transformer based Generative Adversarial neural network (Trans-GAN) has performed phenomenally well on tasks pertaining to visual perception. In this paper, we would like to demonstrate the use of trans-GAN on audio datasets, which would be used to perform bioacoustics augmentation of the killer whale audio vocalizations obtained from existing open-source libraries to generate a synthetically substantial amount of audio data on the killer whale vocalizations for tasks pertaining to audio perception. To validate the Trans-GAN generated audio to the original killer Whale vocalization sample, we have implemented a time-sequence-based algorithm called Dynamic Time Wrapping (DTW), which compares the similarity index between these two audio samples.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129678685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Glacier-surface velocities in Gangotri from Landsat8 satellite imagery 来自Landsat8卫星图像的Gangotri冰川表面速度

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10052898

Reem Klaib, Hajer Alabdouli, Mritunjay Kumar Singh

A glacier's mass balance and dynamics are regulated by changes in ice velocity. As a result, estimating glacier flow velocity is a crucial part of temporal glacier monitoring of its health, response to climate change parameters, and its effect on increasing the sea level rise. In this study, we estimated the Gangotri glacier surface velocities from 2014 to 2021. We used remote sensing-based techniques to estimate the Gangotri surface velocity since it provides such measurements regularly for a vast geographical area. Sub-pixel correlation of Landsat 8 imagery was used by using the COSI- Corr (co-registration of optically sensed images and correlation) tool to determine surface velocities over the Gangotri glacier. Our derived velocities values match the ground truth velocities values comparatively well. Gangotri surface velocities vary over the various regions of the glacier, and from year to year. Our study indicated that the middle region of the ablation zone and the accumulation zone had higher velocities across all the years, while the boundary regions of the glacier show lower speeds. The average velocities range varies from ∼13 m/year in the accumulation zone to ∼22m/year. For the ablation zone, the average velocities range from ∼ 11m/year in the ablation zone to ∼18m/year. The average surface velocity showed a high decrement of 26 % from 2014 to 2021. The general surface velocities in Gangotri vary from ∼8 m/y to∼ 61 m/y± 1.9 from 2014 to 2021.

冰川的质量平衡和动态受冰速变化的调节。因此，冰川流速的估算是监测冰川健康状况、对气候变化参数的响应及其对海平面上升影响的重要组成部分。在这项研究中，我们估计了2014年至2021年甘戈特里冰川的表面速度。我们使用基于遥感的技术来估计Gangotri的地表速度，因为它为广阔的地理区域提供了定期的此类测量。利用COSI- Corr(光学遥感图像和相关的共同配准)工具，利用Landsat 8图像的亚像元相关性来确定Gangotri冰川的表面速度。我们得到的速度值与地面真实速度值比较吻合。甘戈特里冰川的表面速度在冰川的不同区域变化，而且每年都在变化。研究表明，消融带和堆积带的中部区域在各年际间速度较高，而冰川边界区域速度较低。平均流速范围从积累带的~ 13 m/年到~ 22m/年不等。对于消融区，平均速度范围为~ 11m/年至~ 18m/年。从2014年到2021年，平均地表速度下降了26%。从2014年到2021年，Gangotri的一般地表速度在~ 8 m/y到~ 61 m/y±1.9之间变化。

{"title":"Glacier-surface velocities in Gangotri from Landsat8 satellite imagery","authors":"Reem Klaib, Hajer Alabdouli, Mritunjay Kumar Singh","doi":"10.1109/IPAS55744.2022.10052898","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052898","url":null,"abstract":"A glacier's mass balance and dynamics are regulated by changes in ice velocity. As a result, estimating glacier flow velocity is a crucial part of temporal glacier monitoring of its health, response to climate change parameters, and its effect on increasing the sea level rise. In this study, we estimated the Gangotri glacier surface velocities from 2014 to 2021. We used remote sensing-based techniques to estimate the Gangotri surface velocity since it provides such measurements regularly for a vast geographical area. Sub-pixel correlation of Landsat 8 imagery was used by using the COSI- Corr (co-registration of optically sensed images and correlation) tool to determine surface velocities over the Gangotri glacier. Our derived velocities values match the ground truth velocities values comparatively well. Gangotri surface velocities vary over the various regions of the glacier, and from year to year. Our study indicated that the middle region of the ablation zone and the accumulation zone had higher velocities across all the years, while the boundary regions of the glacier show lower speeds. The average velocities range varies from ∼13 m/year in the accumulation zone to ∼22m/year. For the ablation zone, the average velocities range from ∼ 11m/year in the ablation zone to ∼18m/year. The average surface velocity showed a high decrement of 26 % from 2014 to 2021. The general surface velocities in Gangotri vary from ∼8 m/y to∼ 61 m/y± 1.9 from 2014 to 2021.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123985485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Light Weight Approach for Real-time Background Subtraction in Camera Surveillance Systems 摄像机监控系统中实时背景减法的一种轻量级方法

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10053028

Ege Ince, Sevdenur Kutuk, Rayan Abri, Sara Abri, S. Cetin

Real time processing in the context of image processing for topics like motion detection and suspicious object detection requires processing the background more times. In this field, background subtraction solutions can overcome the limitations caused by real time issues. Different methods of background subtraction have been investigated for this goal. Although more background subtraction methods provide the required efficiency, they do not make produce a real-time solution in a camera surveillance environment. In this paper, we propose a model for background subtraction using four different traditional algorithms; ViBe, Mixture of Gaussian V2 (MOG2), Two Points, and Pixel Based Adaptive Segmenter (PBAS). The presented model is a lightweight real time architecture for surveillance cameras. In this model, the dynamic programming logic is used during preprocessing of the frames. The CDnet 2014 data set is used to assess the model's accuracy, and the findings show that it is more accurate than the traditional methods whose combinations are suggested in the paper in terms of Frames per second (fps), F1 score, and Intersection over union (IoU) values by 61.31, 0.552, and 0.430 correspondingly.

在运动检测、可疑物体检测等图像处理课题中，实时处理需要对背景进行更多的处理。在这个领域，背景减法解决方案可以克服实时问题带来的限制。不同的背景减法的方法已经研究了这个目标。虽然更多的背景减法方法提供了所需的效率，但它们不能在摄像机监控环境中产生实时解决方案。本文提出了一种基于四种不同传统算法的背景减法模型;ViBe，混合高斯V2 (MOG2)，两点和基于像素的自适应分割器(PBAS)。所提出的模型是一种用于监控摄像机的轻量级实时架构。该模型在帧的预处理过程中采用了动态规划逻辑。利用CDnet 2014数据集对模型的精度进行了评估，结果表明，该模型在帧数/秒(fps)、F1分数和IoU值上比本文提出的传统组合方法的精度分别提高了61.31、0.552和0.430。

{"title":"A Light Weight Approach for Real-time Background Subtraction in Camera Surveillance Systems","authors":"Ege Ince, Sevdenur Kutuk, Rayan Abri, Sara Abri, S. Cetin","doi":"10.1109/IPAS55744.2022.10053028","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053028","url":null,"abstract":"Real time processing in the context of image processing for topics like motion detection and suspicious object detection requires processing the background more times. In this field, background subtraction solutions can overcome the limitations caused by real time issues. Different methods of background subtraction have been investigated for this goal. Although more background subtraction methods provide the required efficiency, they do not make produce a real-time solution in a camera surveillance environment. In this paper, we propose a model for background subtraction using four different traditional algorithms; ViBe, Mixture of Gaussian V2 (MOG2), Two Points, and Pixel Based Adaptive Segmenter (PBAS). The presented model is a lightweight real time architecture for surveillance cameras. In this model, the dynamic programming logic is used during preprocessing of the frames. The CDnet 2014 data set is used to assess the model's accuracy, and the findings show that it is more accurate than the traditional methods whose combinations are suggested in the paper in terms of Frames per second (fps), F1 score, and Intersection over union (IoU) values by 61.31, 0.552, and 0.430 correspondingly.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134519578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Resource-Constrained Insect Monitoring System based on Machine Vision with Edge AI 一种基于边缘AI机器视觉的资源受限昆虫监测系统

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10052895

Amin Kargar, Mariusz P. Wilk, Dimitrios Zorbas, Michael T. Gaffney, Brendan Q'Flynn

Effective insect pest monitoring is a vital component of Integrated Pest Management (IPM) strategies. It helps to support crop productivity while minimising the need for plant protection products. In recent years, many researchers have considered the integration of intelligence into such systems in the context of the Smart Agriculture research agenda. This paper describes the development of a smart pest monitoring system, developed in accordance with specific requirements associated with the agricultural sector. The proposed system is a low-cost smart insect trap, for use in orchards, that detects specific insect species that are detrimental to fruit quality. The system helps to identify the invasive insect, Brown Marmorated Stink Bug (BMSB) or Halyomorpha halys (HH) using a Microcontroller Unit-based edge device comprising of an Internet of Things enabled, resource-constrained image acquisition and processing system. It is used to execute our proposed lightweight image analysis algorithm and Convolutional Neural Network (CNN) model for insect detection and classification, respectively. The prototype device is currently deployed in an orchard in Italy. The preliminary experimental results show over 70 percent of accuracy in BMSB classification on our custom-built dataset, demonstrating the proposed system feasibility and effectiveness in monitoring this invasive insect species.

有效的病虫害监测是病虫害综合治理(IPM)战略的重要组成部分。它有助于支持作物生产力，同时最大限度地减少对植物保护产品的需求。近年来，许多研究人员在智能农业研究议程的背景下考虑将智能集成到此类系统中。本文描述了一种智能害虫监测系统的开发，该系统是根据农业部门的具体要求开发的。该系统是一种低成本的智能昆虫陷阱，用于果园，可以检测对水果质量有害的特定昆虫种类。该系统使用基于微控制器单元的边缘设备，包括启用物联网、资源受限的图像采集和处理系统，帮助识别入侵昆虫，棕色Marmorated臭虫(BMSB)或Halyomorpha halys (HH)。它分别用于执行我们提出的轻量级图像分析算法和卷积神经网络(CNN)模型，用于昆虫检测和分类。该设备的原型目前部署在意大利的一个果园里。初步实验结果表明，在我们定制的数据集上，BMSB分类准确率超过70%，证明了所提出的系统监测该入侵昆虫物种的可行性和有效性。

{"title":"A Novel Resource-Constrained Insect Monitoring System based on Machine Vision with Edge AI","authors":"Amin Kargar, Mariusz P. Wilk, Dimitrios Zorbas, Michael T. Gaffney, Brendan Q'Flynn","doi":"10.1109/IPAS55744.2022.10052895","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052895","url":null,"abstract":"Effective insect pest monitoring is a vital component of Integrated Pest Management (IPM) strategies. It helps to support crop productivity while minimising the need for plant protection products. In recent years, many researchers have considered the integration of intelligence into such systems in the context of the Smart Agriculture research agenda. This paper describes the development of a smart pest monitoring system, developed in accordance with specific requirements associated with the agricultural sector. The proposed system is a low-cost smart insect trap, for use in orchards, that detects specific insect species that are detrimental to fruit quality. The system helps to identify the invasive insect, Brown Marmorated Stink Bug (BMSB) or Halyomorpha halys (HH) using a Microcontroller Unit-based edge device comprising of an Internet of Things enabled, resource-constrained image acquisition and processing system. It is used to execute our proposed lightweight image analysis algorithm and Convolutional Neural Network (CNN) model for insect detection and classification, respectively. The prototype device is currently deployed in an orchard in Italy. The preliminary experimental results show over 70 percent of accuracy in BMSB classification on our custom-built dataset, demonstrating the proposed system feasibility and effectiveness in monitoring this invasive insect species.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116564093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Drought Stress Segmentation on Drone captured Maize using Ensemble U-Net framework 基于集成U-Net框架的无人机捕获玉米干旱胁迫分割

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10052939

N. Tejasri, G. U. Sai, P. Rajalakshmi, B. BalajiNaik, U. B. Desai

Water is essential for any crop production. Lack of sufficient supply of water supply causes abiotic stress in crops. Accurate identification of the crops affected by drought is required for achieving sustainable agricultural yield. The image data plays a crucial role in studying the crop's response. Recent developments in aerial-based imaging methods allow us to capture RGB maize data by integrating an RGB camera with the drone. In this work, we propose a pipeline to collect data rapidly, pre-process the data and apply deep learning based models to segment drought affected/stressed and unaffected/healthy RGB maize crop grown in controlled water conditions. We develop an ensemble-based framework based on U-Net and U-Net++ architectures for the drought stress segmentation task. The ensemble framework is based on the stacking approach by averaging the predictions of fine-tuned U-Net and U-Net++ models to generate the output mask. The experimental results showed that the ensemble framework performed better than individual U-Net and U-Net++ models on the test set with a mean IoU of 0.71 and a dice coefficient of 0.74.

水对任何作物生产都是必不可少的。缺乏足够的水供应导致作物的非生物胁迫。准确识别受干旱影响的作物是实现可持续农业产量的必要条件。图像数据在研究作物的反应中起着至关重要的作用。航空成像方法的最新发展使我们能够通过将RGB相机与无人机集成来捕获RGB玉米数据。在这项工作中，我们提出了一个快速收集数据的管道，对数据进行预处理，并应用基于深度学习的模型来分割在受控水分条件下生长的受干旱影响/胁迫和未受影响/健康的RGB玉米作物。基于U-Net和U-Net++架构开发了基于集成的干旱应力分割框架。集成框架基于叠加方法，通过平均微调U-Net和U-Net++模型的预测来生成输出掩码。实验结果表明，集成框架在测试集上的平均IoU为0.71，骰子系数为0.74，优于单个U-Net和U-Net++模型。

{"title":"Drought Stress Segmentation on Drone captured Maize using Ensemble U-Net framework","authors":"N. Tejasri, G. U. Sai, P. Rajalakshmi, B. BalajiNaik, U. B. Desai","doi":"10.1109/IPAS55744.2022.10052939","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052939","url":null,"abstract":"Water is essential for any crop production. Lack of sufficient supply of water supply causes abiotic stress in crops. Accurate identification of the crops affected by drought is required for achieving sustainable agricultural yield. The image data plays a crucial role in studying the crop's response. Recent developments in aerial-based imaging methods allow us to capture RGB maize data by integrating an RGB camera with the drone. In this work, we propose a pipeline to collect data rapidly, pre-process the data and apply deep learning based models to segment drought affected/stressed and unaffected/healthy RGB maize crop grown in controlled water conditions. We develop an ensemble-based framework based on U-Net and U-Net++ architectures for the drought stress segmentation task. The ensemble framework is based on the stacking approach by averaging the predictions of fine-tuned U-Net and U-Net++ models to generate the output mask. The experimental results showed that the ensemble framework performed better than individual U-Net and U-Net++ models on the test set with a mean IoU of 0.71 and a dice coefficient of 0.74.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125341284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Animal Video Retrieval System using Image Recognition and Relationships Between Concepts of Animal Families and Species 基于图像识别和动物科、物种概念关系的动物视频检索系统

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

Pub Date : 2022-12-05 DOI: 10.1109/IPAS55744.2022.10052995

Chinatsu Watanabe, Mayu Kaneko, N. P. Chandrasiri

In recent years, video streaming services have become increasingly popular. In general, the search function in a video sharing service site evaluates the relevance of a search query to the title, tags, description, and so on given by the creator of the video. Then, the search results with the highest relevance are displayed. Therefore, if a title is given to a video that does not match its content, there is a possibility that a video with low relevance will be found. In this research, (1) we built a new system that retrieves animal videos that are relevant to its content using image recognition. (2) By describing the relationships between the concepts of animal families and species and incorporating them into the retrieval system, it is possible to retrieve animal videos by their family names. Adding retrieval by animal family name enabled us to find species that have not been learned. In this research, (3) we confirmed the usefulness of our video retrieval system using trained neural networks, GoogLeNet and ResNet50, as animal species classifiers.

近年来，视频流媒体服务越来越受欢迎。通常，视频共享服务站点中的搜索功能会评估搜索查询与视频创建者提供的标题、标签、描述等的相关性。然后，显示相关度最高的搜索结果。因此，如果给一个视频的标题与其内容不匹配，就有可能找到一个低相关性的视频。在本研究中，(1)我们构建了一个新的系统，该系统使用图像识别来检索与其内容相关的动物视频。(2)通过描述动物科和物种概念之间的关系，并将其纳入检索系统，实现了按动物科名检索动物视频。加上按动物名称检索，使我们能够找到尚未了解的物种。在本研究中，(3)我们证实了我们的视频检索系统使用训练好的神经网络GoogLeNet和ResNet50作为动物物种分类器的有效性。

{"title":"Animal Video Retrieval System using Image Recognition and Relationships Between Concepts of Animal Families and Species","authors":"Chinatsu Watanabe, Mayu Kaneko, N. P. Chandrasiri","doi":"10.1109/IPAS55744.2022.10052995","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052995","url":null,"abstract":"In recent years, video streaming services have become increasingly popular. In general, the search function in a video sharing service site evaluates the relevance of a search query to the title, tags, description, and so on given by the creator of the video. Then, the search results with the highest relevance are displayed. Therefore, if a title is given to a video that does not match its content, there is a possibility that a video with low relevance will be found. In this research, (1) we built a new system that retrieves animal videos that are relevant to its content using image recognition. (2) By describing the relationships between the concepts of animal families and species and incorporating them into the retrieval system, it is possible to retrieve animal videos by their family names. Adding retrieval by animal family name enabled us to find species that have not been learned. In this research, (3) we confirmed the usefulness of our video retrieval system using trained neural networks, GoogLeNet and ResNet50, as animal species classifiers.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129753809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0