首页 > 最新文献

2020 Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
An improved method for pylon extraction and vegetation encroachment analysis in high voltage transmission lines using LiDAR data 基于激光雷达数据的高压输电线路塔架提取与植被侵占分析改进方法
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363391
Nosheen Munir, M. Awrangjeb, Bela Stantic
The maintenance of high-voltage power lines rights-of-way due to vegetation intrusions is important for electric power distribution companies for safe and secure delivery of electricity. However, the monitoring becomes more challenging if power line corridor (PLC) exists in complex environment such as mountainous terrains or forests. To overcome these challenges, this paper aims to provide an automated method for extraction of individual pylons and monitoring of vegetation near the PLC in hilly terrain. The proposed method starts off by dividing the large dataset into small manageable datasets. A voxel grid is formed on each dataset to separate power lines from pylons and vegetation. The power line points are converted into a binary image to get the individual spans. These span points are used to find nearby vegetation and pylon points and individual pylons and vegetation are further separated using a statistical analysis. Finally, the height and location of extracted vegetation with reference to power lines are estimated and separated into danger and clearance zones. The experiment on two large Australian datasets shows that the proposed method provides high completeness and correctness of 96.5% and 99% for pylons, respectively. Moreover, the growing vegetation beneath and around the PLC that can harm the power lines is identified.
由于植被的侵入,维护高压电线的通行权对于配电公司安全可靠地输送电力非常重要。然而,当电力线走廊(PLC)存在于山地或森林等复杂环境中时,其监测就变得更加困难。为了克服这些挑战,本文旨在提供一种自动化的方法来提取单个塔和监测丘陵地形PLC附近的植被。该方法首先将大数据集划分为可管理的小数据集。在每个数据集上形成一个体素网格,将电力线与塔和植被分开。将电力线点转换成二值图像,得到各个点的跨度。这些跨度点用于寻找附近的植被和塔点,并使用统计分析进一步分离单个塔和植被。最后,根据电力线估算提取植被的高度和位置,并将其划分为危险区和清除区。在澳大利亚两个大型数据集上的实验表明,该方法对塔架的完整性和正确性分别达到了96.5%和99%。此外,还确定了PLC下面和周围生长的植被,这些植被可能会损害电力线。
{"title":"An improved method for pylon extraction and vegetation encroachment analysis in high voltage transmission lines using LiDAR data","authors":"Nosheen Munir, M. Awrangjeb, Bela Stantic","doi":"10.1109/DICTA51227.2020.9363391","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363391","url":null,"abstract":"The maintenance of high-voltage power lines rights-of-way due to vegetation intrusions is important for electric power distribution companies for safe and secure delivery of electricity. However, the monitoring becomes more challenging if power line corridor (PLC) exists in complex environment such as mountainous terrains or forests. To overcome these challenges, this paper aims to provide an automated method for extraction of individual pylons and monitoring of vegetation near the PLC in hilly terrain. The proposed method starts off by dividing the large dataset into small manageable datasets. A voxel grid is formed on each dataset to separate power lines from pylons and vegetation. The power line points are converted into a binary image to get the individual spans. These span points are used to find nearby vegetation and pylon points and individual pylons and vegetation are further separated using a statistical analysis. Finally, the height and location of extracted vegetation with reference to power lines are estimated and separated into danger and clearance zones. The experiment on two large Australian datasets shows that the proposed method provides high completeness and correctness of 96.5% and 99% for pylons, respectively. Moreover, the growing vegetation beneath and around the PLC that can harm the power lines is identified.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124847170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Evaluation of U-Net CNN Approaches for Human Neck MRI Segmentation U-Net CNN方法对人体颈部MRI分割的评价
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363385
A. Suman, Yash Khemchandani, Md. Asikuzzaman, A. Webb, D. Perriman, M. Tahtali, M. Pickering
The segmentation of neck muscles is useful for the diagnoses and planning of medical interventions for neck pain-related conditions such as whiplash and cervical dystonia. Neck muscles are tightly grouped, have similar appearance to each other and display large anatomical variability between subjects. They also exhibit low contrast with background organs in magnetic resonance (MR) images. These characteristics make the segmentation of neck muscles a challenging task. Due to the significant success of the U-Net architecture for deep learning-based segmentation, numerous versions of this approach have emerged for the task of medical image segmentation. This paper presents an evaluation of 10 U-Net CNN approaches, 6 direct (U-Net, CRF-Unet, A-Unet, MFP-Unet, R2Unet and U-Net++) and 4 modified (R2A-Unet, R2A-Unet++, PMS-Unet and MS-Unet). The modifications are inspired by recent multi-scale and multi-stream techniques for deep learning algorithms. T1 weighted axial MR images of the neck, at the distal end of the C3 vertebrae, from 45 subjects with real-time data augmentation were used in our evaluation of neck muscle segmentation approaches. The analysis of our numerical results indicates that the R2Unet architecture achieves the best accuracy.
颈部肌肉的分割是有用的诊断和计划的医疗干预颈部疼痛相关的条件,如鞭打和颈肌张力障碍。颈部肌肉紧密地聚集在一起,彼此具有相似的外观,并且在受试者之间显示出很大的解剖差异。在磁共振(MR)图像中,它们与背景器官的对比度也很低。这些特征使得颈部肌肉的分割成为一项具有挑战性的任务。由于U-Net架构在基于深度学习的分割方面取得了重大成功,因此出现了许多版本的该方法用于医学图像分割任务。本文对10种U-Net CNN方法进行了评价,其中6种是直接的(U-Net、CRF-Unet、A-Unet、MFP-Unet、R2Unet和U-Net++), 4种是改进的(R2A-Unet、R2A-Unet++、PMS-Unet和MS-Unet)。这些修改受到了最近深度学习算法的多尺度和多流技术的启发。我们使用45名受试者C3椎体远端颈部T1加权轴向MR图像进行实时数据增强,以评估颈部肌肉分割方法。数值结果分析表明,R2Unet架构达到了最好的精度。
{"title":"Evaluation of U-Net CNN Approaches for Human Neck MRI Segmentation","authors":"A. Suman, Yash Khemchandani, Md. Asikuzzaman, A. Webb, D. Perriman, M. Tahtali, M. Pickering","doi":"10.1109/DICTA51227.2020.9363385","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363385","url":null,"abstract":"The segmentation of neck muscles is useful for the diagnoses and planning of medical interventions for neck pain-related conditions such as whiplash and cervical dystonia. Neck muscles are tightly grouped, have similar appearance to each other and display large anatomical variability between subjects. They also exhibit low contrast with background organs in magnetic resonance (MR) images. These characteristics make the segmentation of neck muscles a challenging task. Due to the significant success of the U-Net architecture for deep learning-based segmentation, numerous versions of this approach have emerged for the task of medical image segmentation. This paper presents an evaluation of 10 U-Net CNN approaches, 6 direct (U-Net, CRF-Unet, A-Unet, MFP-Unet, R2Unet and U-Net++) and 4 modified (R2A-Unet, R2A-Unet++, PMS-Unet and MS-Unet). The modifications are inspired by recent multi-scale and multi-stream techniques for deep learning algorithms. T1 weighted axial MR images of the neck, at the distal end of the C3 vertebrae, from 45 subjects with real-time data augmentation were used in our evaluation of neck muscle segmentation approaches. The analysis of our numerical results indicates that the R2Unet architecture achieves the best accuracy.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128756122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SL3D - Single Look 3D Object Detection based on RGB-D Images SL3D -基于RGB-D图像的单视3D目标检测
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363404
G. Erabati, Helder Araújo
We present SL3D, Single Look 3D object detection approach to detect the 3D objects from the RGB-D image pair. The approach is a proposal free, single-stage 3D object detection method from RGB-D images by leveraging multi-scale feature fusion of RGB and depth feature maps, and multi-layer predictions. The method takes pair of RGB and depth images as an input and outputs predicted 3D bounding boxes. The neural network SL3D, comprises of two modules: multi-scale feature fusion and multi-layer prediction. The multi-scale feature fusion module fuses the multi-scale features from RGB and depth feature maps, which are later used by the multi-layer prediction module for 3D object detection. Each location of prediction layer is attached with a set of predefined 3D prior boxes to account for varying shapes of 3D objects. The output of the network regresses the predicted 3D bounding boxes as an offset to the set of 3D prior boxes and duplicate 3D bounding boxes are removed by applying 3D non-maximum suppression. The network is trained end-to-end on publicly available SUN RGB-D dataset. The SL3D approach with ResNeXt50 achieves 31.77 mAP on SUN RGB-D test dataset with an inference speed of approximately 4 fps, and with MobileNetV2, it achieves approximately 15 fps with a reduction of around 2 mAP. The quantitative results show that the proposed method achieves competitive performance to state-of-the-art methods on SUN RGB-D dataset with near real-time inference speed.
我们提出了SL3D (Single Look 3D object detection)方法来检测RGB-D图像对中的3D物体。该方法利用RGB和深度特征图的多尺度特征融合以及多层预测,是一种基于RGB- d图像的无提案单阶段3D目标检测方法。该方法以RGB和深度图像对作为预测三维边界框的输入和输出。神经网络SL3D包括两个模块:多尺度特征融合和多层预测。多尺度特征融合模块融合来自RGB和深度特征图的多尺度特征,然后由多层预测模块用于3D目标检测。预测层的每个位置都附加了一组预定义的3D先验框,以考虑3D物体的不同形状。该网络的输出将预测的3D边界框作为对3D先验框集的偏移量进行回归,并通过应用3D非最大抑制去除重复的3D边界框。该网络在公开可用的SUN RGB-D数据集上进行端到端训练。使用ResNeXt50的SL3D方法在SUN RGB-D测试数据集上实现了31.77 mAP,推理速度约为4 fps,使用MobileNetV2的SL3D方法实现了约15 fps,减少了约2 mAP。定量结果表明,该方法在SUN RGB-D数据集上的推理速度接近实时,达到了与现有方法相媲美的性能。
{"title":"SL3D - Single Look 3D Object Detection based on RGB-D Images","authors":"G. Erabati, Helder Araújo","doi":"10.1109/DICTA51227.2020.9363404","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363404","url":null,"abstract":"We present SL3D, Single Look 3D object detection approach to detect the 3D objects from the RGB-D image pair. The approach is a proposal free, single-stage 3D object detection method from RGB-D images by leveraging multi-scale feature fusion of RGB and depth feature maps, and multi-layer predictions. The method takes pair of RGB and depth images as an input and outputs predicted 3D bounding boxes. The neural network SL3D, comprises of two modules: multi-scale feature fusion and multi-layer prediction. The multi-scale feature fusion module fuses the multi-scale features from RGB and depth feature maps, which are later used by the multi-layer prediction module for 3D object detection. Each location of prediction layer is attached with a set of predefined 3D prior boxes to account for varying shapes of 3D objects. The output of the network regresses the predicted 3D bounding boxes as an offset to the set of 3D prior boxes and duplicate 3D bounding boxes are removed by applying 3D non-maximum suppression. The network is trained end-to-end on publicly available SUN RGB-D dataset. The SL3D approach with ResNeXt50 achieves 31.77 mAP on SUN RGB-D test dataset with an inference speed of approximately 4 fps, and with MobileNetV2, it achieves approximately 15 fps with a reduction of around 2 mAP. The quantitative results show that the proposed method achieves competitive performance to state-of-the-art methods on SUN RGB-D dataset with near real-time inference speed.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125629192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CNN to Capsule Network Transformation CNN向胶囊网络转型
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363395
Takumi Sato, K. Hotta
Capsule Network has been recently proposed which outperforms CNN in specific tasks. Due to the network architecture differences between Capsule Network and CNN, Capsule Network could not use transfer learning which is very frequently used in CNN. In this paper, we propose a transfer learning method which can easily transfer CNN to Capsule Network. We achieved by stacking pre-trained CNN and used the proposed capsule random transformer to interact individual CNN each other which will form a Capsule Network. We applied this method to U-net and achieved to create a capsule based method that has similar accuracy compared to U-net. We show the results on cell segmentation dataset. Our capsule network successfully archives higher accuracy compared to other Capsule Network based semantic segmentation methods.
胶囊网络最近被提出,它在特定任务上优于CNN。由于Capsule network和CNN的网络结构不同,所以Capsule network不能使用CNN中经常使用的迁移学习。本文提出了一种迁移学习方法,可以方便地将CNN迁移到Capsule网络。我们通过堆叠预训练好的CNN来实现,并使用所提出的胶囊随机变压器使单个CNN相互作用,从而形成一个胶囊网络。我们将该方法应用于U-net,并创建了一个基于胶囊的方法,与U-net相比具有相似的精度。我们在细胞分割数据集上展示了结果。与其他基于胶囊网络的语义分割方法相比,我们的胶囊网络成功地获得了更高的准确率。
{"title":"CNN to Capsule Network Transformation","authors":"Takumi Sato, K. Hotta","doi":"10.1109/DICTA51227.2020.9363395","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363395","url":null,"abstract":"Capsule Network has been recently proposed which outperforms CNN in specific tasks. Due to the network architecture differences between Capsule Network and CNN, Capsule Network could not use transfer learning which is very frequently used in CNN. In this paper, we propose a transfer learning method which can easily transfer CNN to Capsule Network. We achieved by stacking pre-trained CNN and used the proposed capsule random transformer to interact individual CNN each other which will form a Capsule Network. We applied this method to U-net and achieved to create a capsule based method that has similar accuracy compared to U-net. We show the results on cell segmentation dataset. Our capsule network successfully archives higher accuracy compared to other Capsule Network based semantic segmentation methods.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134459595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dual image and mask synthesis with GANs for semantic segmentation in optical coherence tomography 基于gan的双图像和掩模合成用于光学相干断层成像的语义分割
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363402
J. Kugelman, D. Alonso-Caneiro, Scott A. Read, Stephen J. Vincent, F. Chen, M. Collins
In recent years, deep learning-based OCT segmentation methods have addressed many of the limitations of traditional segmentation approaches and are capable of performing rapid, consistent and accurate segmentation of the chorio-retinal layers. However, robust deep learning methods require a sufficiently large and diverse dataset for training, which is not always feasible in many biomedical applications. Generative adversarial networks (GANs) have demonstrated the capability of producing realistic and diverse high-resolution images for a range of modalities and datasets, including for data augmentation, a powerful application of GAN methods. In this study we propose the use of a StyleGAN inspired approach to generate chorio-retinal optical coherence tomography (OCT) images with a high degree of realism and diversity. We utilize the method to synthesize image and segmentation mask pairs that can be used to train a deep learning semantic segmentation method for subsequent boundary delineation of three chorioretinal layer boundaries. By pursuing a dual output solution rather than a mask-to-image translation solution, we remove an unnecessary constraint on the generated images and enable the synthesis of new unseen area mask labels. The results are encouraging with near comparable performance observed when training using purely synthetic data, compared to the real data. Moreover, training using a combination of real and synthetic data results in zero measurable performance loss, further demonstrating the reliability of this technique and feasibility for data augmentation in future work.
近年来,基于深度学习的OCT分割方法解决了传统分割方法的许多局限性,能够对绒毛膜-视网膜层进行快速、一致和准确的分割。然而,稳健的深度学习方法需要足够大且多样化的数据集进行训练,这在许多生物医学应用中并不总是可行的。生成对抗网络(GAN)已经证明了为一系列模式和数据集生成逼真且多样化的高分辨率图像的能力,包括数据增强,这是GAN方法的强大应用。在这项研究中,我们提出使用StyleGAN启发的方法来生成具有高度真实感和多样性的绒毛膜-视网膜光学相干断层扫描(OCT)图像。我们利用该方法合成图像和分割掩码对,这些掩码对可用于训练一种深度学习语义分割方法,用于后续三个绒毛膜视网膜层边界的边界划定。通过追求双输出解决方案而不是掩码到图像的转换解决方案,我们消除了对生成图像的不必要约束,并能够合成新的未见区域掩码标签。与真实数据相比,使用纯合成数据进行训练时观察到的性能几乎相当,结果令人鼓舞。此外,结合使用真实数据和合成数据进行训练的结果是零可测量性能损失,进一步证明了该技术的可靠性和在未来工作中增加数据的可行性。
{"title":"Dual image and mask synthesis with GANs for semantic segmentation in optical coherence tomography","authors":"J. Kugelman, D. Alonso-Caneiro, Scott A. Read, Stephen J. Vincent, F. Chen, M. Collins","doi":"10.1109/DICTA51227.2020.9363402","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363402","url":null,"abstract":"In recent years, deep learning-based OCT segmentation methods have addressed many of the limitations of traditional segmentation approaches and are capable of performing rapid, consistent and accurate segmentation of the chorio-retinal layers. However, robust deep learning methods require a sufficiently large and diverse dataset for training, which is not always feasible in many biomedical applications. Generative adversarial networks (GANs) have demonstrated the capability of producing realistic and diverse high-resolution images for a range of modalities and datasets, including for data augmentation, a powerful application of GAN methods. In this study we propose the use of a StyleGAN inspired approach to generate chorio-retinal optical coherence tomography (OCT) images with a high degree of realism and diversity. We utilize the method to synthesize image and segmentation mask pairs that can be used to train a deep learning semantic segmentation method for subsequent boundary delineation of three chorioretinal layer boundaries. By pursuing a dual output solution rather than a mask-to-image translation solution, we remove an unnecessary constraint on the generated images and enable the synthesis of new unseen area mask labels. The results are encouraging with near comparable performance observed when training using purely synthetic data, compared to the real data. Moreover, training using a combination of real and synthetic data results in zero measurable performance loss, further demonstrating the reliability of this technique and feasibility for data augmentation in future work.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127432971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Base-Package Recommendation Framework Based on Consumer Behaviours in IPTV Platform 基于IPTV平台消费者行为的基包推荐框架
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363400
Kuruparan Shanmugalingam, Ruwinda Ranganayake, Chanaka Gunawardhana, Rajitha Navarathna
Internet Protocol TeleVision (IPTV) provides many services such as live television streaming, time-shifted media, and Video On Demand (VOD). However, many customers do not engage properly with their subscribed packages due to a lack of knowledge and poor guidance. Many customers fail to identify the proper IPTV service package based on their needs and to utilise their current package to the maximum. In this paper, we propose a base-package recommendation model with a novel customer scoring-meter based on customers behaviour. Initially, our paper describes an algorithm to measure customers engagement score, which illustrates a novel approach to track customer engagement with the IPTV service provider. Next, the content-based recommendation system, which uses vector representation of subscribers and base packages details is described. We show the significance of our approach using local IPTV service provider data set qualitatively. The proposed approach can significantly improve user retention, long term revenue and customer satisfaction.
IPTV (Internet Protocol TeleVision)提供电视直播、时移媒体、视频点播(Video On Demand, VOD)等多种业务。然而,由于缺乏知识和缺乏指导,许多客户不能正确地使用他们订阅的套餐。许多客户不能根据自己的需要确定合适的IPTV服务包,也不能最大限度地利用现有的服务包。在本文中,我们提出了一个基于客户行为的基本包推荐模型,该模型具有新颖的客户计分表。首先,我们的论文描述了一种测量客户参与度得分的算法,它说明了一种跟踪IPTV服务提供商客户参与度的新方法。其次,介绍了基于内容的推荐系统,该系统使用订阅者和基本包详细信息的向量表示。我们使用本地IPTV服务提供商的数据集定性地展示了我们的方法的重要性。所提出的方法可以显著提高用户留存率、长期收益和客户满意度。
{"title":"Base-Package Recommendation Framework Based on Consumer Behaviours in IPTV Platform","authors":"Kuruparan Shanmugalingam, Ruwinda Ranganayake, Chanaka Gunawardhana, Rajitha Navarathna","doi":"10.1109/DICTA51227.2020.9363400","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363400","url":null,"abstract":"Internet Protocol TeleVision (IPTV) provides many services such as live television streaming, time-shifted media, and Video On Demand (VOD). However, many customers do not engage properly with their subscribed packages due to a lack of knowledge and poor guidance. Many customers fail to identify the proper IPTV service package based on their needs and to utilise their current package to the maximum. In this paper, we propose a base-package recommendation model with a novel customer scoring-meter based on customers behaviour. Initially, our paper describes an algorithm to measure customers engagement score, which illustrates a novel approach to track customer engagement with the IPTV service provider. Next, the content-based recommendation system, which uses vector representation of subscribers and base packages details is described. We show the significance of our approach using local IPTV service provider data set qualitatively. The proposed approach can significantly improve user retention, long term revenue and customer satisfaction.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125813890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recurrent Motion Neural Network for Low Resolution Drone Detection 低分辨率无人机检测的递归运动神经网络
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363377
Hamish Pratt, B. Evans, T. Rowntree, I. Reid, S. Wiederman
Drones are becoming increasingly prevalent in everyday usage with many commercial applications in fields such as construction work and agricultural surveying. Despite their common commercial use, drones have been recently used with malicious intent, such as airline disruptions at Gatwick Airport. With the emerging issue of safety concerns for the public and other airspace users, detecting and monitoring active drones in an area is crucial. This paper introduces a recurrent convolutional neural network (CNN) specifically designed for drone detection. This CNN can detect drones from down-sampled images by exploiting the temporal information of drones in flight and outperforms a state-of-the-art conventional object detector. Due to the lightweight and low resolution nature of this network, it can be mounted on a small processor and run at near real-time speeds.
无人机在日常使用中越来越普遍,在建筑工作和农业测量等领域有许多商业应用。尽管无人机有常见的商业用途,但最近也有恶意使用,比如盖特威克机场(Gatwick Airport)的航班中断。随着公众和其他空域用户安全问题的出现,探测和监控一个地区的活跃无人机至关重要。介绍了一种专门用于无人机检测的递归卷积神经网络(CNN)。该CNN利用飞行中无人机的时间信息,从下采样图像中检测无人机,优于最先进的传统目标检测器。由于该网络的轻量级和低分辨率特性,它可以安装在小型处理器上并以接近实时的速度运行。
{"title":"Recurrent Motion Neural Network for Low Resolution Drone Detection","authors":"Hamish Pratt, B. Evans, T. Rowntree, I. Reid, S. Wiederman","doi":"10.1109/DICTA51227.2020.9363377","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363377","url":null,"abstract":"Drones are becoming increasingly prevalent in everyday usage with many commercial applications in fields such as construction work and agricultural surveying. Despite their common commercial use, drones have been recently used with malicious intent, such as airline disruptions at Gatwick Airport. With the emerging issue of safety concerns for the public and other airspace users, detecting and monitoring active drones in an area is crucial. This paper introduces a recurrent convolutional neural network (CNN) specifically designed for drone detection. This CNN can detect drones from down-sampled images by exploiting the temporal information of drones in flight and outperforms a state-of-the-art conventional object detector. Due to the lightweight and low resolution nature of this network, it can be mounted on a small processor and run at near real-time speeds.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128902836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualizing and Understanding Inherent Image Features in CNN-based Glaucoma Detection 基于cnn的青光眼检测中固有图像特征的可视化和理解
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363369
Dhaval Vaghjiani, Sajib Saha, Yann Connan, Shaun Frost, Y. Kanagasingam
Convolutional neural network (CNN)-based methods have achieved state-of-the-art performance in glaucoma detection. Despite this, these methods are often criticized for offering no opportunity to understand how classification decisions are made. In this paper, we develop an innovative visualization strategy that allows the inherent image features contributing to glaucoma detection at different CNN layers to be understood. We also develop a set of interpretable notions to better comprehend the contributing image features involved in the disease detection process. Extensive experiments are conducted on publicly available glaucoma datasets. Results show that the optic cup is the most influential ocular component for glaucoma detection (overall Intersection over Union (IoU) score of 0.18), followed by the neuro-retinal rim (NR) with IoU score 0.17. With an overall IoU score of 0.16 vessels in the photograph also play a considerable role in the disease detection.
基于卷积神经网络(CNN)的方法在青光眼检测中取得了最先进的性能。尽管如此,这些方法经常受到批评,因为它们没有提供机会来理解如何做出分类决策。在本文中,我们开发了一种创新的可视化策略,可以理解不同CNN层中有助于青光眼检测的固有图像特征。我们还开发了一套可解释的概念,以更好地理解疾病检测过程中涉及的贡献图像特征。在公开可用的青光眼数据集上进行了广泛的实验。结果显示,视杯对青光眼检测的影响最大(IoU评分为0.18),其次是神经视网膜缘(NR), IoU评分为0.17。照片中血管的IoU总分为0.16,在疾病检测中也发挥了相当大的作用。
{"title":"Visualizing and Understanding Inherent Image Features in CNN-based Glaucoma Detection","authors":"Dhaval Vaghjiani, Sajib Saha, Yann Connan, Shaun Frost, Y. Kanagasingam","doi":"10.1109/DICTA51227.2020.9363369","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363369","url":null,"abstract":"Convolutional neural network (CNN)-based methods have achieved state-of-the-art performance in glaucoma detection. Despite this, these methods are often criticized for offering no opportunity to understand how classification decisions are made. In this paper, we develop an innovative visualization strategy that allows the inherent image features contributing to glaucoma detection at different CNN layers to be understood. We also develop a set of interpretable notions to better comprehend the contributing image features involved in the disease detection process. Extensive experiments are conducted on publicly available glaucoma datasets. Results show that the optic cup is the most influential ocular component for glaucoma detection (overall Intersection over Union (IoU) score of 0.18), followed by the neuro-retinal rim (NR) with IoU score 0.17. With an overall IoU score of 0.16 vessels in the photograph also play a considerable role in the disease detection.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122545088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Attentive Inception Module based Convolutional Neural Network for Image Enhancement 基于细心盗梦模块的卷积神经网络图像增强
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363375
Purbaditya Bhattacharya, U. Zölzer
In this paper, the problem of image enhancement in the form of single image superresolution and compression artifact reduction is addressed by proposing a convolutional neural network with an inception module containing an attention mechanism. The inception module in the network contains parallel branches of convolution layers employing filters with multiple receptive fields via filter dilation. The aggregated multi-scale features are subsequently filtered via an attention mechanism which allows learned feature map weighting in order to reduce redundancy. Additionally, a long skip attentive connection is also introduced in order to process the penultimate feature layer of the proposed network. Addition of the aforementioned attention modules introduce a dynamic nature to the model which would otherwise consist of static trained filters. Experiments are performed with multiple network depths and architectures in order to assess their contributions. The final network is evaluated on the benchmark datasets for the aforementioned tasks, and the results indicate a very good performance.
本文提出了一种包含注意机制的初始模块的卷积神经网络,解决了单幅图像超分辨率和压缩伪像减少的图像增强问题。网络中的初始模块包含卷积层的并行分支,该卷积层采用具有多个接受域的滤波器,通过滤波器扩张。随后通过注意机制过滤聚合的多尺度特征,该机制允许学习到的特征映射加权以减少冗余。此外,为了处理所提出的网络的倒数第二特征层,还引入了长跳细心连接。上述注意模块的添加为模型引入了动态特性,否则将由静态训练过滤器组成。实验采用了多种网络深度和架构,以评估它们的贡献。最后的网络在上述任务的基准数据集上进行了评估,结果表明性能非常好。
{"title":"Attentive Inception Module based Convolutional Neural Network for Image Enhancement","authors":"Purbaditya Bhattacharya, U. Zölzer","doi":"10.1109/DICTA51227.2020.9363375","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363375","url":null,"abstract":"In this paper, the problem of image enhancement in the form of single image superresolution and compression artifact reduction is addressed by proposing a convolutional neural network with an inception module containing an attention mechanism. The inception module in the network contains parallel branches of convolution layers employing filters with multiple receptive fields via filter dilation. The aggregated multi-scale features are subsequently filtered via an attention mechanism which allows learned feature map weighting in order to reduce redundancy. Additionally, a long skip attentive connection is also introduced in order to process the penultimate feature layer of the proposed network. Addition of the aforementioned attention modules introduce a dynamic nature to the model which would otherwise consist of static trained filters. Experiments are performed with multiple network depths and architectures in order to assess their contributions. The final network is evaluated on the benchmark datasets for the aforementioned tasks, and the results indicate a very good performance.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133986203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Network-based structure flow estimation 基于网络的结构流估计
Pub Date : 2020-11-29 DOI: 10.1109/DICTA51227.2020.9363398
Shu Liu, Nick Barnes, R. Mahony, Haolei Ye
Structure flow is a novel three-dimensional motion representation that differs from scene flow in that it is directly associated with image change. Due to its close connection with both optical flow and divergence in images, it is well suited to estimation from monocular vision. To acquire an accurate measurement of structure flow, we design a method that employs the spatial pyramid structure and the network-based method. We investigate the current motion field datasets and validate the performance of our method by comparing its two-dimensional component of motion field with the previous works. In general, we experimentally show two conclusions: 1. Our motion estimator employs only RGB images and outperforms the previous work that utilizes RGB-D images. 2. The estimated structure flow map is a more effective representation for demonstrating the motion field compared with the widely-accepted scene flow via monocular vision.
结构流是一种新的三维运动表示形式,它与场景流的不同之处在于它与图像变化直接相关。由于它与图像的光流和散度密切相关,因此非常适合于单目视觉估计。为了获得精确的结构流量测量,我们设计了一种采用空间金字塔结构和基于网络的方法的方法。我们研究了当前的运动场数据集,并通过将其运动场的二维分量与先前的工作进行比较来验证我们的方法的性能。总的来说,我们通过实验得出了两个结论:1。我们的运动估计器仅使用RGB图像,并且优于以前使用RGB- d图像的工作。2. 与目前广泛接受的单目视觉的场景流相比,估计的结构流图能更有效地表示运动场。
{"title":"Network-based structure flow estimation","authors":"Shu Liu, Nick Barnes, R. Mahony, Haolei Ye","doi":"10.1109/DICTA51227.2020.9363398","DOIUrl":"https://doi.org/10.1109/DICTA51227.2020.9363398","url":null,"abstract":"Structure flow is a novel three-dimensional motion representation that differs from scene flow in that it is directly associated with image change. Due to its close connection with both optical flow and divergence in images, it is well suited to estimation from monocular vision. To acquire an accurate measurement of structure flow, we design a method that employs the spatial pyramid structure and the network-based method. We investigate the current motion field datasets and validate the performance of our method by comparing its two-dimensional component of motion field with the previous works. In general, we experimentally show two conclusions: 1. Our motion estimator employs only RGB images and outperforms the previous work that utilizes RGB-D images. 2. The estimated structure flow map is a more effective representation for demonstrating the motion field compared with the widely-accepted scene flow via monocular vision.","PeriodicalId":348164,"journal":{"name":"2020 Digital Image Computing: Techniques and Applications (DICTA)","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124488334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1