首页 > 最新文献

Proceedings of the ACM Multimedia Asia最新文献

英文 中文
WaveCSN
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366574
Hai Wu, Hongtao Xie, Fanchao Lin, Sicheng Zhang, Jun Sun, Yongdong Zhang
Landmark detection in hip X-ray images plays a critical role in diagnosis of Developmental Dysplasia of the Hip (DDH) and surgeries of Total Hip Arthroplasty (THA). Regression and heatmap techniques of convolution network could obtain reasonable results. However, they have limitations in either robustness or precision given the complexities and intensity inhomogeneities of hip X-ray images. In this paper, we propose a Wave-like Cascade Segmentation Network (WaveCSN) to improve the accuracy of landmark detection by transforming landmark detection into area segmentation. The WaveCSN consists of three basic sub-networks and each sub-network is composed of a U-net module, an indicate module and a max-MSER module. The U-net undertakes the task to generate masks, and the indicate module is trained to distinguish the masks and ground truth. The U-net and indicate module are trained in turns, in which process the generated masks are supervised to be more and more alike to the ground truth. The max-MSER module ensures landmarks can be extracted from the generated masks precisely. We present two professional datasets (DDH and THA) for the first time and evaluate the WaveCSN on them. Our results prove that the WaveCSN can improve 2.66 and 4.11 pixels at least on these two datasets compared to other methods, and achieves the state-of-the-art for landmark detection in hip X-ray images.
{"title":"WaveCSN","authors":"Hai Wu, Hongtao Xie, Fanchao Lin, Sicheng Zhang, Jun Sun, Yongdong Zhang","doi":"10.1145/3338533.3366574","DOIUrl":"https://doi.org/10.1145/3338533.3366574","url":null,"abstract":"Landmark detection in hip X-ray images plays a critical role in diagnosis of Developmental Dysplasia of the Hip (DDH) and surgeries of Total Hip Arthroplasty (THA). Regression and heatmap techniques of convolution network could obtain reasonable results. However, they have limitations in either robustness or precision given the complexities and intensity inhomogeneities of hip X-ray images. In this paper, we propose a Wave-like Cascade Segmentation Network (WaveCSN) to improve the accuracy of landmark detection by transforming landmark detection into area segmentation. The WaveCSN consists of three basic sub-networks and each sub-network is composed of a U-net module, an indicate module and a max-MSER module. The U-net undertakes the task to generate masks, and the indicate module is trained to distinguish the masks and ground truth. The U-net and indicate module are trained in turns, in which process the generated masks are supervised to be more and more alike to the ground truth. The max-MSER module ensures landmarks can be extracted from the generated masks precisely. We present two professional datasets (DDH and THA) for the first time and evaluate the WaveCSN on them. Our results prove that the WaveCSN can improve 2.66 and 4.11 pixels at least on these two datasets compared to other methods, and achieves the state-of-the-art for landmark detection in hip X-ray images.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123436043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Artistic Text Stylization for Visual-Textual Presentation Synthesis 视觉文本呈现综合的艺术文本样式化
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3372211
Shuai Yang
In this research, we study a specific task of visual-textual presentation synthesis, where artistic text is generated and embedded in a background photo. The art form of visual-textual presentation is widely used in graphic design such as posters, billboards and trademarks, and therefore is of high application value. We propose a new framework to complete this task. First, the shape of the target text is adjusted and the textures are rendered to match the reference style image to generate artistic text. By considering both aesthetics and seamlessness, the layout where the artistic text is placed is determined. Finally the artistic text is blended with the background photo to obtain the visual-textual presentations. The experimental results demonstrate the effectiveness of the proposed framework in creating professionally designed visual-textual presentations.
在本研究中,我们研究了视觉文本呈现合成的具体任务,其中艺术文本被生成并嵌入背景照片中。视觉文字表现的艺术形式广泛应用于海报、广告牌、商标等平面设计中,具有很高的应用价值。我们提出了一个新的框架来完成这项任务。首先,调整目标文本的形状并渲染纹理以匹配参考样式图像,从而生成艺术文本。通过考虑美学和无缝性,确定艺术文本放置的布局。最后将艺术文本与背景照片进行融合,获得视觉文本呈现。实验结果证明了该框架在创建专业设计的视觉文本表示方面的有效性。
{"title":"Artistic Text Stylization for Visual-Textual Presentation Synthesis","authors":"Shuai Yang","doi":"10.1145/3338533.3372211","DOIUrl":"https://doi.org/10.1145/3338533.3372211","url":null,"abstract":"In this research, we study a specific task of visual-textual presentation synthesis, where artistic text is generated and embedded in a background photo. The art form of visual-textual presentation is widely used in graphic design such as posters, billboards and trademarks, and therefore is of high application value. We propose a new framework to complete this task. First, the shape of the target text is adjusted and the textures are rendered to match the reference style image to generate artistic text. By considering both aesthetics and seamlessness, the layout where the artistic text is placed is determined. Finally the artistic text is blended with the background photo to obtain the visual-textual presentations. The experimental results demonstrate the effectiveness of the proposed framework in creating professionally designed visual-textual presentations.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115255261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Gradient Guided Image Deblocking Using Convolutional Neural Networks 基于卷积神经网络的梯度引导图像块化
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3368258
Cheolkon Jung, Jiawei Feng, Zhu Li
Block-based transform coding in its nature causes blocking artifacts, which severely degrades picture quality especially in a high compression rate. Although convolutional neural networks (CNNs) achieve good performance in image restoration tasks, existing methods mainly focus on deep or efficient network architecture. The gradient of compressed images has different characteristics from the original gradient that has dramatic changes in pixel values along block boundaries. Motivated by them, we propose gradient guided image deblocking based on CNNs in this paper. Guided by the gradient information of the input blocky image, the proposed network successfully preserves textural edges while reducing blocky edges, and thus restores the original clean image from compression degradation. Experimental results demonstrate that the gradient information in the input compressed image contributes to blocking artifact reduction as well as the proposed method achieves a significant performance improvement in terms of visual quality and objective measurements.
基于块的变换编码本质上会产生块伪影,严重降低图像质量,特别是在高压缩率下。虽然卷积神经网络(cnn)在图像恢复任务中取得了良好的性能,但现有的方法主要集中在深度或高效的网络架构上。压缩图像的梯度与原始梯度具有不同的特征,原始梯度沿块边界的像素值变化很大。受此启发,本文提出了一种基于cnn的梯度引导图像块化方法。在输入的块图像梯度信息的指导下,该网络成功地保留了纹理边缘,同时减少了块边缘,从而从压缩退化中恢复了原始的干净图像。实验结果表明,输入压缩图像中的梯度信息有助于减少块伪影,并且该方法在视觉质量和客观测量方面都取得了显着的性能提高。
{"title":"Gradient Guided Image Deblocking Using Convolutional Neural Networks","authors":"Cheolkon Jung, Jiawei Feng, Zhu Li","doi":"10.1145/3338533.3368258","DOIUrl":"https://doi.org/10.1145/3338533.3368258","url":null,"abstract":"Block-based transform coding in its nature causes blocking artifacts, which severely degrades picture quality especially in a high compression rate. Although convolutional neural networks (CNNs) achieve good performance in image restoration tasks, existing methods mainly focus on deep or efficient network architecture. The gradient of compressed images has different characteristics from the original gradient that has dramatic changes in pixel values along block boundaries. Motivated by them, we propose gradient guided image deblocking based on CNNs in this paper. Guided by the gradient information of the input blocky image, the proposed network successfully preserves textural edges while reducing blocky edges, and thus restores the original clean image from compression degradation. Experimental results demonstrate that the gradient information in the input compressed image contributes to blocking artifact reduction as well as the proposed method achieves a significant performance improvement in terms of visual quality and objective measurements.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129581920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An LSTM based Rate and Distortion Prediction Method for Low-delay Video Coding 基于LSTM的低延迟视频编码速率和失真预测方法
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366630
Feiyang Liu, G. Cao, Daiqin Yang, Yiyong Zha, Yunfei Zhang, Xin Liu
In this paper, an LSTM based rate-distortion (R-D) prediction method for low-delay video coding has been proposed. Unlike the traditional rate control algorithms, LSTM is introduced to learn the latent pattern of the R-D relationship in the progress of video coding. Temporal information, hierarchical coding structure information and the content of the frame which is to be encoded have been used to achieve more accurate prediction. Based on the proposed network, a new R-D model parameters prediction method is proposed and tested on test model of Versatile Video Coding (VVC). According to the experimental results, compared with the state-of-the-art method used in VVC, the proposed method can achieve better performance.
提出了一种基于LSTM的低延迟视频编码率失真(R-D)预测方法。与传统的速率控制算法不同,引入LSTM来学习视频编码过程中R-D关系的潜在模式。利用时序信息、分层编码结构信息和待编码帧的内容来实现更准确的预测。在此基础上,提出了一种新的R-D模型参数预测方法,并在通用视频编码(VVC)测试模型上进行了测试。实验结果表明,与VVC中使用的最先进的方法相比,该方法可以获得更好的性能。
{"title":"An LSTM based Rate and Distortion Prediction Method for Low-delay Video Coding","authors":"Feiyang Liu, G. Cao, Daiqin Yang, Yiyong Zha, Yunfei Zhang, Xin Liu","doi":"10.1145/3338533.3366630","DOIUrl":"https://doi.org/10.1145/3338533.3366630","url":null,"abstract":"In this paper, an LSTM based rate-distortion (R-D) prediction method for low-delay video coding has been proposed. Unlike the traditional rate control algorithms, LSTM is introduced to learn the latent pattern of the R-D relationship in the progress of video coding. Temporal information, hierarchical coding structure information and the content of the frame which is to be encoded have been used to achieve more accurate prediction. Based on the proposed network, a new R-D model parameters prediction method is proposed and tested on test model of Versatile Video Coding (VVC). According to the experimental results, compared with the state-of-the-art method used in VVC, the proposed method can achieve better performance.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128554765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
IKDMM
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366607
Zhaoyi Liu, Yuexian Zou
Microphone array beamforming has been approved to be an effective method for suppressing adverse interferences. Recently, acoustic beamformers that employ neural networks (NN) for estimating the time-frequency (T-F) mask, termed as TFMask-BF, receive tremendous attention and have shown great benefits as a front-end for noise-robust Automatic Speech Recognition (ASR). However, our preliminary experiments using TFMask-BF for ASR task show that the mask model trained with simulated data cannot perform well in the real environment since there is a data mismatch problem. In this study, we adopt the knowledge distillation learning framework to make use of real-recording data together with simulated data in the training phase to reduce the impact of the data mismatch. Moreover, a novel iterative knowledge distillation mask model (IKDMM) training scheme has been systematically developed. Specifically, two bi-directional long short-term memory (BLSTM) models, are designed as a teacher mask model (TMM) and a student mask model (SMM). The TMM is trained with simulated data at each iteration and then it is employed to separately generate the soft mask labels of both simulated and real-recording data.The simulated data and the real-recording data with their corresponding generated soft mask labels are formed into the new training data to train our SMM at each iteration. The proposed approach is evaluated as a front-end for ASR on the six-channel CHiME-4 corpus. Experimental results show that the data mismatch problem can be reduced by our IKDMM, leading to a 5% relative Word Error Rate (WER) reduction compared to conventional TFMask-BF for the real-recording data under noisy conditions.
{"title":"IKDMM","authors":"Zhaoyi Liu, Yuexian Zou","doi":"10.1145/3338533.3366607","DOIUrl":"https://doi.org/10.1145/3338533.3366607","url":null,"abstract":"Microphone array beamforming has been approved to be an effective method for suppressing adverse interferences. Recently, acoustic beamformers that employ neural networks (NN) for estimating the time-frequency (T-F) mask, termed as TFMask-BF, receive tremendous attention and have shown great benefits as a front-end for noise-robust Automatic Speech Recognition (ASR). However, our preliminary experiments using TFMask-BF for ASR task show that the mask model trained with simulated data cannot perform well in the real environment since there is a data mismatch problem. In this study, we adopt the knowledge distillation learning framework to make use of real-recording data together with simulated data in the training phase to reduce the impact of the data mismatch. Moreover, a novel iterative knowledge distillation mask model (IKDMM) training scheme has been systematically developed. Specifically, two bi-directional long short-term memory (BLSTM) models, are designed as a teacher mask model (TMM) and a student mask model (SMM). The TMM is trained with simulated data at each iteration and then it is employed to separately generate the soft mask labels of both simulated and real-recording data.The simulated data and the real-recording data with their corresponding generated soft mask labels are formed into the new training data to train our SMM at each iteration. The proposed approach is evaluated as a front-end for ASR on the six-channel CHiME-4 corpus. Experimental results show that the data mismatch problem can be reduced by our IKDMM, leading to a 5% relative Word Error Rate (WER) reduction compared to conventional TFMask-BF for the real-recording data under noisy conditions.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116246799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-balance Motion and Appearance Model for Multi-object Tracking in UAV 无人机多目标跟踪的自平衡运动与外观模型
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366561
Hongyang Yu, Guorong Li, Weigang Zhang, H. Yao, Qingming Huang
Under the tracking-by-detection framework, multi-object tracking methods try to connect object detections with target trajectories by reasonable policy. Most methods represent objects by the appearance and motion. The inference of the association is mostly judged by a fusion of appearance similarity and motion consistency. However, the fusion ratio between appearance and motion are often determined by subjective setting. In this paper, we propose a novel self-balance method fusing appearance similarity and motion consistency. Extensive experimental results on public benchmarks demonstrate the effectiveness of the proposed method with comparisons to several state-of-the-art trackers.
在检测跟踪框架下,多目标跟踪方法试图通过合理的策略将目标检测与目标轨迹联系起来。大多数方法通过外观和运动来表示对象。联想推理主要通过融合外观相似性和动作一致性来判断。然而,外观与动作的融合比例往往是由主观设定决定的。本文提出了一种融合外观相似性和运动一致性的自平衡方法。在公共基准上的大量实验结果证明了所提出方法的有效性,并与几种最先进的跟踪器进行了比较。
{"title":"Self-balance Motion and Appearance Model for Multi-object Tracking in UAV","authors":"Hongyang Yu, Guorong Li, Weigang Zhang, H. Yao, Qingming Huang","doi":"10.1145/3338533.3366561","DOIUrl":"https://doi.org/10.1145/3338533.3366561","url":null,"abstract":"Under the tracking-by-detection framework, multi-object tracking methods try to connect object detections with target trajectories by reasonable policy. Most methods represent objects by the appearance and motion. The inference of the association is mostly judged by a fusion of appearance similarity and motion consistency. However, the fusion ratio between appearance and motion are often determined by subjective setting. In this paper, we propose a novel self-balance method fusing appearance similarity and motion consistency. Extensive experimental results on public benchmarks demonstrate the effectiveness of the proposed method with comparisons to several state-of-the-art trackers.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114877646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Tumor Tissue Segmentation for Histopathological Images 组织病理学图像的肿瘤组织分割
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3372210
Xiansong Huang, Hong-Ju He, Pengxu Wei, Chi Zhang, Juncen Zhang, Jie Chen
Histopathological image analysis is considered as a gold standard for cancer identification and diagnosis. Tumor segmentation for histopathological images is one of the most important research topics and its performance directly affects the diagnosis judgment of doctors for cancer categories and their periods. With the remarkable development of deep learning methods, extensive methods have been proposed for tumor segmentation. However, there are few researches on analysis of specific pipeline of tumor segmentation. Moreover, few studies have done detailed research on the hard example mining of tumor segmentation. In order to bridge this gap, this study firstly summarize a specific pipeline of tumor segmentation. Then, hard example mining in tumor segmentation is also explored. Finally, experiments are conducted for evaluating segmentation performance of our method, demonstrating the effects of our method and hard example mining.
组织病理学图像分析被认为是鉴别和诊断癌症的金标准。组织病理图像的肿瘤分割是最重要的研究课题之一,其性能直接影响到医生对肿瘤种类和周期的诊断判断。随着深度学习方法的显著发展,人们提出了广泛的肿瘤分割方法。然而,对肿瘤分割的具体流程分析的研究却很少。此外,很少有研究对肿瘤分割的硬例挖掘进行了详细的研究。为了弥补这一空白,本研究首先总结了一种特定的肿瘤分割管道。然后,对肿瘤分割中的硬例挖掘进行了探讨。最后,通过实验对该方法的分割性能进行了评价,验证了该方法和硬例挖掘的效果。
{"title":"Tumor Tissue Segmentation for Histopathological Images","authors":"Xiansong Huang, Hong-Ju He, Pengxu Wei, Chi Zhang, Juncen Zhang, Jie Chen","doi":"10.1145/3338533.3372210","DOIUrl":"https://doi.org/10.1145/3338533.3372210","url":null,"abstract":"Histopathological image analysis is considered as a gold standard for cancer identification and diagnosis. Tumor segmentation for histopathological images is one of the most important research topics and its performance directly affects the diagnosis judgment of doctors for cancer categories and their periods. With the remarkable development of deep learning methods, extensive methods have been proposed for tumor segmentation. However, there are few researches on analysis of specific pipeline of tumor segmentation. Moreover, few studies have done detailed research on the hard example mining of tumor segmentation. In order to bridge this gap, this study firstly summarize a specific pipeline of tumor segmentation. Then, hard example mining in tumor segmentation is also explored. Finally, experiments are conducted for evaluating segmentation performance of our method, demonstrating the effects of our method and hard example mining.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128195206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Session details: Poster Session 会议详情:海报会议
Pub Date : 2019-12-15 DOI: 10.1145/3379196
Cong Bai
{"title":"Session details: Poster Session","authors":"Cong Bai","doi":"10.1145/3379196","DOIUrl":"https://doi.org/10.1145/3379196","url":null,"abstract":"","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115888326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale Features for Weakly Supervised Lesion Detection of Cerebral Hemorrhage with Collaborative Learning 基于协同学习的弱监督脑出血病变检测的多尺度特征
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3372209
Zhiwei Chen, Rongrong Ji, Jipeng Wu, Yunhang Shen
Deep networks have recently been applied to medical assistant diagnosis. The brain is the largest and the most complex structure in the central nervous system, which is also complicated in medical images such as computed tomography (CT) scan. While reading the CT image, radiologists generally search across the image to find lesions, characterize and measure them, and then describe them in the radiological report. To automate this process, we quantitatively analyze the cerebral hemorrhage dataset and propose a Multi-scale Feature with Collaborative Learning (MFCL) strategy in terms of Weakly Supervised Lesion Detection (WSLD), which not only adapts to the characteristics of detecting small lesions but also introduces the global constraint classification objective in training. Specifically, a multi-scale feature branch network and a collaborative learning are designed to locate the lesion area. Experimental results demonstrate that the proposed method is valid on the cerebral hemorrhage dataset, and a new baseline of WSLD is established on cerebral hemorrhage dataset.
深度网络最近被应用于医疗辅助诊断。大脑是中枢神经系统中最大和最复杂的结构,在计算机断层扫描(CT)等医学图像中也很复杂。在阅读CT图像时,放射科医生通常会在图像中搜索病灶,对其进行表征和测量,然后在放射报告中对其进行描述。为了实现这一过程的自动化,我们对脑出血数据集进行了定量分析,并在弱监督病灶检测(WSLD)方面提出了一种多尺度特征协同学习(MFCL)策略,该策略不仅适应了检测小病灶的特点,而且在训练中引入了全局约束分类目标。具体而言,设计了多尺度特征分支网络和协同学习来定位病变区域。实验结果表明,该方法在脑出血数据集上是有效的,并在脑出血数据集上建立了新的WSLD基线。
{"title":"Multi-scale Features for Weakly Supervised Lesion Detection of Cerebral Hemorrhage with Collaborative Learning","authors":"Zhiwei Chen, Rongrong Ji, Jipeng Wu, Yunhang Shen","doi":"10.1145/3338533.3372209","DOIUrl":"https://doi.org/10.1145/3338533.3372209","url":null,"abstract":"Deep networks have recently been applied to medical assistant diagnosis. The brain is the largest and the most complex structure in the central nervous system, which is also complicated in medical images such as computed tomography (CT) scan. While reading the CT image, radiologists generally search across the image to find lesions, characterize and measure them, and then describe them in the radiological report. To automate this process, we quantitatively analyze the cerebral hemorrhage dataset and propose a Multi-scale Feature with Collaborative Learning (MFCL) strategy in terms of Weakly Supervised Lesion Detection (WSLD), which not only adapts to the characteristics of detecting small lesions but also introduces the global constraint classification objective in training. Specifically, a multi-scale feature branch network and a collaborative learning are designed to locate the lesion area. Experimental results demonstrate that the proposed method is valid on the cerebral hemorrhage dataset, and a new baseline of WSLD is established on cerebral hemorrhage dataset.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123514179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature fusion adversarial learning network for liver lesion classification 特征融合对抗学习网络用于肝脏病变分类
Pub Date : 2019-12-15 DOI: 10.1145/3338533.3366577
Peng Chen, Yuqing Song, Deqi Yuan, Zhe Liu
The number of training data is the key bottleneck in achieving good results for medical image analysis and especially in deep learning. Due to small medical training data, deep learning models often fail to mine useful features and have serious over-fitting problems. In this paper, we propose a clean and effective feature fusion adversarial learning network to mine useful features and relieve over-fitting problems. Firstly, we train a fully convolution autoencoder network with unsupervised learning to mine useful feature maps from our liver lesion data. Secondly, these feature maps will be transferred to our adversarial SENet network for liver lesion classification. Our experiments on liver lesion classification in CT show an average accuracy as 85.47% compared with the baseline training scheme, which demonstrate our proposed method can mime useful features and relieve over-fitting problem. It can assist physicians in the early detection and treatment of liver lesions.
训练数据的数量是医学图像分析特别是深度学习取得良好效果的关键瓶颈。由于医学训练数据较少,深度学习模型往往无法挖掘出有用的特征,并且存在严重的过拟合问题。本文提出了一种简洁有效的特征融合对抗学习网络来挖掘有用的特征,缓解过拟合问题。首先,我们用无监督学习训练一个全卷积自编码器网络,从我们的肝脏病变数据中挖掘有用的特征映射。其次,这些特征图将被转移到我们的对抗SENet网络中进行肝脏病变分类。我们对CT中肝脏病变分类的实验结果表明,与基线训练方案相比,该方法的平均准确率为85.47%,表明我们的方法可以模拟有用的特征,缓解过拟合问题。它可以帮助医生在早期发现和治疗肝脏病变。
{"title":"Feature fusion adversarial learning network for liver lesion classification","authors":"Peng Chen, Yuqing Song, Deqi Yuan, Zhe Liu","doi":"10.1145/3338533.3366577","DOIUrl":"https://doi.org/10.1145/3338533.3366577","url":null,"abstract":"The number of training data is the key bottleneck in achieving good results for medical image analysis and especially in deep learning. Due to small medical training data, deep learning models often fail to mine useful features and have serious over-fitting problems. In this paper, we propose a clean and effective feature fusion adversarial learning network to mine useful features and relieve over-fitting problems. Firstly, we train a fully convolution autoencoder network with unsupervised learning to mine useful feature maps from our liver lesion data. Secondly, these feature maps will be transferred to our adversarial SENet network for liver lesion classification. Our experiments on liver lesion classification in CT show an average accuracy as 85.47% compared with the baseline training scheme, which demonstrate our proposed method can mime useful features and relieve over-fitting problem. It can assist physicians in the early detection and treatment of liver lesions.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131717894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the ACM Multimedia Asia
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1