首页 > 最新文献

2022 7th International Conference on Multimedia and Image Processing最新文献

英文 中文
Effective Speckle reduction and structure enhancement method for retinal OCT image based on VID and Retinex 基于VID和Retinex的视网膜OCT图像有效斑点消减和结构增强方法
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517084
Biyuan Li, Yu Wang, Jun Zhang
Improving the quality of images is one of the key tasks in Optical Coherence Tomography (OCT) imaging technology. Low contrast and speckle noise are two major factors affecting the accuracy of OCT measurement. In this paper, an effective speckle reduction and structure enhancement method is proposed based on variational image decomposition (VID) and multi-scale Retinex (MSR). To be specific, we propose a new variational image decomposition model BL-G-BM3D to decompose the OCT image into background part, structure part and noise. Then the structure part is enhanced by MSR and the background part is used to generate a filter mask by fuzzy c-means clustering algorithm. Experimental results show that the proposed method performs well in speckle reduction and structure enhancement, with better quality metrics of the SNR, CNR, and ENL and better fine detail retention than shearlet transform method and BM3D method.
提高图像质量是光学相干层析成像技术的关键问题之一。低对比度和散斑噪声是影响OCT测量精度的两个主要因素。本文提出了一种基于变分图像分解(VID)和多尺度Retinex (MSR)的有效斑点减少和结构增强方法。本文提出了一种新的变分图像分解模型BL-G-BM3D,将OCT图像分解为背景部分、结构部分和噪声部分。然后对结构部分进行MSR增强,背景部分采用模糊c均值聚类算法生成滤波掩模。实验结果表明,与shearlet变换方法和BM3D方法相比,该方法具有更好的信噪比、CNR和ENL的质量指标和更好的细节保留能力,具有良好的斑点去除和结构增强效果。
{"title":"Effective Speckle reduction and structure enhancement method for retinal OCT image based on VID and Retinex","authors":"Biyuan Li, Yu Wang, Jun Zhang","doi":"10.1145/3517077.3517084","DOIUrl":"https://doi.org/10.1145/3517077.3517084","url":null,"abstract":"Improving the quality of images is one of the key tasks in Optical Coherence Tomography (OCT) imaging technology. Low contrast and speckle noise are two major factors affecting the accuracy of OCT measurement. In this paper, an effective speckle reduction and structure enhancement method is proposed based on variational image decomposition (VID) and multi-scale Retinex (MSR). To be specific, we propose a new variational image decomposition model BL-G-BM3D to decompose the OCT image into background part, structure part and noise. Then the structure part is enhanced by MSR and the background part is used to generate a filter mask by fuzzy c-means clustering algorithm. Experimental results show that the proposed method performs well in speckle reduction and structure enhancement, with better quality metrics of the SNR, CNR, and ENL and better fine detail retention than shearlet transform method and BM3D method.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129770648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Screening Framework for Lymph Node Metastasis in Colorectal Cancer Based on Deep Learning Approaches 基于深度学习方法的结直肠癌淋巴结转移筛查新框架
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517082
Yeming Liu, Fulong Li, Haitao Yu, Zhiyong Zhang, Huiyan Li, Chunxiao Han
As a diagnostic criterion for cancer, histopathology image analysis is quite critical for the subsequent therapeutic treatment of patients. Nowadays, the diagnosis is mainly depended on manually which is less precise and low-accuracy. To address the problem, we propose a novel screening framework combined image preprocess and AI approaches for the automatic detection of lymph node metastasis of colorectal cancer. First calculates the Histogram of Oriented Gradient (HOG) and Gray Level Cooccurrence Matrix (GLCM) of high-resolution digital images transformed from pathological sections. Statistical analysis show that Support Vector Machine (SVM) can be used to automatically identify cancerous areas. We further introduce deep learning models Convolutional Neural Network (CNN) into our framework, taking preprocessed images as inputs. The screening results demonstrate that the highest overlapping ratio can be achieved compared with manually annotation areas is 93.09% got by CNN, while another approaches SVM get an accuracy of 83.75%. The combination of image preprocess and deep learning can effectively improve the efficiency of lymph node metastasis screening in colorectal cancer and has great significance for the further development of Computer Aided Diagnosis (CAD) systems.
组织病理学图像分析作为癌症的诊断标准,对患者的后续治疗至关重要。目前,该病的诊断主要依靠人工诊断,精度低,准确率低。为了解决这一问题,我们提出了一种结合图像预处理和人工智能方法的新型筛查框架,用于自动检测结直肠癌淋巴结转移。首先计算病理切片变换后的高分辨率数字图像的定向梯度直方图(HOG)和灰度共生矩阵(GLCM);统计分析表明,支持向量机(SVM)可以用于自动识别癌变区域。我们进一步将深度学习模型卷积神经网络(CNN)引入到我们的框架中,以预处理图像作为输入。筛选结果表明,与人工标注区域相比,CNN获得的重叠率最高,为93.09%,而另一种方法SVM获得的准确率为83.75%。图像预处理与深度学习相结合可以有效提高结直肠癌淋巴结转移筛查的效率,对计算机辅助诊断(CAD)系统的进一步发展具有重要意义。
{"title":"A Novel Screening Framework for Lymph Node Metastasis in Colorectal Cancer Based on Deep Learning Approaches","authors":"Yeming Liu, Fulong Li, Haitao Yu, Zhiyong Zhang, Huiyan Li, Chunxiao Han","doi":"10.1145/3517077.3517082","DOIUrl":"https://doi.org/10.1145/3517077.3517082","url":null,"abstract":"As a diagnostic criterion for cancer, histopathology image analysis is quite critical for the subsequent therapeutic treatment of patients. Nowadays, the diagnosis is mainly depended on manually which is less precise and low-accuracy. To address the problem, we propose a novel screening framework combined image preprocess and AI approaches for the automatic detection of lymph node metastasis of colorectal cancer. First calculates the Histogram of Oriented Gradient (HOG) and Gray Level Cooccurrence Matrix (GLCM) of high-resolution digital images transformed from pathological sections. Statistical analysis show that Support Vector Machine (SVM) can be used to automatically identify cancerous areas. We further introduce deep learning models Convolutional Neural Network (CNN) into our framework, taking preprocessed images as inputs. The screening results demonstrate that the highest overlapping ratio can be achieved compared with manually annotation areas is 93.09% got by CNN, while another approaches SVM get an accuracy of 83.75%. The combination of image preprocess and deep learning can effectively improve the efficiency of lymph node metastasis screening in colorectal cancer and has great significance for the further development of Computer Aided Diagnosis (CAD) systems.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130200632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Feature extraction of Motion-onset visual evoked potential based on CSP and FBCSP 基于CSP和FBCSP的运动诱发视觉诱发电位特征提取
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517101
Xinglin He, Li Zhao, Tongning Meng, Zhiwen Zhang
Motion-onset visual evoked potential (mVEP) has been gradually applied in brain computer interface systems due to its maximum amplitude and minimum difference between subjects. In this paper, three feature extraction algorithms including downsampling stack average algorithm, common spatial pattern (CSP) and filter bank common spatial pattern (FBCSP) were used to extract the features of mVEP, and the experimental results show that the average classification accuracy of CSP algorithm and FBCSP algorithm in mVEP-BCI is 89.0% and 91.2% respectively, which is 3.8% and 6% higher than that of the downsampling stack average algorithm. And indicating that the CSP algorithm and the FBCSP algorithm are suitable for exercise initiation visual evoked potential brain-computer interface system and the FBCSP algorithm is in the system The feature extraction process can play a more obvious effect.
运动诱发视觉诱发电位(mVEP)因其振幅最大、被试间差异最小而逐渐在脑机接口系统中得到应用。本文采用下采样堆栈平均算法、公共空间模式(CSP)和滤波器组公共空间模式(FBCSP)三种特征提取算法对mVEP- bci进行特征提取,实验结果表明,CSP算法和FBCSP算法在mVEP- bci中的平均分类准确率分别为89.0%和91.2%,分别比下采样堆栈平均算法提高3.8%和6%。并表明CSP算法和FBCSP算法都适用于运动启动视觉诱发电位脑机接口系统,而FBCSP算法在系统的特征提取过程中能够起到较为明显的效果。
{"title":"Feature extraction of Motion-onset visual evoked potential based on CSP and FBCSP","authors":"Xinglin He, Li Zhao, Tongning Meng, Zhiwen Zhang","doi":"10.1145/3517077.3517101","DOIUrl":"https://doi.org/10.1145/3517077.3517101","url":null,"abstract":"Motion-onset visual evoked potential (mVEP) has been gradually applied in brain computer interface systems due to its maximum amplitude and minimum difference between subjects. In this paper, three feature extraction algorithms including downsampling stack average algorithm, common spatial pattern (CSP) and filter bank common spatial pattern (FBCSP) were used to extract the features of mVEP, and the experimental results show that the average classification accuracy of CSP algorithm and FBCSP algorithm in mVEP-BCI is 89.0% and 91.2% respectively, which is 3.8% and 6% higher than that of the downsampling stack average algorithm. And indicating that the CSP algorithm and the FBCSP algorithm are suitable for exercise initiation visual evoked potential brain-computer interface system and the FBCSP algorithm is in the system The feature extraction process can play a more obvious effect.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116512013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fatigue Driving Vigilance Detection Using Convolutional Neural Networks and Scalp EEG Signals 基于卷积神经网络和头皮脑电信号的疲劳驾驶警觉性检测
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517099
Y. Fang, Chunxiao Han, Jing Liu, Fengjuan Guo, Yingmei Qin, Y. Che
Fatigue driving is one of the important factors that cause traffic accidents. To solve this problem, this paper proposes a classification model based on the traditional convolutional neural network (CNN) to distinguish the vigilance state. First, the raw electroencephalogram (EEG) signals were converted into two-dimensional spectrograms by the short-time Fourier transform (STFT). Then, the CNN model was used for automatic features extraction and classification from these spectrograms. Finally, the performance of the trained CNN model was evaluated. The average of area under ROC Curve (AUC) was 1, the sensitivity was 91.4%, the average false prediction rate (FPR) was 0.02/h, and the accuracy rate was as high as 97%. The effectiveness of the CNN model was verified by the evaluation results.
疲劳驾驶是造成交通事故的重要因素之一。为了解决这一问题,本文提出了一种基于传统卷积神经网络(CNN)的分类模型来区分警戒状态。首先,利用短时傅立叶变换(STFT)将原始脑电图(EEG)信号转换为二维频谱图。然后,利用CNN模型对这些谱图进行自动特征提取和分类。最后,对训练后的CNN模型进行性能评价。ROC曲线下面积(AUC)平均值为1,灵敏度为91.4%,平均错误预测率(FPR)为0.02/h,准确率高达97%。评价结果验证了CNN模型的有效性。
{"title":"Fatigue Driving Vigilance Detection Using Convolutional Neural Networks and Scalp EEG Signals","authors":"Y. Fang, Chunxiao Han, Jing Liu, Fengjuan Guo, Yingmei Qin, Y. Che","doi":"10.1145/3517077.3517099","DOIUrl":"https://doi.org/10.1145/3517077.3517099","url":null,"abstract":"Fatigue driving is one of the important factors that cause traffic accidents. To solve this problem, this paper proposes a classification model based on the traditional convolutional neural network (CNN) to distinguish the vigilance state. First, the raw electroencephalogram (EEG) signals were converted into two-dimensional spectrograms by the short-time Fourier transform (STFT). Then, the CNN model was used for automatic features extraction and classification from these spectrograms. Finally, the performance of the trained CNN model was evaluated. The average of area under ROC Curve (AUC) was 1, the sensitivity was 91.4%, the average false prediction rate (FPR) was 0.02/h, and the accuracy rate was as high as 97%. The effectiveness of the CNN model was verified by the evaluation results.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125210173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Capsule Leakage Detection Based on Linear Array Camera 基于线阵相机的胶囊泄漏检测研究
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517094
L. Li, Genghuang Yang, Baoli Wang
The common detection method used for detecting capsule is to put oil blotting paper on it and to observe whether the paper is clean after the conventional time. This method could cost low payment but need spend more time. A method to detect capsule whether the leakage occurs based on linear array camera is proposed in this paper. Firstly, the capsule images are taken by linear array camera and imaged processing in computer. Secondly, Adaptive Histogram Equalization (AHE) algorithm and Sobel Operator (SO) algorithm are used to sharpen the obtained images to highlight the position of the leakage parts. Finally, the leakage positions are determined by comparing the gray value difference of each area of the images. It is proved by a large number of experiments that, in the context of real-time detection, the error rate of capsule leakage detection is reduced from 10% to 1.5% if it takes the line scan camera to capture the images of a capsule illuminated by a laser with a wavelength of 638nm and the images to process by the above algorithm. Meanwhile, under the same number of comparison experiments, the detection task can be complete seven days in advance. Therefore, the capsule detection method proposed in this paper can greatly improve the accuracy and efficiency.
检测胶囊常用的检测方法是将吸油纸放在其上,在常规时间后观察吸油纸是否清洁。这种方法费用低,但需要花费更多的时间。提出了一种基于线阵相机的胶囊泄漏检测方法。首先,用线阵相机拍摄胶囊图像,并用计算机进行图像处理。其次,采用自适应直方图均衡化(AHE)算法和Sobel算子(SO)算法对得到的图像进行锐化,突出泄漏部位的位置;最后,通过比较图像各区域的灰度值差确定泄漏位置。大量实验证明,在实时检测的情况下,采用线扫描相机采集波长为638nm的激光照射的胶囊图像,再进行上述算法处理,可以将胶囊泄漏检测的错误率从10%降低到1.5%。同时,在相同次数的对比实验下,可以提前7天完成检测任务。因此,本文提出的胶囊检测方法可以大大提高检测的准确性和效率。
{"title":"Research on Capsule Leakage Detection Based on Linear Array Camera","authors":"L. Li, Genghuang Yang, Baoli Wang","doi":"10.1145/3517077.3517094","DOIUrl":"https://doi.org/10.1145/3517077.3517094","url":null,"abstract":"The common detection method used for detecting capsule is to put oil blotting paper on it and to observe whether the paper is clean after the conventional time. This method could cost low payment but need spend more time. A method to detect capsule whether the leakage occurs based on linear array camera is proposed in this paper. Firstly, the capsule images are taken by linear array camera and imaged processing in computer. Secondly, Adaptive Histogram Equalization (AHE) algorithm and Sobel Operator (SO) algorithm are used to sharpen the obtained images to highlight the position of the leakage parts. Finally, the leakage positions are determined by comparing the gray value difference of each area of the images. It is proved by a large number of experiments that, in the context of real-time detection, the error rate of capsule leakage detection is reduced from 10% to 1.5% if it takes the line scan camera to capture the images of a capsule illuminated by a laser with a wavelength of 638nm and the images to process by the above algorithm. Meanwhile, under the same number of comparison experiments, the detection task can be complete seven days in advance. Therefore, the capsule detection method proposed in this paper can greatly improve the accuracy and efficiency.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"295 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114270141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency Domain Filtering Based Compressed Sensing Applied on Sparse-angle CT Image Reconstruction 基于频域滤波的压缩感知在稀疏角度CT图像重建中的应用
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517089
Jian Dong, Hao Chen, Xiaoxia Yang
In the process of CT scanning, multi-angle projection data needs to be obtained from a large number of projection actions, which makes the scanned individual bear the risk of high radiation exposure. In order to solve such problems, the use of sparse projection data for CT image reconstruction is proposed as a new type of solution. The previous research can obtain good quality reconstructed images when the projection data is sparse by using the CT reconstruction technology based on the nonlinear sparsity transformation of compressed sensing. However, the heavy time loading of the image reconstruction is a practical problem that needs to be solved urgently. This study optimizes the non-linear filtering process of the regularization term of the original scheme, and proposes a novel method which replaces the original non-linear filter with a low-pass frequency domain filter. This strategy effectively utilizes the properties of low-pass frequency domain filtering in image processing. The excellent properties include high efficiency and low time complexity for image smoothing. The simulation experiment results show that in the process of CT image reconstruction using compressed sensing algorithm, the low-pass frequency domain filtering of the new scheme can greatly reduce the required time in the reconstruction of sparse projection data, and the image quality is feasibly guaranteed.
在CT扫描过程中,需要从大量的投影动作中获得多角度的投影数据,这使得被扫描个体承担了高辐射暴露的风险。为了解决这类问题,本文提出了利用稀疏投影数据进行CT图像重建的一种新型解决方案。以往的研究采用基于压缩感知的非线性稀疏变换的CT重建技术,在投影数据稀疏的情况下,获得了质量较好的重建图像。然而,图像重建的大时间负荷是一个迫切需要解决的现实问题。本研究对原方案正则化项的非线性滤波过程进行了优化,提出了一种用低通频域滤波器代替原方案非线性滤波器的新方法。该策略有效地利用了低通频域滤波在图像处理中的特性。该算法具有效率高、时间复杂度低等优点。仿真实验结果表明,在使用压缩感知算法重建CT图像的过程中,新方案的低通频域滤波可以大大减少稀疏投影数据重建所需的时间,并且图像质量得到了可行的保证。
{"title":"Frequency Domain Filtering Based Compressed Sensing Applied on Sparse-angle CT Image Reconstruction","authors":"Jian Dong, Hao Chen, Xiaoxia Yang","doi":"10.1145/3517077.3517089","DOIUrl":"https://doi.org/10.1145/3517077.3517089","url":null,"abstract":"In the process of CT scanning, multi-angle projection data needs to be obtained from a large number of projection actions, which makes the scanned individual bear the risk of high radiation exposure. In order to solve such problems, the use of sparse projection data for CT image reconstruction is proposed as a new type of solution. The previous research can obtain good quality reconstructed images when the projection data is sparse by using the CT reconstruction technology based on the nonlinear sparsity transformation of compressed sensing. However, the heavy time loading of the image reconstruction is a practical problem that needs to be solved urgently. This study optimizes the non-linear filtering process of the regularization term of the original scheme, and proposes a novel method which replaces the original non-linear filter with a low-pass frequency domain filter. This strategy effectively utilizes the properties of low-pass frequency domain filtering in image processing. The excellent properties include high efficiency and low time complexity for image smoothing. The simulation experiment results show that in the process of CT image reconstruction using compressed sensing algorithm, the low-pass frequency domain filtering of the new scheme can greatly reduce the required time in the reconstruction of sparse projection data, and the image quality is feasibly guaranteed.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134022204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Theoretical Analysis Of Complex Networks In The Alzheimer Brain Using Navie-Bayes Classifier: An EEG And MRI Study 使用纳维-贝叶斯分类器对阿尔茨海默症大脑复杂网络的图论分析:脑电图和MRI研究
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517079
Ruofan Wang, Y. Yin, Haodong Wang, Lianshuan Shi
In order to investigate the changes of local brain regions and the differences of functional network and structural network in patients with Alzheimer's disease, the coherent functional network and structural network were constructed by using EEG signals and MRI images of patients with Alzheimer's disease and normal controls respectively. Then the brain was divided into five brain regions (frontal, parietal, occipital, temporal and central), and seven network topological features were extracted from each brain region. ANOVA1 statistical analysis of these features showed that EEG network and MRI network of AD brain had the same results, that is, there were significant differences in the number of features, and the two groups had significant differences in the frontal lobe region. In order to further analyze the abnormal topological changes of brain structure and functional networks, the single feature and the combination of features of brain regions were used as the input of Naive Bayes classifier. The classification results showed that compared with single feature EEG and MRI network feature combination, the classification accuracy was significantly improved, and the best accuracy was 0.9565 and 0.9621, respectively. This method can effectively distinguish AD group from control group and provide effective support for the study of AD brain.
为了研究阿尔茨海默病患者脑局部区域的变化以及功能网络和结构网络的差异,分别利用阿尔茨海默病患者和正常对照者的脑电图信号和MRI图像构建了连贯的功能网络和结构网络。然后将大脑划分为5个脑区(额、顶叶、枕、颞和中枢),并从每个脑区提取7个网络拓扑特征。对这些特征进行ANOVA1统计分析表明,AD大脑的EEG网络与MRI网络结果相同,即特征数量存在显著差异,且两组在额叶区域存在显著差异。为了进一步分析大脑结构和功能网络的异常拓扑变化,将大脑区域的单一特征和组合特征作为朴素贝叶斯分类器的输入。分类结果表明,与单特征EEG和MRI网络特征组合相比,分类准确率显著提高,最佳准确率分别为0.9565和0.9621。该方法可有效区分AD组与对照组,为AD脑的研究提供有效支持。
{"title":"Graph Theoretical Analysis Of Complex Networks In The Alzheimer Brain Using Navie-Bayes Classifier: An EEG And MRI Study","authors":"Ruofan Wang, Y. Yin, Haodong Wang, Lianshuan Shi","doi":"10.1145/3517077.3517079","DOIUrl":"https://doi.org/10.1145/3517077.3517079","url":null,"abstract":"In order to investigate the changes of local brain regions and the differences of functional network and structural network in patients with Alzheimer's disease, the coherent functional network and structural network were constructed by using EEG signals and MRI images of patients with Alzheimer's disease and normal controls respectively. Then the brain was divided into five brain regions (frontal, parietal, occipital, temporal and central), and seven network topological features were extracted from each brain region. ANOVA1 statistical analysis of these features showed that EEG network and MRI network of AD brain had the same results, that is, there were significant differences in the number of features, and the two groups had significant differences in the frontal lobe region. In order to further analyze the abnormal topological changes of brain structure and functional networks, the single feature and the combination of features of brain regions were used as the input of Naive Bayes classifier. The classification results showed that compared with single feature EEG and MRI network feature combination, the classification accuracy was significantly improved, and the best accuracy was 0.9565 and 0.9621, respectively. This method can effectively distinguish AD group from control group and provide effective support for the study of AD brain.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129808086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A preliminary study of challenges in extracting purity videos from the AV Speech Benchmark 从AV语音基准中提取纯度视频挑战的初步研究
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517091
Haoran Yan, Huijun Lu, Dunbo Cai, Tao Hang, Ling Qian
Recently reported deep audiovisual models have shown promising results on solving the cocktail party problem and are attracting new studies. Audiovisual datasets are an important basis for these studies. Here we investigate the AVSpeech dataset[1], a popular dataset that was launched by the Google team, for training deep audio-visual models for multi-talker speech separation. Our goal is to derive a special kind of video, called purity video, from the dataset. A piece of purity video contains continuous image frames of the same person with a face within a time. A natural question is how we can extract purity videos, as many as possible, from the AVSpeech dataset. This paper presents the tools and methods we utilized, problems we encountered, and the purity video we obtained. Our main contributions are as follows: 1) We propose a solution to extract a derivation subset of the AVSpeech dataset that is of high quality and more than the existing training sets publicly available. 2) We implemented the above solution to perform experiments on the AVSpeech dataset and got insightful results; 3) We also evaluated our proposed solution on our manually labeled dataset called VTData. Experiments show that our solution is effective and robust. We hope this work can help the community in exploiting the AVSpeech dataset for other video understanding tasks.
最近报道的深度视听模型在解决鸡尾酒会问题上显示出有希望的结果,并吸引了新的研究。视听数据集是这些研究的重要基础。在这里,我们研究了AVSpeech数据集[1],这是一个由Google团队推出的流行数据集,用于训练用于多说话者语音分离的深度视听模型。我们的目标是从数据集中获得一种特殊类型的视频,称为纯度视频。一段纯视频包含同一个人在一段时间内的连续图像帧。一个自然的问题是我们如何从AVSpeech数据集中提取尽可能多的纯度视频。本文介绍了我们使用的工具和方法,遇到的问题,以及我们获得的纯度视频。我们的主要贡献如下:1)我们提出了一个解决方案来提取AVSpeech数据集的派生子集,该子集具有高质量,并且比现有的公开可用的训练集更多。2)我们将上述解决方案在AVSpeech数据集上进行了实验,得到了有见地的结果;3)我们还在名为VTData的人工标记数据集上评估了我们提出的解决方案。实验结果表明,该方法具有较好的鲁棒性和有效性。我们希望这项工作可以帮助社区利用AVSpeech数据集进行其他视频理解任务。
{"title":"A preliminary study of challenges in extracting purity videos from the AV Speech Benchmark","authors":"Haoran Yan, Huijun Lu, Dunbo Cai, Tao Hang, Ling Qian","doi":"10.1145/3517077.3517091","DOIUrl":"https://doi.org/10.1145/3517077.3517091","url":null,"abstract":"Recently reported deep audiovisual models have shown promising results on solving the cocktail party problem and are attracting new studies. Audiovisual datasets are an important basis for these studies. Here we investigate the AVSpeech dataset[1], a popular dataset that was launched by the Google team, for training deep audio-visual models for multi-talker speech separation. Our goal is to derive a special kind of video, called purity video, from the dataset. A piece of purity video contains continuous image frames of the same person with a face within a time. A natural question is how we can extract purity videos, as many as possible, from the AVSpeech dataset. This paper presents the tools and methods we utilized, problems we encountered, and the purity video we obtained. Our main contributions are as follows: 1) We propose a solution to extract a derivation subset of the AVSpeech dataset that is of high quality and more than the existing training sets publicly available. 2) We implemented the above solution to perform experiments on the AVSpeech dataset and got insightful results; 3) We also evaluated our proposed solution on our manually labeled dataset called VTData. Experiments show that our solution is effective and robust. We hope this work can help the community in exploiting the AVSpeech dataset for other video understanding tasks.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123507108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Focus Image Fusion Based on Improved CNN 基于改进CNN的多焦点图像融合
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517093
Lixia Zhang
In order to avoid the limitations of artificial feature extraction, the CNN model is adopted to extract image features by big data-driven adaptive learning, which improves the accuracy of the features. For avoiding the loss of spatial information, an improved CNN model based on up-sampling is proposed, which consists of six layers of superimposed small convolution. The multi-layer design not only expands the receptive field, but also reduces the number of training parameters, and improves the running speed. The fusion method based on improved CNN model is proposed for multi-focus images. The improved CNN model divides the input image into focus region and non-focus region, and form the decision map. According to the decision map optimized by GFF, the focus regions are intergraded by pixel-by-pixel weighted fusion strategy to obtain fusion image. Experimental results show that the fusion results of the proposed method are clear in detail, complete in structure, no distortion in contrast, and no artifacts in the picture. It effectively avoids grayscale discontinuity, artifacts and other problems, and is better than classical methods we selected.
为了避免人工特征提取的局限性,采用CNN模型,通过大数据驱动的自适应学习提取图像特征,提高了特征的准确性。为了避免空间信息的丢失,提出了一种基于上采样的改进CNN模型,该模型由6层叠加的小卷积组成。多层设计不仅扩大了接受域,而且减少了训练参数的数量,提高了运行速度。提出了一种基于改进CNN模型的多焦点图像融合方法。改进的CNN模型将输入图像分为焦点区域和非焦点区域,形成决策图。根据GFF优化后的决策图,采用逐像素加权融合策略对焦点区域进行融合,得到融合图像。实验结果表明,该方法的融合结果细节清晰,结构完整,对比度无失真,图像无伪影。该方法有效地避免了灰度不连续、伪影等问题,优于我们选择的经典方法。
{"title":"Multi-Focus Image Fusion Based on Improved CNN","authors":"Lixia Zhang","doi":"10.1145/3517077.3517093","DOIUrl":"https://doi.org/10.1145/3517077.3517093","url":null,"abstract":"In order to avoid the limitations of artificial feature extraction, the CNN model is adopted to extract image features by big data-driven adaptive learning, which improves the accuracy of the features. For avoiding the loss of spatial information, an improved CNN model based on up-sampling is proposed, which consists of six layers of superimposed small convolution. The multi-layer design not only expands the receptive field, but also reduces the number of training parameters, and improves the running speed. The fusion method based on improved CNN model is proposed for multi-focus images. The improved CNN model divides the input image into focus region and non-focus region, and form the decision map. According to the decision map optimized by GFF, the focus regions are intergraded by pixel-by-pixel weighted fusion strategy to obtain fusion image. Experimental results show that the fusion results of the proposed method are clear in detail, complete in structure, no distortion in contrast, and no artifacts in the picture. It effectively avoids grayscale discontinuity, artifacts and other problems, and is better than classical methods we selected.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114474257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Establishment of Speaker Recognition Corpus for Intelligent Attendance System 智能考勤系统中说话人识别语料库的建立
Pub Date : 2022-01-14 DOI: 10.1145/3517077.3517118
Shuxi Chen, Yiyang Sun
With the rapid development of information technology, student attendance has changed from paper attendance to machine attendance, such as taking photos, scanning QR codes, positioning, etc. These attendance needs to turn on the camera to take photos, which is slightly inefficient, or turn on the positioning service. However, many people think that turning on the positioning service will infringe on personal privacy. Therefore, we need to consider a more efficient Attendance method that does not infringe on personal privacy. Voice, as a signal that can quickly obtain and contain a variety of information, can be used for class students' attendance. Speaker recognition corpus is the basis of speech speaker recognition research. Diversified, large-scale and high-quality speaker recognition corpus plays an important role in improving the performance of speaker recognition system. At present, although there are many standardized corpora, there are few corpora for student attendance scenes. Therefore, this topic studies the speaker's speech feature parameters, and selects the appropriate Chinese phrases to establish the speaker's corpus.
随着信息技术的飞速发展,学生考勤已经从纸质考勤转变为机器考勤,如拍照、扫描二维码、定位等。这些出席者需要打开相机拍照,效率略低,或者打开定位服务。然而,很多人认为开启定位服务会侵犯个人隐私。因此,我们需要考虑一种不侵犯个人隐私的更高效的考勤方法。语音作为一种可以快速获取并包含多种信息的信号,可以用于班级学生的考勤。说话人识别语料库是语音说话人识别研究的基础。多样化、规模化、高质量的说话人识别语料库对提高说话人识别系统的性能具有重要作用。目前,虽然标准化的语料库很多,但针对学生出勤场景的语料库却很少。因此,本课题研究说话人的语音特征参数,选择合适的汉语短语建立说话人的语料库。
{"title":"Establishment of Speaker Recognition Corpus for Intelligent Attendance System","authors":"Shuxi Chen, Yiyang Sun","doi":"10.1145/3517077.3517118","DOIUrl":"https://doi.org/10.1145/3517077.3517118","url":null,"abstract":"With the rapid development of information technology, student attendance has changed from paper attendance to machine attendance, such as taking photos, scanning QR codes, positioning, etc. These attendance needs to turn on the camera to take photos, which is slightly inefficient, or turn on the positioning service. However, many people think that turning on the positioning service will infringe on personal privacy. Therefore, we need to consider a more efficient Attendance method that does not infringe on personal privacy. Voice, as a signal that can quickly obtain and contain a variety of information, can be used for class students' attendance. Speaker recognition corpus is the basis of speech speaker recognition research. Diversified, large-scale and high-quality speaker recognition corpus plays an important role in improving the performance of speaker recognition system. At present, although there are many standardized corpora, there are few corpora for student attendance scenes. Therefore, this topic studies the speaker's speech feature parameters, and selects the appropriate Chinese phrases to establish the speaker's corpus.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123967966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 7th International Conference on Multimedia and Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1