2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献_第6页

Sequential pattern detection for identifying courses of treatment and anomalous claim behaviour in medical insurance 用于识别医疗保险治疗过程和异常索赔行为的顺序模式检测

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995541

James Kemp, Christopher Barker, Norm M. Good, Michael Bain

Fraud and waste is a costly problem in medical insurance. Utilising sequence information for anomaly detection is under-explored in this domain. We present a multi-part method employing sequential pattern mining for identifying and grouping comparable courses of treatment, finding patterns within those courses, calculating the cost of possible additional or upcoded claims in unusual patterns, and ranking the providers based on potential recoverable costs. We applied this method to real-world radiation therapy data. Results were assessed by experts at the Australian Government Department of Health, and were found to be interpretable and informative. Previously unknown anomalous claim patterns were discovered, and confirmation of a previously suspected anomalous claim pattern was also obtained. Outlying providers each claimed up to ${$}$486,617.60 in potentially recoverable costs. Our method was able to identify anomalous claims as well as the patterns in which they were anomalous, making the results easily interpretable. The method is currently being implemented for another problem involving sequential data at the Department of Health.

医疗保险中的欺诈和浪费是一个代价高昂的问题。利用序列信息进行异常检测在这一领域尚未得到充分的探索。我们提出了一种多部分的方法，采用顺序模式挖掘来识别和分组可比较的治疗过程，在这些过程中找到模式，计算在不寻常模式下可能的额外或上编码索赔的成本，并根据潜在的可收回成本对提供者进行排名。我们将这种方法应用于真实世界的放射治疗数据。澳大利亚政府卫生部的专家对结果进行了评估，发现这些结果是可解释的，并提供了信息。发现了以前未知的异常索赔模式，并确认了以前怀疑的异常索赔模式。边远的供应商每人索赔高达${$}$486,617.60的潜在可收回成本。我们的方法能够识别异常声明以及它们异常的模式，使结果易于解释。该方法目前正用于解决卫生部的另一个涉及顺序数据的问题。

{"title":"Sequential pattern detection for identifying courses of treatment and anomalous claim behaviour in medical insurance","authors":"James Kemp, Christopher Barker, Norm M. Good, Michael Bain","doi":"10.1109/BIBM55620.2022.9995541","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995541","url":null,"abstract":"Fraud and waste is a costly problem in medical insurance. Utilising sequence information for anomaly detection is under-explored in this domain. We present a multi-part method employing sequential pattern mining for identifying and grouping comparable courses of treatment, finding patterns within those courses, calculating the cost of possible additional or upcoded claims in unusual patterns, and ranking the providers based on potential recoverable costs. We applied this method to real-world radiation therapy data. Results were assessed by experts at the Australian Government Department of Health, and were found to be interpretable and informative. Previously unknown anomalous claim patterns were discovered, and confirmation of a previously suspected anomalous claim pattern was also obtained. Outlying providers each claimed up to ${$}$486,617.60 in potentially recoverable costs. Our method was able to identify anomalous claims as well as the patterns in which they were anomalous, making the results easily interpretable. The method is currently being implemented for another problem involving sequential data at the Department of Health.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115589684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Reassembling Consistent-Complementary Constraints in Triplet Network for Multi-view Learning of Medical Images 基于三重网络的一致性互补约束重组医学图像多视图学习

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995213

Xingyue Wang, Jiansheng Fang, Na Zeng, Jingqi Huang, Hanpei Miao, W. Kwapong, Ziyi Zhang, Shuting Zhang, Jiang Liu

Existing multi-view learning methods based on the information bottleneck principle exhibit impressing generalization by capturing inter-view consistency and complementarity. They leverage cross-view joint information (consistency) and view-specific information (complementarity) while discarding redundant information. By fusing visual features, multi-view learning methods help medical image processing to produce more reliable predictions. However, multi-views of medical images often have low consistency and high complementarity due to modal differences in imaging or different projection depths, thus challenging existing methods to balance them to the maximal extent. To mitigate such an issue, we improve the information bottleneck (IB) loss function with a balanced regularization term, termed IBB loss, reassembling the constraints of multi-view consistency and complementarity. In particular, the balanced regularization term with a unique trade-off factor in IBB loss helps minimize the mutual information on consistency and complementarity to strike a balance. In addition, we devise a triplet multi-view network named TM net to learn the consistent and complementary features from multi-view medical images. By evaluating two datasets, we demonstrate the superiority of our method against several counterparts. The extensive experiments also confirm that our IBB loss significantly improves multi-view learning in medical images.

现有的基于信息瓶颈原理的多视图学习方法通过捕捉视图间一致性和互补性表现出令人印象深刻的泛化。它们利用跨视图联合信息(一致性)和特定于视图的信息(互补性)，同时丢弃冗余信息。通过融合视觉特征，多视图学习方法有助于医学图像处理产生更可靠的预测。然而，由于成像模态的差异或投影深度的不同，多视图医学图像往往具有一致性低、互补性高的特点，这给现有方法最大程度地平衡它们带来了挑战。为了缓解这一问题，我们用平衡正则化项IBB loss改进了信息瓶颈(IB)损失函数，重组了多视图一致性和互补性约束。特别是，IBB损失中具有唯一权衡因子的平衡正则化项有助于最小化一致性和互补性的相互信息，从而达到平衡。此外，我们设计了一个名为TM网的三联体多视图网络，从多视图医学图像中学习一致和互补的特征。通过评估两个数据集，我们证明了我们的方法相对于几个同类方法的优越性。大量的实验也证实了我们的IBB损失显著改善了医学图像的多视图学习。

{"title":"Reassembling Consistent-Complementary Constraints in Triplet Network for Multi-view Learning of Medical Images","authors":"Xingyue Wang, Jiansheng Fang, Na Zeng, Jingqi Huang, Hanpei Miao, W. Kwapong, Ziyi Zhang, Shuting Zhang, Jiang Liu","doi":"10.1109/BIBM55620.2022.9995213","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995213","url":null,"abstract":"Existing multi-view learning methods based on the information bottleneck principle exhibit impressing generalization by capturing inter-view consistency and complementarity. They leverage cross-view joint information (consistency) and view-specific information (complementarity) while discarding redundant information. By fusing visual features, multi-view learning methods help medical image processing to produce more reliable predictions. However, multi-views of medical images often have low consistency and high complementarity due to modal differences in imaging or different projection depths, thus challenging existing methods to balance them to the maximal extent. To mitigate such an issue, we improve the information bottleneck (IB) loss function with a balanced regularization term, termed IBB loss, reassembling the constraints of multi-view consistency and complementarity. In particular, the balanced regularization term with a unique trade-off factor in IBB loss helps minimize the mutual information on consistency and complementarity to strike a balance. In addition, we devise a triplet multi-view network named TM net to learn the consistent and complementary features from multi-view medical images. By evaluating two datasets, we demonstrate the superiority of our method against several counterparts. The extensive experiments also confirm that our IBB loss significantly improves multi-view learning in medical images.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115684957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Feature Selection for Microarray Data via Community Detection Fusing Multiple Gene Relation Networks Information 融合多基因关系网络信息的社区检测微阵列数据特征选择

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9994959

Shoujia Zhang, Wei Li, Weidong Xie, Linjie Wang

In recent decades, the rapid development of gene sequencing and computer technology has increased the growth of high-dimensional microarray data. Some machine learning methods have been successfully applied to it to help classify cancer. In most cases, high dimensionality and the small sample size of microarray data restricted the performance of cancer classification. This problem usually issolved bysome feature selection methods. However, most of them neglect the exploitation of relations among genes. This paper proposes a novel feature selection method by fusing multiple gene relation network information based on community detection (MGRCD). The proposed method divides all genes into different communities. Then, the genes most associated with cancer classification are selected from each community. The proposed method satisfies both maximum relevances gene with cancer and minimum redundancy among genes for the selected optimal feature subset. The experiment results show that the proposed gene selection method can effectively improve classification performance.

近几十年来，基因测序和计算机技术的快速发展促进了高维微阵列数据的增长。一些机器学习方法已经成功地应用于它来帮助分类癌症。在大多数情况下，微阵列数据的高维数和小样本量限制了癌症分类的性能。这个问题通常通过一些特征选择方法来解决。然而，它们大多忽视了基因间关系的开发。提出了一种基于社区检测(MGRCD)的融合多基因关系网络信息的特征选择方法。该方法将所有基因划分为不同的群落。然后，从每个群体中选择与癌症分类最相关的基因。所提出的方法既满足癌症基因的最大相关性，又满足所选最优特征子集中基因间的最小冗余。实验结果表明，所提出的基因选择方法可以有效地提高分类性能。

{"title":"Feature Selection for Microarray Data via Community Detection Fusing Multiple Gene Relation Networks Information","authors":"Shoujia Zhang, Wei Li, Weidong Xie, Linjie Wang","doi":"10.1109/BIBM55620.2022.9994959","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9994959","url":null,"abstract":"In recent decades, the rapid development of gene sequencing and computer technology has increased the growth of high-dimensional microarray data. Some machine learning methods have been successfully applied to it to help classify cancer. In most cases, high dimensionality and the small sample size of microarray data restricted the performance of cancer classification. This problem usually issolved bysome feature selection methods. However, most of them neglect the exploitation of relations among genes. This paper proposes a novel feature selection method by fusing multiple gene relation network information based on community detection (MGRCD). The proposed method divides all genes into different communities. Then, the genes most associated with cancer classification are selected from each community. The proposed method satisfies both maximum relevances gene with cancer and minimum redundancy among genes for the selected optimal feature subset. The experiment results show that the proposed gene selection method can effectively improve classification performance.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114877097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating Multi-scale Feature Representation and Ensemble Learning for Schizophrenia Diagnosis 融合多尺度特征表示与集成学习的精神分裂症诊断

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9994950

Manna Xiao, Hulin Kuang, Jin Liu, Yan Zhang, Yizhen Xiang, Jianxin Wang

Resting-state functional magnetic resonance imaging (rs-fMRI) images have been widely used for diagnosis of schizophrenia. With rs-fMRI, most existing schizophrenia diagnostic methods have revealed schizophrenia’s functional abnormalities from the following three scales, i.e., regional neural activity alterations, functional connectivity abnormalities and brain network dysfunctions. However, many schizophrenia diagnosis methods do not consider the fusion of features from the three scales. In this study, we propose a schizophrenia diagnostic method based on multi-scale feature representation and ensemble learning. Firstly, features including the three scales (region, connectivity and network) are extracted from rs-fMRI images using the brainnetome atlas. For each scale, feature selection, i.e., least absolute shrinkage and selection operator, is applied to identify effective sub-features related to schizophrenia classification by a grid search. Then the selected sub-features of each scale are input to support vector machine with linear kernel to classify schizophrenia patients and healthy controls respectively. To further improve the schizophrenia diagnostic performance, an ensemble learning framework named E-RCN is proposed to average the probabilities obtained by the classifiers of each scale in decision level. By leave-one-out cross-validation on the center for biomedical research excellence dataset (COBRE), our proposed method achieves encouraging diagnosis performance, outperforming several state-of-the-art methods. In addition, ranked by the occurence frequency of each brain region within the leave-one-out cross-validation experiments, some brain regions related to schizophrenia, i.e., thalamus and middle temporal gyrus, and important elaborate subregions, i.e., Tha_L_8_8, MTG_L_4_4 and MTG_R_4_4, are found.

静息状态功能磁共振成像(rs-fMRI)图像已被广泛用于精神分裂症的诊断。现有的精神分裂症诊断方法大多从区域神经活动改变、功能连通性异常和脑网络功能障碍三个尺度来揭示精神分裂症的功能异常。然而，许多精神分裂症的诊断方法并没有考虑三个量表特征的融合。在这项研究中，我们提出了一种基于多尺度特征表示和集成学习的精神分裂症诊断方法。首先，利用脑网络图谱提取rs-fMRI图像的三个尺度(区域、连通性和网络)特征;对于每个尺度，应用特征选择，即最小绝对收缩和选择算子，通过网格搜索识别与精神分裂症分类相关的有效子特征。然后将每个尺度所选择的子特征输入到具有线性核的支持向量机中，分别对精神分裂症患者和健康对照组进行分类。为了进一步提高精神分裂症的诊断性能，提出了一种集成学习框架E-RCN，对决策层面各尺度分类器得到的概率进行平均。通过对生物医学研究卓越数据集中心(COBRE)的留一交叉验证，我们提出的方法取得了令人鼓舞的诊断性能，优于几种最先进的方法。此外，根据各脑区在留一交叉验证实验中的出现频率排序，发现了一些与精神分裂症相关的脑区，如丘脑和颞中回，以及重要的复杂亚区，如th_l_8_8、MTG_L_4_4和MTG_R_4_4。

{"title":"Integrating Multi-scale Feature Representation and Ensemble Learning for Schizophrenia Diagnosis","authors":"Manna Xiao, Hulin Kuang, Jin Liu, Yan Zhang, Yizhen Xiang, Jianxin Wang","doi":"10.1109/BIBM55620.2022.9994950","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9994950","url":null,"abstract":"Resting-state functional magnetic resonance imaging (rs-fMRI) images have been widely used for diagnosis of schizophrenia. With rs-fMRI, most existing schizophrenia diagnostic methods have revealed schizophrenia’s functional abnormalities from the following three scales, i.e., regional neural activity alterations, functional connectivity abnormalities and brain network dysfunctions. However, many schizophrenia diagnosis methods do not consider the fusion of features from the three scales. In this study, we propose a schizophrenia diagnostic method based on multi-scale feature representation and ensemble learning. Firstly, features including the three scales (region, connectivity and network) are extracted from rs-fMRI images using the brainnetome atlas. For each scale, feature selection, i.e., least absolute shrinkage and selection operator, is applied to identify effective sub-features related to schizophrenia classification by a grid search. Then the selected sub-features of each scale are input to support vector machine with linear kernel to classify schizophrenia patients and healthy controls respectively. To further improve the schizophrenia diagnostic performance, an ensemble learning framework named E-RCN is proposed to average the probabilities obtained by the classifiers of each scale in decision level. By leave-one-out cross-validation on the center for biomedical research excellence dataset (COBRE), our proposed method achieves encouraging diagnosis performance, outperforming several state-of-the-art methods. In addition, ranked by the occurence frequency of each brain region within the leave-one-out cross-validation experiments, some brain regions related to schizophrenia, i.e., thalamus and middle temporal gyrus, and important elaborate subregions, i.e., Tha_L_8_8, MTG_L_4_4 and MTG_R_4_4, are found.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117268811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Random Feature Augmentation for Domain Generalization in Medical Image Segmentation 医学图像分割领域泛化的随机特征增强

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9994999

Xu Zhao, Yuxin Kang, Hansheng Li, Jiayu Luo, Lei Cui, Jun Feng, Lin Yang

Deep convolutional neural networks (DCNNs) significantly improve the performance of medical image segmentation. Nevertheless, medical images frequently experience distribution discrepancies, which fails to maintain their robustness when applying trained models to unseen clinical data. To address this problem, domain generalization methods were proposed to enhance the generalization ability of DCNNs. Feature space-based data augmentation methods have proven their effectiveness to improve domain generalization. However, existing methods still mainly rely on certain prior knowledge or assumption, which has limitations in enriching the diversity of source domain data. In this paper, we propose a random feature augmentation (RFA) method to diversify source domain data at the feature level without prior knowledge. Specifically, we explore the effectiveness of random convolution at the feature level for the first time and prove experimentallyt hat itc an adequately preserve domain-invariant information while perturbing domainspecific information. Furthermore, tocapture the same domain-invariant information from the augmented features of RFA, we present a domain-invariant consistent learning strategy to enable DCNNs to learn a more generalized representation. Our proposed method achieves state-of-the-art performance on two medical image segmentation tasks, including optic cup/disc segmentation on fundus images and prostate segmentation on MRI images.

深度卷积神经网络(DCNNs)显著提高了医学图像分割的性能。然而，医学图像经常经历分布差异，当将训练模型应用于未见过的临床数据时，无法保持其鲁棒性。针对这一问题，提出了域泛化方法来增强DCNNs的泛化能力。基于特征空间的数据增强方法已被证明是提高领域泛化的有效方法。然而，现有的方法仍然主要依赖于一定的先验知识或假设，在丰富源领域数据的多样性方面存在局限性。本文提出了一种随机特征增强(RFA)方法，在不需要先验知识的情况下，在特征层面实现源域数据的多样化。具体来说，我们首次在特征水平上探索了随机卷积的有效性，并通过实验证明了它在干扰域特定信息的同时充分保留了域不变信息。此外，为了从RFA的增广特征中捕获相同的域不变信息，我们提出了一种域不变一致学习策略，使DCNNs能够学习更广义的表示。我们提出的方法在眼底图像的视杯/视盘分割和MRI图像的前列腺分割两个医学图像分割任务上达到了最先进的性能。

{"title":"A Random Feature Augmentation for Domain Generalization in Medical Image Segmentation","authors":"Xu Zhao, Yuxin Kang, Hansheng Li, Jiayu Luo, Lei Cui, Jun Feng, Lin Yang","doi":"10.1109/BIBM55620.2022.9994999","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9994999","url":null,"abstract":"Deep convolutional neural networks (DCNNs) significantly improve the performance of medical image segmentation. Nevertheless, medical images frequently experience distribution discrepancies, which fails to maintain their robustness when applying trained models to unseen clinical data. To address this problem, domain generalization methods were proposed to enhance the generalization ability of DCNNs. Feature space-based data augmentation methods have proven their effectiveness to improve domain generalization. However, existing methods still mainly rely on certain prior knowledge or assumption, which has limitations in enriching the diversity of source domain data. In this paper, we propose a random feature augmentation (RFA) method to diversify source domain data at the feature level without prior knowledge. Specifically, we explore the effectiveness of random convolution at the feature level for the first time and prove experimentallyt hat itc an adequately preserve domain-invariant information while perturbing domainspecific information. Furthermore, tocapture the same domain-invariant information from the augmented features of RFA, we present a domain-invariant consistent learning strategy to enable DCNNs to learn a more generalized representation. Our proposed method achieves state-of-the-art performance on two medical image segmentation tasks, including optic cup/disc segmentation on fundus images and prostate segmentation on MRI images.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123223379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Multi-View Representation Learning for Multi-Instance Learning with Applications to Medical Image Classification 多视图表示学习在医学图像分类中的应用

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995079

Lu Zhao, Liming Yuan, Zhenliang Li, Xianbin Wen

Multi-Instance Learning (MIL) is a weakly supervised learning paradigm, in which every training example is a labeled bag of unlabeled instances. In typical MIL applications, instances are often used for describing the features of regions/parts in a whole object, e.g., regional patches/lesions in an eye-fundus image. However, for a (semantically) complex part the standard MIL formulation puts a heavy burden on the representation ability of the corresponding instance. To alleviate this pressure, we still adopt a bag-of-instances as an example in this paper, but extract from each instance a set of representations using $1 times1$ convolutions. The advantages of this tactic are two-fold: i) This set of representations can be regarded as multi-view representations for an instance; ii) Compared to building multi-view representations directly from scratch, extracting them automatically using $1 times1$ convolutions is more economical, and may be more effective since $1 times1$ convolutions can be embedded into the whole network. Furthermore, we apply two consecutive multi-instance pooling operations on the reconstituted bag that has actually become a bag of sets of multi-view representations. We have conducted extensive experiments on several canonical MIL data sets from different application domains. The experimental results show that the proposed framework outperforms the standard MIL formulation in terms of classification performance and has good interpretability.

多实例学习(Multi-Instance Learning, MIL)是一种弱监督学习范式，其中每个训练样例都是未标记实例的标记袋。在典型的MIL应用中，实例通常用于描述整个物体中区域/部分的特征，例如眼底图像中的区域斑块/病变。然而，对于(语义)复杂的部件，标准的MIL公式给相应实例的表示能力带来了沉重的负担。为了减轻这种压力，我们在本文中仍然采用实例袋作为示例，但是从每个实例中提取一组使用$1 times1$卷积的表示。这种策略的优点是双重的:i)这组表示可以被视为一个实例的多视图表示;ii)与直接从头开始构建多视图表示相比，使用$1 times1$卷积自动提取它们更经济，并且可能更有效，因为$1 times1$卷积可以嵌入到整个网络中。此外，我们在重构后的包上应用了两个连续的多实例池化操作，这个包实际上已经变成了一个多视图表示集的包。我们对来自不同应用领域的几个规范MIL数据集进行了广泛的实验。实验结果表明，该框架在分类性能上优于标准MIL公式，具有良好的可解释性。

{"title":"Multi-View Representation Learning for Multi-Instance Learning with Applications to Medical Image Classification","authors":"Lu Zhao, Liming Yuan, Zhenliang Li, Xianbin Wen","doi":"10.1109/BIBM55620.2022.9995079","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995079","url":null,"abstract":"Multi-Instance Learning (MIL) is a weakly supervised learning paradigm, in which every training example is a labeled bag of unlabeled instances. In typical MIL applications, instances are often used for describing the features of regions/parts in a whole object, e.g., regional patches/lesions in an eye-fundus image. However, for a (semantically) complex part the standard MIL formulation puts a heavy burden on the representation ability of the corresponding instance. To alleviate this pressure, we still adopt a bag-of-instances as an example in this paper, but extract from each instance a set of representations using $1 times1$ convolutions. The advantages of this tactic are two-fold: i) This set of representations can be regarded as multi-view representations for an instance; ii) Compared to building multi-view representations directly from scratch, extracting them automatically using $1 times1$ convolutions is more economical, and may be more effective since $1 times1$ convolutions can be embedded into the whole network. Furthermore, we apply two consecutive multi-instance pooling operations on the reconstituted bag that has actually become a bag of sets of multi-view representations. We have conducted extensive experiments on several canonical MIL data sets from different application domains. The experimental results show that the proposed framework outperforms the standard MIL formulation in terms of classification performance and has good interpretability.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"430 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121965445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhanced CT Image Generation by GAN for Improving Thyroid Anatomy Detection 基于GAN的增强CT图像生成改进甲状腺解剖检测

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995366

Jianyu Shi, Xiaohong Liu, Guoxing Yang, Guangyu Wang

Computed tomography (CT) is one of the most imaging methods widely used to locate lesions such as nodules, tumors, and cysts, and make primary diagnosis. For clearer imaging of anatomical or lesions, contrast-enhanced CT (CECT) scans are imaging with injecting a contrast agent into a patient during examination. But there are limits to iodine contrast injections so that CECT scans are not convenient like non-contrast enhanced CT (NECT). Recently, deep learning models bring impressive results in computer vision, including image translation. So, we would like to apply image translation methods to generate CECT images from the more accessible NECT images, and evaluate the effects of generated images on image detection tasks. In this study, we propose a method called cross-modal enhancement training strategy for thyroid anatomy detection, which employs CycleGAN to translate non-constrast enhanced CT images to enhanced CT style images with content reserved. The experiments are conducted on thyroid CT images with anatomy object annotation. The experimental results show that by adding translated images into the training dataset, the performance of thyroid anatomy detection can be effectively improved. We achieve the best mAP of 82.5% compared to 73.2% in the along non-contrast enhanced CT training.

计算机断层扫描(CT)是目前广泛应用于结节、肿瘤、囊肿等病变定位和初步诊断的影像学方法之一。为了更清晰地成像解剖或病变，对比增强CT (CECT)扫描是在检查期间向患者注射造影剂进行成像。但是由于碘造影剂注射的限制，使得CECT扫描不像非对比增强CT (NECT)那样方便。最近，深度学习模型在计算机视觉领域取得了令人印象深刻的成果，包括图像翻译。因此，我们希望应用图像转换方法从更容易访问的NECT图像中生成CECT图像，并评估生成的图像对图像检测任务的影响。在本研究中，我们提出了一种名为交叉模态增强训练策略的甲状腺解剖检测方法，该方法使用CycleGAN将非对比增强CT图像转换为保留内容的增强CT图像。实验在带有解剖对象注释的甲状腺CT图像上进行。实验结果表明，将翻译后的图像添加到训练数据集中，可以有效地提高甲状腺解剖检测的性能。我们获得了82.5%的最佳mAP，而在沿程非对比增强CT训练中为73.2%。

{"title":"Enhanced CT Image Generation by GAN for Improving Thyroid Anatomy Detection","authors":"Jianyu Shi, Xiaohong Liu, Guoxing Yang, Guangyu Wang","doi":"10.1109/BIBM55620.2022.9995366","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995366","url":null,"abstract":"Computed tomography (CT) is one of the most imaging methods widely used to locate lesions such as nodules, tumors, and cysts, and make primary diagnosis. For clearer imaging of anatomical or lesions, contrast-enhanced CT (CECT) scans are imaging with injecting a contrast agent into a patient during examination. But there are limits to iodine contrast injections so that CECT scans are not convenient like non-contrast enhanced CT (NECT). Recently, deep learning models bring impressive results in computer vision, including image translation. So, we would like to apply image translation methods to generate CECT images from the more accessible NECT images, and evaluate the effects of generated images on image detection tasks. In this study, we propose a method called cross-modal enhancement training strategy for thyroid anatomy detection, which employs CycleGAN to translate non-constrast enhanced CT images to enhanced CT style images with content reserved. The experiments are conducted on thyroid CT images with anatomy object annotation. The experimental results show that by adding translated images into the training dataset, the performance of thyroid anatomy detection can be effectively improved. We achieve the best mAP of 82.5% compared to 73.2% in the along non-contrast enhanced CT training.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116991912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A knowledge graph embedding-based method for predicting the synergistic effects of drug combinations 基于知识图嵌入的药物联合协同效应预测方法

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995466

Peng Zhang, Shikui Tu

Predicting the synergistic effects of drug combinations can accelerate the identification process of novel potential combination therapies for clinical studies. Although extensive efforts have been made in the field, the problem is still challenging due to the high sparsity of drug combinations’ synergy data and the existence of false positive combinations resulted from the noise in experiments. In this paper, we develop a Knowledge Graph Embedding-based method for predicting the synergistic effects of Drug Combinations, namely KGE-DC, which fully extracts the features of drug combinations. Firstly, a largescale knowledge graph including drugs, targets, enzymes and transporters is constructed, therefore, the sparsity of the drug combinations’ data is reduced and the reliability of the data is increased. Then, knowledge graph embedding, which are capable of capturing complex semantic information of various entities in the knowledge graph, is adopted for learning low-dimensional representations for the drugs and cell lines. Finally, the synergy scores of drug combinations are predicted based on the drug and cell line embeddings of the drug combinations’ synergy data. Extensive experiments on benchmark dataset with four different synergy types demonstrate that KGE-DC outperforms state-of the-art methods on both the regression and classification tasks, namely predicting the synergy scores of drug combinations and predicting whether the drug combinations are synergistic combinations. Our results indicate that KGE-DC is a valuable tool to facilitate the discovery of novel combination therapies for cancer treatment. The implemented code and experimental dataset are available online at https://github.com/yushenshashen/KGE-DC.

预测药物联合的协同效应可以加速临床研究中新的潜在联合疗法的识别过程。尽管该领域已经做出了广泛的努力，但由于药物组合协同数据的高稀疏性以及实验噪声导致的假阳性组合的存在，该问题仍然具有挑战性。本文提出了一种基于知识图嵌入的药物联合协同效应预测方法，即KGE-DC，它充分提取了药物联合的特征。首先，构建了包括药物、靶标、酶和转运体在内的大尺度知识图谱，降低了药物组合数据的稀疏性，提高了数据的可靠性。然后，利用知识图嵌入技术捕获知识图中各个实体的复杂语义信息，学习药物和细胞系的低维表示;最后，根据药物组合协同数据的药物和细胞系嵌入来预测药物组合的协同得分。在四种不同协同类型的基准数据集上进行的大量实验表明，KGE-DC在回归和分类任务(即预测药物组合的协同得分和预测药物组合是否为协同组合)上都优于目前最先进的方法。我们的研究结果表明，KGE-DC是一个有价值的工具，有助于发现新的癌症治疗联合疗法。实现的代码和实验数据集可在https://github.com/yushenshashen/KGE-DC上在线获得。

{"title":"A knowledge graph embedding-based method for predicting the synergistic effects of drug combinations","authors":"Peng Zhang, Shikui Tu","doi":"10.1109/BIBM55620.2022.9995466","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995466","url":null,"abstract":"Predicting the synergistic effects of drug combinations can accelerate the identification process of novel potential combination therapies for clinical studies. Although extensive efforts have been made in the field, the problem is still challenging due to the high sparsity of drug combinations’ synergy data and the existence of false positive combinations resulted from the noise in experiments. In this paper, we develop a Knowledge Graph Embedding-based method for predicting the synergistic effects of Drug Combinations, namely KGE-DC, which fully extracts the features of drug combinations. Firstly, a largescale knowledge graph including drugs, targets, enzymes and transporters is constructed, therefore, the sparsity of the drug combinations’ data is reduced and the reliability of the data is increased. Then, knowledge graph embedding, which are capable of capturing complex semantic information of various entities in the knowledge graph, is adopted for learning low-dimensional representations for the drugs and cell lines. Finally, the synergy scores of drug combinations are predicted based on the drug and cell line embeddings of the drug combinations’ synergy data. Extensive experiments on benchmark dataset with four different synergy types demonstrate that KGE-DC outperforms state-of the-art methods on both the regression and classification tasks, namely predicting the synergy scores of drug combinations and predicting whether the drug combinations are synergistic combinations. Our results indicate that KGE-DC is a valuable tool to facilitate the discovery of novel combination therapies for cancer treatment. The implemented code and experimental dataset are available online at https://github.com/yushenshashen/KGE-DC.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124729181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Markov Guided Spatio-Temporal Networks for Brain Image Classification* 脑图像分类的马尔可夫引导时空网络*

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995528

Yupei Zhang, Yunan Xu, Rui An, Yuxin Li, Shuhui Liu, Xuequn Shang

This paper proposes a representation learning model to identify task-state fMRIs for knowledge-concept recognition, which has the potential to model the human cognitive expression system. The traditional CNN-LSTM is usually employed to learn deep features from fMRIs, where CNN aims at extracting the spatial structure and LSTM accounts for the temporal structure. However, the manifold smoothness of the latent features caused by the fMRI sequence is often ignored, leading to unsteady data representation. In this paper, we model latent features as a hidden Markov chain and introduce a Markov-guided Spatio-Temporal Network (MSTNet) for brain image representation. Concretely, MSTNet has three parts: CNN that aims to learn latent features from 3D fMRI frames where a Markov Regularization enforces the neighborhood frames to have similar features, LSTM integrates all frames of an fMRI sequence into a feature vector and fully connected network (FCN) that is to implement the brain image classification. Our model is trained towards minimizing the cross entropy (CE) loss. Our experiment is conducted on the brain fMRI datasets achieved by scanning college students when they were learning five concepts of computer science. The results show that the proposed MSTNet can benefit from the introduced Markov regularization and thus result in improved performance on the brain activity classification. This study not only shows an effective fMRI classification model with Markov regularization but also provides the potential to understand brain intelligence and help patients with language disabilities.

提出了一种用于知识概念识别的任务状态fmri表征学习模型，该模型具有模拟人类认知表达系统的潜力。传统的CNN-LSTM通常用于从fmri中学习深度特征，其中CNN的目的是提取空间结构，LSTM则是提取时间结构。然而，由于fMRI序列引起的潜在特征的流形平滑性往往被忽略，导致数据表示不稳定。在本文中，我们将潜在特征建模为一个隐马尔可夫链，并引入一个马尔可夫引导的时空网络(MSTNet)来表示脑图像。具体来说，MSTNet包括三个部分:CNN旨在从3D fMRI帧中学习潜在特征，其中马尔可夫正则化强制邻域帧具有相似特征;LSTM将fMRI序列的所有帧集成为特征向量和全连接网络(FCN)，实现脑图像分类。我们的模型是朝着最小化交叉熵(CE)损失的方向训练的。我们的实验是在大学生学习计算机科学的五个概念时通过扫描获得的大脑fMRI数据集上进行的。结果表明，所提出的MSTNet可以从引入的马尔可夫正则化中获益，从而提高了脑活动分类的性能。本研究不仅展示了一种有效的马尔可夫正则化fMRI分类模型，而且为理解大脑智力和帮助语言障碍患者提供了潜力。

{"title":"Markov Guided Spatio-Temporal Networks for Brain Image Classification*","authors":"Yupei Zhang, Yunan Xu, Rui An, Yuxin Li, Shuhui Liu, Xuequn Shang","doi":"10.1109/BIBM55620.2022.9995528","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995528","url":null,"abstract":"This paper proposes a representation learning model to identify task-state fMRIs for knowledge-concept recognition, which has the potential to model the human cognitive expression system. The traditional CNN-LSTM is usually employed to learn deep features from fMRIs, where CNN aims at extracting the spatial structure and LSTM accounts for the temporal structure. However, the manifold smoothness of the latent features caused by the fMRI sequence is often ignored, leading to unsteady data representation. In this paper, we model latent features as a hidden Markov chain and introduce a Markov-guided Spatio-Temporal Network (MSTNet) for brain image representation. Concretely, MSTNet has three parts: CNN that aims to learn latent features from 3D fMRI frames where a Markov Regularization enforces the neighborhood frames to have similar features, LSTM integrates all frames of an fMRI sequence into a feature vector and fully connected network (FCN) that is to implement the brain image classification. Our model is trained towards minimizing the cross entropy (CE) loss. Our experiment is conducted on the brain fMRI datasets achieved by scanning college students when they were learning five concepts of computer science. The results show that the proposed MSTNet can benefit from the introduced Markov regularization and thus result in improved performance on the brain activity classification. This study not only shows an effective fMRI classification model with Markov regularization but also provides the potential to understand brain intelligence and help patients with language disabilities.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124730893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An interleaved hardware-accelerated k-mer parser 交错硬件加速的k-mer解析器

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Pub Date : 2022-12-06 DOI: 10.1109/BIBM55620.2022.9995126

F. Milicchio, Marco Oliva, Mattia C. F. Prosperi

Advances in next-generation sequencing (NGS) have not only increased the overall throughput of genomic content (e.g. Illumina NovaSeq up to 6, 000GB), but also provided technology miniaturization (e.g. Oxford Nanopore MinION) enabling real-time, mobile experiments. Single Instruction/Multiple Data (SIMD) hardware acceleration is increasingly used to improve performance of NGS data processing tools, while generic template programming libraries are advantageous to adapt to the fast changes in sequencing and computing platforms. We here present a novel k-mer parser written in ISO C++ that exploits an interleaved, non-sequential, hardware accelerated SIMD implementation within a generic programming framework called libseq. We benchmarked our k-mer parser using different NGS experimental datasets comparing with other two popular k-mer counting tools (DSK and KMC3). On an Intel machine with AVX2 (Quad-Core Intel Core i5 CPU, 32 GB RAM), using simulated in-memory reads, DSK and KMC3 were on average 3. 6x and 1. 03x times slower than our parser across k value ranges of 35-63. On real sequencing experiments, DSK and KMC3 were on average 8. 3x and 28. 8x times slower in file/read parsing and k-mer building than ours. Since our tool uses generic programming, other methods that rely on k-mers (e.g. de Bruijn graphs) can directly benefit from its SIMD acceleration. Our k-mer parser and libseq 2.0 are released under the BSD license and available at https://zenodo.org/record/7015294.

新一代测序技术(NGS)的进步不仅提高了基因组内容的总体通量(例如Illumina NovaSeq高达6000 gb)，而且还提供了技术小型化(例如Oxford Nanopore MinION)，使实时、移动实验成为可能。单指令/多数据(SIMD)硬件加速越来越多地用于提高NGS数据处理工具的性能，而通用模板编程库则有利于适应测序和计算平台的快速变化。我们在这里提出了一个用ISO c++编写的新颖的k-mer解析器，它利用了在称为libseq的通用编程框架内的交错、非顺序、硬件加速的SIMD实现。我们使用不同的NGS实验数据集对我们的k-mer解析器进行基准测试，并与其他两种流行的k-mer计数工具(DSK和KMC3)进行比较。在使用AVX2(四核英特尔酷睿i5 CPU, 32 GB RAM)的英特尔机器上，使用模拟内存读取，DSK和KMC3平均为3。6x和1。在k值范围为35-63时，比我们的解析器慢0.3倍。在真实测序实验中，DSK和KMC3平均为8。3x和28。在文件/读取解析和k-mer构建方面比我们慢8倍。由于我们的工具使用泛型编程，其他依赖k-mers的方法(例如de Bruijn图)可以直接受益于它的SIMD加速。我们的k-mer解析器和libseq 2.0是在BSD许可下发布的，可以在https://zenodo.org/record/7015294上获得。

{"title":"An interleaved hardware-accelerated k-mer parser","authors":"F. Milicchio, Marco Oliva, Mattia C. F. Prosperi","doi":"10.1109/BIBM55620.2022.9995126","DOIUrl":"https://doi.org/10.1109/BIBM55620.2022.9995126","url":null,"abstract":"Advances in next-generation sequencing (NGS) have not only increased the overall throughput of genomic content (e.g. Illumina NovaSeq up to 6, 000GB), but also provided technology miniaturization (e.g. Oxford Nanopore MinION) enabling real-time, mobile experiments. Single Instruction/Multiple Data (SIMD) hardware acceleration is increasingly used to improve performance of NGS data processing tools, while generic template programming libraries are advantageous to adapt to the fast changes in sequencing and computing platforms. We here present a novel k-mer parser written in ISO C++ that exploits an interleaved, non-sequential, hardware accelerated SIMD implementation within a generic programming framework called libseq. We benchmarked our k-mer parser using different NGS experimental datasets comparing with other two popular k-mer counting tools (DSK and KMC3). On an Intel machine with AVX2 (Quad-Core Intel Core i5 CPU, 32 GB RAM), using simulated in-memory reads, DSK and KMC3 were on average 3. 6x and 1. 03x times slower than our parser across k value ranges of 35-63. On real sequencing experiments, DSK and KMC3 were on average 8. 3x and 28. 8x times slower in file/read parsing and k-mer building than ours. Since our tool uses generic programming, other methods that rely on k-mers (e.g. de Bruijn graphs) can directly benefit from its SIMD acceleration. Our k-mer parser and libseq 2.0 are released under the BSD license and available at https://zenodo.org/record/7015294.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129419395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0