Interdisciplinary Sciences: Computational Life Sciences最新文献_第8页

Deep Canonical Correlation Fusion Algorithm Based on Denoising Autoencoder for ASD Diagnosis and Pathogenic Brain Region Identification 基于去噪自编码器的深度典型相关融合算法用于 ASD 诊断和致病脑区识别

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-04-04 DOI: 10.1007/s12539-024-00625-y

Huilian Zhang, Jie Chen, Bo Liao, Fang-xiang Wu, Xia-an Bi

Autism Spectrum Disorder (ASD) is defined as a neurodevelopmental condition distinguished by unconventional neural activities. Early intervention is key to managing the progress of ASD, and current research primarily focuses on the use of structural magnetic resonance imaging (sMRI) or resting-state functional magnetic resonance imaging (rs-fMRI) for diagnosis. Moreover, the use of autoencoders for disease classification has not been sufficiently explored. In this study, we introduce a new framework based on autoencoder, the Deep Canonical Correlation Fusion algorithm based on Denoising Autoencoder (DCCF-DAE), which proves to be effective in handling high-dimensional data. This framework involves efficient feature extraction from different types of data with an advanced autoencoder, followed by the fusion of these features through the DCCF model. Then we utilize the fused features for disease classification. DCCF integrates functional and structural data to help accurately diagnose ASD and identify critical Regions of Interest (ROIs) in disease mechanisms. We compare the proposed framework with other methods by the Autism Brain Imaging Data Exchange (ABIDE) database and the results demonstrate its outstanding performance in ASD diagnosis. The superiority of DCCF-DAE highlights its potential as a crucial tool for early ASD diagnosis and monitoring.

Graphical Abstract

自闭症谱系障碍（ASD）被定义为一种以非常规神经活动为特征的神经发育疾病。早期干预是控制自闭症进展的关键，目前的研究主要集中在使用结构磁共振成像（sMRI）或静息态功能磁共振成像（rs-fMRI）进行诊断。此外，使用自编码器进行疾病分类的研究还不够深入。在本研究中，我们介绍了一种基于自编码器的新框架，即基于去噪自编码器的深度典范相关融合算法（DCCF-DAE），事实证明它能有效处理高维数据。该框架包括利用先进的自动编码器从不同类型的数据中高效提取特征，然后通过 DCCF 模型融合这些特征。然后，我们利用融合后的特征进行疾病分类。DCCF 整合了功能和结构数据，有助于准确诊断 ASD 并识别疾病机制中的关键感兴趣区 (ROI)。我们通过自闭症脑成像数据交换（ABIDE）数据库将所提出的框架与其他方法进行了比较，结果表明其在 ASD 诊断中表现出色。DCCF-DAE 的优越性凸显了它作为早期 ASD 诊断和监测的重要工具的潜力。

{"title":"Deep Canonical Correlation Fusion Algorithm Based on Denoising Autoencoder for ASD Diagnosis and Pathogenic Brain Region Identification","authors":"Huilian Zhang, Jie Chen, Bo Liao, Fang-xiang Wu, Xia-an Bi","doi":"10.1007/s12539-024-00625-y","DOIUrl":"https://doi.org/10.1007/s12539-024-00625-y","url":null,"abstract":"Autism Spectrum Disorder (ASD) is defined as a neurodevelopmental condition distinguished by unconventional neural activities. Early intervention is key to managing the progress of ASD, and current research primarily focuses on the use of structural magnetic resonance imaging (sMRI) or resting-state functional magnetic resonance imaging (rs-fMRI) for diagnosis. Moreover, the use of autoencoders for disease classification has not been sufficiently explored. In this study, we introduce a new framework based on autoencoder, the Deep Canonical Correlation Fusion algorithm based on Denoising Autoencoder (DCCF-DAE), which proves to be effective in handling high-dimensional data. This framework involves efficient feature extraction from different types of data with an advanced autoencoder, followed by the fusion of these features through the DCCF model. Then we utilize the fused features for disease classification. DCCF integrates functional and structural data to help accurately diagnose ASD and identify critical Regions of Interest (ROIs) in disease mechanisms. We compare the proposed framework with other methods by the Autism Brain Imaging Data Exchange (ABIDE) database and the results demonstrate its outstanding performance in ASD diagnosis. The superiority of DCCF-DAE highlights its potential as a crucial tool for early ASD diagnosis and monitoring.<h3 data-test=\"abstract-sub-heading\">Graphical Abstract</h3>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":"40 1","pages":""},"PeriodicalIF":4.8,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DeepPI: Alignment-Free Analysis of Flexible Length Proteins Based on Deep Learning and Image Generator DeepPI：基于深度学习和图像生成器的灵活长度蛋白质免对齐分析

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-04-03 DOI: 10.1007/s12539-024-00618-x

Mingeun Ji, Yejin Kan, Dongyeon Kim, Seungmin Lee, Gangman Yi

With the rapid development of NGS technology, the number of protein sequences has increased exponentially. Computational methods have been introduced in protein functional studies because the analysis of large numbers of proteins through biological experiments is costly and time-consuming. In recent years, new approaches based on deep learning have been proposed to overcome the limitations of conventional methods. Although deep learning-based methods effectively utilize features of protein function, they are limited to sequences of fixed-length and consider information from adjacent amino acids. Therefore, new protein analysis tools that extract functional features from proteins of flexible length and train models are required. We introduce DeepPI, a deep learning-based tool for analyzing proteins in large-scale database. The proposed model that utilizes Global Average Pooling is applied to proteins of flexible length and leads to reduced information loss compared to existing algorithms that use fixed sizes. The image generator converts a one-dimensional sequence into a distinct two-dimensional structure, which can extract common parts of various shapes. Finally, filtering techniques automatically detect representative data from the entire database and ensure coverage of large protein databases. We demonstrate that DeepPI has been successfully applied to large databases such as the Pfam-A database. Comparative experiments on four types of image generators illustrated the impact of structure on feature extraction. The filtering performance was verified by varying the parameter values and proved to be applicable to large databases. Compared to existing methods, DeepPI outperforms in family classification accuracy for protein function inference.

Graphical Abstract

随着 NGS 技术的快速发展，蛋白质序列的数量呈指数级增长。由于通过生物实验分析大量蛋白质既费钱又费时，因此蛋白质功能研究引入了计算方法。近年来，人们提出了基于深度学习的新方法，以克服传统方法的局限性。虽然基于深度学习的方法能有效利用蛋白质功能的特征，但它们仅限于固定长度的序列，并考虑相邻氨基酸的信息。因此，需要能从长度灵活的蛋白质中提取功能特征并训练模型的新蛋白质分析工具。我们介绍了 DeepPI，这是一种基于深度学习的工具，用于分析大规模数据库中的蛋白质。利用全局平均池化技术提出的模型适用于长度灵活的蛋白质，与使用固定大小的现有算法相比，可减少信息损失。图像生成器可将一维序列转换为独特的二维结构，从而提取出各种形状的共同部分。最后，过滤技术可自动检测整个数据库中的代表性数据，确保覆盖大型蛋白质数据库。我们证明，DeepPI 已成功应用于 Pfam-A 数据库等大型数据库。四种图像生成器的对比实验说明了结构对特征提取的影响。通过改变参数值验证了过滤性能，并证明其适用于大型数据库。与现有方法相比，DeepPI在蛋白质功能推断方面的族分类准确性更胜一筹。

{"title":"DeepPI: Alignment-Free Analysis of Flexible Length Proteins Based on Deep Learning and Image Generator","authors":"Mingeun Ji, Yejin Kan, Dongyeon Kim, Seungmin Lee, Gangman Yi","doi":"10.1007/s12539-024-00618-x","DOIUrl":"https://doi.org/10.1007/s12539-024-00618-x","url":null,"abstract":"With the rapid development of NGS technology, the number of protein sequences has increased exponentially. Computational methods have been introduced in protein functional studies because the analysis of large numbers of proteins through biological experiments is costly and time-consuming. In recent years, new approaches based on deep learning have been proposed to overcome the limitations of conventional methods. Although deep learning-based methods effectively utilize features of protein function, they are limited to sequences of fixed-length and consider information from adjacent amino acids. Therefore, new protein analysis tools that extract functional features from proteins of flexible length and train models are required. We introduce DeepPI, a deep learning-based tool for analyzing proteins in large-scale database. The proposed model that utilizes Global Average Pooling is applied to proteins of flexible length and leads to reduced information loss compared to existing algorithms that use fixed sizes. The image generator converts a one-dimensional sequence into a distinct two-dimensional structure, which can extract common parts of various shapes. Finally, filtering techniques automatically detect representative data from the entire database and ensure coverage of large protein databases. We demonstrate that DeepPI has been successfully applied to large databases such as the Pfam-A database. Comparative experiments on four types of image generators illustrated the impact of structure on feature extraction. The filtering performance was verified by varying the parameter values and proved to be applicable to large databases. Compared to existing methods, DeepPI outperforms in family classification accuracy for protein function inference.<h3 data-test=\"abstract-sub-heading\">Graphical Abstract</h3>\u0000","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":"51 1","pages":""},"PeriodicalIF":4.8,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PPSNO: A Feature-Rich SNO Sites Predictor by Stacking Ensemble Strategy from Protein Sequence-Derived Information. PPSNO：通过蛋白质序列信息的堆叠集合策略预测特征丰富的 SNO 位点。

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2024-01-11 DOI: 10.1007/s12539-023-00595-7

Lun Zhu, Liuyang Wang, Zexi Yang, Piao Xu, Sen Yang

The protein S-nitrosylation (SNO) is a significant post-translational modification that affects the stability, activity, cellular localization, and function of proteins. Therefore, highly accurate prediction of SNO sites aids in grasping biological function mechanisms. In this document, we have constructed a predictor, named PPSNO, forecasting protein SNO sites using stacked integrated learning. PPSNO integrates multiple machine learning techniques into an ensemble model, enhancing its predictive accuracy. First, we established benchmark datasets by collecting SNO sites from various sources, including literature, databases, and other predictors. Second, various techniques for feature extraction are applied to derive characteristics from protein sequences, which are subsequently amalgamated into the PPSNO predictor for training. Five-fold cross-validation experiments show that PPSNO outperformed existing predictors, such as PSNO, PreSNO, pCysMod, DeepNitro, RecSNO, and Mul-SNO. The PPSNO predictor achieved an impressive accuracy of 92.8%, an area under the curve (AUC) of 96.1%, a Matthews correlation coefficient (MCC) of 81.3%, an F1-score of 85.6%, an SN of 79.3%, an SP of 97.7%, and an average precision (AP) of 92.2%. We also employed ROC curves, PR curves, and radar plots to show the superior performance of PPSNO. Our study shows that fused protein sequence features and two-layer stacked ensemble models can improve the accuracy of predicting SNO sites, which can aid in comprehending cellular processes and disease mechanisms. The codes and data are available at https://github.com/serendipity-wly/PPSNO .

蛋白质 S-亚硝基化（SNO）是一种重要的翻译后修饰，会影响蛋白质的稳定性、活性、细胞定位和功能。因此，高精度预测 SNO 位点有助于掌握生物功能机制。在本文中，我们构建了一个名为 PPSNO 的预测器，利用堆叠集成学习预测蛋白质 SNO 位点。PPSNO 将多种机器学习技术集成到一个集合模型中，提高了预测精度。首先，我们建立了基准数据集，从文献、数据库和其他预测器等不同来源收集 SNO 位点。其次，应用各种特征提取技术从蛋白质序列中提取特征，然后将其合并到 PPSNO 预测器中进行训练。五倍交叉验证实验表明，PPSNO 优于现有的预测器，如 PSNO、PreSNO、pCysMod、DeepNitro、RecSNO 和 Mul-SNO。PPSNO 预测器的准确率高达 92.8%，曲线下面积 (AUC) 为 96.1%，马修斯相关系数 (MCC) 为 81.3%，F1 分数为 85.6%，SN 为 79.3%，SP 为 97.7%，平均精确度 (AP) 为 92.2%。我们还采用了 ROC 曲线、PR 曲线和雷达图来显示 PPSNO 的优越性能。我们的研究表明，融合蛋白质序列特征和双层堆叠集合模型可以提高预测 SNO 位点的准确性，有助于理解细胞过程和疾病机制。代码和数据可在 https://github.com/serendipity-wly/PPSNO 上获取。

{"title":"PPSNO: A Feature-Rich SNO Sites Predictor by Stacking Ensemble Strategy from Protein Sequence-Derived Information.","authors":"Lun Zhu, Liuyang Wang, Zexi Yang, Piao Xu, Sen Yang","doi":"10.1007/s12539-023-00595-7","DOIUrl":"10.1007/s12539-023-00595-7","url":null,"abstract":"The protein S-nitrosylation (SNO) is a significant post-translational modification that affects the stability, activity, cellular localization, and function of proteins. Therefore, highly accurate prediction of SNO sites aids in grasping biological function mechanisms. In this document, we have constructed a predictor, named PPSNO, forecasting protein SNO sites using stacked integrated learning. PPSNO integrates multiple machine learning techniques into an ensemble model, enhancing its predictive accuracy. First, we established benchmark datasets by collecting SNO sites from various sources, including literature, databases, and other predictors. Second, various techniques for feature extraction are applied to derive characteristics from protein sequences, which are subsequently amalgamated into the PPSNO predictor for training. Five-fold cross-validation experiments show that PPSNO outperformed existing predictors, such as PSNO, PreSNO, pCysMod, DeepNitro, RecSNO, and Mul-SNO. The PPSNO predictor achieved an impressive accuracy of 92.8%, an area under the curve (AUC) of 96.1%, a Matthews correlation coefficient (MCC) of 81.3%, an F1-score of 85.6%, an SN of 79.3%, an SP of 97.7%, and an average precision (AP) of 92.2%. We also employed ROC curves, PR curves, and radar plots to show the superior performance of PPSNO. Our study shows that fused protein sequence features and two-layer stacked ensemble models can improve the accuracy of predicting SNO sites, which can aid in comprehending cellular processes and disease mechanisms. The codes and data are available at https://github.com/serendipity-wly/PPSNO .","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"192-217"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139416998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cervical Cancer Classification From Pap Smear Images Using Deep Convolutional Neural Network Models. 使用深度卷积神经网络模型从子宫颈抹片图像中分类宫颈癌。

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2023-11-14 DOI: 10.1007/s12539-023-00589-5

Sher Lyn Tan, Ganeshsree Selvachandran, Weiping Ding, Raveendran Paramesran, Ketan Kotecha

As one of the most common female cancers, cervical cancer often develops years after a prolonged and reversible pre-cancerous stage. Traditional classification algorithms used for detection of cervical cancer often require cell segmentation and feature extraction techniques, while convolutional neural network (CNN) models demand a large dataset to mitigate over-fitting and poor generalization problems. To this end, this study aims to develop deep learning models for automated cervical cancer detection that do not rely on segmentation methods or custom features. Due to limited data availability, transfer learning was employed with pre-trained CNN models to directly operate on Pap smear images for a seven-class classification task. Thorough evaluation and comparison of 13 pre-trained deep CNN models were performed using the publicly available Herlev dataset and the Keras package in Google Collaboratory. In terms of accuracy and performance, DenseNet-201 is the best-performing model. The pre-trained CNN models studied in this paper produced good experimental results and required little computing time.

作为最常见的女性癌症之一，宫颈癌通常在长期可逆的癌前阶段发展数年。用于宫颈癌检测的传统分类算法通常需要细胞分割和特征提取技术，而卷积神经网络(CNN)模型需要大数据集来缓解过拟合和泛化不良的问题。为此，本研究旨在开发不依赖于分割方法或自定义特征的宫颈癌自动检测的深度学习模型。由于数据可用性有限，迁移学习与预训练的CNN模型一起直接对巴氏涂片图像进行七类分类任务。使用公开可用的Herlev数据集和谷歌协作实验室的Keras包对13个预训练的深度CNN模型进行了全面的评估和比较。在准确性和性能方面，DenseNet-201是性能最好的模型。本文研究的预训练CNN模型实验结果良好，计算时间短。

{"title":"Cervical Cancer Classification From Pap Smear Images Using Deep Convolutional Neural Network Models.","authors":"Sher Lyn Tan, Ganeshsree Selvachandran, Weiping Ding, Raveendran Paramesran, Ketan Kotecha","doi":"10.1007/s12539-023-00589-5","DOIUrl":"10.1007/s12539-023-00589-5","url":null,"abstract":"As one of the most common female cancers, cervical cancer often develops years after a prolonged and reversible pre-cancerous stage. Traditional classification algorithms used for detection of cervical cancer often require cell segmentation and feature extraction techniques, while convolutional neural network (CNN) models demand a large dataset to mitigate over-fitting and poor generalization problems. To this end, this study aims to develop deep learning models for automated cervical cancer detection that do not rely on segmentation methods or custom features. Due to limited data availability, transfer learning was employed with pre-trained CNN models to directly operate on Pap smear images for a seven-class classification task. Thorough evaluation and comparison of 13 pre-trained deep CNN models were performed using the publicly available Herlev dataset and the Keras package in Google Collaboratory. In terms of accuracy and performance, DenseNet-201 is the best-performing model. The pre-trained CNN models studied in this paper produced good experimental results and required little computing time.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"16-38"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10881721/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92153676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of Semi-supervised Fuzzy Clustering Based on Knowledge Weighting and Cluster Center Learning to Mammary Molybdenum Target Image Segmentation. 基于知识加权和聚类中心学习的半监督模糊聚类在乳腺钼靶图像分割中的应用

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2023-07-24 DOI: 10.1007/s12539-023-00580-0

Peng Peng, Danping Wu, Li-Jun Huang, Jianqiang Wang, Li Zhang, Yue Wu, Yizhang Jiang, Zhihua Lu, Khin Wee Lai, Kaijian Xia

Breast cancer is commonly diagnosed with mammography. Using image segmentation algorithms to separate lesion areas in mammography can facilitate diagnosis by doctors and reduce their workload, which has important clinical significance. Because large, accurately labeled medical image datasets are difficult to obtain, traditional clustering algorithms are widely used in medical image segmentation as an unsupervised model. Traditional unsupervised clustering algorithms have limited learning knowledge. Moreover, some semi-supervised fuzzy clustering algorithms cannot fully mine the information of labeled samples, which results in insufficient supervision. When faced with complex mammography images, the above algorithms cannot accurately segment lesion areas. To address this, a semi-supervised fuzzy clustering based on knowledge weighting and cluster center learning (WSFCM_V) is presented. According to prior knowledge, three learning modes are proposed: a knowledge weighting method for cluster centers, Euclidean distance weights for unlabeled samples, and learning from the cluster centers of labeled sample sets. These strategies improve the clustering performance. On real breast molybdenum target images, the WSFCM_V algorithm is compared with currently popular semi-supervised and unsupervised clustering algorithms. WSFCM_V has the best evaluation index values. Experimental results demonstrate that compared with the existing clustering algorithms, WSFCM_V has a higher segmentation accuracy than other clustering algorithms, both for larger lesion regions like tumor areas and for smaller lesion areas like calcification point areas.

乳腺癌通常是通过乳房 X 射线照相术诊断出来的。利用图像分割算法来分离乳腺 X 射线照相术中的病变区域，可以方便医生进行诊断，减少医生的工作量，具有重要的临床意义。由于难以获得大量准确标记的医学图像数据集，传统聚类算法作为一种无监督模型被广泛应用于医学图像分割。传统的无监督聚类算法学习知识有限。此外，一些半监督模糊聚类算法无法充分挖掘标记样本的信息，导致监督不足。面对复杂的乳腺 X 射线图像，上述算法无法准确分割病变区域。为此，本文提出了一种基于知识加权和聚类中心学习的半监督模糊聚类算法（WSFCM_V）。根据先验知识，提出了三种学习模式：对聚类中心的知识加权法、对未标记样本的欧氏距离加权法以及从标记样本集的聚类中心学习法。这些策略提高了聚类性能。在真实的乳腺钼靶图像上，WSFCM_V 算法与目前流行的半监督和无监督聚类算法进行了比较。WSFCM_V 的评价指标值最佳。实验结果表明，与现有的聚类算法相比，WSFCM_V 在较大病变区域（如肿瘤区域）和较小病变区域（如钙化点区域）的分割准确率均高于其他聚类算法。

{"title":"Application of Semi-supervised Fuzzy Clustering Based on Knowledge Weighting and Cluster Center Learning to Mammary Molybdenum Target Image Segmentation.","authors":"Peng Peng, Danping Wu, Li-Jun Huang, Jianqiang Wang, Li Zhang, Yue Wu, Yizhang Jiang, Zhihua Lu, Khin Wee Lai, Kaijian Xia","doi":"10.1007/s12539-023-00580-0","DOIUrl":"10.1007/s12539-023-00580-0","url":null,"abstract":"Breast cancer is commonly diagnosed with mammography. Using image segmentation algorithms to separate lesion areas in mammography can facilitate diagnosis by doctors and reduce their workload, which has important clinical significance. Because large, accurately labeled medical image datasets are difficult to obtain, traditional clustering algorithms are widely used in medical image segmentation as an unsupervised model. Traditional unsupervised clustering algorithms have limited learning knowledge. Moreover, some semi-supervised fuzzy clustering algorithms cannot fully mine the information of labeled samples, which results in insufficient supervision. When faced with complex mammography images, the above algorithms cannot accurately segment lesion areas. To address this, a semi-supervised fuzzy clustering based on knowledge weighting and cluster center learning (WSFCM_V) is presented. According to prior knowledge, three learning modes are proposed: a knowledge weighting method for cluster centers, Euclidean distance weights for unlabeled samples, and learning from the cluster centers of labeled sample sets. These strategies improve the clustering performance. On real breast molybdenum target images, the WSFCM_V algorithm is compared with currently popular semi-supervised and unsupervised clustering algorithms. WSFCM_V has the best evaluation index values. Experimental results demonstrate that compared with the existing clustering algorithms, WSFCM_V has a higher segmentation accuracy than other clustering algorithms, both for larger lesion regions like tumor areas and for smaller lesion areas like calcification point areas.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"39-57"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9849849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computer-Aided Diagnosis of Complications After Liver Transplantation Based on Transfer Learning. 基于迁移学习的肝移植术后并发症计算机辅助诊断。

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2023-10-25 DOI: 10.1007/s12539-023-00588-6

Ying Zhang, Chenyuan Shangguan, Xuena Zhang, Jialin Ma, Jiyuan He, Meng Jia, Na Chen

Liver transplantation is one of the most effective treatments for acute liver failure, cirrhosis, and even liver cancer. The prediction of postoperative complications is of great significance for liver transplantation. However, the existing prediction methods based on traditional machine learning are often unavailable or unreliable due to the insufficient amount of real liver transplantation data. Therefore, we propose a new framework to increase the accuracy of computer-aided diagnosis of complications after liver transplantation with transfer learning, which can handle small-scale but high-dimensional data problems. Furthermore, since data samples are often high dimensional in the real world, capturing key features that influence postoperative complications can help make the correct diagnosis for patients. So, we also introduce the SHapley Additive exPlanation (SHAP) method into our framework for exploring the key features of postoperative complications. We used data obtained from 425 patients with 456 features in our experiments. Experimental results show that our approach outperforms all compared baseline methods in predicting postoperative complications. In our work, the average precision, the mean recall, and the mean F1 score reach 91.22%, 91.70%, and 91.18%, respectively.

肝移植是治疗急性肝功能衰竭、肝硬化甚至癌症最有效的方法之一。术后并发症的预测对肝移植具有重要意义。然而，由于真实肝移植数据量不足，现有的基于传统机器学习的预测方法往往不可用或不可靠。因此，我们提出了一个新的框架，通过迁移学习来提高肝移植后并发症计算机辅助诊断的准确性，该框架可以处理小规模但高维的数据问题。此外，由于数据样本在现实世界中往往是高维的，捕捉影响术后并发症的关键特征有助于为患者做出正确诊断。因此，我们还将SHapley加性exPlanation（SHAP）方法引入我们的框架中，以探索术后并发症的关键特征。在我们的实验中，我们使用了425名具有456个特征的患者的数据。实验结果表明，我们的方法在预测术后并发症方面优于所有比较的基线方法。在我们的工作中，平均精确度、平均召回率和平均F1得分分别达到91.22%、91.70%和91.18%。

{"title":"Computer-Aided Diagnosis of Complications After Liver Transplantation Based on Transfer Learning.","authors":"Ying Zhang, Chenyuan Shangguan, Xuena Zhang, Jialin Ma, Jiyuan He, Meng Jia, Na Chen","doi":"10.1007/s12539-023-00588-6","DOIUrl":"10.1007/s12539-023-00588-6","url":null,"abstract":"Liver transplantation is one of the most effective treatments for acute liver failure, cirrhosis, and even liver cancer. The prediction of postoperative complications is of great significance for liver transplantation. However, the existing prediction methods based on traditional machine learning are often unavailable or unreliable due to the insufficient amount of real liver transplantation data. Therefore, we propose a new framework to increase the accuracy of computer-aided diagnosis of complications after liver transplantation with transfer learning, which can handle small-scale but high-dimensional data problems. Furthermore, since data samples are often high dimensional in the real world, capturing key features that influence postoperative complications can help make the correct diagnosis for patients. So, we also introduce the SHapley Additive exPlanation (SHAP) method into our framework for exploring the key features of postoperative complications. We used data obtained from 425 patients with 456 features in our experiments. Experimental results show that our approach outperforms all compared baseline methods in predicting postoperative complications. In our work, the average precision, the mean recall, and the mean F1 score reach 91.22%, 91.70%, and 91.18%, respectively.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"123-140"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50157822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Deep Neural Network for Predicting Synergistic Drug Combinations on Cancer. 预测癌症协同药物组合的深度神经网络

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2024-01-06 DOI: 10.1007/s12539-023-00596-6

Shiyu Yan, Ding Zheng

The exploration of drug combinations presents an opportunity to amplify therapeutic effectiveness while alleviating undesirable side effects. Nevertheless, the extensive array of potential combinations poses challenges in terms of cost and time constraints for experimental screening. Thus, it is crucial to narrow down the search space. Deep learning approaches have gained widespread popularity in predicting synergistic drug combinations tailored for specific cell lines in vitro settings. In the present study, we introduce a novel method termed GTextSyn, which utilizes the integration of gene expression data and chemical structure information for the prediction of synergistic effects in drug combinations. GTextSyn employs a sentence classification model within the domain of Natural Language Processing (NLP), wherein drugs and cell lines are regarded as entities possessing biochemical relevance. Meanwhile, combinations of drug pairs and cell lines are construed as sentences with biochemical relational significance. To assess the efficacy of GTextSyn, we conduct a comparative analysis with alternative deep learning approaches using a standard benchmark dataset. The results from a five-fold cross-validation demonstrate a 49.5% reduction in Mean Square Error (MSE) achieved by GTextSyn, surpassing the performance of the next best method in the regression task. Furthermore, we conduct a comprehensive literature survey on the predicted novel drug combinations and find substantial support from prior experimental studies for many of the combinations identified by GTextSyn.

对药物组合的探索为提高治疗效果并减轻不良副作用提供了机会。然而，大量潜在的药物组合给实验筛选带来了成本和时间限制方面的挑战。因此，缩小搜索空间至关重要。深度学习方法在预测针对特定细胞系的体外协同药物组合方面受到广泛欢迎。在本研究中，我们介绍了一种名为 GTextSyn 的新方法，它利用基因表达数据和化学结构信息的整合来预测药物组合的协同效应。GTextSyn 采用了自然语言处理（NLP）领域的句子分类模型，其中药物和细胞系被视为具有生化相关性的实体。同时，药物对和细胞系的组合被视为具有生化关系意义的句子。为了评估 GTextSyn 的功效，我们使用标准基准数据集与其他深度学习方法进行了比较分析。五倍交叉验证的结果表明，GTextSyn 的均方误差（MSE）降低了 49.5%，超过了回归任务中次好方法的性能。此外，我们还对预测的新型药物组合进行了全面的文献调查，发现 GTextSyn 识别出的许多组合都得到了先前实验研究的大力支持。

{"title":"A Deep Neural Network for Predicting Synergistic Drug Combinations on Cancer.","authors":"Shiyu Yan, Ding Zheng","doi":"10.1007/s12539-023-00596-6","DOIUrl":"10.1007/s12539-023-00596-6","url":null,"abstract":"The exploration of drug combinations presents an opportunity to amplify therapeutic effectiveness while alleviating undesirable side effects. Nevertheless, the extensive array of potential combinations poses challenges in terms of cost and time constraints for experimental screening. Thus, it is crucial to narrow down the search space. Deep learning approaches have gained widespread popularity in predicting synergistic drug combinations tailored for specific cell lines in vitro settings. In the present study, we introduce a novel method termed GTextSyn, which utilizes the integration of gene expression data and chemical structure information for the prediction of synergistic effects in drug combinations. GTextSyn employs a sentence classification model within the domain of Natural Language Processing (NLP), wherein drugs and cell lines are regarded as entities possessing biochemical relevance. Meanwhile, combinations of drug pairs and cell lines are construed as sentences with biochemical relational significance. To assess the efficacy of GTextSyn, we conduct a comparative analysis with alternative deep learning approaches using a standard benchmark dataset. The results from a five-fold cross-validation demonstrate a 49.5% reduction in Mean Square Error (MSE) achieved by GTextSyn, surpassing the performance of the next best method in the regression task. Furthermore, we conduct a comprehensive literature survey on the predicted novel drug combinations and find substantial support from prior experimental studies for many of the combinations identified by GTextSyn.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"218-230"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139110865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tumour Growth Mechanisms Determine Effectiveness of Adaptive Therapy in Glandular Tumours. 肿瘤生长机制决定腺体肿瘤适应性治疗的有效性。

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2023-09-30 DOI: 10.1007/s12539-023-00586-8

Rui Zhen Tan

In cancer treatment, adaptive therapy holds promise for delaying the onset of recurrence through regulating the competition between drug-sensitive and drug-resistant cells. Adaptive therapy has been studied in well-mixed models assuming free mixing of all cells and spatial models considering the interactions of single cells with their immediate adjacent cells. Both models do not reflect the spatial structure in glandular tumours where intra-gland cellular interaction is high, while inter-gland interaction is limited. Here, we use mathematical modelling to study the effects of adaptive therapy on glandular tumours that expand using either glandular fission or invasive growth. A two-dimensional, lattice-based model of sites containing sensitive and resistant cells within individual glands is developed to study the evolution of glandular tumour cells under continuous and adaptive therapies. We found that although both growth models benefit from adaptive therapy's ability to prevent recurrence, invasive growth benefits more from it than fission growth. This difference is due to the migration of daughter cells into neighboring glands that is absent in fission but present in invasive growth. The migration resulted in greater mixing of cells, enhancing competition induced by adaptive therapy. By varying the initial spatial spread and location of the resistant cells within the tumour, we found that modifying the conditions within the resistant cells containing glands affect both fission and invasive growth. However, modifying the conditions surrounding these glands affect invasive growth only. Our work reveals the interplay between growth mechanism and tumour topology in modulating the effectiveness of cancer therapy.

在癌症治疗中，适应性治疗有望通过调节药物敏感细胞和耐药细胞之间的竞争来延缓复发的发生。自适应治疗已经在假设所有细胞自由混合的良好混合模型和考虑单个细胞与其紧邻细胞相互作用的空间模型中进行了研究。这两个模型都没有反映腺肿瘤的空间结构，腺内细胞相互作用很高，而腺间相互作用有限。在这里，我们使用数学模型来研究适应性治疗对使用腺分裂或侵袭性生长扩张的腺肿瘤的影响。开发了一个二维的、基于晶格的模型，用于研究在连续和适应性治疗下腺肿瘤细胞的进化。我们发现，尽管这两种生长模型都受益于适应性治疗预防复发的能力，但侵入性生长比裂变生长受益更多。这种差异是由于子细胞迁移到相邻的腺体中，这些腺体在分裂中不存在，但在侵入性生长中存在。迁移导致细胞的更多混合，增强了适应性治疗诱导的竞争。通过改变耐药细胞在肿瘤内的初始空间分布和位置，我们发现改变含有腺体的耐药细胞内的条件会影响分裂和侵袭性生长。然而，改变这些腺体周围的条件只会影响侵袭性生长。我们的工作揭示了生长机制和肿瘤拓扑结构在调节癌症治疗效果方面的相互作用。

{"title":"Tumour Growth Mechanisms Determine Effectiveness of Adaptive Therapy in Glandular Tumours.","authors":"Rui Zhen Tan","doi":"10.1007/s12539-023-00586-8","DOIUrl":"10.1007/s12539-023-00586-8","url":null,"abstract":"In cancer treatment, adaptive therapy holds promise for delaying the onset of recurrence through regulating the competition between drug-sensitive and drug-resistant cells. Adaptive therapy has been studied in well-mixed models assuming free mixing of all cells and spatial models considering the interactions of single cells with their immediate adjacent cells. Both models do not reflect the spatial structure in glandular tumours where intra-gland cellular interaction is high, while inter-gland interaction is limited. Here, we use mathematical modelling to study the effects of adaptive therapy on glandular tumours that expand using either glandular fission or invasive growth. A two-dimensional, lattice-based model of sites containing sensitive and resistant cells within individual glands is developed to study the evolution of glandular tumour cells under continuous and adaptive therapies. We found that although both growth models benefit from adaptive therapy's ability to prevent recurrence, invasive growth benefits more from it than fission growth. This difference is due to the migration of daughter cells into neighboring glands that is absent in fission but present in invasive growth. The migration resulted in greater mixing of cells, enhancing competition induced by adaptive therapy. By varying the initial spatial spread and location of the resistant cells within the tumour, we found that modifying the conditions within the resistant cells containing glands affect both fission and invasive growth. However, modifying the conditions surrounding these glands affect invasive growth only. Our work reveals the interplay between growth mechanism and tumour topology in modulating the effectiveness of cancer therapy.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"73-90"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41111338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comprehensive scRNA-seq Model Reveals Artery Endothelial Cell Heterogeneity and Metabolic Preference in Human Vascular Disease. 综合scRNA-seq模型揭示了人类血管疾病中动脉内皮细胞的异质性和代谢偏好

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2023-11-17 DOI: 10.1007/s12539-023-00591-x

Liping Zeng, Yunchang Liu, Xiaoping Li, Xue Gong, Miao Tian, Peili Yang, Qi Cai, Gengze Wu, Chunyu Zeng

Vascular disease is one of the major causes of death worldwide. Endothelial cells are important components of the vascular structure. A better understanding of the endothelial cell changes in the development of vascular disease may provide new targets for clinical treatment strategies. Single-cell RNA sequencing can serve as a powerful tool to explore transcription patterns, as well as cell type identity. Our current study is based on comprehensive scRNA-seq data of several types of human vascular disease datasets with deep-learning-based algorithm. A gene set scoring system, created based on cell clustering, may help to identify the relative stage of the development of vascular disease. Metabolic preference patterns were estimated using a graphic neural network model. Overall, our study may provide potential treatment targets for retaining normal endothelial function under pathological situations.

血管疾病是世界范围内死亡的主要原因之一。内皮细胞是血管结构的重要组成部分。更好地了解血管疾病发展过程中内皮细胞的变化可能为临床治疗策略提供新的靶点。单细胞RNA测序可以作为一个强大的工具来探索转录模式，以及细胞类型的身份。我们目前的研究是基于基于深度学习算法的几种人类血管疾病数据集的综合scRNA-seq数据。基于细胞聚类建立的基因集评分系统可能有助于确定血管疾病发展的相对阶段。代谢偏好模式估计使用图形神经网络模型。总之，我们的研究可能为在病理情况下保持正常内皮功能提供潜在的治疗靶点。

引用次数: 0

The Dynamical Biomarkers in Functional Connectivity of Autism Spectrum Disorder Based on Dynamic Graph Embedding. 基于动态图嵌入的自闭症谱系障碍功能连通性动态生物标志物研究。

IF 4.8 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2024-03-01 Epub Date: 2023-12-07 DOI: 10.1007/s12539-023-00592-w

Yanting Liu, Hao Wang, Yanrui Ding

Autism spectrum disorder (ASD) is a neurological and developmental disorder and its early diagnosis is a challenging task. The dynamic brain network (DBN) offers a wealth of information for the diagnosis and treatment of ASD. Mining the spatio-temporal characteristics of DBN is critical for finding dynamic communication across brain regions and, ultimately, identifying the ASD diagnostic biomarker. We proposed the dgEmbed-KNN and the Aggregation-SVM diagnostic models, which use the spatio-temporal information from DBN and interactive information among brain regions represented by dynamic graph embedding. The classification accuracies show that dgEmbed-KNN model performs slightly better than traditional machine learning and deep learning methods, while the Aggregation-SVM model has a very good capacity to diagnose ASD using aggregation brain network connections as features. We discovered over- and under-connections in ASD at the level of dynamic connections, involving brain regions of the postcentral gyrus, the insula, the cerebellum, the caudate nucleus, and the temporal pole. We also found abnormal dynamic interactions associated with ASD within/between the functional subnetworks, including default mode network, visual network, auditory network and saliency network. These can provide potential DBN biomarkers for ASD identification.

自闭症谱系障碍(ASD)是一种神经和发育障碍，其早期诊断是一项具有挑战性的任务。动态脑网络(DBN)为ASD的诊断和治疗提供了丰富的信息。挖掘DBN的时空特征对于发现大脑区域之间的动态交流并最终确定ASD诊断生物标志物至关重要。我们提出了dgEmbed-KNN和Aggregation-SVM诊断模型，它们利用DBN的时空信息和动态图嵌入表示的脑区间交互信息。分类精度表明，dgEmbed-KNN模型的分类精度略优于传统的机器学习和深度学习方法，而aggregation - svm模型以聚集脑网络连接为特征诊断ASD的能力非常好。我们在动态连接的水平上发现了ASD的过度连接和欠连接，涉及中央后回，岛，小脑，尾状核和颞极的大脑区域。我们还发现与ASD相关的功能子网络内部/之间的异常动态相互作用，包括默认模式网络、视觉网络、听觉网络和显著性网络。这些可以为ASD鉴定提供潜在的DBN生物标志物。

{"title":"The Dynamical Biomarkers in Functional Connectivity of Autism Spectrum Disorder Based on Dynamic Graph Embedding.","authors":"Yanting Liu, Hao Wang, Yanrui Ding","doi":"10.1007/s12539-023-00592-w","DOIUrl":"10.1007/s12539-023-00592-w","url":null,"abstract":"Autism spectrum disorder (ASD) is a neurological and developmental disorder and its early diagnosis is a challenging task. The dynamic brain network (DBN) offers a wealth of information for the diagnosis and treatment of ASD. Mining the spatio-temporal characteristics of DBN is critical for finding dynamic communication across brain regions and, ultimately, identifying the ASD diagnostic biomarker. We proposed the dgEmbed-KNN and the Aggregation-SVM diagnostic models, which use the spatio-temporal information from DBN and interactive information among brain regions represented by dynamic graph embedding. The classification accuracies show that dgEmbed-KNN model performs slightly better than traditional machine learning and deep learning methods, while the Aggregation-SVM model has a very good capacity to diagnose ASD using aggregation brain network connections as features. We discovered over- and under-connections in ASD at the level of dynamic connections, involving brain regions of the postcentral gyrus, the insula, the cerebellum, the caudate nucleus, and the temporal pole. We also found abnormal dynamic interactions associated with ASD within/between the functional subnetworks, including default mode network, visual network, auditory network and saliency network. These can provide potential DBN biomarkers for ASD identification.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"141-159"},"PeriodicalIF":4.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138498321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0