Interdisciplinary Sciences: Computational Life Sciences最新文献_第3页

BEST: Basic Embedding Search Tool Enhancing Discovery of Novel Enzyme. BEST：增强新酶发现的基本嵌入搜索工具。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-03-01 Epub Date: 2025-08-11 DOI: 10.1007/s12539-025-00753-z

Yuxuan Wu, Xiao Yi, Yang Tan, Huiqun Yu, Guisheng Fan, Gaowei Zheng

The identification of protein homologs in large databases is critical for biological advancements. Traditional methods, such as protein sequence alignment, often miss remote homologs. To address this limitation, we present the Basic Embedding Search Tool (BEST), a fast and sensitive approach that employs protein language models to create sequence embeddings enriched with evolutionary and structural information. Besides, we introduce a segmented distillation pruning technique to accelerate sequence encoding and develop a multi-layer acceleration structure to achieve a 4290.86-fold speedup in swift access and retrieval of dense vectors. Extensive experiments on real datasets demonstrate that BEST increases sensitivity by over 20% compared to prior methods while maintaining precision and recall. It operates 23.41 times faster than traditional tools like PSI-BLAST and 3.92 times faster than Foldseek, while also detecting homologous sequences that conventional methods miss. BEST and its open-access web server ( http://pm2s.cpolar.top/best1/ ) are poised to significantly aid enzyme mining and advance biological research. The code is publicly available at https://github.com/SkyTai-W/ProteinMiningEvaluator .

在大型数据库中鉴定蛋白质同源物对生物学进步至关重要。传统的方法，如蛋白质序列比对，往往会遗漏远程同源物。为了解决这一限制，我们提出了基本嵌入搜索工具（BEST），这是一种快速敏感的方法，利用蛋白质语言模型来创建富含进化和结构信息的序列嵌入。此外，我们引入了分段蒸馏剪枝技术来加速序列编码，并开发了多层加速结构，使密集向量的快速访问和检索速度提高了4290.86倍。在真实数据集上的大量实验表明，BEST在保持精度和召回率的同时，比先前的方法提高了20%以上的灵敏度。它的运行速度比PSI-BLAST等传统工具快23.41倍，比Foldseek快3.92倍，同时还能检测到传统方法无法检测到的同源序列。BEST及其开放访问网络服务器（http://pm2s.cpolar.top/best1/）将极大地帮助酶挖掘和推进生物学研究。该代码可在https://github.com/SkyTai-W/ProteinMiningEvaluator上公开获得。

{"title":"BEST: Basic Embedding Search Tool Enhancing Discovery of Novel Enzyme.","authors":"Yuxuan Wu, Xiao Yi, Yang Tan, Huiqun Yu, Guisheng Fan, Gaowei Zheng","doi":"10.1007/s12539-025-00753-z","DOIUrl":"10.1007/s12539-025-00753-z","url":null,"abstract":"The identification of protein homologs in large databases is critical for biological advancements. Traditional methods, such as protein sequence alignment, often miss remote homologs. To address this limitation, we present the Basic Embedding Search Tool (BEST), a fast and sensitive approach that employs protein language models to create sequence embeddings enriched with evolutionary and structural information. Besides, we introduce a segmented distillation pruning technique to accelerate sequence encoding and develop a multi-layer acceleration structure to achieve a 4290.86-fold speedup in swift access and retrieval of dense vectors. Extensive experiments on real datasets demonstrate that BEST increases sensitivity by over 20% compared to prior methods while maintaining precision and recall. It operates 23.41 times faster than traditional tools like PSI-BLAST and 3.92 times faster than Foldseek, while also detecting homologous sequences that conventional methods miss. BEST and its open-access web server ( http://pm2s.cpolar.top/best1/ ) are poised to significantly aid enzyme mining and advance biological research. The code is publicly available at https://github.com/SkyTai-W/ProteinMiningEvaluator .","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"101-121"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144821313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

scMSDA: A Novel Multi-View Fusion Framework for Single-Cell RNA-seq Data Clustering with Semantic and Distribution Alignment. 基于语义和分布对齐的单细胞RNA-seq数据聚类多视图融合框架。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-13 DOI: 10.1007/s12539-025-00801-8

Congcong Jiang, Wenlan Chen, Yanyan Tan, Hai Zhong, Cheng Liang

Single-cell RNA sequencing (scRNA-seq) technology has improved cellular heterogeneity resolution but faces challenges like high dimensionality, sparsity, and technical noise in downstream analysis. Existing methods often treat all negative samples equally, ignoring local structures that are essential for capturing meaningful semantic relationships within the data. In this paper, we propose scMSDA, a novel multi-view fusion framework for scRNA-seq data clustering, which leverages semantic consistency and distribution alignment to effectively learn robust representations for downstream tasks. Our model first performs data augmentation on the original data by introducing dropout regularization. Then, we perform global feature aggregation on two latent representations obtained from the encoders with non-shared parameters. To further alleviate the representation conflict problem in traditional contrastive learning, we propose a distance-guided adaptive-negative contrastive learning strategy, which dynamically adjusts the contribution of negative sample pairs through a neighborhood-aware weight matrix. In addition, our method enhances intra-cluster compactness while maximizing inter-cluster separation through an iterative centroid refinement process guided by pseudo-labels. Finally, the optimal transport (OT)-based cross-view alignment explicitly minimizes transport costs between semantically related instances and target clusters, effectively enforcing distribution alignment across views. We evaluate our model on 17 publicly available datasets and the experimental results show our model outperforms 10 baseline methods in terms of various clustering metrics. The source code of scMSDA is freely available at https://github.com/LiangSDNULab/scMSDA.

单细胞RNA测序（scRNA-seq）技术提高了细胞异质性分辨率，但在下游分析中面临高维、稀疏和技术噪声等挑战。现有的方法通常平等地对待所有负样本，忽略了对于捕获数据中有意义的语义关系至关重要的局部结构。在本文中，我们提出了scMSDA，一种用于scRNA-seq数据聚类的新型多视图融合框架，它利用语义一致性和分布对齐来有效地学习下游任务的鲁棒表示。我们的模型首先通过引入dropout正则化对原始数据进行数据增强。然后，我们对从具有非共享参数的编码器中获得的两个潜在表示进行全局特征聚合。为了进一步缓解传统对比学习中的表示冲突问题，提出了一种距离引导自适应负对比学习策略，该策略通过邻域感知权矩阵动态调整负样本对的贡献。此外，我们的方法通过伪标签引导的迭代质心细化过程增强了簇内紧凑性，同时最大限度地提高了簇间分离。最后，基于最佳传输（OT）的跨视图对齐显式地最小化了语义相关实例和目标集群之间的传输成本，有效地执行了跨视图的分布对齐。我们在17个公开可用的数据集上评估了我们的模型，实验结果表明我们的模型在各种聚类指标方面优于10种基线方法。scMSDA的源代码可在https://github.com/LiangSDNULab/scMSDA免费获得。

{"title":"scMSDA: A Novel Multi-View Fusion Framework for Single-Cell RNA-seq Data Clustering with Semantic and Distribution Alignment.","authors":"Congcong Jiang, Wenlan Chen, Yanyan Tan, Hai Zhong, Cheng Liang","doi":"10.1007/s12539-025-00801-8","DOIUrl":"https://doi.org/10.1007/s12539-025-00801-8","url":null,"abstract":"Single-cell RNA sequencing (scRNA-seq) technology has improved cellular heterogeneity resolution but faces challenges like high dimensionality, sparsity, and technical noise in downstream analysis. Existing methods often treat all negative samples equally, ignoring local structures that are essential for capturing meaningful semantic relationships within the data. In this paper, we propose scMSDA, a novel multi-view fusion framework for scRNA-seq data clustering, which leverages semantic consistency and distribution alignment to effectively learn robust representations for downstream tasks. Our model first performs data augmentation on the original data by introducing dropout regularization. Then, we perform global feature aggregation on two latent representations obtained from the encoders with non-shared parameters. To further alleviate the representation conflict problem in traditional contrastive learning, we propose a distance-guided adaptive-negative contrastive learning strategy, which dynamically adjusts the contribution of negative sample pairs through a neighborhood-aware weight matrix. In addition, our method enhances intra-cluster compactness while maximizing inter-cluster separation through an iterative centroid refinement process guided by pseudo-labels. Finally, the optimal transport (OT)-based cross-view alignment explicitly minimizes transport costs between semantically related instances and target clusters, effectively enforcing distribution alignment across views. We evaluate our model on 17 publicly available datasets and the experimental results show our model outperforms 10 baseline methods in terms of various clustering metrics. The source code of scMSDA is freely available at https://github.com/LiangSDNULab/scMSDA.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146179352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Network Co-expression Analysis Enhances Biological Insights from Single-Cell Gene Expression. 多网络共表达分析增强单细胞基因表达的生物学见解。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-06 DOI: 10.1007/s12539-025-00806-3

Alicia Gómez-Pascual, Araks Martirosyan, Katja Hebestreit, Andrew Kottick, Michelle Mighdoll, Victor Hanson-Smith, José Luis Mellina-Andreu, Alejandro Cisterna, Matthew G Holt, Grant Belgard, Sebastian Guelfi, Juan A Botía

引用次数: 0

IIC-DTI: A Contrastive Learning Enhanced Inter-Intra Molecular Fusing Framework for Drug-Target Interaction Prediction. IIC-DTI：用于药物-靶标相互作用预测的对比学习增强的分子间融合框架。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-03 DOI: 10.1007/s12539-025-00799-z

Fei Wang, Dacheng Ruan, Yang Zhang, Yue Chen, Xiujuan Lei, Fang-Xiang Wu, Yansen Su, Chunhou Zheng

Purpose: Predicting drug-target interactions (DTIs) is a practical demand in drug development and drug repositioning. Therefore, developing accurate and efficient DTI prediction methods has significant application value. Current models focus on the features of either drugs or targets independently, and concatenate them together for downstream prediction. They ignore the hidden associations between drugs and targets, which may affect the implementation of DTIs.

Methods: In this work, we design a contrastive learning model to fuse intramolecular and intermolecular features of drugs and targets, named IIC-DTI. The intramolecular features focus on drug chemical structures and target amino acid sequences, which are generated separately. Meanwhile, the intermolecular features are focused on drug-target pairs, extracted by a multi-head cross-attention network. For the two embeddings of either a drug or a target in two views, a contrastive learning module is applied to update the embedding of one view by fusing information from the other view. Those novel embeddings are concatenated and fed into a 3-hidden layer neural network for predicting potential DTIs.

Results: Multiple comparative experiments show that our proposed model has better performance than nine state-of-the-art methods, including two pre-trained large language models, according to several evaluation metrics on four benchmark datasets. In case study, 16 out of 20 drug-target pairs were verified by literature evidence. Moreover, IIC-DTI identified related interactions of a given drug and target successfully. It indicates that IIC-DTI has the potential application to identify DTIs in realistic conditions.

目的：预测药物-靶标相互作用（DTIs）是药物开发和药物重新定位的实际需求。因此，开发准确、高效的DTI预测方法具有重要的应用价值。目前的模型要么单独关注药物或靶点的特征，要么将它们连接在一起进行下游预测。他们忽视了药物和靶标之间的潜在关联，这可能会影响dti的实施。方法：我们设计了一个融合药物和靶点分子内和分子间特征的对比学习模型，命名为IIC-DTI。分子内特征集中在药物化学结构和目标氨基酸序列上，它们是单独产生的。同时，将分子间特征集中在药物-靶标对上，通过多头交叉关注网络提取。对于药物或目标在两个视图中的两个嵌入，应用对比学习模块通过融合另一个视图的信息来更新一个视图的嵌入。这些新颖的嵌入被连接并馈送到一个三隐层神经网络中，用于预测潜在的dti。结果：多个对比实验表明，根据四个基准数据集的几个评估指标，我们提出的模型比九种最先进的方法（包括两种预训练的大型语言模型）具有更好的性能。在案例研究中，20对药物靶标对中有16对得到了文献证据的验证。此外，IIC-DTI成功地识别了给定药物与靶标的相关相互作用。这表明IIC-DTI在现实条件下识别dti具有潜在的应用前景。

{"title":"IIC-DTI: A Contrastive Learning Enhanced Inter-Intra Molecular Fusing Framework for Drug-Target Interaction Prediction.","authors":"Fei Wang, Dacheng Ruan, Yang Zhang, Yue Chen, Xiujuan Lei, Fang-Xiang Wu, Yansen Su, Chunhou Zheng","doi":"10.1007/s12539-025-00799-z","DOIUrl":"https://doi.org/10.1007/s12539-025-00799-z","url":null,"abstract":"Purpose: Predicting drug-target interactions (DTIs) is a practical demand in drug development and drug repositioning. Therefore, developing accurate and efficient DTI prediction methods has significant application value. Current models focus on the features of either drugs or targets independently, and concatenate them together for downstream prediction. They ignore the hidden associations between drugs and targets, which may affect the implementation of DTIs.Methods: In this work, we design a contrastive learning model to fuse intramolecular and intermolecular features of drugs and targets, named IIC-DTI. The intramolecular features focus on drug chemical structures and target amino acid sequences, which are generated separately. Meanwhile, the intermolecular features are focused on drug-target pairs, extracted by a multi-head cross-attention network. For the two embeddings of either a drug or a target in two views, a contrastive learning module is applied to update the embedding of one view by fusing information from the other view. Those novel embeddings are concatenated and fed into a 3-hidden layer neural network for predicting potential DTIs.Results: Multiple comparative experiments show that our proposed model has better performance than nine state-of-the-art methods, including two pre-trained large language models, according to several evaluation metrics on four benchmark datasets. In case study, 16 out of 20 drug-target pairs were verified by literature evidence. Moreover, IIC-DTI identified related interactions of a given drug and target successfully. It indicates that IIC-DTI has the potential application to identify DTIs in realistic conditions.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146113081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Modal Fusion with Supervised Contrastive Learning Model for Early Alzheimer's Disease Diagnosis and Multi-Modal Biomarker Identification. 基于监督对比学习模型的多模态融合早期阿尔茨海默病诊断和多模态生物标志物识别。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-03 DOI: 10.1007/s12539-025-00805-4

Xiaofeng Xie, Peng Xue, Yihao Guo, Huijuan Chen, Li Fan, Rongnian Tang, Zhenkai Xu, Xuanqi Wang, Tao Liu, Feng Chen

Early and accurate diagnosis of mild cognitive impairment (MCI), a prodromal stage of Alzheimer's disease (AD), is critical for timely intervention and management. Nevertheless, effectively integrating heterogeneous multi-modal data for AD diagnosis remains worthy of further investigation. Therefore, we propose a supervised contrastive learning framework that integrates single nucleotide polymorphisms (SNPs), plasma proteomics, and T1-weighted structural magnetic resonance imaging (sMRI) from a biologically informed perspective, with SNPs influencing protein structure or gene expression levels, ultimately altering brain structure. Through a supervised contrastive learning mechanism, we construct a cross-modal feature space and introduce a similarity-based symmetrical attention mechanism to capture intermodal interactions and mitigate modality heterogeneity. We validate the proposed method on the Alzheimer's Disease Neuroimaging Initiative dataset, and experimental results demonstrate accuracy of 96.1%, 86.2%, and 86.1% for the AD-NC task, MCI-NC task, and AD-MCI task. In addition, the application of explainable methods to our model identified multi-modal biomarkers related to AD diagnosis. The experimental results validate the effectiveness of our model in the diagnosis of AD and MCI.

早期准确诊断轻度认知障碍（MCI）是阿尔茨海默病（AD）的前驱阶段，对于及时干预和治疗至关重要。然而，有效整合异构多模态数据用于AD诊断仍值得进一步研究。因此，我们提出了一个有监督的对比学习框架，该框架从生物学角度整合了单核苷酸多态性（snp）、血浆蛋白质组学和t1加权结构磁共振成像（sMRI）， SNPs影响蛋白质结构或基因表达水平，最终改变大脑结构。通过有监督的对比学习机制，我们构建了一个跨模态特征空间，并引入了一个基于相似性的对称注意机制来捕捉多模态交互并减轻模态异质性。我们在阿尔茨海默病神经成像倡议数据集上验证了该方法，实验结果表明AD-NC任务、MCI-NC任务和AD-MCI任务的准确率分别为96.1%、86.2%和86.1%。此外，将可解释的方法应用于我们的模型，确定了与AD诊断相关的多模态生物标志物。实验结果验证了该模型在AD和MCI诊断中的有效性。

{"title":"Multi-Modal Fusion with Supervised Contrastive Learning Model for Early Alzheimer's Disease Diagnosis and Multi-Modal Biomarker Identification.","authors":"Xiaofeng Xie, Peng Xue, Yihao Guo, Huijuan Chen, Li Fan, Rongnian Tang, Zhenkai Xu, Xuanqi Wang, Tao Liu, Feng Chen","doi":"10.1007/s12539-025-00805-4","DOIUrl":"https://doi.org/10.1007/s12539-025-00805-4","url":null,"abstract":"Early and accurate diagnosis of mild cognitive impairment (MCI), a prodromal stage of Alzheimer's disease (AD), is critical for timely intervention and management. Nevertheless, effectively integrating heterogeneous multi-modal data for AD diagnosis remains worthy of further investigation. Therefore, we propose a supervised contrastive learning framework that integrates single nucleotide polymorphisms (SNPs), plasma proteomics, and T1-weighted structural magnetic resonance imaging (sMRI) from a biologically informed perspective, with SNPs influencing protein structure or gene expression levels, ultimately altering brain structure. Through a supervised contrastive learning mechanism, we construct a cross-modal feature space and introduce a similarity-based symmetrical attention mechanism to capture intermodal interactions and mitigate modality heterogeneity. We validate the proposed method on the Alzheimer's Disease Neuroimaging Initiative dataset, and experimental results demonstrate accuracy of 96.1%, 86.2%, and 86.1% for the AD-NC task, MCI-NC task, and AD-MCI task. In addition, the application of explainable methods to our model identified multi-modal biomarkers related to AD diagnosis. The experimental results validate the effectiveness of our model in the diagnosis of AD and MCI.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146113104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FHGNet: A Feature-Centric Hierarchical Network with Graph Attention Layer for Supraventricular Tachycardia Classification. FHGNet：一种以特征为中心的具有图注意层的室上性心动过速分类层次网络。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-03 DOI: 10.1007/s12539-025-00802-7

Xiaolin Ju, Tao Liu, Bowen Luo, Heling Cao, Zhan Gao, Haiyan Pan

Automated electrocardiogram (ECG) classification plays a critical role in arrhythmia diagnosis. However, current deep learning-based methodologies frequently fail to account for physiological rhythms and clinical diagnostic reasoning, thereby compromising their reliability and interpretability. This study proposes a clinically inspired multi-lead oscillatory Transformer framework, named FHGNet, to enhance the precision and interoperability of classifying ventricular tachycardia (VT) and supraventricular tachycardia (SVT). The proposed architecture integrates R-peak detection for heartbeat segmentation and adaptive-length patch extraction with R-wave positional encoding to enhance temporal awareness. It employs a convolutional neural network (CNN) to capture intra-beat morphological features (QRS morphology), a Transformer with FANLayer to model inter-beat rhythmic patterns, and a graph attention network (GAT) to fuse multi-lead dependencies. Additionally, a two-stage classifier is designed to enhance the detection of rare arrhythmia classes. Experimental evaluations on the MIT-BIH Supraventricular Arrhythmia dataset demonstrate FHGNet achieves a macro F1-score of 91.35% outperforming baselines. Ablation studies reveal that removing GAT reduces F1 by 2.42% in multi-lead scenarios, while the two-stage design improves minority class recall by 5.82%. Attention visualization confirms the model focuses on clinically relevant features, such as ST-T segment energy ratios and inter-lead phase differences, aligning with established diagnostic criteria. Additionally, the interpretability of FHGNet is further enhanced by two aspects: 1) Explicit integration of physiological priors (e.g., RR interval variability, intra-beat positional information) in dynamic feature engineering, which enables the model to align with clinicians' rhythm analysis logic; 2) The two-stage classifier strictly follows the clinical diagnostic workflow (first screening abnormalities, then subclassifying), making the decision-making process traceable. This work provides an interpretable, clinically adaptive framework for high-accuracy ECG classification, potentially reducing reliance on invasive electrophysiological studies.

心电图自动分类在心律失常诊断中起着至关重要的作用。然而，目前基于深度学习的方法经常不能解释生理节律和临床诊断推理，从而损害了它们的可靠性和可解释性。为了提高室性心动过速（VT）和室上性心动过速（SVT）分类的准确性和互操作性，本研究提出了一种临床启发的多导联振荡变压器框架FHGNet。该架构集成了r -峰检测用于心跳分割和自适应长度补丁提取与r -波位置编码，以增强时间感知。该算法采用卷积神经网络（CNN）捕捉拍内形态特征（QRS形态学），采用带FANLayer的Transformer建模拍间节奏模式，采用图注意网络（GAT）融合多导联依赖关系。此外，一个两阶段分类器的设计，以提高检测罕见的心律失常类别。在MIT-BIH室上性心律失常数据集上的实验评估表明，FHGNet的宏观f1得分比基线高91.35%。消融术研究表明，在多导联情况下，去除GAT可将F1降低2.42%，而两阶段设计可将少数族裔召回率提高5.82%。注意力可视化证实该模型关注临床相关特征，如ST-T段能量比和导联间相位差，符合既定的诊断标准。此外，FHGNet的可解释性还得到了两个方面的进一步增强：1)动态特征工程中显式整合生理先验（如RR间隔变异性、节拍内位置信息），使模型与临床医生的节奏分析逻辑保持一致；2)两阶段分类器严格遵循临床诊断流程（先筛查异常，再细分），决策过程可追溯。这项工作为高精度ECG分类提供了一个可解释的、临床适应性的框架，潜在地减少了对侵入性电生理研究的依赖。

{"title":"FHGNet: A Feature-Centric Hierarchical Network with Graph Attention Layer for Supraventricular Tachycardia Classification.","authors":"Xiaolin Ju, Tao Liu, Bowen Luo, Heling Cao, Zhan Gao, Haiyan Pan","doi":"10.1007/s12539-025-00802-7","DOIUrl":"https://doi.org/10.1007/s12539-025-00802-7","url":null,"abstract":"Automated electrocardiogram (ECG) classification plays a critical role in arrhythmia diagnosis. However, current deep learning-based methodologies frequently fail to account for physiological rhythms and clinical diagnostic reasoning, thereby compromising their reliability and interpretability. This study proposes a clinically inspired multi-lead oscillatory Transformer framework, named FHGNet, to enhance the precision and interoperability of classifying ventricular tachycardia (VT) and supraventricular tachycardia (SVT). The proposed architecture integrates R-peak detection for heartbeat segmentation and adaptive-length patch extraction with R-wave positional encoding to enhance temporal awareness. It employs a convolutional neural network (CNN) to capture intra-beat morphological features (QRS morphology), a Transformer with FANLayer to model inter-beat rhythmic patterns, and a graph attention network (GAT) to fuse multi-lead dependencies. Additionally, a two-stage classifier is designed to enhance the detection of rare arrhythmia classes. Experimental evaluations on the MIT-BIH Supraventricular Arrhythmia dataset demonstrate FHGNet achieves a macro F1-score of 91.35% outperforming baselines. Ablation studies reveal that removing GAT reduces F1 by 2.42% in multi-lead scenarios, while the two-stage design improves minority class recall by 5.82%. Attention visualization confirms the model focuses on clinically relevant features, such as ST-T segment energy ratios and inter-lead phase differences, aligning with established diagnostic criteria. Additionally, the interpretability of FHGNet is further enhanced by two aspects: 1) Explicit integration of physiological priors (e.g., RR interval variability, intra-beat positional information) in dynamic feature engineering, which enables the model to align with clinicians' rhythm analysis logic; 2) The two-stage classifier strictly follows the clinical diagnostic workflow (first screening abnormalities, then subclassifying), making the decision-making process traceable. This work provides an interpretable, clinically adaptive framework for high-accuracy ECG classification, potentially reducing reliance on invasive electrophysiological studies.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146113119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GATRSyn: Advancing Anticancer Drug Synergy Prediction Through Graph Attention Networks and Transformer-based Feature Re-embedding. GATRSyn：通过图关注网络和基于变压器的特征重嵌入推进抗癌药物协同预测。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-03 DOI: 10.1007/s12539-025-00798-0

Sile Li, Ziyu Li, Chenyang Dong, Yushuang Li

引用次数: 0

A Deep Learning Framework with Multi-perspective Feature Fusion for Transcription Factor Binding Site Prediction. 基于多视角特征融合的转录因子结合位点预测深度学习框架。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-02-02 DOI: 10.1007/s12539-025-00807-2

Tong Wang, Zhendong Liu

引用次数: 0

ESM-PsyPred: Leveraging Protein Language Models for Accurate Prediction of Psychrophilic Proteins. ESM-PsyPred：利用蛋白质语言模型准确预测嗜冷蛋白。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-01-29 DOI: 10.1007/s12539-025-00810-7

Chong Peng, Yarui Bian, Chengwu Yuan, Yuying Chen, Dingkuo Liu, Fuping Lu, Fufeng Liu, Yihan Liu

Psychrophilic proteins, which maintain high activity and stability in low-temperature environments, hold significant potential for industrial and ecological research. However, existing predictive tools predominantly focus on thermophilic proteins, while psychrophilic protein prediction models remain constrained by data scarcity and subtle sequence variations, resulting in suboptimal performance. To overcome these barriers, this study introduces ESM-PsyPred, a computational framework that integrates the evolutionary-scale protein language model ESM-2 with a support vector machine (SVM). By extracting high-dimensional semantic features from protein sequences via ESM-2 and employing an SVM classifier, the model achieves independent test accuracies of 88.9% and 83.9% in binary (psychrophilic vs. mesophilic) and ternary (psychrophilic, mesophilic, thermophilic) classification tasks, respectively, significantly outperforming existing methods. Visualization analyses demonstrate the model's ability to identify critical cold-adaptation signatures. Furthermore, the construction of high-quality datasets, PMTTer and PNPBin, alongside cross-dataset validation, underscores the framework's robust generalization capabilities. The open-source availability of the code (accessible at https://github.com/tust-lamee/ESM-PsyPred ) establishes ESM-PsyPred as an efficient tool for the rational design and industrial development of cold-adapted proteins.

在低温环境中保持高活性和稳定性的亲冷蛋白在工业和生态研究中具有重要的潜力。然而，现有的预测工具主要集中在嗜热蛋白上，而嗜冷蛋白预测模型仍然受到数据稀缺和微妙序列变化的限制，导致性能不理想。为了克服这些障碍，本研究引入了ESM-PsyPred，这是一个将进化尺度蛋白质语言模型ESM-2与支持向量机（SVM）集成在一起的计算框架。通过ESM-2从蛋白质序列中提取高维语义特征，并采用SVM分类器，该模型在二值（亲冷与中温）和三值（亲冷、中温、热）分类任务中分别达到了88.9%和83.9%的独立测试准确率，显著优于现有方法。可视化分析证明了该模型识别关键冷适应特征的能力。此外，高质量数据集的构建，PMTTer和PNPBin，以及跨数据集验证，强调了框架的鲁棒泛化能力。代码的开源可用性（可访问https://github.com/tust-lamee/ESM-PsyPred）将ESM-PsyPred建立为冷适应蛋白的合理设计和工业开发的有效工具。

{"title":"ESM-PsyPred: Leveraging Protein Language Models for Accurate Prediction of Psychrophilic Proteins.","authors":"Chong Peng, Yarui Bian, Chengwu Yuan, Yuying Chen, Dingkuo Liu, Fuping Lu, Fufeng Liu, Yihan Liu","doi":"10.1007/s12539-025-00810-7","DOIUrl":"https://doi.org/10.1007/s12539-025-00810-7","url":null,"abstract":"Psychrophilic proteins, which maintain high activity and stability in low-temperature environments, hold significant potential for industrial and ecological research. However, existing predictive tools predominantly focus on thermophilic proteins, while psychrophilic protein prediction models remain constrained by data scarcity and subtle sequence variations, resulting in suboptimal performance. To overcome these barriers, this study introduces ESM-PsyPred, a computational framework that integrates the evolutionary-scale protein language model ESM-2 with a support vector machine (SVM). By extracting high-dimensional semantic features from protein sequences via ESM-2 and employing an SVM classifier, the model achieves independent test accuracies of 88.9% and 83.9% in binary (psychrophilic vs. mesophilic) and ternary (psychrophilic, mesophilic, thermophilic) classification tasks, respectively, significantly outperforming existing methods. Visualization analyses demonstrate the model's ability to identify critical cold-adaptation signatures. Furthermore, the construction of high-quality datasets, PMTTer and PNPBin, alongside cross-dataset validation, underscores the framework's robust generalization capabilities. The open-source availability of the code (accessible at https://github.com/tust-lamee/ESM-PsyPred ) establishes ESM-PsyPred as an efficient tool for the rational design and industrial development of cold-adapted proteins.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146085658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HRD-Informed Digital Histology Model for Predicting Platinum Chemo-Response and Prognosis in High-Grade Serous Ovarian Cancer. 基于hrd的数字组织学模型预测高级别浆液性卵巢癌铂化疗反应和预后。

IF 3.9 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Interdisciplinary Sciences: Computational Life Sciences

Pub Date : 2026-01-29 DOI: 10.1007/s12539-025-00809-0

Zijian Yang, Liujin Zhang, Luyuan Li, Yan Song, Jie Sun, Fanling Meng

Homologous recombination deficiency (HRD) is a critical biomarker in high-grade serous ovarian cancer for the clinical benefit from platinum-based chemotherapy and poly polymerase inhibitors, but molecular testing is costly, time-consuming, and limited by tissue requirements. In this study, we introduce dPathHRD (digital Pathological assessment of Homologous Recombination Deficiency), a deep learning model designed to predict HRD status and platinum chemotherapy response directly from routine hematoxylin and eosin-stained whole-slide images. By integrating a pre-trained transformer-based pathology foundation model with an attention-based multiple-instance learning architecture, dPathHRD successfully predicts HRD status with an area under the curve of 0.920 in the discovery cohort and 0.766 in the validation cohort. The digital scores generated by dPathHRD were significantly correlated with established HRD-related genomic and transcriptomic features. Furthermore, dPathHRD demonstrated the ability to predict therapeutic response to platinum chemotherapy, with the HRD-like group showing higher complete response rates and longer progression-free and recurrence-free survival compared to the homologous recombination proficiency (HRP)-like group across all three cohorts. Interpretation analysis via attention mapping confirmed the model's reliance on biologically relevant histopathological features in tumor and stromal regions. In conclusion, dPathHRD offers a promising, cost-effective alternative to molecular testing, leveraging widely available digital pathology images to inform personalized treatment strategies. Further prospective validation is warranted to confirm its clinical applicability and predictive power.

同源重组缺乏症（HRD）是判断高级别浆液性卵巢癌是否接受铂类化疗和聚合酶抑制剂临床获益的关键生物标志物，但分子检测成本高、耗时长，且受组织要求的限制。在这项研究中，我们引入了dPathHRD（数字病理评估同源重组缺陷），这是一个深度学习模型，旨在直接从常规苏木精和伊红染色的全片图像中预测HRD状态和铂化疗反应。通过将预训练的基于变压器的病理基础模型与基于注意力的多实例学习架构相结合，dPathHRD成功预测了HRD状态，发现队列的曲线下面积为0.920，验证队列的曲线下面积为0.766。dPathHRD生成的数字分数与已建立的hrd相关的基因组和转录组特征显著相关。此外，dPathHRD显示出预测铂化疗治疗反应的能力，在所有三个队列中，与同源重组熟练度（HRP）样组相比，hrd样组显示出更高的完全缓解率和更长的无进展和无复发生存期。通过注意映射的解释分析证实了该模型依赖于肿瘤和间质区域的生物学相关组织病理学特征。总之，dPathHRD为分子检测提供了一种有前途的、具有成本效益的替代方案，利用广泛可用的数字病理图像为个性化治疗策略提供信息。进一步的前瞻性验证是必要的，以确认其临床适用性和预测能力。

{"title":"HRD-Informed Digital Histology Model for Predicting Platinum Chemo-Response and Prognosis in High-Grade Serous Ovarian Cancer.","authors":"Zijian Yang, Liujin Zhang, Luyuan Li, Yan Song, Jie Sun, Fanling Meng","doi":"10.1007/s12539-025-00809-0","DOIUrl":"https://doi.org/10.1007/s12539-025-00809-0","url":null,"abstract":"Homologous recombination deficiency (HRD) is a critical biomarker in high-grade serous ovarian cancer for the clinical benefit from platinum-based chemotherapy and poly polymerase inhibitors, but molecular testing is costly, time-consuming, and limited by tissue requirements. In this study, we introduce dPathHRD (digital Pathological assessment of Homologous Recombination Deficiency), a deep learning model designed to predict HRD status and platinum chemotherapy response directly from routine hematoxylin and eosin-stained whole-slide images. By integrating a pre-trained transformer-based pathology foundation model with an attention-based multiple-instance learning architecture, dPathHRD successfully predicts HRD status with an area under the curve of 0.920 in the discovery cohort and 0.766 in the validation cohort. The digital scores generated by dPathHRD were significantly correlated with established HRD-related genomic and transcriptomic features. Furthermore, dPathHRD demonstrated the ability to predict therapeutic response to platinum chemotherapy, with the HRD-like group showing higher complete response rates and longer progression-free and recurrence-free survival compared to the homologous recombination proficiency (HRP)-like group across all three cohorts. Interpretation analysis via attention mapping confirmed the model's reliance on biologically relevant histopathological features in tumor and stromal regions. In conclusion, dPathHRD offers a promising, cost-effective alternative to molecular testing, leveraging widely available digital pathology images to inform personalized treatment strategies. Further prospective validation is warranted to confirm its clinical applicability and predictive power.","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146085660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0