miss-SNF: a multimodal patient similarity network integration approach to handle completely missing data sources.

Jessica Gliozzo, Mauricio A Soto Gomez, Arturo Bonometti, Alex Patak, Elena Casiraghi, Giorgio Valentini
{"title":"miss-SNF: a multimodal patient similarity network integration approach to handle completely missing data sources.","authors":"Jessica Gliozzo, Mauricio A Soto Gomez, Arturo Bonometti, Alex Patak, Elena Casiraghi, Giorgio Valentini","doi":"10.1093/bioinformatics/btaf150","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Precision medicine leverages patient-specific multimodal data to improve prevention, diagnosis, prognosis, and treatment of diseases. Advancing precision medicine requires the non-trivial integration of complex, heterogeneous, and potentially high-dimensional data sources, such as multi-omics and clinical data. In the literature, several approaches have been proposed to manage missing data, but are usually limited to the recovery of subsets of features for a subset of patients. A largely overlooked problem is the integration of multiple sources of data when one or more of them are completely missing for a subset of patients, a relatively common condition in clinical practice.</p><p><strong>Results: </strong>We propose miss-Similarity Network Fusion (miss-SNF), a novel general-purpose data integration approach designed to manage completely missing data in the context of patient similarity networks. miss-SNF integrates incomplete unimodal patient similarity networks by leveraging a non-linear message-passing strategy borrowed from the SNF algorithm. miss-SNF is able to recover missing patient similarities and is \"task agnostic\", in the sense that can integrate partial data for both unsupervised and supervised prediction tasks. Experimental analyses on nine cancer datasets from The Cancer Genome Atlas (TCGA) demonstrate that miss-SNF achieves state-of-the-art results in recovering similarities and in identifying patients subgroups enriched in clinically relevant variables and having differential survival. Moreover, amputation experiments show that miss-SNF supervised prediction of cancer clinical outcomes and Alzheimer's disease diagnosis with completely missing data achieves results comparable to those obtained when all the data are available.</p><p><strong>Availability and implementation: </strong>miss-SNF code, implemented in R, is available at https://github.com/AnacletoLAB/missSNF.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12011365/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Precision medicine leverages patient-specific multimodal data to improve prevention, diagnosis, prognosis, and treatment of diseases. Advancing precision medicine requires the non-trivial integration of complex, heterogeneous, and potentially high-dimensional data sources, such as multi-omics and clinical data. In the literature, several approaches have been proposed to manage missing data, but are usually limited to the recovery of subsets of features for a subset of patients. A largely overlooked problem is the integration of multiple sources of data when one or more of them are completely missing for a subset of patients, a relatively common condition in clinical practice.

Results: We propose miss-Similarity Network Fusion (miss-SNF), a novel general-purpose data integration approach designed to manage completely missing data in the context of patient similarity networks. miss-SNF integrates incomplete unimodal patient similarity networks by leveraging a non-linear message-passing strategy borrowed from the SNF algorithm. miss-SNF is able to recover missing patient similarities and is "task agnostic", in the sense that can integrate partial data for both unsupervised and supervised prediction tasks. Experimental analyses on nine cancer datasets from The Cancer Genome Atlas (TCGA) demonstrate that miss-SNF achieves state-of-the-art results in recovering similarities and in identifying patients subgroups enriched in clinically relevant variables and having differential survival. Moreover, amputation experiments show that miss-SNF supervised prediction of cancer clinical outcomes and Alzheimer's disease diagnosis with completely missing data achieves results comparable to those obtained when all the data are available.

Availability and implementation: miss-SNF code, implemented in R, is available at https://github.com/AnacletoLAB/missSNF.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
miss-SNF:处理完全缺失数据源的多模态患者相似性网络整合方法。
动机:精准医学利用针对患者的多模态数据来改善疾病的预防、诊断、预后和治疗。推进精准医疗需要对复杂、异构和潜在高维数据源(如多组学和临床数据)进行非琐碎的整合。在文献中,已经提出了几种方法来管理丢失的数据,但通常仅限于恢复患者子集的特征子集。一个很大程度上被忽视的问题是,当一个或多个数据来源对一部分患者完全缺失时,整合多个数据来源,这是临床实践中相对常见的情况。结果:我们提出了缺失相似度网络融合(miss-SNF),这是一种新的通用数据集成方法,旨在管理患者相似度网络背景下的完全缺失数据。Miss-SNF通过借鉴SNF算法的非线性消息传递策略集成了不完全单峰患者相似性网络。Miss-SNF能够恢复缺失患者的相似性,并且是“任务不可知的”,从某种意义上说,它可以整合无监督和有监督预测任务的部分数据。对来自癌症基因组图谱(TCGA)的9个癌症数据集的实验分析表明,miss-SNF在恢复相似性和识别临床相关变量丰富的患者亚组以及具有差异生存方面取得了最先进的结果。此外,截肢实验表明,在数据完全缺失的情况下,缺失snf对癌症临床结局和阿尔茨海默病诊断的监督预测结果与所有数据都可用时的结果相当。可用性和实现:miss-SNF代码,用R实现,可在https://github.com/AnacletoLAB/missSNF.Supplementary上获得。信息:补充信息可在Bioinformatics在线获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
RoBep: A Region-Oriented Deep Learning Model for B-Cell Epitope Prediction. mimicDetector: a pipeline for protein motif mimicry detection in host-pathogen interactions. Malaria-GENOMAP: A web-based tool for exploring genomic variation of malaria parasites. Capturing gene-cell duality in a cat's cradle. From Articles to Code: On-Demand Generation of Core Algorithms from Scientific Publications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1