{"title":"Simplicity within biological complexity","authors":"Natasa Przulj, Noel Malod-Dognin","doi":"arxiv-2405.09595","DOIUrl":null,"url":null,"abstract":"Heterogeneous, interconnected, systems-level, molecular data have become\nincreasingly available and key in precision medicine. We need to utilize them\nto better stratify patients into risk groups, discover new biomarkers and\ntargets, repurpose known and discover new drugs to personalize medical\ntreatment. Existing methodologies are limited and a paradigm shift is needed to\nachieve quantitative and qualitative breakthroughs. In this perspective paper,\nwe survey the literature and argue for the development of a comprehensive,\ngeneral framework for embedding of multi-scale molecular network data that\nwould enable their explainable exploitation in precision medicine in linear\ntime. Network embedding methods map nodes to points in low-dimensional space,\nso that proximity in the learned space reflects the network's topology-function\nrelationships. They have recently achieved unprecedented performance on hard\nproblems of utilizing few omic data in various biomedical applications.\nHowever, research thus far has been limited to special variants of the problems\nand data, with the performance depending on the underlying topology-function\nnetwork biology hypotheses, the biomedical applications and evaluation metrics.\nThe availability of multi-omic data, modern graph embedding paradigms and\ncompute power call for a creation and training of efficient, explainable and\ncontrollable models, having no potentially dangerous, unexpected behaviour,\nthat make a qualitative breakthrough. We propose to develop a general,\ncomprehensive embedding framework for multi-omic network data, from models to\nefficient and scalable software implementation, and to apply it to biomedical\ninformatics. It will lead to a paradigm shift in computational and biomedical\nunderstanding of data and diseases that will open up ways to solving some of\nthe major bottlenecks in precision medicine and other domains.","PeriodicalId":501219,"journal":{"name":"arXiv - QuanBio - Other Quantitative Biology","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Other Quantitative Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.09595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Heterogeneous, interconnected, systems-level, molecular data have become
increasingly available and key in precision medicine. We need to utilize them
to better stratify patients into risk groups, discover new biomarkers and
targets, repurpose known and discover new drugs to personalize medical
treatment. Existing methodologies are limited and a paradigm shift is needed to
achieve quantitative and qualitative breakthroughs. In this perspective paper,
we survey the literature and argue for the development of a comprehensive,
general framework for embedding of multi-scale molecular network data that
would enable their explainable exploitation in precision medicine in linear
time. Network embedding methods map nodes to points in low-dimensional space,
so that proximity in the learned space reflects the network's topology-function
relationships. They have recently achieved unprecedented performance on hard
problems of utilizing few omic data in various biomedical applications.
However, research thus far has been limited to special variants of the problems
and data, with the performance depending on the underlying topology-function
network biology hypotheses, the biomedical applications and evaluation metrics.
The availability of multi-omic data, modern graph embedding paradigms and
compute power call for a creation and training of efficient, explainable and
controllable models, having no potentially dangerous, unexpected behaviour,
that make a qualitative breakthrough. We propose to develop a general,
comprehensive embedding framework for multi-omic network data, from models to
efficient and scalable software implementation, and to apply it to biomedical
informatics. It will lead to a paradigm shift in computational and biomedical
understanding of data and diseases that will open up ways to solving some of
the major bottlenecks in precision medicine and other domains.