{"title":"DriverMEDS: Cancer driver gene identification using mutual exclusivity from embeded features and driver mutation scoring.","authors":"Sichen Yi, Minzhu Xie","doi":"10.1016/j.ymeth.2025.03.010","DOIUrl":null,"url":null,"abstract":"<p><p>Efficiently identifying cancer driver genes plays a key role in the cancer development, diagnosis and treatment. Current unsupervised driver gene identification methods typically integrate multi-omics data into gene function networks and employ network embedding algorithms to learn gene features. Additionally, they consider mutual exclusivity and mutation frequency as crucial concepts in identifying driver genes. However, existing approaches neglect the possible important implications of mutual exclusivity in the embedding space. Furthermore, they simply assume that all driver genes exhibit high mutation frequencies. Fortunately, we explored the mutual exclusivity implanted in the learned features and have verified that the Euclidean distances between learned features are strongly related to the mutual exclusivity and they can reveal more information for the mutual exclusivity. Thus, we designed an unsupervised driver gene predicting framework DriverMEDS based on the above idea and a novel driver mutation scoring strategy. First, we design a feature clustering algorithm to generate gene modules. In each module, the Euclidean distances of learned features are used to calculate a module importance score for each gene based on the related mutual exclusivity. Then, following the fact that most of driver genes have intermediate mutation frequencies, a driver mutation scoring function is designed for each gene to optimize the existing mutation frequency scoring strategy. Finally, the weighted sum of the module importance score and the driver mutation score is used to prioritize the genes. The experiment results and analysis show that DriverMEDS could detect novel cancer driver genes and relevant function modules, and outperforms other five state-of-the-art methods for cancer driver identification.</p>","PeriodicalId":390,"journal":{"name":"Methods","volume":" ","pages":"22-29"},"PeriodicalIF":4.2000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ymeth.2025.03.010","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Efficiently identifying cancer driver genes plays a key role in the cancer development, diagnosis and treatment. Current unsupervised driver gene identification methods typically integrate multi-omics data into gene function networks and employ network embedding algorithms to learn gene features. Additionally, they consider mutual exclusivity and mutation frequency as crucial concepts in identifying driver genes. However, existing approaches neglect the possible important implications of mutual exclusivity in the embedding space. Furthermore, they simply assume that all driver genes exhibit high mutation frequencies. Fortunately, we explored the mutual exclusivity implanted in the learned features and have verified that the Euclidean distances between learned features are strongly related to the mutual exclusivity and they can reveal more information for the mutual exclusivity. Thus, we designed an unsupervised driver gene predicting framework DriverMEDS based on the above idea and a novel driver mutation scoring strategy. First, we design a feature clustering algorithm to generate gene modules. In each module, the Euclidean distances of learned features are used to calculate a module importance score for each gene based on the related mutual exclusivity. Then, following the fact that most of driver genes have intermediate mutation frequencies, a driver mutation scoring function is designed for each gene to optimize the existing mutation frequency scoring strategy. Finally, the weighted sum of the module importance score and the driver mutation score is used to prioritize the genes. The experiment results and analysis show that DriverMEDS could detect novel cancer driver genes and relevant function modules, and outperforms other five state-of-the-art methods for cancer driver identification.
期刊介绍:
Methods focuses on rapidly developing techniques in the experimental biological and medical sciences.
Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.