{"title":"Employing Machine Learning Techniques to Detect Protein Function: A Survey, Experimental, and Empirical Evaluations","authors":"Kamal Taha","doi":"10.1109/TCBB.2024.3427381","DOIUrl":null,"url":null,"abstract":"This review article delves deeply into the various machine learning (ML) methods and algorithms employed in discerning protein functions. Each method discussed is assessed for its efficacy, limitations, potential improvements, and future prospects. We present an innovative hierarchical classification system that arranges algorithms into intricate categories and unique techniques. This taxonomy is based on a tri-level hierarchy, starting with the methodology category and narrowing down to specific techniques. Such a framework allows for a structured and comprehensive classification of algorithms, assisting researchers in understanding the interrelationships among diverse algorithms and techniques. The study incorporates both empirical and experimental evaluations to differentiate between the techniques. The empirical evaluation ranks the techniques based on four criteria. The experimental assessments rank: (1) individual techniques under the same methodology sub-category, (2) different sub-categories within the same category, and (3) the broad categories themselves. Integrating the innovative methodological classification, empirical findings, and experimental assessments, the article offers a well-rounded understanding of ML strategies in protein function identification. The paper also explores techniques for multi-task and multi-label detection of protein functions, in addition to focusing on single-task methods. Moreover, the paper sheds light on the future avenues of ML in protein function determination.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1965-1986"},"PeriodicalIF":3.6000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10598224/","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
This review article delves deeply into the various machine learning (ML) methods and algorithms employed in discerning protein functions. Each method discussed is assessed for its efficacy, limitations, potential improvements, and future prospects. We present an innovative hierarchical classification system that arranges algorithms into intricate categories and unique techniques. This taxonomy is based on a tri-level hierarchy, starting with the methodology category and narrowing down to specific techniques. Such a framework allows for a structured and comprehensive classification of algorithms, assisting researchers in understanding the interrelationships among diverse algorithms and techniques. The study incorporates both empirical and experimental evaluations to differentiate between the techniques. The empirical evaluation ranks the techniques based on four criteria. The experimental assessments rank: (1) individual techniques under the same methodology sub-category, (2) different sub-categories within the same category, and (3) the broad categories themselves. Integrating the innovative methodological classification, empirical findings, and experimental assessments, the article offers a well-rounded understanding of ML strategies in protein function identification. The paper also explores techniques for multi-task and multi-label detection of protein functions, in addition to focusing on single-task methods. Moreover, the paper sheds light on the future avenues of ML in protein function determination.
这篇综述文章深入探讨了用于辨别蛋白质功能的各种机器学习(ML)方法和算法。文章对所讨论的每种方法的功效、局限性、潜在改进和未来前景进行了评估。我们提出了一种创新的分层分类系统,将算法分为复杂的类别和独特的技术。这种分类法基于三级层次结构,从方法类别开始,逐渐缩小到具体技术。这种框架可以对算法进行结构化的全面分类,帮助研究人员了解各种算法和技术之间的相互关系。研究结合了经验评估和实验评估来区分不同的技术。经验评估根据四项标准对技术进行排名。实验评估将:(1) 同一方法子类别下的单项技术;(2) 同一类别中的不同子类别;(3) 大类别本身。综合创新方法分类、经验发现和实验评估,文章提供了对蛋白质功能鉴定中的 ML 策略的全面理解。除了关注单任务方法,本文还探讨了蛋白质功能的多任务和多标签检测技术。此外,文章还揭示了 ML 在蛋白质功能鉴定中的未来发展方向。
期刊介绍:
IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system