Qiyuan Bai , Hao Chen , Wenshuo Li , Lei Li , Junhao Li , Zhen Gao , Yuan Li , Xuhua Li , Bing Song
{"title":"DeepForest-HTP:预测抗高血压肽的新型深度森林方法。","authors":"Qiyuan Bai , Hao Chen , Wenshuo Li , Lei Li , Junhao Li , Zhen Gao , Yuan Li , Xuhua Li , Bing Song","doi":"10.1016/j.cmpb.2024.108514","DOIUrl":null,"url":null,"abstract":"<div><div>Hypertension is a major preventable risk factor for cardiovascular disease, affecting over 1.5 billion adults worldwide. Antihypertensive peptides (AHTPs) have gained attention as a natural therapeutic option with minimal side effects. This study proposes a Deep Forest-based machine learning framework for AHTP prediction, leveraging a multi-granularity cascade structure to enhance classification accuracy. We integrated data from BIOPEP-UWM and three previously used datasets, totaling 2000 peptide sequences, and introduced novel feature extraction methods to build a comprehensive dataset for model training.</div><div>This study represents the first application of Deep Forest for AHTP identification, demonstrating substantial classification performance advantages over traditional methods (e.g., SVM, CNN, and XGBoost) as well as recent mainstream prediction models (Ensemble-AHTPpred, CNN-SVM Ensemble, and mAHTPred). Requiring no complex manual feature engineering, the model adapts flexibly to various data needs, offering a novel perspective for efficient AHTP prediction and promising utility in hypertension management.</div><div>On the benchmark dataset, the model achieved high accuracy, sensitivity, and AUC, providing a robust tool for identifying safe and effective AHTPs. However, future efforts should incorporate larger and more diverse independent validation datasets to further improve the model and enhance its generalizability. Additionally, the model's predictive accuracy relies on known AHTP targets and sequence features, potentially limiting its ability to detect AHTPs with uncharacterized or atypical properties.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"258 ","pages":"Article 108514"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DeepForest-HTP: A novel deep forest approach for predicting antihypertensive peptides\",\"authors\":\"Qiyuan Bai , Hao Chen , Wenshuo Li , Lei Li , Junhao Li , Zhen Gao , Yuan Li , Xuhua Li , Bing Song\",\"doi\":\"10.1016/j.cmpb.2024.108514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Hypertension is a major preventable risk factor for cardiovascular disease, affecting over 1.5 billion adults worldwide. Antihypertensive peptides (AHTPs) have gained attention as a natural therapeutic option with minimal side effects. This study proposes a Deep Forest-based machine learning framework for AHTP prediction, leveraging a multi-granularity cascade structure to enhance classification accuracy. We integrated data from BIOPEP-UWM and three previously used datasets, totaling 2000 peptide sequences, and introduced novel feature extraction methods to build a comprehensive dataset for model training.</div><div>This study represents the first application of Deep Forest for AHTP identification, demonstrating substantial classification performance advantages over traditional methods (e.g., SVM, CNN, and XGBoost) as well as recent mainstream prediction models (Ensemble-AHTPpred, CNN-SVM Ensemble, and mAHTPred). Requiring no complex manual feature engineering, the model adapts flexibly to various data needs, offering a novel perspective for efficient AHTP prediction and promising utility in hypertension management.</div><div>On the benchmark dataset, the model achieved high accuracy, sensitivity, and AUC, providing a robust tool for identifying safe and effective AHTPs. However, future efforts should incorporate larger and more diverse independent validation datasets to further improve the model and enhance its generalizability. Additionally, the model's predictive accuracy relies on known AHTP targets and sequence features, potentially limiting its ability to detect AHTPs with uncharacterized or atypical properties.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"258 \",\"pages\":\"Article 108514\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169260724005078\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724005078","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
DeepForest-HTP: A novel deep forest approach for predicting antihypertensive peptides
Hypertension is a major preventable risk factor for cardiovascular disease, affecting over 1.5 billion adults worldwide. Antihypertensive peptides (AHTPs) have gained attention as a natural therapeutic option with minimal side effects. This study proposes a Deep Forest-based machine learning framework for AHTP prediction, leveraging a multi-granularity cascade structure to enhance classification accuracy. We integrated data from BIOPEP-UWM and three previously used datasets, totaling 2000 peptide sequences, and introduced novel feature extraction methods to build a comprehensive dataset for model training.
This study represents the first application of Deep Forest for AHTP identification, demonstrating substantial classification performance advantages over traditional methods (e.g., SVM, CNN, and XGBoost) as well as recent mainstream prediction models (Ensemble-AHTPpred, CNN-SVM Ensemble, and mAHTPred). Requiring no complex manual feature engineering, the model adapts flexibly to various data needs, offering a novel perspective for efficient AHTP prediction and promising utility in hypertension management.
On the benchmark dataset, the model achieved high accuracy, sensitivity, and AUC, providing a robust tool for identifying safe and effective AHTPs. However, future efforts should incorporate larger and more diverse independent validation datasets to further improve the model and enhance its generalizability. Additionally, the model's predictive accuracy relies on known AHTP targets and sequence features, potentially limiting its ability to detect AHTPs with uncharacterized or atypical properties.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.