Simon Ngao Mule, Evaristo Villalba Alemán, Livia Rosa-Fernandes, Joyce S. Saad, Gilberto Santos de Oliveira, Deivid Martins, Claudia Blanes Angeli, Deborah Brandt-Almeida, Mauro Cortez, Martin Røssel Larsen, Jeffrey J. Shaw, Marta M. G. Teixeira, Giuseppe Palmisano
Evolutionary relationships among parasites of the subfamily Leishmaniinae, which comprises pathogen agents of leishmaniasis, were inferred based on differential protein expression profiles from mass spectrometry-based quantitative data using the PhyloQuant method. Evolutionary distances following identification and quantification of protein and peptide abundances using Proteome Discoverer and MaxQuant software were estimated for 11 species from six Leishmaniinae genera. Results clustered all dixenous species of the genus Leishmania, subgenera L. (Leishmania), L. (Viannia), and L. (Mundinia), sister to the dixenous species of genera Endotrypanum and Porcisia. Placed basal to the assemblage formed by all these parasites were the species of genera Zelonia, Crithidia, and Leptomonas, so far described as monoxenous of insects although eventually reported from humans. Inferences based on protein expression profiles were congruent with currently established phylogeny using DNA sequences. Our results reinforce PhyloQuant as a valuable approach to infer evolutionary relationships within Leishmaniinae, which is comprised of very tightly related trypanosomatids that are just beginning to be phylogenetically unraveled. In addition to evolutionary history, mapping of species-specific protein expression is paramount to understand differences in infection processes, tissue tropisms, potential to jump from insects to vertebrates including humans, and targets for species-specific diagnostic and drug development.
{"title":"Leishmaniinae: Evolutionary inferences based on protein expression profiles (PhyloQuant) congruent with phylogenetic relationships among Leishmania, Endotrypanum, Porcisia, Zelonia, Crithidia, and Leptomonas","authors":"Simon Ngao Mule, Evaristo Villalba Alemán, Livia Rosa-Fernandes, Joyce S. Saad, Gilberto Santos de Oliveira, Deivid Martins, Claudia Blanes Angeli, Deborah Brandt-Almeida, Mauro Cortez, Martin Røssel Larsen, Jeffrey J. Shaw, Marta M. G. Teixeira, Giuseppe Palmisano","doi":"10.1002/pmic.202100313","DOIUrl":"10.1002/pmic.202100313","url":null,"abstract":"<p>Evolutionary relationships among parasites of the subfamily Leishmaniinae, which comprises pathogen agents of leishmaniasis, were inferred based on differential protein expression profiles from mass spectrometry-based quantitative data using the PhyloQuant method. Evolutionary distances following identification and quantification of protein and peptide abundances using Proteome Discoverer and MaxQuant software were estimated for 11 species from six Leishmaniinae genera. Results clustered all dixenous species of the genus Leishmania, subgenera <i>L. (Leishmania)</i>, <i>L. (Viannia)</i>, and <i>L. (Mundinia)</i>, sister to the dixenous species of genera <i>Endotrypanum</i> and <i>Porcisia</i>. Placed basal to the assemblage formed by all these parasites were the species of genera <i>Zelonia</i>, <i>Crithidia</i>, and <i>Leptomonas</i>, so far described as monoxenous of insects although eventually reported from humans. Inferences based on protein expression profiles were congruent with currently established phylogeny using DNA sequences. Our results reinforce PhyloQuant as a valuable approach to infer evolutionary relationships within Leishmaniinae, which is comprised of very tightly related trypanosomatids that are just beginning to be phylogenetically unraveled. In addition to evolutionary history, mapping of species-specific protein expression is paramount to understand differences in infection processes, tissue tropisms, potential to jump from insects to vertebrates including humans, and targets for species-specific diagnostic and drug development.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quang H. Nguyen, Thanh-Hoang Nguyen-Vo, Trang T. T. Do, Binh P. Nguyen
Short-length antimicrobial peptides (AMPs) have been demonstrated to have intensified antimicrobial activities against a wide spectrum of microbes. Therefore, exploration of novel and promising short AMPs is highly essential in developing various types of antimicrobial drugs or treatments. In addition to experimental approaches, computational methods have been developed to improve screening efficiency. Although existing computational methods have achieved satisfactory performance, there is still much room for model improvement. In this study, we proposed iAMP-DL, an efficient hybrid deep learning architecture, for predicting short AMPs. The model was constructed using two well-known deep learning architectures: the long short-term memory architecture and convolutional neural networks. To fairly assess the performance of the model, we compared our model with existing state-of-the-art methods using the same independent test set. Our comparative analysis shows that iAMP-DL outperformed other methods. Furthermore, to assess the robustness and stability of our model, the experiments were repeated 10 times to observe the variation in prediction efficiency. The results demonstrate that iAMP-DL is an effective, robust, and stable framework for detecting promising short AMPs. Another comparative study of different negative data sampling methods also confirms the effectiveness of our method and demonstrates that it can also be used to develop a robust model for predicting AMPs in general. The proposed framework was also deployed as an online web server with a user-friendly interface to support the research community in identifying short AMPs.
{"title":"An efficient hybrid deep learning architecture for predicting short antimicrobial peptides","authors":"Quang H. Nguyen, Thanh-Hoang Nguyen-Vo, Trang T. T. Do, Binh P. Nguyen","doi":"10.1002/pmic.202300382","DOIUrl":"10.1002/pmic.202300382","url":null,"abstract":"<p>Short-length antimicrobial peptides (AMPs) have been demonstrated to have intensified antimicrobial activities against a wide spectrum of microbes. Therefore, exploration of novel and promising short AMPs is highly essential in developing various types of antimicrobial drugs or treatments. In addition to experimental approaches, computational methods have been developed to improve screening efficiency. Although existing computational methods have achieved satisfactory performance, there is still much room for model improvement. In this study, we proposed iAMP-DL, an efficient hybrid deep learning architecture, for predicting short AMPs. The model was constructed using two well-known deep learning architectures: the long short-term memory architecture and convolutional neural networks. To fairly assess the performance of the model, we compared our model with existing state-of-the-art methods using the same independent test set. Our comparative analysis shows that iAMP-DL outperformed other methods. Furthermore, to assess the robustness and stability of our model, the experiments were repeated 10 times to observe the variation in prediction efficiency. The results demonstrate that iAMP-DL is an effective, robust, and stable framework for detecting promising short AMPs. Another comparative study of different negative data sampling methods also confirms the effectiveness of our method and demonstrates that it can also be used to develop a robust model for predicting AMPs in general. The proposed framework was also deployed as an online web server with a user-friendly interface to support the research community in identifying short AMPs.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300382","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141260038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Extracellular vesicles (EVs) are membrane-surrounded vesicles released by various cell types into the extracellular microenvironment. Although EVs vary in size, biological function, and components, their importance in cancer progression and the potential use of EV molecular species to serve as novel cancer biomarkers have become increasingly evident. Cancer cells actively release EVs into surrounding tissues, which play vital roles in cancer progression and metastasis, including invasion and immune modulation. EVs released by cancer cells are usually chosen as a gateway in the search for biomarkers for cancer. In this review, we mainly focused on molecular profiling of EV protein constituents from breast cancer, emphasizing mass spectrometry (MS)-based proteomic approaches. To further investigate the potential use of EVs as a source of breast cancer biomarkers, we have discussed the use of these proteins as predictive marker candidates. Besides, we have also summarized the key characteristics of EVs as potential therapeutic targets in breast cancer and provided significant information on their implications in breast cancer development and progression. Information provided in this review may help understand the recent progress in understanding EV biology and their potential role as new noninvasive biomarkers as well as emerging therapeutic opportunities and associated challenges.
细胞外囊泡(EVs)是由各种类型细胞释放到细胞外微环境中的膜包围囊泡。尽管EVs的大小、生物功能和成分各不相同,但它们在癌症进展中的重要性以及EV分子物种作为新型癌症生物标记物的潜在用途已变得越来越明显。癌细胞会主动向周围组织释放 EVs,这些 EVs 在癌症进展和转移过程中发挥着重要作用,包括侵袭和免疫调节。癌细胞释放的 EV 通常被选为寻找癌症生物标志物的入口。在这篇综述中,我们主要关注乳腺癌 EV 蛋白成分的分子谱分析,强调基于质谱(MS)的蛋白质组学方法。为了进一步研究 EVs 作为乳腺癌生物标志物来源的潜在用途,我们讨论了将这些蛋白质作为候选预测标志物的用途。此外,我们还总结了作为乳腺癌潜在治疗靶点的 EVs 的主要特征,并提供了它们在乳腺癌发展和恶化过程中的重要影响。本综述所提供的信息可能有助于人们了解 EV 生物学的最新进展及其作为新的非侵入性生物标记物的潜在作用,以及新出现的治疗机会和相关挑战。
{"title":"Extracellular vesicle proteins as breast cancer biomarkers: Mass spectrometry-based analysis","authors":"Raju Bandu, Jae Won Oh, Kwang Pyo Kim","doi":"10.1002/pmic.202300062","DOIUrl":"10.1002/pmic.202300062","url":null,"abstract":"<p>Extracellular vesicles (EVs) are membrane-surrounded vesicles released by various cell types into the extracellular microenvironment. Although EVs vary in size, biological function, and components, their importance in cancer progression and the potential use of EV molecular species to serve as novel cancer biomarkers have become increasingly evident. Cancer cells actively release EVs into surrounding tissues, which play vital roles in cancer progression and metastasis, including invasion and immune modulation. EVs released by cancer cells are usually chosen as a gateway in the search for biomarkers for cancer. In this review, we mainly focused on molecular profiling of EV protein constituents from breast cancer, emphasizing mass spectrometry (MS)-based proteomic approaches. To further investigate the potential use of EVs as a source of breast cancer biomarkers, we have discussed the use of these proteins as predictive marker candidates. Besides, we have also summarized the key characteristics of EVs as potential therapeutic targets in breast cancer and provided significant information on their implications in breast cancer development and progression. Information provided in this review may help understand the recent progress in understanding EV biology and their potential role as new noninvasive biomarkers as well as emerging therapeutic opportunities and associated challenges.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300062","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Van-An Duong, Altai Enkhbayar, Nobel Bhasin, Lakmini Senavirathna, Eva C Preisner, Kristi L Hoffman, Richa Shukla, Robert R Jenq, Kai Cheng, Mary P Bronner, Daniel Figeys, Robert A Britton, Sheng Pan, Ru Chen
The human gut microbiome plays a vital role in preserving individual health and is intricately involved in essential functions. Imbalances or dysbiosis within the microbiome can significantly impact human health and are associated with many diseases. Several metaproteomics platforms are currently available to study microbial proteins within complex microbial communities. In this study, we attempted to develop an integrated pipeline to provide deeper insights into both the taxonomic and functional aspects of the cultivated human gut microbiomes derived from clinical colon biopsies. We combined a rapid peptide search by MSFragger against the Unified Human Gastrointestinal Protein database and the taxonomic and functional analyses with Unipept Desktop and MetaLab-MAG. Across seven samples, we identified and matched nearly 36,000 unique peptides to approximately 300 species and 11 phyla. Unipept Desktop provided gene ontology, InterPro entries, and enzyme commission number annotations, facilitating the identification of relevant metabolic pathways. MetaLab-MAG contributed functional annotations through Clusters of Orthologous Genes and Non-supervised Orthologous Groups categories. These results unveiled functional similarities and differences among the samples. This integrated pipeline holds the potential to provide deeper insights into the taxonomy and functions of the human gut microbiome for interrogating the intricate connections between microbiome balance and diseases.
{"title":"A complementary metaproteomic approach to interrogate microbiome cultivated from clinical colon biopsies.","authors":"Van-An Duong, Altai Enkhbayar, Nobel Bhasin, Lakmini Senavirathna, Eva C Preisner, Kristi L Hoffman, Richa Shukla, Robert R Jenq, Kai Cheng, Mary P Bronner, Daniel Figeys, Robert A Britton, Sheng Pan, Ru Chen","doi":"10.1002/pmic.202400078","DOIUrl":"10.1002/pmic.202400078","url":null,"abstract":"<p><p>The human gut microbiome plays a vital role in preserving individual health and is intricately involved in essential functions. Imbalances or dysbiosis within the microbiome can significantly impact human health and are associated with many diseases. Several metaproteomics platforms are currently available to study microbial proteins within complex microbial communities. In this study, we attempted to develop an integrated pipeline to provide deeper insights into both the taxonomic and functional aspects of the cultivated human gut microbiomes derived from clinical colon biopsies. We combined a rapid peptide search by MSFragger against the Unified Human Gastrointestinal Protein database and the taxonomic and functional analyses with Unipept Desktop and MetaLab-MAG. Across seven samples, we identified and matched nearly 36,000 unique peptides to approximately 300 species and 11 phyla. Unipept Desktop provided gene ontology, InterPro entries, and enzyme commission number annotations, facilitating the identification of relevant metabolic pathways. MetaLab-MAG contributed functional annotations through Clusters of Orthologous Genes and Non-supervised Orthologous Groups categories. These results unveiled functional similarities and differences among the samples. This integrated pipeline holds the potential to provide deeper insights into the taxonomy and functions of the human gut microbiome for interrogating the intricate connections between microbiome balance and diseases.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zahoor Ahmed, Kiran Shahzadi, Yanting Jin, Rui Li, Biffon Manyura Momanyi, Hasan Zulfiqar, Lin Ning, Hao Lin
RNA-dependent liquid-liquid phase separation (LLPS) proteins play critical roles in cellular processes such as stress granule formation, DNA repair, RNA metabolism, germ cell development, and protein translation regulation. The abnormal behavior of these proteins is associated with various diseases, particularly neurodegenerative disorders like amyotrophic lateral sclerosis and frontotemporal dementia, making their identification crucial. However, conventional biochemistry-based methods for identifying these proteins are time-consuming and costly. Addressing this challenge, our study developed a robust computational model for their identification. We constructed a comprehensive dataset containing 137 RNA-dependent and 606 non-RNA-dependent LLPS protein sequences, which were then encoded using amino acid composition, composition of K-spaced amino acid pairs, Geary autocorrelation, and conjoined triad methods. Through a combination of correlation analysis, mutual information scoring, and incremental feature selection, we identified an optimal feature subset. This subset was used to train a random forest model, which achieved an accuracy of 90% when tested against an independent dataset. This study demonstrates the potential of computational methods as efficient alternatives for the identification of RNA-dependent LLPS proteins. To enhance the accessibility of the model, a user-centric web server has been established and can be accessed via the link: http://rpp.lin-group.cn.
{"title":"Identification of RNA‐dependent liquid‐liquid phase separation proteins using an artificial intelligence strategy.","authors":"Zahoor Ahmed, Kiran Shahzadi, Yanting Jin, Rui Li, Biffon Manyura Momanyi, Hasan Zulfiqar, Lin Ning, Hao Lin","doi":"10.1002/pmic.202400044","DOIUrl":"https://doi.org/10.1002/pmic.202400044","url":null,"abstract":"<p><p>RNA-dependent liquid-liquid phase separation (LLPS) proteins play critical roles in cellular processes such as stress granule formation, DNA repair, RNA metabolism, germ cell development, and protein translation regulation. The abnormal behavior of these proteins is associated with various diseases, particularly neurodegenerative disorders like amyotrophic lateral sclerosis and frontotemporal dementia, making their identification crucial. However, conventional biochemistry-based methods for identifying these proteins are time-consuming and costly. Addressing this challenge, our study developed a robust computational model for their identification. We constructed a comprehensive dataset containing 137 RNA-dependent and 606 non-RNA-dependent LLPS protein sequences, which were then encoded using amino acid composition, composition of K-spaced amino acid pairs, Geary autocorrelation, and conjoined triad methods. Through a combination of correlation analysis, mutual information scoring, and incremental feature selection, we identified an optimal feature subset. This subset was used to train a random forest model, which achieved an accuracy of 90% when tested against an independent dataset. This study demonstrates the potential of computational methods as efficient alternatives for the identification of RNA-dependent LLPS proteins. To enhance the accessibility of the model, a user-centric web server has been established and can be accessed via the link: http://rpp.lin-group.cn.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando J. Peña, Francisco Eduardo Martín-Cano, Laura Becerro-Rey, Cristina Ortega-Ferrusola, Gemma Gaitskell-Phillips, Eva da Silva-Álvarez, María Cruz Gil
The mammalian ejaculate is very well suited to proteomics studies. As such, research concerning sperm proteomics is offering a huge amount of new information on the biology of spermatozoa. Among domestic animals, horses represent a species of special interest, in which reproductive technologies and a sizeable market of genetic material have grown exponentially in the last decade. Studies using proteomic approaches have been conducted in recent years, showing that proteomics is a potent tool to dig into the biology of the stallion spermatozoa. The aim of this review is to present an overview of the research conducted, and how these studies have improved our knowledge of stallion sperm biology. The main outcomes of the research conducted so far have been an improved knowledge of metabolism, and its importance in sperm functions, the impact of different technologies on the sperm proteome, and the identification of potential biomarkers. Moreover, proteomics of seminal plasma and phosphoproteomics are identified as areas of major interest.
{"title":"Proteomics is advancing the understanding of stallion sperm biology","authors":"Fernando J. Peña, Francisco Eduardo Martín-Cano, Laura Becerro-Rey, Cristina Ortega-Ferrusola, Gemma Gaitskell-Phillips, Eva da Silva-Álvarez, María Cruz Gil","doi":"10.1002/pmic.202300522","DOIUrl":"10.1002/pmic.202300522","url":null,"abstract":"<p>The mammalian ejaculate is very well suited to proteomics studies. As such, research concerning sperm proteomics is offering a huge amount of new information on the biology of spermatozoa. Among domestic animals, horses represent a species of special interest, in which reproductive technologies and a sizeable market of genetic material have grown exponentially in the last decade. Studies using proteomic approaches have been conducted in recent years, showing that proteomics is a potent tool to dig into the biology of the stallion spermatozoa. The aim of this review is to present an overview of the research conducted, and how these studies have improved our knowledge of stallion sperm biology. The main outcomes of the research conducted so far have been an improved knowledge of metabolism, and its importance in sperm functions, the impact of different technologies on the sperm proteome, and the identification of potential biomarkers. Moreover, proteomics of seminal plasma and phosphoproteomics are identified as areas of major interest.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300522","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141160859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philippe Charlier, Virginie Bourdin, Didier N'Dah, Mélodie Kielbasa, Olivier Pible, Jean Armengaud
The palace of King Ghezo in Abomey, capital of the ancient kingdom of Dahomey (present-day Benin), houses two sacred huts which are specific funerary structures. It is claimed that the binder in their walls is made of human blood. In the study presented here, we conceived an original strategy to analyze the proteins present on minute amounts of the cladding sampled from the inner facade of the cenotaph wall and establish their origin. The extracted proteins were proteolyzed and the resulting peptides were characterized by high-resolution tandem mass spectrometry. Over 6397 distinct molecular entities were identified using cascading searches. Starting from without a priori searches of an extended generic database, the peptide repertoire was narrowed down to the most representative organisms—identified by means of taxon-specific peptides. A wide diversity of bacteria, fungi, plants, and animals were detected through the available protein material. This inventory was used to archaeologically reconstruct the voodoo rituals of consecration and maintenance of vitality. Several indicators attested to the presence of traces of human and poultry blood in the material taken. This study shows the essential advantages of paleoproteomics and metaproteomics for the study of ancient residues from archaeological excavations or historical monuments.
{"title":"Metaproteomic analysis of King Ghezo tomb wall (Abomey, Benin) confirms 19th century voodoo sacrifices","authors":"Philippe Charlier, Virginie Bourdin, Didier N'Dah, Mélodie Kielbasa, Olivier Pible, Jean Armengaud","doi":"10.1002/pmic.202400048","DOIUrl":"10.1002/pmic.202400048","url":null,"abstract":"<p>The palace of King Ghezo in Abomey, capital of the ancient kingdom of Dahomey (present-day Benin), houses two sacred huts which are specific funerary structures. It is claimed that the binder in their walls is made of human blood. In the study presented here, we conceived an original strategy to analyze the proteins present on minute amounts of the cladding sampled from the inner facade of the cenotaph wall and establish their origin. The extracted proteins were proteolyzed and the resulting peptides were characterized by high-resolution tandem mass spectrometry. Over 6397 distinct molecular entities were identified using cascading searches. Starting from without a priori searches of an extended generic database, the peptide repertoire was narrowed down to the most representative organisms—identified by means of taxon-specific peptides. A wide diversity of bacteria, fungi, plants, and animals were detected through the available protein material. This inventory was used to archaeologically reconstruct the voodoo rituals of consecration and maintenance of vitality. Several indicators attested to the presence of traces of human and poultry blood in the material taken. This study shows the essential advantages of paleoproteomics and metaproteomics for the study of ancient residues from archaeological excavations or historical monuments.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202400048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141160858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dashleen Kaur, Akanksha Arora, Palani Vigneshwar, Gajendra P. S. Raghava
Peptide hormones serve as genome-encoded signal transduction molecules that play essential roles in multicellular organisms, and their dysregulation can lead to various health problems. In this study, we propose a method for predicting hormonal peptides with high accuracy. The dataset used for training, testing, and evaluating our models consisted of 1174 hormonal and 1174 non-hormonal peptide sequences. Initially, we developed similarity-based methods utilizing BLAST and MERCI software. Although these similarity-based methods provided a high probability of correct prediction, they had limitations, such as no hits or prediction of limited sequences. To overcome these limitations, we further developed machine and deep learning-based models. Our logistic regression-based model achieved a maximum AUROC of 0.93 with an accuracy of 86% on an independent/validation dataset. To harness the power of similarity-based and machine learning-based models, we developed an ensemble method that achieved an AUROC of 0.96 with an accuracy of 89.79% and a Matthews correlation coefficient (MCC) of 0.8 on the validation set. To facilitate researchers in predicting and designing hormone peptides, we developed a web-based server called HOPPred. This server offers a unique feature that allows the identification of hormone-associated motifs within hormone peptides. The server can be accessed at: https://webs.iiitd.edu.in/raghava/hoppred/.
{"title":"Prediction of peptide hormones using an ensemble of machine learning and similarity-based methods","authors":"Dashleen Kaur, Akanksha Arora, Palani Vigneshwar, Gajendra P. S. Raghava","doi":"10.1002/pmic.202400004","DOIUrl":"10.1002/pmic.202400004","url":null,"abstract":"<p>Peptide hormones serve as genome-encoded signal transduction molecules that play essential roles in multicellular organisms, and their dysregulation can lead to various health problems. In this study, we propose a method for predicting hormonal peptides with high accuracy. The dataset used for training, testing, and evaluating our models consisted of 1174 hormonal and 1174 non-hormonal peptide sequences. Initially, we developed similarity-based methods utilizing BLAST and MERCI software. Although these similarity-based methods provided a high probability of correct prediction, they had limitations, such as no hits or prediction of limited sequences. To overcome these limitations, we further developed machine and deep learning-based models. Our logistic regression-based model achieved a maximum AUROC of 0.93 with an accuracy of 86% on an independent/validation dataset. To harness the power of similarity-based and machine learning-based models, we developed an ensemble method that achieved an AUROC of 0.96 with an accuracy of 89.79% and a Matthews correlation coefficient (MCC) of 0.8 on the validation set. To facilitate researchers in predicting and designing hormone peptides, we developed a web-based server called HOPPred. This server offers a unique feature that allows the identification of hormone-associated motifs within hormone peptides. The server can be accessed at: https://webs.iiitd.edu.in/raghava/hoppred/.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141157042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}