{"title":"Direct conversion of peptides into diverse peptidomimetics using a transformer-based chemical language model","authors":"Atsushi Yoshimori , Jürgen Bajorath","doi":"10.1016/j.ejmcr.2025.100249","DOIUrl":null,"url":null,"abstract":"<div><div>The design of pharmaceutically relevant compounds that mimic bioactive peptides or secondary structure elements in proteins is an important task in medicinal chemistry. Over time, various chemical strategies have been developed to convert natural peptide ligands into so-called peptidomimetics. This process is supported by computational approaches to identify peptidomimetic candidate compounds or design templates mimicking active peptide conformations. However, generating peptidomimetics continues to be challenging. Chemical language models (CLMs) offer new opportunities for molecular design. Therefore, we have revisited computational design of peptidomimetics from a different perspective and devised a CLM to directly transform input peptides into peptidomimetic candidates, without requiring intermediate states. A critically important aspect of the approach has been the generation of training data for effective learning that was guided by a quantitative measure of peptide-likeness such that the CLM could implicitly capture transitions from peptides or peptide-like molecules to compounds with reduced or eliminated peptide character. Herein, we introduce the CLM for peptidomimetics design and establish proof-of-principle for the approach. For given input peptides, both the general model and a version fine-tuned for a specific application were shown to produce a spectrum of candidate compounds with varying similarity, gradually changing chemical features, and diminishing peptide-likeness. As a part of our study, the CLM and data are provided.</div></div>","PeriodicalId":12015,"journal":{"name":"European Journal of Medicinal Chemistry Reports","volume":"13 ","pages":"Article 100249"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medicinal Chemistry Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772417425000056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The design of pharmaceutically relevant compounds that mimic bioactive peptides or secondary structure elements in proteins is an important task in medicinal chemistry. Over time, various chemical strategies have been developed to convert natural peptide ligands into so-called peptidomimetics. This process is supported by computational approaches to identify peptidomimetic candidate compounds or design templates mimicking active peptide conformations. However, generating peptidomimetics continues to be challenging. Chemical language models (CLMs) offer new opportunities for molecular design. Therefore, we have revisited computational design of peptidomimetics from a different perspective and devised a CLM to directly transform input peptides into peptidomimetic candidates, without requiring intermediate states. A critically important aspect of the approach has been the generation of training data for effective learning that was guided by a quantitative measure of peptide-likeness such that the CLM could implicitly capture transitions from peptides or peptide-like molecules to compounds with reduced or eliminated peptide character. Herein, we introduce the CLM for peptidomimetics design and establish proof-of-principle for the approach. For given input peptides, both the general model and a version fine-tuned for a specific application were shown to produce a spectrum of candidate compounds with varying similarity, gradually changing chemical features, and diminishing peptide-likeness. As a part of our study, the CLM and data are provided.