Francesca Peccati, Cristina M. Segovia, Reyes Núñez-Franco, Gonzalo Jiménez-Osés
The ability to computationally predict changes in protein thermostability upon mutation is crucial for advancing protein design and engineering, with applications ranging from therapeutics to biocatalysis. This review provides a comprehensive overview of the significant challenges and diverse computational strategies for predicting protein stability and understanding epistatic interactions across protein variants. A primary obstacle to this goal is the scarcity of high-quality, large-scale thermodynamic datasets, which are often biased toward single-point, destabilizing mutations and lack standardized experimental metrics. This limitation directly impacts the performance and generalizability of data-driven methods, from early machine learning approaches to modern deep learning architectures such as ThermoMPNN and protein language models. Physics-based approaches, such as those employing Rosetta and FoldX energy functions, offer valuable insights but are often limited by their reliance on static structures and oversimplified representations of the unfolded state. While molecular dynamics simulations can capture the critical role of protein flexibility and dynamics in thermostabilization, their computational cost restricts their application in high-throughput screening. Accurately predicting the effects of multiple mutations is further complicated by epistasis, where nonadditive interactions can significantly alter stability and function. Overcoming these hurdles requires a synergistic approach, integrating AI-driven predictions with physics-based simulations and accurate conformational sampling methods. Promising future directions include the development of more comprehensive and unbiased datasets, and improved modeling of epistasis and the (un)folded states and their ensembles. Such advancements are essential for enhancing the reliability of thermostability predictions and navigating the complex stability–activity trade-offs inherent in protein optimization and design.
{"title":"Computation of Protein Thermostability and Epistasis","authors":"Francesca Peccati, Cristina M. Segovia, Reyes Núñez-Franco, Gonzalo Jiménez-Osés","doi":"10.1002/wcms.70045","DOIUrl":"10.1002/wcms.70045","url":null,"abstract":"<p>The ability to computationally predict changes in protein thermostability upon mutation is crucial for advancing protein design and engineering, with applications ranging from therapeutics to biocatalysis. This review provides a comprehensive overview of the significant challenges and diverse computational strategies for predicting protein stability and understanding epistatic interactions across protein variants. A primary obstacle to this goal is the scarcity of high-quality, large-scale thermodynamic datasets, which are often biased toward single-point, destabilizing mutations and lack standardized experimental metrics. This limitation directly impacts the performance and generalizability of data-driven methods, from early machine learning approaches to modern deep learning architectures such as ThermoMPNN and protein language models. Physics-based approaches, such as those employing Rosetta and FoldX energy functions, offer valuable insights but are often limited by their reliance on static structures and oversimplified representations of the unfolded state. While molecular dynamics simulations can capture the critical role of protein flexibility and dynamics in thermostabilization, their computational cost restricts their application in high-throughput screening. Accurately predicting the effects of multiple mutations is further complicated by epistasis, where nonadditive interactions can significantly alter stability and function. Overcoming these hurdles requires a synergistic approach, integrating AI-driven predictions with physics-based simulations and accurate conformational sampling methods. Promising future directions include the development of more comprehensive and unbiased datasets, and improved modeling of epistasis and the (un)folded states and their ensembles. Such advancements are essential for enhancing the reliability of thermostability predictions and navigating the complex stability–activity trade-offs inherent in protein optimization and design.</p><p>This article is categorized under:\u0000\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"15 5","pages":""},"PeriodicalIF":27.0,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.70045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}