Pub Date : 2026-01-22DOI: 10.1021/acs.jctc.5c01483
Luis Vasquez,Maxim F Gelin,Sebastian Pios,Lipeng Chen,Zhenggang Lan,Wolfgang Domcke
We present an efficient numerical implementation of the quasi-classical doorway-window approximation, specifically designed for on-the-fly simulations of time-resolved nonlinear spectroscopic signals, in the Julia package WaveMixings.jl. The code takes as input quasi-classical trajectory data (energies of electronic states and transition dipole moments between elecronic states) as a function of discretized time, which may be provided by any nonadiabatic quasi-classical dynamics package. The output of WaveMixings.jl are raw spectroscopic data, that is, spectral intensities as functions one or more frequencies and pump-probe delay time. The package contains modules that facilitate standard tasks such as input/output data handling, data filtering, and postprocessing, among others. WaveMixings.jl includes implementations of various signals, including integral and dispersed transient absorption pump-probe signals, time- and frequency-resolved fluorescence spectra, and two-dimensional electronic spectra. By developing WaveMixings.jl we aim to create a versatile platform to perform simulations and develop new methodologies within the quasi-classical doorway-window framework. WaveMixings.jl differs from other existing codes for the simulation of nonlinear time-resolved spectra by the explicit inclusion of the shapes and durations of the laser pulses.
{"title":"WaveMixings.jl: A Julia Package for Computing Time-Resolved Nonlinear Electronic Spectra from on-the-Fly Quasi-Classical Trajectories.","authors":"Luis Vasquez,Maxim F Gelin,Sebastian Pios,Lipeng Chen,Zhenggang Lan,Wolfgang Domcke","doi":"10.1021/acs.jctc.5c01483","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01483","url":null,"abstract":"We present an efficient numerical implementation of the quasi-classical doorway-window approximation, specifically designed for on-the-fly simulations of time-resolved nonlinear spectroscopic signals, in the Julia package WaveMixings.jl. The code takes as input quasi-classical trajectory data (energies of electronic states and transition dipole moments between elecronic states) as a function of discretized time, which may be provided by any nonadiabatic quasi-classical dynamics package. The output of WaveMixings.jl are raw spectroscopic data, that is, spectral intensities as functions one or more frequencies and pump-probe delay time. The package contains modules that facilitate standard tasks such as input/output data handling, data filtering, and postprocessing, among others. WaveMixings.jl includes implementations of various signals, including integral and dispersed transient absorption pump-probe signals, time- and frequency-resolved fluorescence spectra, and two-dimensional electronic spectra. By developing WaveMixings.jl we aim to create a versatile platform to perform simulations and develop new methodologies within the quasi-classical doorway-window framework. WaveMixings.jl differs from other existing codes for the simulation of nonlinear time-resolved spectra by the explicit inclusion of the shapes and durations of the laser pulses.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"52 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1021/acs.jctc.5c01766
Zhixuan Zhong,Linbo Ma,Jian Jiang
Coarse-grained (CG) modeling simplifies molecular systems by mapping groups of atoms into representative particles, but traditional approaches depend on fixed rules and struggle to handle diverse chemical structures. Supervised CG methods further suffer from limited labeled datasets and the inability to control mapping resolution, which is essential for multiscale modeling. To overcome these limitations, we propose MolCluster, an unsupervised model that integrates graph neural networks and community detection algorithm to extract CG representations. Additionally, a predefined group pair loss ensures the preservation of target groups, and a bisection strategy enables precise, customizable resolution across different molecular systems. In the case of downstream task, evaluations on the MARTINI2 dataset demonstrate that MolCluster, benefiting from its label-free pretraining strategy, outperforms both traditional clustering and supervised models in CG mapping and bead type prediction. Overall, these results highlight the potential of MolCluster as a base model for customizable and chemically consistent CG mapping, with future applications extending to polymers, proteins, and other complex multiscale systems.
{"title":"MolCluster: An Unsupervised Framework for Multiscale Molecular Representations with Physically Consistent Resolution Control","authors":"Zhixuan Zhong,Linbo Ma,Jian Jiang","doi":"10.1021/acs.jctc.5c01766","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01766","url":null,"abstract":"Coarse-grained (CG) modeling simplifies molecular systems by mapping groups of atoms into representative particles, but traditional approaches depend on fixed rules and struggle to handle diverse chemical structures. Supervised CG methods further suffer from limited labeled datasets and the inability to control mapping resolution, which is essential for multiscale modeling. To overcome these limitations, we propose MolCluster, an unsupervised model that integrates graph neural networks and community detection algorithm to extract CG representations. Additionally, a predefined group pair loss ensures the preservation of target groups, and a bisection strategy enables precise, customizable resolution across different molecular systems. In the case of downstream task, evaluations on the MARTINI2 dataset demonstrate that MolCluster, benefiting from its label-free pretraining strategy, outperforms both traditional clustering and supervised models in CG mapping and bead type prediction. Overall, these results highlight the potential of MolCluster as a base model for customizable and chemically consistent CG mapping, with future applications extending to polymers, proteins, and other complex multiscale systems.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"276 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146006340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1021/acs.jctc.5c01500
Carlos A Martins,Daniela A Damasceno,Keat Yung Hue,Caetano Rodrigues Miranda,Erich A Müller,Rodrigo A Vargas-Hernández
Coarse-grained (CG) force field models are extensively utilized in material simulations because of their scalability. Ordinarily, these models are parametrized using hybrid strategies that sequentially integrate top-down and bottom-up approaches. However, this combination restricts the capacity to jointly optimize all parameters. Although Bayesian optimization (BO) has been explored as an alternative search strategy to identify well-optimized CG parameters, its application has conventionally been limited to low-dimensional scenarios. This has contributed to the assumption that BO is unsuitable for more complex CG models, which often involve a large number of parameters. In this study, we challenge this assumption by successfully extending BO, using the tree-structured Parzen estimator (TPE) model, to optimize a high-dimensional CG model. Specifically, we show that a 41-parameter CG model of Pebax-1657, a copolymer composed of alternating polyamide and polyether segments, can be effectively parametrized using BO, resulting in a model that accurately reproduces the key physical properties of its parent atomistic representation. Our optimization framework simultaneously targets structural and thermodynamic properties, namely, density, radius of gyration, and glass transition temperature. Compared to traditional search algorithms, BO-TPE not only converges faster but also delivers consistent improvements over more standard parametrization approaches.
{"title":"Bayesian Optimization for High-Dimensional Coarse-Grained Model Parameterization: A Case Study on Pebax Polymer.","authors":"Carlos A Martins,Daniela A Damasceno,Keat Yung Hue,Caetano Rodrigues Miranda,Erich A Müller,Rodrigo A Vargas-Hernández","doi":"10.1021/acs.jctc.5c01500","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01500","url":null,"abstract":"Coarse-grained (CG) force field models are extensively utilized in material simulations because of their scalability. Ordinarily, these models are parametrized using hybrid strategies that sequentially integrate top-down and bottom-up approaches. However, this combination restricts the capacity to jointly optimize all parameters. Although Bayesian optimization (BO) has been explored as an alternative search strategy to identify well-optimized CG parameters, its application has conventionally been limited to low-dimensional scenarios. This has contributed to the assumption that BO is unsuitable for more complex CG models, which often involve a large number of parameters. In this study, we challenge this assumption by successfully extending BO, using the tree-structured Parzen estimator (TPE) model, to optimize a high-dimensional CG model. Specifically, we show that a 41-parameter CG model of Pebax-1657, a copolymer composed of alternating polyamide and polyether segments, can be effectively parametrized using BO, resulting in a model that accurately reproduces the key physical properties of its parent atomistic representation. Our optimization framework simultaneously targets structural and thermodynamic properties, namely, density, radius of gyration, and glass transition temperature. Compared to traditional search algorithms, BO-TPE not only converges faster but also delivers consistent improvements over more standard parametrization approaches.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"48 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1021/acs.jctc.5c01587
Adam Coxson,Ömer H. Omar,Marcos del Cueto,Alessandro Troisi
We present a machine learning approach (ΔML) capable of enhancing the accuracy of semiempirical excited-state energy calculations to a level close to that of Time-Dependent Density Functional Theory (TDDFT). Using a data set of 7600 organic π-conjugated molecules calculated at the ZINDO and M06-2X/3-21G* TDDFT computational levels, we trained a set of models to learn the systematic errors of the low-level method and correct it toward higher-level accuracy values. The best performing model improved the correlation of ZINDO S1 energy predictions from 0.77 to 0.96 on a 9500 molecule test set of TDDFT target energies. Our ΔML-ZINDO model presents a negligible additional cost (∼2 ms per molecule) to a standard ZINDO calculation (∼2 s per molecule), enabling the computational screening of large data sets of molecules. Critical to the performance of the model is the AttentiveFP Message-Passing Neural Network with added electronic information derived from ZINDO calculations such as particle-hole densities. We also investigate the utility of the Morgan fingerprint and a novel descriptor designed to capture the electronic structure of molecules: a molecular orbital-weighted radial distribution function. The ΔML framework is retrainable to other low- and high-level calculation pairs, achieving an improvement in correlation from 0.88 to 0.99 on a test set of 24,000 molecules from the QCDGE data set, when mapping ZINDO to ωB97X-D/6-31G* energies. We also adapt ΔML-ZINDO for S1 oscillator strength prediction, improving ZINDO predictions from a correlation of 0.524 to 0.839 on our M06-2X/3-21G* target test set, thus enabling the identification of emissive molecules.
{"title":"Predicting S1 TDDFT Energies from ZINDO Calculations Using Message-Passing ΔML with Electronically Informed Descriptors","authors":"Adam Coxson,Ömer H. Omar,Marcos del Cueto,Alessandro Troisi","doi":"10.1021/acs.jctc.5c01587","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01587","url":null,"abstract":"We present a machine learning approach (ΔML) capable of enhancing the accuracy of semiempirical excited-state energy calculations to a level close to that of Time-Dependent Density Functional Theory (TDDFT). Using a data set of 7600 organic π-conjugated molecules calculated at the ZINDO and M06-2X/3-21G* TDDFT computational levels, we trained a set of models to learn the systematic errors of the low-level method and correct it toward higher-level accuracy values. The best performing model improved the correlation of ZINDO S1 energy predictions from 0.77 to 0.96 on a 9500 molecule test set of TDDFT target energies. Our ΔML-ZINDO model presents a negligible additional cost (∼2 ms per molecule) to a standard ZINDO calculation (∼2 s per molecule), enabling the computational screening of large data sets of molecules. Critical to the performance of the model is the AttentiveFP Message-Passing Neural Network with added electronic information derived from ZINDO calculations such as particle-hole densities. We also investigate the utility of the Morgan fingerprint and a novel descriptor designed to capture the electronic structure of molecules: a molecular orbital-weighted radial distribution function. The ΔML framework is retrainable to other low- and high-level calculation pairs, achieving an improvement in correlation from 0.88 to 0.99 on a test set of 24,000 molecules from the QCDGE data set, when mapping ZINDO to ωB97X-D/6-31G* energies. We also adapt ΔML-ZINDO for S1 oscillator strength prediction, improving ZINDO predictions from a correlation of 0.524 to 0.839 on our M06-2X/3-21G* target test set, thus enabling the identification of emissive molecules.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"63 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146006367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1021/acs.jctc.5c01809
Joachim Galiana,Stefano M Cavaletto,Gilbert Grell,Francisco Fernández-Villoria,Alicia Palacios,Jesús González-Vázquez,Fernando Martín
Recent advances in the generation of ultrashort, few-femtosecond laser pulses in the ultraviolet-visible domain are now enabling the coherent excitation of several electronic states in neutral molecules, with new opportunities for the manipulation of molecular dynamics on ultrafast time scales. Current time-resolved pump-probe experiments can monitor the ensuing coupled electron-nuclear dynamics with ultrashort resolution. Computational modeling of the observables measured in such experiments can be very challenging for medium-sized and large molecules because of (i) the nontrivial treatment of pump-generated coherences with mixed quantum-classical methods and (ii) the high computational cost of probe-step calculations, which cannot be afforded when many different pump pulses have to be considered, as e.g., in control schemes. In this work, we present two trajectory-surface-hopping approaches that include, a posteriori, the effect of the pump-generated coherences on the ensuing coupled electron-nuclear dynamics, thus avoiding performing a different coupled electron-nuclear dynamics calculation for every individual pump pulse. The effectiveness of both approaches is exemplified in glycine molecules excited by short ultraviolet pump pulses. We compare the results of both approaches with those obtained by including pump-generated coherences from the very beginning, showing an excellent agreement and confirming the important role of such initial coherences in the early nonadiabatic dynamics. Our results pave the way for both accurate and flexible simulations of pump-probe experiments or control studies in molecules excited by broadband laser sources.
{"title":"Accounting for Electronic Coherences Induced by Broadband Pulses by Using Pulse-Independent Trajectories.","authors":"Joachim Galiana,Stefano M Cavaletto,Gilbert Grell,Francisco Fernández-Villoria,Alicia Palacios,Jesús González-Vázquez,Fernando Martín","doi":"10.1021/acs.jctc.5c01809","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01809","url":null,"abstract":"Recent advances in the generation of ultrashort, few-femtosecond laser pulses in the ultraviolet-visible domain are now enabling the coherent excitation of several electronic states in neutral molecules, with new opportunities for the manipulation of molecular dynamics on ultrafast time scales. Current time-resolved pump-probe experiments can monitor the ensuing coupled electron-nuclear dynamics with ultrashort resolution. Computational modeling of the observables measured in such experiments can be very challenging for medium-sized and large molecules because of (i) the nontrivial treatment of pump-generated coherences with mixed quantum-classical methods and (ii) the high computational cost of probe-step calculations, which cannot be afforded when many different pump pulses have to be considered, as e.g., in control schemes. In this work, we present two trajectory-surface-hopping approaches that include, a posteriori, the effect of the pump-generated coherences on the ensuing coupled electron-nuclear dynamics, thus avoiding performing a different coupled electron-nuclear dynamics calculation for every individual pump pulse. The effectiveness of both approaches is exemplified in glycine molecules excited by short ultraviolet pump pulses. We compare the results of both approaches with those obtained by including pump-generated coherences from the very beginning, showing an excellent agreement and confirming the important role of such initial coherences in the early nonadiabatic dynamics. Our results pave the way for both accurate and flexible simulations of pump-probe experiments or control studies in molecules excited by broadband laser sources.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"6 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1021/acs.jctc.5c01695
I-O Stan,T P Straatsma,R Broer,C de Graaf,X López
Key electronic processes related to molecular excitonic states of finite stacks of indolonaphthyridine molecules are analyzed via the non-orthogonal configuration interaction with fragments (NOCI-F) method. Indolonaphthyridine is an organic chromophore that can undergo several electronic photoexcitation-related intermolecular processes, such as exciton and electron transfer. The structures studied here are noncrystalline arrangements built as either ordered stacks of indolonaphthyridine or stacks extracted from molecular dynamics simulations including thermal disorder. Taking dimers or trimers from either model, we performed CASSCF and NOCI-F calculations to quantify the intermolecular electronic couplings governing singlet fission, excited singlet and triplet diffusion, and hole and electron diffusion processes. Also, comparing the results for the different models, we studied the effect of structural disorder and distortion on these couplings. Finally, we present a newly developed, advanced postanalysis tool. It takes the NOCI-F data as input to carry out a multifragment full Hamiltonian procedure that involves the complete stack, providing physical information not available from the dimer/trimer models, hence giving access to additional insight into the material's properties.
{"title":"NOCI-F Electronic Couplings in Assemblies of Indolonaphthyridine Molecules: From Dimers to the Full Stack.","authors":"I-O Stan,T P Straatsma,R Broer,C de Graaf,X López","doi":"10.1021/acs.jctc.5c01695","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01695","url":null,"abstract":"Key electronic processes related to molecular excitonic states of finite stacks of indolonaphthyridine molecules are analyzed via the non-orthogonal configuration interaction with fragments (NOCI-F) method. Indolonaphthyridine is an organic chromophore that can undergo several electronic photoexcitation-related intermolecular processes, such as exciton and electron transfer. The structures studied here are noncrystalline arrangements built as either ordered stacks of indolonaphthyridine or stacks extracted from molecular dynamics simulations including thermal disorder. Taking dimers or trimers from either model, we performed CASSCF and NOCI-F calculations to quantify the intermolecular electronic couplings governing singlet fission, excited singlet and triplet diffusion, and hole and electron diffusion processes. Also, comparing the results for the different models, we studied the effect of structural disorder and distortion on these couplings. Finally, we present a newly developed, advanced postanalysis tool. It takes the NOCI-F data as input to carry out a multifragment full Hamiltonian procedure that involves the complete stack, providing physical information not available from the dimer/trimer models, hence giving access to additional insight into the material's properties.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"8 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1021/acs.jctc.6c00041
Feng Long Gu,Daoling Peng,Liang Peng,Weitao Yang
{"title":"Correction to “Electronic Excitation Energy Calculations with Configuration Interaction Based on Nonorthogonal Localized Molecular Orbitals”","authors":"Feng Long Gu,Daoling Peng,Liang Peng,Weitao Yang","doi":"10.1021/acs.jctc.6c00041","DOIUrl":"https://doi.org/10.1021/acs.jctc.6c00041","url":null,"abstract":"","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"17 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146006345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1021/acs.jctc.6c00042
Liang Peng,Daoling Peng,Feng Long Gu,Weitao Yang
{"title":"Correction to “Regularized Localized Molecular Orbitals in a Divide-and-Conquer Approach for Linear Scaling Calculations”","authors":"Liang Peng,Daoling Peng,Feng Long Gu,Weitao Yang","doi":"10.1021/acs.jctc.6c00042","DOIUrl":"https://doi.org/10.1021/acs.jctc.6c00042","url":null,"abstract":"","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"145 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1021/acs.jctc.6c00042
Liang Peng,Daoling Peng,Feng Long Gu,Weitao Yang
{"title":"Correction to “Regularized Localized Molecular Orbitals in a Divide-and-Conquer Approach for Linear Scaling Calculations”","authors":"Liang Peng,Daoling Peng,Feng Long Gu,Weitao Yang","doi":"10.1021/acs.jctc.6c00042","DOIUrl":"https://doi.org/10.1021/acs.jctc.6c00042","url":null,"abstract":"","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"40 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1021/acs.jctc.5c01773
Margherita Mele, Raffaele Fiorentini, Thomas Tarenzi, Giovanni Mattiotti, Raffaello Potestio
The choice of structural resolution is a fundamental aspect of protein modeling, determining the balance between descriptive power and interpretability. Although atomistic simulations provide maximal detail, much of this information is redundant to understand the relevant large-scale motions and conformational states. Here, we introduce an unsupervised information-theoretic framework that determines the minimal number of atoms required to retain a maximally informative description of the configurational space sampled by a protein. This framework quantifies the informativeness of coarse-grained representations obtained by systematically decimating atomic degrees of freedom and evaluating the resulting clustering of sampled conformations. Application to molecular dynamics trajectories of dynamically diverse proteins shows that the optimal number of retained atoms scales linearly with system size, averaging about four heavy atoms per residue, remarkably consistent with the resolution of well-established coarse-grained models, such as MARTINI and SIRAH. Furthermore, the analysis shows that the optimal retained atom number depends not only on molecular size but also on the extent of conformational exploration, decreasing for systems dominated by collective motions. The proposed method establishes a general criterion to identify the minimal structural detail that preserves the essential configurational information, thereby offering a new viewpoint on the structure-dynamics-function relationship in proteins and guiding the construction of parsimonious yet informative multiscale models.
{"title":"Determining the Optimal Structural Resolution of Proteins through an Information-Theoretic Analysis of Their Conformational Ensemble.","authors":"Margherita Mele, Raffaele Fiorentini, Thomas Tarenzi, Giovanni Mattiotti, Raffaello Potestio","doi":"10.1021/acs.jctc.5c01773","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01773","url":null,"abstract":"<p><p>The choice of structural resolution is a fundamental aspect of protein modeling, determining the balance between descriptive power and interpretability. Although atomistic simulations provide maximal detail, much of this information is redundant to understand the relevant large-scale motions and conformational states. Here, we introduce an unsupervised information-theoretic framework that determines the minimal number of atoms required to retain a maximally informative description of the configurational space sampled by a protein. This framework quantifies the informativeness of coarse-grained representations obtained by systematically decimating atomic degrees of freedom and evaluating the resulting clustering of sampled conformations. Application to molecular dynamics trajectories of dynamically diverse proteins shows that the optimal number of retained atoms scales linearly with system size, averaging about four heavy atoms per residue, remarkably consistent with the resolution of well-established coarse-grained models, such as MARTINI and SIRAH. Furthermore, the analysis shows that the optimal retained atom number depends not only on molecular size but also on the extent of conformational exploration, decreasing for systems dominated by collective motions. The proposed method establishes a general criterion to identify the minimal structural detail that preserves the essential configurational information, thereby offering a new viewpoint on the structure-dynamics-function relationship in proteins and guiding the construction of parsimonious yet informative multiscale models.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}