In this work, we present two alternative computational strategies to determine the populations of nonbonded aggregates. One approach extracts these populations from molecular dynamics (MD) simulations, while the other employs quantum mechanical partition functions for the most relevant minima of the multimolecular potential energy surfaces (PESs), identified by automated conformational sampling. In both cases, we adopt a common graph-theory-based framework, introduced in this work, for identifying aggregate conformations, which enables a consistent comparative assessment of both methodologies and provides insight into the underlying approximations. We apply both strategies to investigate phenol aggregates, up to the tetramer, at different concentrations in phenol/carbon tetrachloride mixtures. Subsequently, we simulate the concentration-dependent OH stretching IR region by averaging the harmonic Infrared (IR) spectra of aggregates using the populations predicted by each strategy. Our results indicate that the populations extracted from MD trajectories yield OH stretching signals that closely follow the experimental trends, outperforming the spectra from populations obtained by systematic conformational searches. Such a better performance of MD is attributed to a better description of the entropic contributions. Moreover, the proposed protocol not only successfully addresses a very challenging problem but also offers a benchmark to assess the accuracy of the intermolecular force fields.
We use unsupervised machine learning to construct a phase diagram of a simple 2D rose water model. The machine learning method that we use is a combination of dimensionality reduction methods and clustering algorithms. Two different data sets from the same simulations are used as input data for machine learning. These are angular distribution functions and a set of different thermodynamic, dynamic, and structural properties. To evaluate the efficiency of the method, the machine learning results are compared to manually determined phase diagrams. We show that the methods successfully predict the phase diagram of the rose water model. Furthermore, the phase diagrams obtained from the two data sets are in semiquantitative agreement with each other. Four different solid phases, one liquid phase, and one gaseous phase were determined. The method we have presented is straightforward and easy to implement. It requires almost no prior knowledge of the system to obtain a phase diagram. The method can also be used to distinguish between the different parts of the same phase that have different properties or a sufficiently different structure, and in this way find local differences and anomalies.
Achieving both robust extrapolation and physical interpretability in machine learning interatomic potentials (ML-IPs) for atomistic simulation remains a significant challenge, particularly in data-scarce areas such as chemical reactions or complex, multicomponent materials at extreme conditions. Here, we present a pairwise-decomposed physics-informed neural network (P2Net) that parametrizes an analytical bond-order potential (BOP) layer to decouple the energy contributions of atomic pairs. By leveraging fundamental physical principles, P2Net demonstrates excellence at extrapolating beyond its training regime and accurately capturing molecular geometries far from equilibrium. The pairwise energy decomposition further empowers the bond analyses for deprotonation and SN2 reactions, which is not easy with most ML-IPs. The atomic pair energy offers how to elucidate the evolution of interatomic interactions as a reaction proceeds. Our methodology highlights enhanced data efficiency in building ML-IPs and facilitates more informative postsimulation analysis, thereby broadening the applicability of ML-IPs to complex and reactive systems.