Introduction

The development of ML methods for molecular modelling has opened the possibility to simulate large length scales and long time scales with ab initio-level accuracy1,2,3. ML methods have been employed successfully to model isolated molecules4,5,6,7,8, where the directional intra-molecular interactions dominate, as well as inorganic solids and liquids, where the interactions are homogeneous9,10,11,12,13,14,15,16,17,18,19,20,21,22. The molecular condensed phase presents unique challenges owing to a large separation of scales between intra- and inter-molecular interactions23,24,25,26,27,28. Molecular mixtures, in particular, present further complications owing to the heterogeneity of the inter-molecular environments.

Many important properties of the molecular liquids such as: density, viscosity and dielectric constant; depend specifically, and critically on inter-molecular forces. It is therefore crucial that ML models obtain good accuracies on both intra- and inter-scales. Numerous studies have demonstrated ML potentials for water and aqueous solutions29,30,31,32,33, methane26, ionic liquids28 and, more recently, electrolyte solutions34,35,36. The standard approach to modelling these molecular systems is to create separate force fields for the intra- and inter- contributions2,37,38,39. This approach is inspired by classical force fields and neatly solves the problem of scale separation. However the approximation breaks down in reactive systems where the molecular identity changes during the course of the simulation.

In this work, we intentionally choose to fit ML models on total interactions, without an explicit separation of scales, so our potential are entirely general and could, in principle, capture reactivity40,41. We aim to answer the overarching question: can we develop an accurate and robust ML force field for molecular liquids mixtures without making the scale separation explicit? In particular, can these models generate long and stable molecular dynamics trajectories and reproduce the thermodynamic properties of the reference method? As shown in other studies, even reproducing the density of a molecular system is not a trivial task42 and merits a meticulous investigation. Here we show that a good fit on the total loss function results in good intra-, but poor inter- relative accuracies. In practice, this makes it difficult to fit the inter- contribution which underlies the thermodynamic properties of the liquid state.

Specifically, this paper develops an ML force field for the binary solvent EC:EMC (3:7 M) of the standard LP57 electrolyte (1M LiPF6 in EC:EMC (3:7 M) < 10 ppm H2O, BASF). This formulation comprises a mixture of cyclic and linear carbonates because of their complementary properties: cyclic carbonates have large dipole moments and can efficiently dissociate the salt, while linear carbonates have higher mobility and improve ion diffusion. The solvation structure and ion transport properties of mixed liquid carbonate electrolytes have been extensively explored using classical dynamics in conjunction with quantum calculations43,44,45, polarizable force fields46,47,48, reactive force fields49, short ab initio dynamics47,50 and more recently neural network potentials36. Lower-accuracy methods mischaracterize the ion solvation structure, while higher-accuracy methods are too expensive to capture the slow dynamics and, ultimately, the ion transport. The recent work in ref. 36 demonstrates an ML potential which captures long time scale ion transport with high accuracy, reproducing experimental diffusivities. In their model, however, the inter-molecular interaction is largely captured by a Coulomb interaction based on learned atomic charges and a purely geometry dependent empirical dispersion term.

Recent efforts in ML modelling have developed strategies to deal with the long-range electrostatics51,52,53,54,55,56,57 which is important at interfaces32, in molecular clusters and in the vapor phase58. Here, we find that short-range models are sufficiently accurate to describe our neutral solvent liquid. This result is in line with local molecular-field theory (LMFT)59 which states that homogeneous bulk systems can be modelled with sort-range potentials. The absence of an analytic baseline (e.g. Coulomb, Lennard-Jones) makes our ML modelling more challenging because the force fields must correctly describe the subtle inter-molecular interactions to reproduce the underlying dynamics. However the models we obtain remain general and versatile. We show that by crafting a diverse training set through iterative training and carefully testing intra- and inter-molecular performance, we can fit a general potential that describes the EC:EMC binary solvent liquid for a wide range of compositions. This paves the way for a more general fully reactive force field in the future. More importantly, the conclusions we draw here, likely generalize to other molecular systems as well.

Results

We present our results for fitting a Gaussian Approximation Potential (GAP)60 with Smooth Overlap of Atomic Positions (SOAP) descriptors61 for the EC:EMC molecular solvent. Our main goal is to generate stable dynamics in the NPT ensemble and correctly predict the density of the reference method. For a finite-size system, constrained volume simulation ensembles (NVE, NVT) can mask the poor performance of ML potentials in describing the weak inter-molecular interactions. In the NPT ensemble the density becomes an observable that is highly sensitive to small errors in the inter-molecular interactions.

In the section 'OPLS training sets', we demonstrate that fitting ML models on fixed training-sets results in unstable potentials. In the section 'Iterative training', we show that by iteratively updating the training data set, we are able to continuously improve our GAP models and obtain stable GAP-MD trajectories on the PBE-D2 potential energy surface (PES). We explore different XC functionals (Solvent dimer interactions), retrain our converged GAP model with PBE-D3 data (Comparison to Ab initio MD) and compare the PBE-D3/GAP-MD density prediction to experimental results (Comparison to Experiment). In the sections that follow (Intra-/Inter- decomposition–Model optimization), we present one-by-one the key challenges for fitting ML potentials in multi-component molecular systems, as well as the solutions and tests we developed to address them: the intra-/inter- error decomposition, the rigid-molecular volume-scans, the transferability across molecular compositions and the final model optimizations.

Quantum-mechanical calculations are central to this paper and are used in several different contexts: (i) to provide the energies, forces and virials that build up the training set, (ii) for the testing of XC functionals using molecular complexes, and (iii) to provide ab initio molecular dynamics (AIMD) simulation data as a reference for comparison with the ML-derived results.

OPLS training sets

Obtaining an ML potential without iterative training would be ideal in many applications, but does it work? We test this approach by fitting a series of ML models on a fixed data set without using iterative training. We employed the classical OPLS force field62 to sample diverse molecular configurations for EC:EMC (3:7 M) at a wide range of densities (0.4–1.3 g cm−3) and temperatures (300–1200 K). At all densities we monitor the mean square displacement (MSD) and choose temperatures at which all ns-long OPLS trajectories remain fluid. The high entropy of the liquid allows the system to explore a diverse distribution of molecular environments. We construct training sets of different sizes (ranging from 200 to 1000 configurations) by selecting uncorrelated samples at intervals of 10 to 50 ps, and recompute their energy, forces and virials at the PBE-D2 level of theory.

We train a range of different GAP models extensively exploring the hyper-parameter space, and obtain force root mean square errors (RMSEs) comparable to the level of force errors inherent in the approximation due to the short range cutoff (up to about 6 Å), as revealed by the “locality test” introduced in ref. 9 (see the SI). However, we find that all these models invariably fail when performing NPT dynamics. Figure 1 shows four such GAP models which all behave similarly: shortly after the start of the dynamics, spontaneous bubbles appear and grow within the liquid, which causes the unphysical collapse of the density. This instability reveals a poor description of the inter-molecular degrees of freedom (DoFs).

Fig. 1: Bubble instabilities.
figure 1

A, B Show the potential energy and temperature of GAP-MD simulations in the NVE and NPT ensembles, respectively. C Shows the evolution of density in the NPT ensemble for four different GAP models, which reveals trajectories are unstable and form bubbles (green shaded volume). The problem persists for all GAP models trained on OPLS-generated training sets, independent of the model hyper-parameter choice.

Interestingly, all models yield stable dynamics in the NVT and NVE ensembles, an example of which is shown in Fig. 1. Molecules remain intact during the simulation which indicates all models capture the intra-molecular DoFs reasonably well. Due to equipartition, most thermal energy is stored in the more numerous intra- modes, which explains why potential and kinetic energies remain stable despite the poor description of the inter- modes. Since inter- modes are soft, they are unlikely to cause integration problems at the 1 fs time step, therefore the dynamics can appear stable in the fixed-volume ensembles. The density observable in the NPT ensemble is much more sensitive to the poor description of the inter-molecular interactions. In the thermodynamic limit, all ensembles are equivalent. Therefore, bubble instability will eventually form in the NVE/NVT simulations as well, but might require significantly larger system sizes.

Iterative training

Iterative training solves the inherent self-consistency problem of ML fitting: good potentials require not only a training set that is representative of the probability distribution of the DFT PES, but the potential needs to be reasonably accurate further away too (for configuration never encountered by a simulation on the true PES) otherwise it will not generate the appropriate distribution when used as a sampler.

We use an iterative training protocol that simultaneously improves our potentials and the diversity of the training data sets generated by the potentials. At each iteration, we expand the training set and we fit a new model. Training is performed on PBE-D2 data and all GAP models employ double SOAP (DS) or double Turbo SOAP (DTS) descriptors63 (details in the Methods section), however only DS models are used to generate new samples in the iterative protocol. Unphysical density fluctuations observed in NPT dynamics are used as a proxy for collecting new configurations and extending the training set (details in the methods section and the SI). As part of the protocol, we address key challenges specific to mixed molecular liquids underpinned by the imbalance between intra- and inter-molecular interactions. In particular, we add multiple molecular compositions, isolated molecules, as well as rigid-molecule volume scans to the training data. In addition, we analyse the errors split into intra- and inter- errors and use these metrics to optimize the SOAP descriptors. We test the potentials using 48-molecule NPT GAP-MD and monitor the density fluctuations at different temperatures and molecular compositions. We also check reproducibility by running independent trajectories at the same temperature and composition, but starting from different configurations. Based on OPLS dynamics and experimental results, we expect the density should monotonically decrease with composition: highest for pure EC to lowest for pure EMC.

Figure 2 shows the systematic improvement of our GAP models across multiple generations of iterative training. After a few iterations, at Gen5/DS, the densities of the target molecular composition, EC:EMC (3:7 M), remain relatively stable over 100 ps of trajectory, even at temperatures as high as 500 K. This model is markedly better than those trained on OPLS data alone and does not form bubbles. Interestingly, the potential is not transferable to different molecular compositions away from the 3:7 M target. At the same time, on the 100 ps time-scale, the model performs poorly with respect to the reproducibility test. These issues may be related to one another: the quality of the energy prediction for various local environments differs substantially based on molecular composition. Since these local environments persist over long time-scales, simulations starting from different initial conditions are dominated by different systematic errors and yield different densities.

Fig. 2: Iterative training and GAP-MD density.
figure 2

Successive iterations (left to right) show increasingly stable 48-molecule GAP-MD trajectories for the reference molecular composition at different temperatures (A) and for a range of molecular compositions at 400 K (B). The curved arrows on top indicate the most important additions between these key generations of the model. C Shows the distribution of densities extracted from the last 80 ps of each GAP-MD trajectory across different models. This shows reproducibility (top) at identical thermodynamic conditions but different starting configurations, temperature dependence (middle) and dependence on molecular compositions (bottom).

After Gen5/DS, we extend the training set with configurations across all molecular compositions, ranging from pure EC to pure EMC and by Gen10/DS we find stable dynamics at all compositions. However, the density ordering with respect to composition remains wrong. In particular, the equilibrium density of pure EC is too low compared to other compositions. On the target composition, we find the model performs better on the reproducibility test, however, the densities at all temperatures are higher than final values obtained with Gen16/DTS.

Starting with Gen11/DS, we add isolated molecules and rigid-molecule volume scans to the training set (details in the sections below), which significantly improves the description of the inter-molecular interactions. The resulting model at Gen13/DS generates more physical dynamics, it solves the density ordering problem and brings the densities closer to final converged values. However, unphysical density fluctuations persist at higher temperatures (500 K), owing to a still insufficiently accurate description of the inter-molecular interactions. We refine our model hyper-parameters and replace the SOAP descriptors with Turbo SOAP, which uses a better radial basis63. With these changes, at Gen16/DTS we achieve converged NPT dynamics across all compositions and temperatures, demonstrating a robust GAP model for the EC:EMC liquid solvent.

Solvent dimer interactions

The iterative training employs PBE-D2 to construct the final training set. In this section we explore the accuracy of other XC functionals and dispersion correction schemes with respect to gold-standard PNO-LCCSD(T)-F12/AVTZ-F12 coupled-cluster theory by comparing the interaction energy of different solvent dimers as shown in Fig. 3. Coupled-cluster calculations show that (EC)2 is most binding (−0.45 eV), (EMC)2 is least binding (−0.23 eV), while the mixed (EC)1(EMC)1 lies in between (−0.31 eV). All XC functionals tested here follow the same trend as the coupled-cluster reference. Without dispersion correction, all XC functionals underestimate the interaction energies which is consistent with the consensus that GGA and Hybrid DFT functionals require such corrections to describe the weak dispersion interactions present in our system. After including dispersion, we observe a considerable improvement for (EC)2 and (EC)1(EMC)1. In the case of (EMC)2, the interaction energy is consistently overestimated due to the non-polar character of the EMC molecule.

Fig. 3: Ab initio dimer interaction energies.
figure 3

A Shows interaction energy curves for EC2, EMC2 and EC1EMC1 dimers at the PNO-LCCSD(T)-F12 level of theory. BD Show interaction energies for EC2, EMC2 and EC1EMC1 dimers, respectively, computed with four different XC functionals, with and without dispersion correction, as well as the reference PNO-LCCSD(T)-F12 result.

In choosing an XC functional to model our system we also consider computational cost. In the case of organic electrolytes, only a few AIMD simulations exist because of their intrinsic structural complexity64. Very often, such simulations are carried out with GGA functionals due to the large number of atoms required to mitigate finite-size effects. From our dimer scans, we observe that among the selection of functionals in Fig. 3, PBE-D3 (GGA) and PBE0-D3 (hybrid-GGA) exhibit similar accuracy for describing the interaction energies of the solvent components. We chose the less expensive PBE-D3 to validate the training strategy developed in this study.

Comparison to Ab initio MD

We test our best GAP models against ab initio MD (AIMD) results for the target composition (EC:EMC (3:7 M)) at 500 K and show that both potentials perform well. In particular, double Turbo SOAP (Gen16/DTS) gives the best agreement with the DFT density and Equation of States (EoS). All AIMD simulations were carried out at the PBE-D3 level. The original PBE-D2 training set was recalculated with PBE-D3 and both Gen16/DS and Gen16/DTS models were retrained on the new data. Notably, we find good model transferability between the different dispersion schemes (D2 and D3) and no additional iterative training was required.

In general, converging NPT-AIMD simulations with small boxes sizes remains a challenging due to large volume fluctuations, e.g., leading to requiring very dense k-space sampling and large plane wave cutoffs65. We circumvent this issue by running multiple NVT simulations where we monitor the pressure expectation value and interpolate it to find the equilibrium density. Each trajectory contains 48-molecules and spans 20 ps. Each AIMD simulation took 0.5 M core-hours, while the equivalent GAP-MD simulation took 250 core-hours, which corresponds to a speed-up of 2000 × .

Figure 4A–C compares a selection of radial distribution functions (RDFs) for the three methods (see SI for details). We compute partial RDFs, separated into intra- and inter- contributions, for each atomic element pair (ab):

$$\begin{array}{ll}{g}_{{{{\rm{intra}}}}}^{ab}(r)\,=\,\frac{V}{4\pi N{r}^{2}}\left\langle \mathop{\sum }\limits_{i}^{{N}_{a}}\mathop{\sum }\limits_{j}^{{N}_{b}}{h}_{ij}\delta (r-{r}_{ij})\right\rangle \\ {g}_{{{{\rm{inter}}}}}^{ab}(r)\,=\,\frac{V}{4\pi N{r}^{2}}\left\langle \mathop{\sum }\limits_{i}^{{N}_{a}}\mathop{\sum }\limits_{j}^{{N}_{b}}(1-{h}_{ij})\delta (r-{r}_{ij})\right\rangle \\ \end{array}$$
(1)

where

$${h}_{ij}=\left\{\begin{array}{l}1{\quad}i,j\,{{\mbox{same molecule}}}\,\\ 0{\quad}i,j\,{{\mbox{different molecules}}}\,\end{array}\right.$$
(2)

such that total RDF is simply:

$$g(r)=\mathop{\sum}\limits_{a,b}{g}_{{{{\rm{intra}}}}}^{ab}(r)+{g}_{{{{\rm{inter}}}}}^{ab}(r)$$
(3)
Fig. 4: Comparison of GAP-MD to AIMD for EC:EMC (3:7 M), at 500 K.
figure 4

A, B Show selected intra-molecular RDFs (C-O and H-C) at ρ = 0.95 g cm−3. B Zooms in on the subtle structure at longer-range. C Shows the C-O inter-molecular RDFs at all three densities. D Shows the VACFs up to 0.5 ps, at ρ = 0.95 g cm−3. The VDoS (obtained by FT from VACF) is shown in E, F - the higher frequency peak originating form the C-H bond is shown separately. G Shows the pressure distributions obtained in the NVT ensemble for the target molecular composition, at three different fixed densities and 500 K. H Shows the equilibrium pressure as function of density for the three models, as well as a Murnaghan-Tait inspired EoS fitted to the data. The zero-pressure intercept is marked with vertical dashed lines, whereas the equilibrium pressures obtained from NPT GAP-MD simulations are marked with crosses. I Shows the corresponding energy distributions obtained from MD of the liquid state (analogous to G). J Shows the specific heat at constant volume CV computed from the energy fluctuations.

Both GAP models agree well with the AIMD results. At short range, the intra-RDF curves are indistinguishable from each-other, indicating the intra-molecular geometries are well reproduced. We report the longer-range intra-RDF (4–8 Å interval) separately to demonstrate that agreement is excellent even for this subtle structure of the RDF. This portion of the signal characterizes the EMC molecule which is longer than EC, and is sensitive to the populations of the EMC conformers. The curves are well reproduced by both surrogate models even at distances larger than Rcut = 6Å, indicating the correct population of EMC isomers is achieved. Finally, the inter-RDFs are in good agreement as well, demonstrating that both models correctly capture the more subtle inter-molecular interactions.

We also compare the self-averaged velocity autocorrelation functions (VACFs) computed as:

$$z(t)={\left\langle \mathop{\sum}\limits_{i}{{{{\bf{v}}}}}_{i}(\tau )\cdot {{{{\bf{v}}}}}_{i}(\tau +t)\right\rangle }_{\tau }$$
(4)

and the vibrational density of states (VDoS):

$$\hat{z}(\omega )={\rm{Re}} \left\{{{{\rm{FT}}}}\left[z(t)-\bar{z}\right]\right\}$$
(5)

where z(t) was sampled every 4 fs, computed up to 1 ps and zero-padded up to 4 ps for smoothing (ωNyquist = 4170 cm−1, δω = 33 cm−1, δωinterp = 8 cm−1). Both models reproduce the AIMD vibrational dynamics at high- and low-frequency indicating that intra- and inter-molecular modes are both well described. Gen16/DTS agrees slightly better with AIMD for the high-frequency CH vibration (Fig. 4F).

Figure 4 also shows the pressure distribution obtained with NVT dynamics at three different densities. We compute the expectation value for the pressure and its standard error by averaging over 5 ps trajectory windows. We interpolate these values using the power law P(ρ) = aρn + b, inspired by the Murnaghan-Tait model EoS and mark the equilibrium values of the density at zero pressure. Gen16/DTS density at normal conditions is in good agreement with the AIMD density. Moreover, the pressure distributions at all densities overlap with the AIMD result (Fig. 4H), and so does the fitted EoS. The Gen16/DS model predicts a higher density at normal conditions, but agrees with AIMD within the error tolerance. The agreement with AIMD for the overall EoS is poorer, therefore, we conclude Gen16/DTS is a better model.

Finally, panels I and J of Fig. 4 show the potential energy distributions obtained from the MD simulation as well as the specific heat at constant volume \({C}_{{{{\rm{V}}}}}={{{\rm{Var}}}}\left[E\right]{({N}_{{{{\rm{mol}}}}}{k}_{{{{\rm{B}}}}}T)}^{-1}\)36. Both GAP models predict CV values in agreement with AIMD. Interestingly, energy distributions obtained through GAP-MD are systematically shifted by 2–4 meV atom−1 to lower energies with respect to AIMD. Note that errors in energy per atom are constant with systems size, while fluctuations in this quantity decrease as \(E/\sqrt{N}\). For large enough system sizes (here 640 atoms) this error appears large compared to the size of the fluctuations. Moreover, the shift is systematically towards lower energies in GAP because the dynamics biases the energy distribution towards lower energy values. Importantly, this constant offset in energy does not affect the physics.

Comparison to experiment

Figure 5 shows 1000-atom GAP-MD simulations obtained with Gen16/DTS fitted using the PBE-D3 XC functional. Trajectories are stable for one nanosecond at all four compositions. Density distributions are Gaussian and we observe no unphysical density fluctuations.

Fig. 5: Comparison of PBE-D3 to experiment.
figure 5

A, B Show 1000-atom, 1 nanosecond GAP-MD obtained with Gen16/DTS for all molecular composition at 300 K and 350 K, respectively. C Shows experimental density values at different compositions and temperatures from ref. 79 (pluses) and this work (triangles). Ref 79 gives concentrations by weight, converted here to molar compositions. The curves are third degree polynomials fitted to the data. Dotted vertical lines indicate the exact molecular compositions of the GAP-MD simulations (colors consistent with the other panels) where the interpolated data is recorded for D. D Compares the GAP-MD densities (circles) to the interpolated experimental densities from C (pluses and triangles) and the additional data for pure EC (squares) from ref. 80. We note that the experimental melting temperature of pure EC is 310 K and ref. 80 data starts in the solid phase. In our simulations pure EC is in the liquid phase at 300 K.

Given the convergence results in this work, the density predictions of the ML potential represent the results corresponding to DFT. Thus we can state that PBE-D3 densities for EMC-rich simulations are in good agreement with the experiment, whereas the densities of EC-rich compositions are under-estimated. We also note that in our simulations pure EC remains liquid at 300 K, whereas experimentally EC is found to be solid below the 310 K melting temperature. The slope of density with respect to temperature which determines the thermal expansion coefficient is well approximated by PBE-D3 for all compositions.

Intra-/Inter- decomposition

Intra-molecular forces and virials are an order of magnitude larger than inter-molecular contributions and they dominate the ML loss function. At the same time, the thermodynamic and kinetic observables of the EC:EMC liquid such as density and self-diffusivity, are primarily determined by the weaker inter-molecular contributions. It is therefore instructive to test the performance of our GAP models separately on the intra- and inter-components. In one approach, we start from liquid configurations where every atomic coordinate can be expressed as:

$${{{{\bf{r}}}}}_{i}={{{{\boldsymbol{\tau }}}}}_{j}+{{{{\boldsymbol{\Theta }}}}}_{j}{{{{\boldsymbol{\xi }}}}}_{i}$$
(6)

and where τj and Θj are the translation and rotation of molecule j, and ξi represent the internal coordinates of atom i in the reference frame of molecule j. We deconstruct the simulation box into individual molecules. For each molecule we keep the original geometry ξi and recompute all of its properties in isolation, obtaining the purely intra-molecular contributions for the whole configuration as:

$$\begin{array}{l}{E}^{{{{\rm{intra}}}}}({{{{\boldsymbol{\tau }}}}}_{j}+{{{{\boldsymbol{\Theta }}}}}_{j}{{{{\boldsymbol{\xi }}}}}_{i})\,=\,\mathop{\sum}\limits_{j}{E}_{j}^{{{{\rm{intra}}}}}({{{{\boldsymbol{\xi }}}}}_{i})\\ {F}^{{{{\rm{intra}}}}}({{{{\boldsymbol{\tau }}}}}_{j}+{{{{\boldsymbol{\Theta }}}}}_{j}{{{{\boldsymbol{\xi }}}}}_{i})\,=\,\mathop{\bigcup}\limits_{j}{{{{\boldsymbol{\Theta }}}}}_{j}^{T}{F}_{j}^{{{{\rm{intra}}}}}({{{{\boldsymbol{\xi }}}}}_{i})\end{array}$$
(7)

where intra-molecular energies are invariant with respect to τj and Θj, while intra- forces and virials are invariant with respect to τj and equivariant with Θj. We compute the inter-molecular contributions as the difference:

$$\begin{array}{ll}{E}^{\rm{inter}}({{{{\boldsymbol{\tau }}}}}_{j}+{{{{\boldsymbol{\Theta }}}}}_{j}{{{{\boldsymbol{\xi }}}}}_{i})= {E}^{{{{\rm{total}}}}}({{{{\boldsymbol{\tau }}}}}_{j}+{{{{\boldsymbol{\Theta }}}}}_{j}{{{{\boldsymbol{\xi }}}}}_{i})-\mathop{\sum}\limits_{j}{E}_{j}^{{{{\rm{intra}}}}}({{{{\boldsymbol{\xi }}}}}_{i})\end{array}$$
(8)

for both GAP and DFT values.

Another method we employ for separating the intra-/inter- errors is based on the molecular DoFs. We decompose atomic forces into translational, rotational and vibrational parts66, using:

$$\begin{array}{l}{{{{\bf{f}}}}}_{i}^{{{{\rm{trans}}}}}\,=\,\frac{{m}_{i}}{{M}_{j}}\mathop{\sum}\limits_{k\in j}{{{{\bf{f}}}}}_{k}\\ {{{{\bf{f}}}}}_{i}^{{{{\rm{rot}}}}}\,=\,{m}_{i}{{{{\bf{r}}}}}_{i}\times {\left({I}_{j}^{\alpha \beta }\right)}^{-1}\mathop{\sum}\limits_{k\in j}{{{{\bf{f}}}}}_{k}\times {{{{\bf{r}}}}}_{k}\\ {{{{\bf{f}}}}}_{i}^{{{{\rm{vib}}}}}\,=\,{{{{\bf{f}}}}}_{i}-{{{{\bf{f}}}}}_{i}^{{{{\rm{trans}}}}}-{{{{\bf{f}}}}}_{i}^{{{{\rm{rot}}}}}\end{array}$$
(9)

where

$$\begin{array}{l}{M}_{j}\,=\,\mathop{\sum}\limits_{k\in j}{m}_{k}\\ {I}_{j}^{\alpha \beta }\,=\,\mathop{\sum}\limits_{k\in j}{m}_{k}\left(| | {{{{\bf{r}}}}}_{k}| {| }^{2}{\delta }^{\alpha \beta }-{r}_{k}^{\alpha }{r}_{k}^{\beta }\right)\end{array}$$
(10)

are the mass and moment of inertia tensor of each molecule j, ri are the atomic positions with respect to the molecular center of mass, the dummy index k runs over all atoms in molecule j, and α, β label the Cartesian directions. This decomposition has the advantage that it can be done in place, without additional calculations on the isolated molecules, however it only works for forces and not for energies and virials.

We quantify the errors by using RMSEs and relative RMSEs (RRMSEs) as defined bellow:

$$\begin{array}{l}{{{\rm{RMSE}}}}=\sqrt{\frac{1}{N}\mathop{\sum}\limits_{i}{\left({F}_{i}^{{{{\rm{DFT}}}}}-{F}_{i}^{{{{\rm{GAP}}}}}\right)}^{2}}\\ {{{\rm{RRMSE}}}}=\sqrt{\frac{{\sum }_{i}{\left({F}_{i}^{{{{\rm{DFT}}}}}-{F}_{i}^{{{{\rm{GAP}}}}}\right)}^{2}}{{\sum }_{i}{\left({F}_{i}^{{{{\rm{DFT}}}}}-\overline{{F}^{{{{\rm{DFT}}}}}}\right)}^{2}}}\end{array}$$
(11)

In Fig. 6 we show GAP-DFT force correlation plots for total quantities as well as for intra-/inter- components separately. We used a test set that contains 40 liquid configurations collected from GAP-MD runs with unphysical density fluctuations. Both Gen10/DS and Gen11/DS models yield good total-force RMSEs which are roughly 3% of the maximum force values ( ~ 6 eV Å−1).

Fig. 6: Separation of intra-/inter- errors.
figure 6

A Illustrates both decomposition procedures: inter-molecular contributions are the difference between intra- and total- components. Total forces are also the sum of translational, rotational and vibrational components. B, C Show the DFT-GAP force correlation plots separated into intra-/inter- contributions for two different GAP models: Gen10/DS and Gen11/DS (additional data in the SI).

Gen10/DS, which does not include isolated molecules in the training set, performs poorly on both the intra-force and inter-force components. This is a consequence of the limited extrapolation power of the model which was trained only on liquid configurations and cannot accurately predict isolated molecule properties (intra- component). The large error on the inter- is simply propagated from the intra- error and is not indicative of the real accuracy of the model. This conclusion is supported by the DoF-based decomposition of the forces, where both models perform similarly well on all force components.

In Gen11/DS we include isolated molecules in the training set and obtain a meaningful decomposition of the intra-/inter- error. This model, yields significantly better RMSEs for both intra- and inter-forces, even though the total RMSE is about the same as in Gen10/DS. In this case, the two RMSE components are predictive of the model accuracy on two separate interaction scales. Although, the absolute RMSEs for the two contributions are similar, in relative terms (RRMSE) the inter-molecular error is significantly larger than the intra-: 1.26 vs 0.09 because inter-forces (~0.5 eVÅ−1) are an order of magnitude smaller than intra-forces (~5 eVÅ−1). Reducing the inter-RMSE (or translational and rotational in the DoF decomposition) is the main challenge in this work and, more generally, in all molecular condensed phase systems where we aim to obtain stable dynamics and properties that depend on inter-molecular forces.

Rigid-molecule volume scans

The rigid-molecule volume scan illustrated in Fig. 7B, was introduced as an additional test to quantify the purely inter-molecular interactions along a special path on the PES:

$${{{{\bf{r}}}}}_{i}(s)=s{{{{\boldsymbol{\tau }}}}}_{j}+{{{{\boldsymbol{\Theta }}}}}_{j}{{{{\boldsymbol{\xi }}}}}_{i}$$
(12)

where internal molecular DoFs and molecular rotations are frozen, and the volume is scaled by a factor s around the starting value by adjusting the center of mass of all molecules. As the starting point for the scans we use liquid configurations from GAP-MD trajectories.

Fig. 7: Rigid-molecule volume scan.
figure 7

A Shows the K-PCA of the Gen16 training data set labeled by type of molecular configuration: liquid configurations (pink), volume scans (yellow), and isolated molecules (light green). B Illustrates the volume scan technique. C Shows a selected energy-volume curve for EC:EMC (3:7 M) obtained with DFT and three different GAP generations. The volume scan reported here was not part of the training set (additional data in the SI).

The resulting energy curves are purely inter-molecular in nature:

$${E}^{{{{\rm{total}}}}}(s)={E}^{{{{\rm{inter}}}}}(s)+{E}^{{{{\rm{intra}}}}}{| }_{{{{\boldsymbol{\xi }}}} = {{{\rm{fixed}}}}}$$
(13)

and manifest a steep repulsive part as the volume is compressed beyond its equilibrium value (s < 1) and an attractive part in the opposite direction (s > 1). At long-range (s 1), the leading contributions to the inter-molecular energy in our system is electrostatic. Specifically, since all molecules are electrically neutral, the main contribution is the dipole-dipole interaction which decays as 1/r3. The remaining dispersion-induction interactions have a 1/r6 dependence and vanish at shorter range. In a two-body approximation, the energy along this curve can be understood as the sum over many different molecular dimer curves averaged over multiple orientations. In practice we find the performance on these tests correlates well with the model stability in GAP-MD simulations.

Figure 7C shows the energy curves along such a volume scan for three different GAP models: Gen10/DS trained only on liquid configurations, Gen11/DS trained on liquid and isolated molecule configurations and Gen12/DS where some volume scans are also included in the training set. These predictions are compared to DFT results. Gen10/DS reproduces the ab initio reference near the equilibrium volume (s ≈ 1) but overestimates both the repulsive and attractive regimes. This over-binding reduces the density fluctuations in Gen10/DS compared to earlier models, however, it also leads to density over-estimation (Fig. 2C) and reduced flow in MD simulations. We find that including isolated molecules in Gen11/DS improves the prediction of the energy curve towards the zero-density regime, where the inter-energy vanishes and the total energy becomes purely intra-.

Folding some of the configurations along the volume scans back into the training set (Gen12/DS) improves the predicted energy curves throughout, including on unseen test curves (Fig. 7C). Volume scan configurations interpolate between liquid environments and isolated molecules in configuration space as can be seen in the kernel principal component analysis (K-PCA) plot (Fig. 7A). These special configurations help improve the sampling of the PES along the shallow-curvature of the inter-molecular modes which can be poorly sampled during MD. On one hand, a better estimation of the energy in the s < 1 regime means the repulsive core of atoms is better captured which helps avoid unphysical close-approaches during MD. On the other hand, the improvement in energy prediction in the attractive regime means that Gen12/DS correctly captures the medium-range inter-molecular interactions in the bulk, despite being a finite-range model. With this model, the predicted NPT densities are in better agreement with converged values for all molecular compositions (Fig. 2C).

Molecular compositions

The inter-molecular interactions are difficult to capture because they are weaker, but also because the relevant geometries are more diverse than for intra-molecular interactions. With n the number of molecules and N the average number of atoms per molecule, there are 3n(N − 2) intra-molecular DoFs and 6n inter-molecular ones. In the EC:EMC (3:7 M) system this equates to 34n intra- and just 6n inter- DoFs. However, with just two molecule types in the liquid, most intra- environments are identical to each other, whereas inter- environments vary significantly owing to the continuous reorientation of the molecules in the liquid state and the additional randomness due to mixing. As a result, capturing the richness of inter-molecular environment is a significantly harder task for the ML models.

Models trained only on the target molecular composition \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}=\) 3:7 M (up to Gen5/DS) are not transferable to other compositions \({{{\mathcal{C}}}}\,\ne \,{{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\). This manifests as poor stability of GAP-MD simulations at \({{{\mathcal{C}}}}\,\ne \,{{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\) (Fig. 2B), which are plagued by the formation of bubbles, and it is also apparent in volume scan tests. The poor transferability can be understood by examining the K-PCA plot of a single atom type (SI for details) shown in Fig. 8A. Local atomic environments originating from different liquid compositions \({{{\mathcal{C}}}}\) occupy distinct regions in configuration space, meaning models trained only on \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\) operate in an extrapolative regime when tested on other compositions.

Fig. 8: Molecular compositions.
figure 8

A Shows the K-PCA of C1/EC carbon atoms (atom labels in SI) in the Gen16 training set, colored based on global molecular composition \({{{\mathcal{C}}}}\). B Shows \(f(c| {{{{\mathcal{C}}}}}_{{{{\rm{t}}}}})\) -- a typical distribution of local environments at the target composition \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\), computed from a GAP-MD trajectory. C, D Show a series of volume scans at two different compositions, obtained with DFT and three different GAP models. Gen5+IM/DS and Gen6+IM/DS were fitted on training sets Gen5 and Gen6 augmented with OPLS isolated molecules. Insets illustrate the corresponding molecular compositions: EC (orange) and EMC (cyan).

Figure 8 shows a collection of volume scans for two molecular compositions: 3:7 M and 7:3 M, obtained with different GAP models. Gen5/DS which was trained only on the target composition, correctly reproduces the energy of the 3:7 M composition at the equilibrium density but grossly under/overestimates the energy at 7:3 M, by hundreds of meV atom−1. Adding isolated molecules to the training set improves the 7:3 M prediction and the new model, Gen5+IM/DS, correctly reproduces the total energy in the limit s → , where only intra-molecular energy terms survive.

The error in the inter-molecular energy of the 7:3 M liquid state (s ≈ 1) is reduced by about 50 meV atom−1, however adding isolated molecules is not sufficient to fix this problem. In Gen6+IM/DS we further extend the training set by adding configurations across the full range of molecular compositions. As shown in Fig. 8D, this new model performs significantly better and volume scan energy predictions at s ≈ 1 are within a few meV atom−1 of the DFT values. Importantly, GAP-MD trajectories obtained with models above Gen6/DS are stable at all compositions (Fig. 2).

Augmenting the training sets with the full range of compositions, significantly improves the representation of all local molecular environments c and these models perform better on \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\) as well. At a fixed \({{{\mathcal{C}}}}\), the typical GAP-MD trajectory explores a wide range of environments c defined as the number of molecules of each type within a cutoff radius R of a center molecule:

$$c=\{{n}_{{{{\rm{EC}}}}}^{R},{n}_{{{{\rm{EMC}}}}}^{R}\}$$
(14)

The probability distribution of such environments depends on the global composition \(f(c| {{{\mathcal{C}}}})\) and some environments are more likely than others. Figure 8B shows an example distribution from GAP-MD at \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\). While the most prevalent environment is \(c\equiv {{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\), some atypical environments like c = {3, 0} still occur with non-zero probability. In models up to Gen5/DS (trained only on \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\)) such low-probability environments are poorly represented in the training set and therefore yield large errors. Extending the training sets to the full range of compositions \({{{\mathcal{C}}}}\) improves the coverage of all local molecular environments c and reduces the overall errors, including at the target composition \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\). This is reflected in more stable and reproducible GAP-MD trajectories at \({{{{\mathcal{C}}}}}_{{{{\rm{t}}}}}\).

Model optimization

All GAP models above Gen11/DS were trained on isolated molecules, volume scans and multiple compositions. These special configurations significantly improve the versatility of the training sets and the description of inter-molecular interactions. The density collapse problem is largely solved, however models up to Gen15/DS continue to exhibit reversible, but unphysical density fluctuations in GAP-MD, especially at higher temperatures (Figs. 9A, 2A). Iterative training does not improve these issues, suggesting further model optimization is required.

Fig. 9: GAP model optimization.
figure 9

A Shows a 12-molecule Gen15/DS-MD simulation at 500 K and target composition. B Zooms in on an unphysical density fluctuation. Both intra- and inter- energies were recomputed with DFT and Gen15/DTS - a double turbo soap model. C Shows the DFT-GAP inter-molecular energy and force correlation plots for both Gen15/DS and Gen15/DTS on a 40-config test set (includes 15 configs from B). D Shows the inter-molecular center of mass force experienced by one molecule in a volume scan at the target composition.

On reevaluating selected points along such a density fluctuations with DFT (Fig. 9B), we find that Gen15/DS reproduces the intra-energy with good accuracy, whereas the inter-energy is systematically under-estimated by around 4–9 meV atom−1. Surprisingly, the 40% drop in density only changes the DFT inter-energy by about 10 meV atom−1, whereas the Gen15/DS model under-estimates this energy barrier by about 5 meV atom−1. This suggests tight inter-energy accuracies are required (order 1 meV atom−1) to keep NPT simulations stable. We find that by tuning the model hyper-parameters we can improve these errors. Specifically, increasing Nsparse improves the flexibility of the model which is required to describe complex inter-molecular environments. A larger σatom makes the atomic density smoother and radial functions broader which decreases the resolution but improves model regularity (SI for details).

We find that using Turbo SOAP63 instead of the original SOAP61, significantly improves the inter- errors on both forces and energies, as well as the smoothness of the model. Figure 9C compares Gen15/DS to Gen15/DTS (double Turbo SOAP) on a test set comprising 40 liquid configurations selected form GAP-MD simulations. Gen15/DTS yields inter-energy RMSEs of around 1 meV atom−1 and 40% smaller inter-force RMSEs. Moreover, the Turbo SOAP model produces smoother volume scan curves as illustrated by the molecular center of mass forces shown in Fig. 9D. This improvement is largely accounted for by the quality of the poly3gauss radial basis, most other parameters being unchanged (SI for more details).

These optimizations for both SOAP and Turbo SOAP, result in improved performance at Gen16. GAP-MD dynamics obtained with both Gen16/DS and Gen16/DTS are stable at all compositions at temperatures up to 500 K (Fig. 2 and SI).

Discussion

In this work we demonstrate a robust ML training protocol for the EC:EMC binary solvent and identify the key challenges in developing converged ML models. We validate our best models against extensive ab initio MD results and compare the density predictions to experimental values. Below we summarize our key findings.

ML potentials fitted on fixed data sets are not MD-stable, especially in the NPT ensemble where density fluctuations are sensitive to an accurate description of the inter-molecular interactions. Iterative training is essential for obtaining a robust model and unphysical density fluctuations can be used as a proxy for extending the training sets.

Intra- and inter-molecular interactions have different scales and dimensionalities which complicates ML modelling. The key difficulties in training our models stem from the large imbalance between intra- and inter-molecular interactions and are likely general in all molecular condensed phase systems. The intra-component is easier to fit from total energies and forces because it dominates the loss function. However, the significantly weaker inter-component determines the thermodynamic properties of the liquid and therefore a good model must describe this component of the interactions with great accuracy. We develop a range of techniques such as rigid-molecule volume scans and intra-/inter- error separation. We employ these techniques to evaluate the potential and to improve its performance on the relevant inter-molecular scale.

Multi-component molecular liquids present additional challenges to ML modelling because they feature an increased diversity of local molecular environments. We demonstrate that a potential intended to model a target molecular composition must still be trained on all compositions. Extending the data in this way, ensures all local molecular environments are uniformly represented in the training set.

We further show that when using SOAP descriptors, the hyper-parameters need to be selected carefully to capture both intra- and inter-molecular interactions. We optimize the radial basis and the sparse set to achieve the required trade-off which performs well on both scales.

We expect our learning protocol to generalize to more complex systems relevant for batteries, such as the electrolyte (solvent + salt) and the solvent-electrode interphase (SEI). In both cases, long-range electrostatics51,52,53,54,55,56,57 will likely be required to correctly capture ion transport and interface chemistry. More generally, the lessons we learned here will help guide future ML modelling in these systems.

Methods

GAP models

We fit and test our potentials using the standard implementation of GAP2,60 from the QUIP library67. The SOAP descriptors are computed from the atomic density of each atom i of chemical species a:

$$\begin{array}{ll}{\rho}^{ia}(r)\,=\, \mathop{\sum}\limits_{j}{\delta }_{ab}\exp \left(\frac{-| r-{r}_{ij}{| }^{2}}{2{\sigma }_{{{{\rm{atom}}}}}^{2}}\right){f}_{{{{\rm{cut}}}}}({r}_{ij}\, < \,{R}_{{{{\rm{cut}}}}})\\\qquad\quad=\, \mathop{\sum }\limits_{n}^{{N}_{\max }}\mathop{\sum }\limits_{lm}^{{L}_{\max }}{c}_{nlm}^{ia}{R}_{n}(r){Y}_{lm}(\hat{{{{\bf{r}}}}})\end{array}$$
(15)

where b is the chemical species of atom j, σatom is the width of the Gaussian smoothing and Rcut is the cut-off of the potential. The density is expanded in a radial-angular basis (controlled by \({N}_{\max }\) and \({L}_{\max }\)) and the expansion coefficients \({c}_{nlm}^{ia}\) are used to construct the rotationally-invariant power spectrum:

$$\xi ={p}_{n{n}^{{\prime} }l}^{ia{a}^{{\prime} }}=\frac{1}{\sqrt{2l+1}}\mathop{\sum}\limits_{m}{c}_{nlm}^{ia}{c}_{{n}^{{\prime} }lm}^{i{a}^{{\prime} }}$$
(16)

These feature vectors enter a polynomial kernel:

$$k(\xi ,{\xi }^{{\prime} })={(\xi \cdot {\xi }^{{\prime} })}^{\zeta }$$
(17)

with ζ = 4 in this work. We use a double SOAP (DS) model with separate kernels:

$$\begin{array}{ll}E={{{\Delta }}}_{1}\mathop{\sum }\limits_{m=1}^{{N}_{{{{\rm{sparse}}}}}^{(1)}}{c}_{m}^{(1)}k({\xi }^{(1)},{\xi }_{m}^{(1)})\\ \quad \quad +\,{{{\Delta }}}_{2}\mathop{\sum }\limits_{m=1}^{{N}_{{{{\rm{sparse}}}}}^{(2)}}{c}_{m}^{(2)}k({\xi }^{(2)},{\xi }_{m}^{(2)})\end{array}$$
(18)

where the first SOAP is higher resolution, shorter range and the second SOAP is lower resolution, longer range: σatom,2 = 2 × σatom,1 and Rcut,2 = 2 × Rcut,1. Scaling parameters are the same for both SOAPs (Δ1 = Δ2 = 0.1), cut-offs are Rcut,1 = 3Å and Rcut,2 = 6Å, while the rest of the parameters (reported in Table 1) were adjusted at different generations of the potential during iterative training.

Table 1 SOAP/GAP hyper-parameters.

The loss function is minimized on total energy (E), forces (F) and stress virials (V):

$$\begin{array}{ll}{{{\mathscr{L}}}}=\mathop{\sum }\limits_{n}^{{N}_{{{{\rm{conf}}}}}}\frac{{({E}^{{{{\rm{DFT}}}}}-{E}^{{{{\rm{GAP}}}}})}^{2}}{{N}_{{{{\rm{atom}}}},n}{\sigma }_{{{{\rm{E}}}}}^{2}}\\ \qquad\,+\,\mathop{\sum }\limits_{n}^{{N}_{{{{\rm{conf}}}}}}\mathop{\sum }\limits_{i}^{{N}_{{{{\rm{atom}}}},n}}\frac{\parallel {F}_{i}^{{{{\rm{DFT}}}}}-{F}_{i}^{{{{\rm{GAP}}}}}{\parallel }^{2}}{{\sigma }_{{{{\rm{F}}}}}^{2}}\\ \qquad\,+\,\mathop{\sum }\limits_{n}^{{N}_{{{{\rm{conf}}}}}}\frac{\parallel {V}^{{{{\rm{DFT}}}}}-{V}^{{{{\rm{GAP}}}}}{\parallel }^{2}}{{N}_{{{{\rm{atom}}}},n}{\sigma }_{{{{\rm{V}}}}}^{2}}\\ \qquad\,+\,\mathop{\sum }\limits_{m=1}^{{N}_{{{{\rm{sparse}}}}}}{c}_{m}k({\xi }_{m},{\xi }_{m}^{{\prime} }){c}_{m}^{{\prime} }\end{array}$$
(19)

with regularization parameters: σE = 1 meV atom−1 (up to Gen10 models) and σE = 2 meV atom−1 (from Gen11), σF = 0.1 eV Å−1, σV = 0.01 eV. At Gen15 (see SI) and Gen16 we also employ Turbo SOAP descriptors63 which are computed from the atomic density:

$$\begin{array}{l}{\rho}^{ia}(r)=\mathop{\sum}\limits_{j}{\delta }_{ab}\exp \left(\frac{-| r-{r}_{ij}{| }^{2}}{2{\sigma }_{{{{\rm{atom}}}},\parallel }^{2}}\right)\\ \qquad \,\,\,\,\,\,\,\,\,\,\times \,\exp \left(\frac{-{r}_{\perp ,ij}^{2}}{2{\sigma }_{{{{\rm{atom}}}},\perp }^{2}}\right){f}_{{{{\rm{cut}}}}}({r}_{ij}\, < \,{R}_{{{{\rm{cut}}}}})\end{array}$$
(20)

where the radial () and angular () parts are separated and the σatom parameters have linear radial dependence:

$$\begin{array}{ll}&{\sigma }_{{{{\rm{atom}}}},\parallel }={\sigma }_{{{{\rm{atom}}}},\parallel ,0}+{\alpha }_{\parallel }r\\ &{\sigma }_{{{{\rm{atom}}}},\perp }={\sigma }_{{{{\rm{atom}}}},\perp ,0}+{\alpha }_{\perp }r\end{array}$$
(21)

In this work we set σatom, = σatom,. We explored different single Turbo SOAP models with varying σatom (SI), however, in the Results section here, we only report double Turbo SOAP (DTS) with α = α = 0 which is our best model. The Turbo SOAP models use a different radial basis which yields smoother potential energy surfaces (check ref. 63 for details). All descriptors are implemented in the QUIP library, see SI for detail input strings.

Iterative Training

We start from Gen0 with a small subset (50 liquid configurations) chosen from the OPLS data-set. At each subsequent iteration, we train a new GAP model on the current training set and use it to run NPT dynamics at various temperatures, pressures, compositions, cell sizes as detailed in Table 2. We typically run multiple GAP-MD simulations spanning 20 to 100 ps, staring from different coordinate seeds. From each GAP-MD trajectory, we choose a single new configuration (typically around unphysical density fluctuations). This results in a few new configurations per generation (depending on the number of MD trajectories, see Table 2), we compute their ab initio properties, and augment our training set for the next generation of the potential.

Table 2 Training set construction.

The sampling for the iterative procedure was carried out exclusively using DS models (see Table 1), while DTS models were employed for additional testing and production simulations.

In addition to these iterations, we add a range of special configurations to our training sets: at Gen11 we add isolated molecules (obtained by deconstructing the 50 OPLS configs from Gen0) and at Gen12 we add volume scans. Up to Gen5 we train only on the target composition, while from Gen6 we expand the data sets to all molecular compositions ranging from pure EC to pure EMC.

The system sizes we use for the iterative procedure are smaller than those employed in the testing/production runs to limit the cost of the DFT calculations. Up to Gen13, we use 12 molecule systems which means different molecular compositions have different system sizes, with the EC molecule being smaller than EMC (10 atoms vs 15). From Gen14 onward, we employ systems with roughly the same number of atoms for each molecular composition, such that the cost of DFT and GAP-MD is more homogeneous across all configurations. This also ensures the finite-size effects are similar for all compositions.

The diversity of molecular environments increases substantially with iterative training. We adjust the number of sparse points (Table 1) to improve the model flexibility and capture the increasing complexity of the data.

Electronic structure calculations

The molecular dimers used for bench-marking (Fig. 3) were geometry-optimized with different DFT XC functionals (BLYP, PBE, B3LYP and PBE0) and with D3BJ68 dispersion corrections, using the def2-TZVP basis sets. We check the stability of the local minimum using vibrational frequency analysis. The distance between molecular centre of masses is used to investigate the strength of intermolecular interactions. All dimer DFT calculations (with and without dispersion corrections69) were performed with Orca 5.070. The PNO-LCCSD(T)-F12/AVTZ-F1271 approach as implemented in MOLPRO 2021.272 was used with the “tight" settings to obtain the reference interaction energies at the DFT optimized geometries.

For iterative training (including volume scans and isolated molecules), we obtain the total energies, forces, and virials from spin-unpolarized periodic electronic structure calculations at the Γ point. We used the plane wave implementation of CASTEP73, with the PBE XC functional and D2 dispersion correction (keyword: G06)74, an energy cutoff of 800 eV and a Monkhorst-Pack 1 x 1 x 1 k-point grid. Standard CASTEP ultrasoft pseudopotentials were used to model the core electrons.

The final training set (Gen16) was recomputed with the mixed Gaussian/plane wave method implemented in CP2K 8.275. We used the PBE-D3 functional with a TZV2P basis set and a density cutoff of 800 Ry. Core electrons were modelled with norm-conserving GTH pseudopotentials.

MD Simulations

OPLS-MD and GAP-MD simulations were performed in LAMMPS66. We obtain the OPLS62 parameters with CM1A charges76 using the online generator LibParGen77. The GAP potentials are loaded in LAMMPS using the pair_style quip command. The equations of motion were integrated using the velocity-Verlet algorithm with a time step of 1 fs. For iterative training we used the Nose-Hoover thermostat with chain length 3 (LAMMPS default) and temperature damping set to 100 fs, and the Nose-Hoover barostat (isotropic pressure) with chain length 3 (LAMMPS default) and pressure damping set to 1 ps. The settings of the simulation (number of molecules, compositions, temperatures, densities) are reported in Table 2.

At selected generations of the potential (Fig. 2) we perform 100 ps NPT GAP-MD tests with larger system sizes (48 molecules), at different compositions and temperatures. The different compositions were modelled using 48 EC molecules (pure EC), 32 EC + 16 EMC (7:3 M), 16 EC + 32 EMC (3:7 M) and 48 EMC (pure EMC).

We also perform 1000 atoms NPT simulations (Fig. 5) with the final versions of the potential (Gen16/DTS) retrained on the PBE-D3 functional. For these simulations, we model four different compositions (Fig. 5) using: 100 EC molecules (pure EC), 57 EC + 29 EMC (7:3 M), 25 EC + 50 EMC (3:7 M) and 67 EMC molecules (pure EMC). All NPT simulations used the barostat, thermostat, integration settings.

AIMD simulations were performed with QUICKSTEP as implemented in the CP2K code75. The wave function extrapolation (between MD steps) was obtained with the always stable predictor-corrector method (ASPC) (extrapolation order 3). We used the NVT ensemble and the CSVR thermostat78 with 100 fs damping to keep the temperature constant at 500 K. We obtained three independent 48-molecule (12 EC + 36 EMC) trajectories at liquid densities ρ = (0.85, 0.95 and 1.08 g cm−3) with cell dimensions L = (19.358, 20.235 and 21.253 Å). The cutoff radius of the dispersion interactions was set to 10 Å. The time step was set to 0.5 fs, and for each box, a 20 ps MD trajectory was generated after a 5 ps equilibration run. For a direct and fair comparison to GAP-MD we ran the same NVT simulations using the LAMMPS implementation of the CSVR thermostat with 100 fs damping, and an integration time step of 0.5 fs. The GAP potentials (Gen16/DS and Gen16/DTS) were retrained with the same electronic structure settings as used in the AIMD.

Experimental details

Mixtures of ethylene carbonate and ethyl methyl carbonate were prepared by weighing EC and EMC in the desired ratio; gentle heating to 60 °C (333 K) was required to fully dissolve EC. The prepared solvent was degassed using three freeze-pump-thaw degas cycles and dried over 4 Å activated molecular sieves. The water content of the mixtures was found to be less than 50 ppm by Karl-Fischer titration. The exact molar ratio was determined by integrating the EC and EMC signals in the 1H NMR spectra. NMR spectra were recorded on a Bruker 300 MHz Ascend spectrometer. Measured ratios were: 69.3 EC to 30.7 EMC by NMR (7:3 M) and 29.9 EC to 70.1 EMC by NMR (3:7 M).

Density measurements were performed at 22 C (295 K) using an Anton Paar DMA 48 instrument, which was calibrated with water, ethanol and propylene carbonate before measuring the mixtures. 5 ml of each liquid was used for the measurements.