1 Introduction

The extraction of metal ions from aqueous solution into organic phases by the use of organic ligands is of great importance in the separation and purification of metals [1]. Quantitative studies of the extraction process require a knowledge of the water–solvent partition of the organic ligand, and so any method of estimating or predicting partition coefficients of organic ligands would be a considerable help in the design of new or novel extraction systems. We have set out a systematic method for the determination of properties or ‘descriptors’ of molecules [2,3,4,5], mostly using experimental values of water to solvent partition coefficients. These descriptors, known as Abraham or Absolv descriptors can then be used to estimate partition coefficients for other water–solvent systems, as well as numerous physicochemical, environmental and biological properties. We have already used this method to determine descriptors for organophosphorus extractants [6]. Another well-known class of extracting agents is based on pentane-2,4-dione, or acetylacetone and its derivatives. Although we have preliminary descriptors for acetylacetone and some derivatives [7, 8], these were based on limited data, and so we have re-determined descriptors for acetylacetone itself, and have obtained new descriptors for 20 derivatives that have been used as extraction agents.

In solution, acetylacetone and its derivatives exist as a mixture of keto and enol forms, the proportion of which depends on the solvent. If the keto–enol equilibrium constant is known in a number of solvents for which the corresponding water–solvent partition coefficients are known, then it is possible to determine descriptors separately for the keto and enol forms [9]. For many of the compounds that we consider in this work, the keto–enol equilibrium constants are not known, and so we use the experimental water–solvent partition coefficients to obtain descriptors for the keto–enol mixture. These obtained descriptors can then be used to predict further experimental partition coefficients into a wide range of solvents.

2 Methodology

We start with our well-known linear free energy relationships, LFERS, Eqs. 1 and 2 [2,3,4,5] for the partition of neutral molecules (non-electrolytes) from water to another solvent or solvent system,

$$ { \log }_{ 10} P = c + e\varvec{E} + {\mathbf{s}}S + \, a\varvec{A} + \, b\varvec{B} + \, v\varvec{V} $$
(1)
$$ { \log }_{ 10} K = c + e\varvec{E} + {\mathbf{s}}S + \, a\varvec{A} + \, b\varvec{B} + \, l\varvec{L} $$
(2)

In Eq. 1, the dependent variable is log10 P, where P is the water to solvent partition coefficient for a series of non-electrolytes in a given water to solvent system. In Eq. 2, the dependent variable is log10 K, where K is the gas phase to solvent system partition coefficient. The independent variables are descriptors as described previously [2,3,4,5]. E is the non-electrolyte (or solute) excess molar refractivity in units of (cm3·mol−1)/10, S is the solute dipolarity/polarizability, A and B are the overall or summation solute hydrogen bond acidity and basicity, V is the solute McGowan characteristic volume in units of (cm3·mol−1)/100, and L is log10 K16, where K16 is the gas to hexadecane partition coefficient at 298 K. The use of Eqs. 1 and 2 has been reviewed [2,3,4,5]; the review of Clarke and Mallon [5] is particularly exhaustive.

In order to obtain descriptors we first need LFERs, based on Eq. 1 for partition from water to various solvent systems. The coefficients in Eq. 1 for partition from water into wet (water saturated) solvents are in Table 1 [6]. Then if we have log10 P values for a given solute in systems for which we have descriptors, we can determine values of the descriptors in Eq. 1 by solution of a set of simultaneous equations. We usually have more equations than we have unknowns (i.e. the descriptors). In this case, use of Microsoft ‘Solver’ is a very convenient way of solving the set of equations by trial-and-error. The solution is the set of descriptors that gives the best fit of the dependent variable. The solution is greatly helped if we have prior knowledge of some of the descriptors. The E-descriptor can be obtained from a refractive index at 293 K (for liquid solutes), or can be calculated from an estimated refractive index [10]. Both the available software programs [7, 8] for descriptors give calculated values of E. The V-descriptor can easily be calculated from its molecular formula [2, 11]. Thus we have three descriptors in Eq. 1 to determine (S, A and B). However we can convert all values of log10 P into corresponding gas–solvent partition coefficients, as log10 K, through Eq. 3 where K w is the gas to water partition coefficient, all partition coefficients being at 298 K. Note that K w has no units. We take log10 K w as another unknown ‘descriptor’ and use both Eqs. 1 and 2 in our set of simultaneous equations. Coefficients in Eq. 2 are given in Table 1. Then even if we have a limited number of log10 P values, say only five, we then have five equations in log10 P and five equations in log10 K. We also have two equations for log10 K w, see Table 1, making a total of 12 equations from which to derive five unknowns (S, A, B, L and log10 K w). This is the procedure we use for the determination of descriptors.

Table 1 Coefficients in Eqs. 1 and 2 for partition of solutes from water and the gas phase to wet organic solvents at 298 K
$$ { \log }_{ 10} P = { \log }_{ 10} K{-}{ \log }_{ 10} K_{\text{w}} $$
(3)

3 Results

The required values of log10 P that we need to initiate the calculation of descriptors are known for acetylacetone and for a number of substituted compounds. Quite fortunately, Leo [12] has collected these log10 P values, many of which are scattered over the literature, and lists them in his software program ‘BioLoom’. The log10 P values that we use are nearly all from BioLoom. We start with acetylacetone itself. A value of E = 0.412 from an experimental value of the refractive index [13] is available, and V = 0.8445 [2, 11]. Partition coefficients into no less than 26 solvents are available [12]. Values of log10 P into hexane, decane and butyl acetate were well out of line and were not used, leaving 23 data points. We have also 23 corresponding values of log10 K, two equations in log10 K w and one equation for the NIST Kovats GC retention index, GCRI, leading to a total set of 49 equations. A value of GCRI = 790 for acetylacetone is listed in ChemSpider [13]. We have used NIST Kovats GC values to obtain Eq. 4:

$$ {\text{GCRI }} = { 69}. 6 { } + { 12}. 1\varvec{E} + { 76}. 3\varvec{S} + { 2}00.0\varvec{L} $$
(4)
$$ N = { 286},SD = { 46}. 4,R^{2} = \, 0. 9 9 2,F = { 12}0 7 9,{{ PRESS }} = { 634316},Q^{2} = \, 0. 9 9 2,PSD = { 47}. 7 $$

Here and elsewhere, N is the number of data points (compounds), SD is the regression standard deviation, R is the correlation coefficient, F is the F-statistic, PRESS and Q 2 are the leave-one-out statistics and PSD is the predictive standard deviation [14]. In order not to bias the results, we use GCRI/100 in the set of simultaneous equations.

The total of 49 simultaneous equations were solved to yield the descriptors given in Table 2 with SD = 0.149 log10 units. The 23 observed and calculated values of log10 P are in Table 3. For this set of data the Absolute Error (AE) = 0.018 and SD = 0.146 log10 units. The descriptors for acetylacetone are not unusual, except that A = 0. It might be expected that acetylacetone would have some hydrogen bond acidity through the ‘active’ CH2 group but we have enough data, with 49 equations, to be reasonably certain that A = 0.

Table 2 Descriptors for pentane-2,4-dione and some of its derivatives
Table 3 Calculated and observed values of log10 P for water–solvent partition of 2,4-pentanedione (acac)

For hexane-2,4-dione, log10 P values into six solvents are known, and yield six corresponding log10 K values. There are also two further equations in log10 K w and an equation in GCRI (890), giving a total of 15 equations. From a known refractive index, E = 0.381 and V = 0.9854. The 15 equations were solved to yield the descriptors in Table 2 with an SD = 0.122 log10 units.

There are only four partition coefficients available for heptane-2,4-dione, but these still yield 11 equations (GCRI = 989). A value for E (0.385) can be obtained from a known refractive index and V = 1.1263 and the 11 equations solved with SD = 0.089 to give the descriptors in Table 2.

In the case of octane-2,4-dione, log10 P values are known only for partition into heptane and tetrachloromethane, leading to a total of seven equations. E was estimated as 0.380, A was taken as zero and V calculated as 1.2672. The seven equation were solved with SD = 0.031 to give the descriptors in Table 2.

A log10 P value into tetrachloromethane is all that is available for nonane-2,4-dione. With GCRI = 1188 [13] we have only five equations. E was estimated as 0.380, A was taken as zero and V calculated as 1.4084. This is just enough to obtain the descriptors in Table 2.

Seven log10 P values are listed for heptane-3,5-dione, and lead to 17 equations (GCRI = 989). An experimental value of the refractive index leads to E = 0.389, V is calculated as 1.1263, and the 17 equations can be solved with SD = 0.143 to give the descriptors in Table 2.

For octane-3,5-dione only two partition coefficients were available, so that we have seven equations. E was estimated as 0.380, A was taken as 0.00 and V calculated as 1.2672. The equations were solved to give the descriptors in Table 2 with an SD of 0.063 log10 units.

There is more data for nonane-4,6-dione. We used seven log10 P values, which translated into 17 equations (GCRI = 1188). A known refractive index gave E = 0.399 and V = 1.4081. The set of equations was solved with an SD = 0.167, to give the descriptors in Table 2.

For undecane-5,7-dione we used five partition coefficients, leading to 12 equations. We estimated E as 0.37, calculated V as 1.6899 and solved the set of equations to yield the descriptors in Table 2 with an SD value of 0.150 log10 units.

As for undecane-5,7-dione we had only five partition coefficients for tridecane-6,8-dione. Taking E = 0.37 and V = 1.9717 we solved the set of 12 equations to obtain the descriptors in Table 2 with a rather large SD of 0.178 log10 units.

There are also a number of branched chain alkyl derivatives for which partition coefficients are available [12]. For 5,5-dimethylhexane-2,4-dione partition coefficients are known into five solvents, and with GCRI (1004) we have 13 equations. With E = 0.38 and V = 1.2672 we solved the set of equations to obtain the descriptors in Table 2; SD = 0.128 log10 units.

Partition coefficients are available for 2,6-dimethylheptane-3,5- dione and with GCRI = 1060 we had 15 equations. The given experimental value [12] for partition into octanol, log10 P = 2.22, was well out of line and was omitted. The resulting 13 equations with E = 0.38 and V = 1.4081 gave the descriptors in Table 2 with SD = 0.162 log10 units.

The only partition coefficient that we could find for 2,8-dimethyl-4,6-dione was that of log10 P = 4.05 for partition into benzene [14], but a value of 1258 was available for GCRI. These yielded only five equations. We estimated E as 0.37, we know that V = 1.6899, and in order to solve the equations we also estimated that B = 0.72. Then solution of the equations gave SD = 0.076 and the remaining descriptors as shown in Table 2.

There are two other alkyl derivatives of acac that have been used as complexing agents, 3-methylpentane-2,4-dione and 2,2,6,6-tetramethylheptane-3,5-dione. There is insufficient data on these compounds to yield a set of equations that can be solved to get descriptors, but from the results we have for the other alkyl derivatives, we estimate the descriptors as shown in Table 2.

A number of other derivatives of acetylacetone have been widely used as complexing agents; for several of these compounds, numerous values of log10 P are known [12]. We start with benzoylacetone (1-phenylbutane-1,3-dione) for which partition into 20 solvents has been studied. A value of 1364 for GCRI is available [13] and so we have no less than 43 equations on the lines of Eqs. 1 and 2. We took E = 1.00 from addition of fragments and also from calculations [7, 8] and V = 1.3114. The equations were solved to yield the descriptors in Table 2 with a very small value of SD = 0.086 log10 units. The observed and calculated values of log10 P are in Table 4. For the 20 solvents, AE = 0.01 and SD = 0.086 log10 units

Table 4 Calculated and observed values of log10 P for water–solvent partition of benzoylacetone

Partitions into 11 solvents are known for 1,1,1-trifluorobenzoylacetone. The value of log10 P into trichloromethane was quite out of line (obs. 2.73, calc. 3.28) and if this is left out we have 23 equations (GCRI = 1198). We estimated E = 0.69 from values for pentane-2,4-dione, 1,1,1-trifluoropentane-2,4-dione and benzoylacetone, and calculated V = 1.3645. The set of equations were solved to give the descriptors in Table 2 with SD = 0.126 log10 units.

For thenoylacetone (1-(2-thienyl)butane-1,3-dione) we have log10 P values into hexane and benzene. Together with a value of 1385 for GCRI these gave seven equations. We estimated E = 1.10 by addition of fragments, calculated V as 1.2361 and solved the equations to give the descriptors in Table 2 with SD = 0.039 log10 units.

There are a large number of log10 P values available for trifluorothenoylacetone (4,4,4-trifluoro-1-(2-thiényl)butane-1,3-dione). These include values for partition into numerous esters for which we have no coefficients in Eqs. 1 and 2. For partition into ethyl acetate and butyl acetate, however, the observed values of log10 P are so far away from our calculated values that we suggest all the given log10 P values into esters be used with caution. We were left with 12 values of log10 P, together with a value of 1199 for GCRI, leading to 27 equations. A calculated refractive index [10] leads to E = 0.524, close to a calculated value for E of 0.53 [8]. We used the latter value and our calculated value of V = 1.2892, and solved the 27 equations to give the descriptors in Table 2 with an SD of 0.101 log10 units.

There are also log10 P values for 2-furoyltrifluoroacetone and pivaloyltrifluoroacetone, but we could not obtain any reasonable set of descriptors for these two compounds.

Finally we deal with trifluoroacetylacetone (1,1,1-trifluoropentane-2,4-dione) and hexafluoroacetylacetone (1,1,1,3,3,3-hexafluoropentane-2,4-dione). For trifluoroacetyl-acetone we have log10 P values into 15 solvents. The value of log10 P into trichloromethane was considerably out of line (calc. 0.94, obs. 0.33) and was left out. With GCRI = 624 this leaves 31 equations to solve. An experimental refractive index of 1.3890 [13] leads to E = 0.106 and with V = 0.8976 we obtained the descriptors in Table 2 with an SD of 0.125 log10 units. The calculated and observed values of log10 P are in Table 5, and yield AE = 0.011 and SD = 0.123 log10 units (omitting trichloromethane). It is noteworthy that the A-descriptor is not zero, but with a set of 31 equations, we can be reasonably confident about this descriptor.

Table 5 Calculated and observed values of log10 P for water–solvent partition of trifluoroacetylacetone

The position with hexafluoroacetylacetone is not straightforward. We have four values [15] of log10 P into trichloromethane (−1.75), tetrachloromethane (−1.92), hexane (−2.04) and benzene (−1.91), and also a value of GCRI (459), leading to eleven equations. The solution of this set of simultaneous equations yields completely unreasonable values for the descriptors. Stokely [16] has shown that hexafluoroacetylacetone decomposes in water. He measured a value for log10 P into benzene of −0.42 (in contrast to the value of −1.91 [15]), and found that the partition coefficient decreased with time. We can obtain a value of −0.217 for E from the refractive index and we can calculate V = 0.9507, but there is still not enough data to obtain a full set of descriptors. We can deduce that B = 0.80 and L = 2.340 by comparison to other compounds in Table 2, and from Absolv calculations [7]. Then with S = 0.07 and A = 0.32 we can reproduce Stokely’s [16] value of −0.42 for log10 P into benzene, and the associated values of log10 K into benzene and log10 K, with the descriptors in Table 2. However, we caution that these results must be regarded as provisional only.

4 Discussion

We have managed to obtain descriptors for acetylacetone and 21 of its derivatives, as shown in Table 2. These can be combined with the equation coefficients in Table 1 to yield estimates of partition coefficients from water and the gas phase into all the listed solvents, and (hypothetical) partition coefficients into a large number of dry solvents for which we have also determined equation coefficients [17,18,19]. In addition we have determined equation coefficients for partition into water–ethanol [20, 21] and water–methanol mixtures [22], and so values of log10 P and log10 K into these solvent mixtures can also be estimated. In addition to the usual organic solvents, we have also studied ionic liquids [23], and partitions into these solvents can be estimated for the various acetylacetonates. Partitions or permeations in biological systems [24,25,26] can also be estimated from the descriptors listed in Table 2.

Inspection of the descriptors themselves shows that all the acetylacetonates are quite polar, with substantial values of the S-descriptor, and, as expected from the presence of the two carbonyl groups, are quite strong hydrogen bond bases, with B-values almost double those for simple aliphatic ketones which have B -values around 0.45 [7, 8]. Perhaps unexpectedly, the alkylsubstituted acetylacetonates all have zero hydrogen bond acidity, as do some of the trifluoroderivatives. Only with trifluoroacetylacetonate, and with hexafluoroacetylacetonate are significant values of the A-descriptor found.

The L-values form a very regular series, and can be taken to show the internal consistency of our set of descriptors. For the acetylacetonates with linear alkyl substituents we find Eq. 5, where CN is the number of carbon atoms.

$$ \varvec{L} = \, 0. 7 1 3 2 { } + \, 0. 50 6 9 {\text{ CN}} $$
(5)
$$ N = { 1}0,SD = \, 0.0 5 9,R^{2} = \, 0. 9 9 8,F = { 3734}.0 $$
$$ {{PRESS }} = \, 0.0 6 2 3 5,Q^{2} = \, 0. 9 9 5,PSD = \, 0.0 8 9 $$

The branched chain substituents behave remarkably similarly to the linear chain substituents, and for all the alkyl substituted acetylacetonates we find Eq. 6.

$$ \varvec{L} = \, 0. 7 5 1 6 { } + \, 0. 50 1 5 {\text{ CN}} $$
(6)
$$ N = { 15},SD = \, 0.0 5 3,R^{2} = \, 0. 9 9 8,F = { 6169}. 3 $$
$$ PRESS = \, 0.0 6 1 6 4 40,Q^{2} = \, 0. 9 9 7,PSD = \, 0.0 6 9 $$

The plots of L against CN are excellent, as shown in Fig. 1. Equation 5 or especially Eq. 6 could be used to estimate an L-value for any alkylsubstituted acetylacetonate.

Fig. 1
figure 1

Plot of the descriptor L against the number of carbon atoms, CN, in alkyl prentane-2,4-diones, open circle linear alkylpentane-2,4-diones, filled circle branched chain alkylpentane-2,4-diones

Once we have descriptors for the acetylacetonates, we can then deduce the corresponding water–octanol partition coefficients, as log10 P. These partition coefficients are of considerable interest, as they are often used as a measure of hydrophobicity of solutes, and they are the most commonly estimated of all water–solvent partition coefficients. We can compare our own calculated values with those calculated through four very common methods, the ClogP program of Leo [12], the EPI Suite TM [27], the ACD program in ChemSketch [10] and the ACD program that is part of the Absolv ADME Suite [7]. In addition, we can compare all the calculated values with the (few) observed values. Details are in Table 6. There are eight compounds for which observed values are available, and a comparison of these with the various calculated values, in terms of the average error and standard deviation, is in Table 7.

Table 6 Observed and calculated values of log10 P for acetonylacetonates
Table 7 Comparison of observed and calculated values of log10 P for acetonylacetonates

Inspection of Tables 6 and 7 suggests that where our descriptors are available, they yield estimates of water–octanol partition coefficients, as log10 P, that are at least as good as those from standard calculation methods [7, 10, 12, 27]. In addition, use of our descriptors has the advantage that water–solvent partition coefficients can be estimated for a very large number of organic solvents. The deviations in observed and calculated values of log10 P for the eight acetonylacetonates are quite similar thus indicating that the errors in the descriptors for the various acetonylacetonates are also quite close.

5 Conclusions

We have been able to obtain Abraham or Absolv descriptors for pentane-2,4-dione and 21 of its derivatives. These descriptors encode important chemical properties, and show that pentane-2,4-dione itself has no hydrogen bond acidity, but that the trifluoro- and hexafluoro-derivatives have substantial hydrogen bond acidity. The descriptors for the 22 compounds enable partition coefficients to be estimated for partition from water to a very large number of organic solvents. In the case of water–octanol partition coefficients we show that estimations through our descriptors are at least as good as the best calculations through well-known calculational programs.