CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Determination of solvation parameters using MarvinSketch Paul Laffort*, Pierre Héricourt CNRS, Centre Européen des Sciences du Goût, 21000 Dijon, France *http://paul.laffort.free.fr.

Download Report

Transcript CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Determination of solvation parameters using MarvinSketch Paul Laffort*, Pierre Héricourt CNRS, Centre Européen des Sciences du Goût, 21000 Dijon, France *http://paul.laffort.free.fr.

CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE

Determination of solvation parameters using MarvinSketch

Paul Laffort*, Pierre Héricourt

CNRS, Centre Européen des Sciences du Goût, 21000 Dijon, France * http://paul.laffort.free.fr

Conceptual definition of solvation parameters

(previously called “solubility factors” by P. Laffort and co-authors) B: solvents SP: experimental matrix of a solubility property; e.g. retention indices in GLC if: SP = A*B, then A and B are respectively matrices of solute and solvent solvation parameters

Experimental definition of solvation parameters 1

The first tool needed is a solid database property.

SP of a solubility

In 1972-1987, together with Andrew Dravnieks, we used unpublished retention indices in GLC, by W.O. McReynolds, from Celanese Chem.Co., Bishop, Texas: A matrix of 75 solutes x 25 stationary phases (i.e. solvents).

In 2005 we used a very accurate matrix of 133 solutes x 10 stationary phases, by Erwin Kov áts and co-authors, from five papers (1990-1995)

Experimental definition of solvation parameters 2

The 2nd tool needed is a suitable statistical analysis: the MMA algorithm

Pearson correlation coefficients of A

B

solvent parameters

A

experimental retention indices

R INPUT MMA

predicted retention indices

A*B

Standard error of R

A OUTPUT

Pearson correlation coefficients of A

First application of the MMA algorithm: the number of terms 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 Number of terms 7 8 9 10

Nature of the five solute solubility parameters

There is an agreement between the authors presently involved in solvation parameters, to consider that five solute parameters and five solvent parameters are needed and sufficient to take into account the solubility phenomena.

The five solute parameters are:

DISPER: dispersion

ORIENT: orientation or polarity

POLARIZ: polarizability/induction

ACID: acidity (proton donor)

BASIC: basicity (proton acceptor)

related to the molar volume independent of the molar volume

The nature of the solvent parameters is not yet completely identified

Experimental definition of solvation parameters 3

The 3rd tool needed is an INPUT set values of the solute parameters, from theoretical or empirical considerations, as close as possible of the output values obtained using together the MMA algorithm and an accurate GLC set of experimental retention indices (here, by Kov áts and co-authors).

Among all published values, we only tested those concerning five solute parameters, including our own previous studies (in 1976 and 1982). In addition to the already mentioned good correlation between INPUT and OUPUT values, two additional criteria have been considered: 1.

2.

A good independence of the solute parameters (poor mutual correlation) An OUTPUT set of solvent parameters without negative values, difficult to understand in physico-chemical terms.

A good independence of the INPUT solute parameters

Among the five published data sets tested, the set by Michael Abraham (1993) presents the best mutual independence of the solute parameters, after an internal rearrangement of the original values via two simple equations.

ORIENT

 H 2 

POLARIZ ACID BASIC

R 2   H 2    H 2 .

DISPER ORIENT POLARIZ ACID

Abraham, 1993 N = 314 0.45  

DISPER ORIENT POLARIZ ACID

Modified according to Laffort et al., 2005

2

 0.06 0.52 0.14 0.06 log L 16 0.61 0.32 0.31  H 2  0.15 -0.14 R 2

Original data set

0.14   H 2  R 2   H 2    H 2 .

 0.24 0.08 -0.02 0.27 0.15 0.06 

2

0.24 

2

 -0.14 R 2 0.14   H 2 

Slightly modified data set

A first set of updated solute solvation parameters

The rearranged data of Michael Abraham (1993):

provide, as we will saw now, a good INPUT matrix using the MMA algorithm and the experimental retention indices of Kováts and co-authors, for 133 compounds;

also provide a first set of updated solute solvation parameters for 314 compounds.

Experimental definition of solvation parameters: 1 + 2 + 3 The Abraham (1993) rearranged data appear as an INPUT data set reasonably good. The version according to Laffort et al. (2005) has been chosen as the best INPUT, generating updated solute parameters for 133 compounds.

MODELS

random numbers Abraham (original) Abraham (modified) Laffort et al., 2005

INPUT / OUTPUT correlations Number in Number of DISPER ORIENT POLARIZ ACID BASIC OUTPUT A neg. values of r

0 .

5 in matrix B

0.04 log L16 1.00 

2

 0.99 f n V b 0.98 0.10  2 H 0.99  

2

 0.97  

2

 0.97 0.00 R 2 0.91 R 2 0.91 R 96 0.92 0.12   H 2 0.98   H 2  0.98   H 2  0.98  0.07   H 2  0.98   H 2  0.98   H 2  0.98 3 2 zero zero zero (param. 4 & 5) 15 (param. 4) 6 zero

Getting optimized values for more solutes

Three ways are now available to get other solute solvation para-meters: 1.

2.

A 100% experimental procedure using GLC with five columns (open tubular, if possible, rather than filled), containing two apolar phases of different molecular weight, a strongly fluorinated, a classical polyether and an alcoholic (e.g. diglycerol), after “learning” the set for 133 compounds.

A rewriting of the numerous data published by Michael Abraham and co-authors (Colin Poole, Alan Katritzky, Andreas Klamt, William Acree Jr. and many others), using the two already mentioned equations of internal rearrangement plus a third unpublished equation when these authors use Vx (the molar volume) in place of L16 (partition coefficient air-hexadecane).

3.

A simplified molecular topology (SMT) which principally takes into account, for each atom of a molecule, its nature, the nature of its bonds and in some cases the nature of its first neighbors. The SMT algorithm is based on the MarvinSketch program and other Java functionalities of ChemAxon Ltd. The learning is based on a pool of the two subsets of solubility parameters already mentioned ( 314 + 133), having a total of 369 defined compounds.

Principle and examples of the SMT

Structural elements Bonds

Carbon ≤ 4

Topological features

C0, C1, C11, C111, C1111, C2, C12, C112, C22, C3, C13 Oxygen ≤ 2 O0, O11 Oxygen Oxygen ≤ 2 O1 ≤ 2 O2

Subcategories

linked to C1, C11, C111, C1111, C112 linked to C12, C112, others

The index of polarizability/induction 1: the model

Features

Constant C1 C111 C1111 C12 C112 N111 F1 Br1 I1 S tot O2 x C112

POLARIZABILITY Coefficients

0.300

-0.150

0.150

0.318

0.055

0.222

0.250

-0.237

0.152

0.482

0.267

-0.158

Partial F ratios

515 182 140 182 607 35 561 73 262 127 72

The index of polarizability/induction 2: the validation

2.5

POLARIZABILITY

1.5

-1.5

0.5

0.5

1.5

r = 0.96

F = 369 -1.5

experim ental

2.5

Conclusion and perspectives

1.

2.

3.

A 100% experimental procedure using GLC with five columns is certainly one of the ways to be pursued.

A Simplified Molecular Topology (SMT) based on the MarvinSketch program and other Java functionalities of ChemAxon Ltd., deserves also to be pursued, and perhaps to be refined with the help of more values from experimental origin.

By the moment, the theoretical approaches are not so precise than the empirical ones, but that could be change in a near future.

_____________ More details can be seen in: Laffort, P. et al., 2005, J.Chromatogr. A, 1100, 90-107 Laffort, P., Héricourt, P., 2006, J. Chem. Inf. Model., 46, 1723-1734