Treatment Process Modeling by QSAR Approach

Download Report

Transcript Treatment Process Modeling by QSAR Approach

TREATMENT PROCESS MODELING
BY QSAR APPROACH BIODEGRADATION
Sung Kyu (Andrew) Maeng
Contents
 QSAR Introduction
 QSBR Introduction
 Results and discussion
 Current QSAR project in UNESCO-IHE
Introduction to the (Q)SAR concept
 Chemicals with similar molecular structures have
similar effects in physical and biological systems
→ qualitative model (SAR)
 The extent of an effect varies in a systematic way
with variations in molecular structure
→ quantitative model (QSAR)
Biodegradation index = 4.066-0.007MW-0.314H/C
r = 0.866, r2 = 0.750, Sig. < 0.005, n= 156
Activity depends on chemical structure
SAR vs QSAR
 SAR is based on the “similarity” principle;
 The principle is assumed, but in the reality it is not
always true;
- Similarity of structures
- Similarity of descriptors
 The authenticity depends on the type of the
relationship between descriptors (numerical
representation of chemicals) and activity;
 The type of the relationship should be known (or
derived)
SAR vs. QSAR
how could we say there is a
difference ?
Three common things to this point:
 Both methods use numerical representation of
chemical compounds;
 Both methods need to decide which representation
to use;
 Both methods need to derive the relationship
between numerical representation (descriptors,
etc.) and activity.
QSAR in water treatment processes
Results obtained from valid qualitative or
quantitative structure-activity relationship
models can provide the removal of PhACs in
drinking water and the process selection for
target compounds. Results of QSAR may be
used instead of testing if results are derived
from a QSAR model whose scientific validity has
been established
QSAR in water treatment processes
 In principle, QSARs can be used to:
- provide information for use in priority setting
treatments for target compounds
- guide the experimental design of a test or testing
strategy
- improve the evaluation of existing test data
- provide mechanistic information (e.g. to support
the grouping of chemicals into categories)
- fill a data gap needed for classification
OECD Principles for QSAR Validation
 QSAR should be associated with the following
information:
- a defined endpoint
- an unambiguous algorithm
- appropriate measures of goodness-of-fit,
robustness and predictivity
- a mechanistic interpretation, if possible
QSBR
 Development of Quantitative Structure-Biodegradation
Relationships (QSBRs)
- QSBRs has been developed to predict the biodegradability of chemicals
released to natural systems using their structure-activity relationships (SAR)
- The development of QSBRs has been relatively slow compared with
proliferation of QSARs because of the nature of the biodegradability
endpoint
- QSBR is very complex because
1. Chemical structure
2. Environmental conditions
3. Bioavailability of the chemical
QSBR
- Limitations often associated in developing QSBR
1. Only within cogeneric series of chemicals
2. The absence of standardised and uniform biodegradation databases
- Recent years, a very intensive development of new and better qualitative
and quantitative biodegradability models was observed
- How many QSBR have been developed ?
A literature search on QSBR was performed including literature published
showed more than 84 models
- However, only a few models provided an acceptable level of agreement
between estimated and experimental data
QSBR
- All QSBR models until 1994 were reviewed by several
researchers for their applicability
1. Group contribution method (OECD, PLS, BIOWIN,
MultiCASE)
2. Chemometric methods (CART)
3. Expert system (BESS, CATABOL, TOPKAT)
- According to the previous studies, the group contribution
method seems to be the most applied and successful way of
modeling biodegradation
Group Contribution Method
 OECD hierarchical model approach
 Multivariable Partial Least Approach (PLS) model
 BIOWIN
 MultiCASE anaerobic program
What Does the BIOWIN Model Do?

Provide estimates of biodegradability useful in chemical screening under aerobic
condition (1,2,5,6)

Provide approximate time required to biodegrade in a stream (3,4)

Recently, BIOWIN was updated and now it can estimate anaerobic biodegradation
potential (7)
BIOWIN has 7 models (U.S. EPA, 2007)
BIOWIN1
BIOWIN2
BIOWIN3
BIOWIN4
BIOWIN5
BIOWIN6
linear
non-linear
Ultimate
Primary
linear
Non-linear
Based on regressions against
36 preselected chemical
structures plus molecular
weight
of
experimental
biodegradation data for 295
compounds (BIODEG)
Based on regressions of
biodegradability estimates
from a survey of experts for a
suite 200 organic chemicals
against the same chemical
substructures plus molecular
weight
Based on regressions of
data from the Japanese MITI
database against a modified
set
of
chemical
substructures
plus
molecular weight
BIOWIN7
Based
on
BIOWIN
fragment
contribution
approach.
Materials and method
 Finding Molecular Descriptors
Sofrware Delft Chemtech, Dragon, Chem3D
etc…
 Selection of Molecular Descriptors
1. PCA (SPSS)
2. Genetic Algorithm-Variable Subset
Selection (Mobydigs)
Principal Component Analysis
Principal Component Analysis (PCA)


Variables: MW, MV, log Kow, dipole, length, width, depth, equiv width, %
HL surface, polar surface are
Assessment of the suitability of the data for PCA
- KMO > 0.6 (KMO = 0.6), Barlett’s Test of Sphericity < 0.05 (<0.005)

Determination of the number of factors by Kaise criterion, scree plot
and Montecarlo parallel analysis
Str u c tu re M a tr ix
MW
eqwidth
width
MV
REJ
depth
len gth
dipole
HL_su rf
log_Kow
po_su rf
Biowin3
Component
1
2
.952
.905
.900
.879
-.395
.780
.723
.714
.898
.379
-.855
.444
.720
-.358
.713
Extraction M ethod: Principa l Component Ana lysis.
Rotation Me th od: Oblimin with Kaiser Normalization.
Classification PhACs - PCA
HL-neu
HL-ion
HP-neu
HP-ion
The two-component solution explained a total of 67% of the variance with
Component 1 contributing 46% and Component2 contributing 21%; Component 1:
SIZE and component 2: Hydrophobic/Hydrophilicity
Biodegradation (Aerobic)
Dependent variable
BIOWIN3
Independent variables (Indices, Chemical descriptors)
MW, MV, log Kow, dipole, length, width, depth, equiv width, %
HL surface, polar surface area
R2
STD.
Error
Sig.
(p)
Rej.
range
(%)
BIOWIN
3 range
Equation to predict biodegradation
HL
0.76
0.21
< 0.05
6.70-98.5
(75)
1.86- 3.60
(2.8)
2.842-0.168logKow-0.008MV+1.039length
(-59+170.06width)
HP1
-
-
-
37.5-99.1
(86)
1.52-2.96
(2.5)
198+7.53log_Kow-42.75length-94.09eqwidth
HL-ionic
0.55
0.25
< 0.05
74.8-96.9
(91)
1.86-3.03
(2.6)
3.536-0.009MW+0.934length
(138.81-5.04logKow-13.84length-94.09HL_surf)
HPionic1
-
-
< 0.05
74.8-99.1
(95)
2.16-2.96
(2.7)
(198.38+7.53logKow-42.57length-94.09eqwidth)
HLneutral
0.84
0.19
< 0.05
6.7-98.5
(60)
2.28-3.59
(2.9)
3.323-1.88logKow-0.004MV
(-119.89+4.53logKow+27704eqwidth)
HPneutral
0.35
0.23
< 0.05
37.5-98.1
(79.7)
1.52-2.68
(2.3)
3.493-4.30logKow
(122.38-32.16logKow+109.73eqwidth0.78HL_surf)
1. HP and HP-ionic compounds were not feasible to come up with equation because of collinearity problem in variables
(Violation in MLR assumptions)
Innovative system for removal of
micropollutants – RBF and NF membrane
days
RBF
days - weeks
weeks
weeks - months
months
Membrane
longer
QSAR Models Decision Support Framework
Organic micropollutants
QSAR
BIOWIN
Physical/Chemical
Treatment
Biological treatment
Kow
GAC
RBF /DUNE
Membrane
AOP
MW
NF
RO
Cl2
O3
Process selection and comparative performance assessment
ARR
Current QSAR project
2008
GIST
Analysis of PhACs
LC-MS / AUTO SPE
QSAR Tools
Selection of Target compounds
Selection of Target compounds
Physical-chemical characteristics
Vs. Water treatments
2009
Selection of Water Treatments
Selection of Water Treatments
Selected water Treatments
PhACs removal using selected
water treatments by GIST
PhACs removal using selected
water treatments by
UNESCO-IHE
Classification, Database, Model development
2010
A decision support tool for PhACS removal for water utility