ICSAS Method Structural Alerts

Download Report

Transcript ICSAS Method Structural Alerts

Wednesday May 19, 2010 QSAR - Duluth DATE

MULTICASE Inc.

Machine Intelligence in the Design of Safer Chemicals.

By

Gilles Klopman

Chemistry Department, Case Western Reserve University

and

MULTICASE Inc.

Cleveland, OHIO 44122, U.S.A

.

DATE

Scientific Problems in Computational Toxicology • Target is unknown or poorly understood

•eg. Carcinogenicity, Reproductive Toxicity, Neurotoxicity etc....

• Hard to identify the responsible functionality • Leading, or competing metabolism • Solubility and Transport properties

Structure-Activity Challenges

Learning set of Experimental Data Congeneric Diverse DATE

How reliable is the data?

Is the learning set domain well defined

Model Builder Continuous Descriptors Fragment descriptors Molecular Geometry

How good is the model?

Predicted Values

How good are the predictions?

Is the test set in the same domain as the learning set?

DATE MULTICASE Inc.

4

DATE

Knowledge-based Systems

META

CASE

Fragments based methodology

N O OH N N O OH O OH

MCASE

M C A S E Builds expert systems for each type of activity M E T A [Set of biophores]

DATE

Use of biophores in mechanistic studies Activity of new chemicals can be predicted

DATE

BIOPHORES

• • • Linear chains of 2 to 10 heavy atoms. – May include a side chain.

• Example : -CH2-N-CH2- <2-NO> • Remark : May combine to form larger groups Expanded fragments.

– Documented valid variation of the Biophore • Example : -CH2-N-CH - <2-NO> 2D Distance fragments – Distance between heteroatoms [7.8A] – Example : OH <----------------------> Cl

DATE

MODULATORS

1. Linear fragments similar to Biophores.

2. Partition Coefficient 3. Water Solubility 4. HOMO/LUMO energies 5. Charge densities located on atoms of the Biophore 6. Location of: – Hydrogen donors – Hydrogen acceptors – Lipophilic centers with respect to Biophore

DATE

BIOPHORES (toxicophores )

Biophore NH 2 -Ar NO 2 -Ar R-O-Ar R-S-CH=CH R-CH 2 -Cl R-CH 2 -Br R-CH 2 -O-CH-R OH-NH-R CH 2 -N-CH 2 <2-NO> CH 3 -N-CH 2 <2-NO> R-CO-NR-CH 3 Tot.

52 101 17 17 31 5 19 12 51 3 9 Inac.

12 3 1 0 1 0 2 0 2 0 1 Act.

40 98 16 17 30 5 17 12 49 3 8 182 20 162 12 1 11 N N

Test of o-Nitroaniline

NH2 NO2 The molecule contains the Biophore(nr.occ.= 1): NH2 -C \\ C *** 15 out of the known 19 molecules ( 79%) containing such Biophore are ChrAb active with an average activity of 30. (conf.level=100%) Log partition coeff.= 1.01 ; Constant is LogP contribution is 40.0

-1.5

** The probability that this molecule is ChrAb active is 80.0% DATE ** The activity is predicted to be MODERATE, activity= 39

FDA Collaborators in developing new Modules

FDA’s Center for Drug Evaluation and Research (CDER) Office of Pharmaceutical Science (OPS) Informatics and Computational Safety Analysis Staff (ICSAS) Office of Testing and Research (OTR) OPS / ICSAS Joseph F. Contrera, Ph.D.

Edwin J. Matthews, Ph.D.* R. Daniel Benz, Ph.D. Naomi L. Kruhlak, Ph.D. OPS / OTR James L. Weaver, Ph.D.

Joseph P. Hanig, Ph.D. P. Scott Pine

DATE

PhRMA ---- FDA / CDER / OPS / ICSAS ---- E. Matthews ---- 05-16-02

1

DATE

CASE DIRECTORY

CARCINOGENICITY

A0C - Rod. carc - Rodents A0D - Rat carcin. - Rats A0E - Mouse carcin - Mice

MUTATION

A20 - Salm.mutagen - Salmonella

IRRITATION

A30 - Sens irritnt - Sensory Irritant mice A31 - Eye irritnt - rabbit - Gold - Gold - Gold - [NTP] A2D - Drosoph mut - Drosophila somatic mutation - 433 - 745 - 633 -1354 - 289 - 226 - Avon & Carpenter - 207 A33 - Cont.Allergn - Allergic Contact Dermatitis -1034

TERATOGENESIS

A49 - teratogen - Misc. Chemicals - FDA/TERIS A50 - MMTD - Mouse Maximum Tolerated Dose - 323 - 321 - 479 A55 - minnow toxic - Minnow Toxicity

OTHER GENOTOXIC SHORT TERM ASSAYS

A60 - SiChEx act - Sister Chromatid Exchange [NTP] A61 - ChrAb act - Chromosomal Aberration A62 - MicrN Ind - Induction Micronuclei A64 - UDS Induc - UDS Induction A65 - SOS Chromot - SOS Chromotest - [NTP] - 233 - 233 - 238 - 299 - 462

ENVIRONMENTAL

AU2 - biodegrad. - Miscellaneous (BOD after 5 days incubation - 509 AUA - MICROTOX - Microb. toxic 83

Example of MC4PC Inactivity Prediction

A2H- mutagenic - all classes, salmo.typh.overall assa ---------------------------------------------------------------------------- Now Processing... Linalool (Molecule # 1) This molecule already exists as nr.3955 of activity 10 (CASE units), under the name :Linalool MC calculated Water Solubility is: 0.83 [in log(mol/m**3)] MC calculated Log(Octanol/Water) Partition Coef.is: 2.97

Molecule satisfies the rule of 5,(bioavailable) MC Calculated Human Intestinal Absorption is: 90.4% MCASE-3 Prediction ----------------- ** The molecule does not contain any known Biophore ** ** The probability that this molecule is mutagenic is 21% ** *** The molecule is known to be INACTIVE *** DATE

MCASE Prediction

----------------------------------------------------------------------------- AF1- Carcinogenici- Male Rats (non-proprietary) #1179 1.80

----------------------------------------------------------------------------- MULTICASE-3 Prediction --------------------- The molecule contains the Biophore (nr.occ.= 1): O -c. =cH -c =c. The ICSAS Alert Index for this Biophore is 325 *** 5 out of the known 5 molecules (100%) containing such Biophore are Carcinogenici with an average activity of 65. (conf.level= 97%) *** QSAR Contribution : Constant is 123.19

** The following Modulator(s) is/are also present: ( 1) OH -c = Inactivating -13.37

( 2) CO -c. = Inactivating -11.89

Electronegativity = -0.04 ; Its contribution is 1.82

Hard/Soft index is = 0.43 ; Its contribution is -44.05

. ----- ** Total projected QSAR activity (in CASE units) is equal to 55.71

O HO N CONCLUSIONS: ----------- ** The projected Carcinogenici activity is 56.0 CASE units ** ** The compound is predicted to be VERY active ** ** The probability that this molecule is Carcinogenici is 85% ** ----------------------------------------------------------------------------- The Molecules containing fragment : O -c. =cH -c =c. are : 1 in molecule 20 (0.91) acronycine 1 in molecule 27 (0.78) aflatoxin B1 DATE 1 in molecule 81 (0.61) Aristolochic acid II 1 in molecule 1011 (0.82) sterigmatocystin of activity 70 of activity 75 of activity 69 of activity 69 of activity 43 O

List of molecules

DATE .c

c biophore cH .c

O

Example of molecules seen as being outside the domain of validity of the module AZ2- MUTAGENS - public domain and pharma #2892 1.56

----------------------------------------------------------------------------- MC calculated Water Solubility is: 0.07 [in log(mol/m**3)] MC calculated Log(Water/Octanol) Partition Coef.is: 0.90

Molecule satisfies the rule of 5 (bioavailable) MC Calculated Human Intestinal Absorption is: 93.7

The molecule is a detergent *** S -CO -c = Acefurtiamine ** WARNING ** The following functionalities are UNKNOWN to me:

O O C C O C C O

*** CO -S -C =

C O O

*** COH-N -C = MULTICASE-3 Prediction --------------------- ** The molecule does not contain any known Biophore ** The results are QUESTIONABLE due to the presence of UNKNOWN functionalities ** CONCLUSIONS: ----------- ** The results are INCONCLUSIVE ICSAS METHOD CONCLUSIONS: ------------------------ ICSAS Method Expert Call: Negative Coverage: 3w

C C N N C C C

*** The probability that this molecule is mutagenic is 21%

C C C C N S C C O C C C C

DATE

Improving AMES mutagenicity model

100.00% 95.00%

Inactives

90.00% 85.00%

Actives Total

80.00% 75.00%

925 1097 1116 1211 1424 1652 2180 2541 2768

70.00% % Concordance %Sensitivity %Specificity %Coverage A2I A2H AZ2 Ames Salmonella: GENETOX + NTP + FDA (original) Ames Salmonella: GENETOX + NTP + FDA +Zeiger (rebuilt) Ames Salmonella: Public domain and propr. pharmaceuticals DATE

I-Case, The Next Generation of the MultiCase Program Why

do we need a new generation of the program

:      There is a need for the capability to generate expert system modules from larger databases. Currently the largest database we could handle is 8000 molecules.

The majority of the computers will be with multiple core processors in the future. Current Multicase program does not support multi-core computation. We can achieve significant performance benefits if we support multi-core processing.

Various biological properties of chemicals are being reported that have other elements than the organic set (C,H,N,S,O,P, Cl, Br, F, I). We needed support for other elements as well.

We needed enhancements for quantitative prediction of activity mainly for databases with continuous activity.

We needed enhancements so that the program can generate customized reports when chemicals are tested for activity.

What is I-CASE; A new program designed to replace MULTICASE • Support for building models using larger databases (in an order of 50,000 and larger) that enabled us to build models for anticancer activity from NCI-60 cancer screenings.

• Support for computers with multiple-core processors. The result is significant increase in speed for building models, running cross validations and testing molecules.

• A totally revamped user interface that supports multitasking. Now we can build several models simultaneously, run several validations simultaneously and run tests simultaneously. The interface has several new features as well.

• The new program now supports all the elements of the full periodic table.

• Added several new continuous descriptors for building local QSARs for each biophore, e.g. Molar Refractivity, Vapor Pressure, E-State descriptors. Result is improvement in quantitative predictions. • Improvements in the models for calculating water solubility, human intestinal absorption and logp. Addition of several pharmacokinetic models is in progress e.g. blood brain barrier permeation, plasma protein binding etc.

• New features have been made in the interface for generating customized test reports for chemicals.

EPA Against FDA

• • • • • • FDA data is based on per unit molecule (per mole) while majority of EPA data is based on bulk (per milligram or per gram).

Toxicity is generally considered to the result of the presence of certain molecular structural feature. It is not a bulk property; therefore molecules declared INACTIVE in EPA testing results can still be toxic at the molecular level whereas ACTIVE molecules in EPA data can be considered active.

FDA data is suitable for QSAR modeling whereas a certain portion of EPA data can not be used in developing QSAR models.

Most of the times EPA data can not be transformed to per mole type because the reported activity is of ACTIVE/INACTIVE type.

FDA data makes sense from a QSAR point of view while EPA data makes sense from a practical point of view.

Mixing FDA and EPA data in a single model results in the deterioration of the model quality.