Computational Discovery of Communicable Knowledge

Download Report

Transcript Computational Discovery of Communicable Knowledge

Inducing Process Models from Continuous Data

Pat Langley

Institute for the Study of Learning and Expertise

Javier Sanchez

CSLI / Stanford University

Ljupco Todorovski Saso Dzeroski

Jozef Stefan Institute Supported by NTT Communication Science Laboratories, by Grant NCC 2-1220 from NASA Ames Research Center, and by EU Grant IST-2000-26469.

Exploratory Research in Machine Learning

Dietterich (1990) claims an exploratory research report should:   define a challenging new problem for machine learning; show that established methods cannot solve the problem;   present an initial approach that addresses the new task; and outline an agenda for future research efforts in the area. In this talk, we explore the problem of inducing

process models

from continuous data.

training data

Inductive Process Modeling

process exponential_growth variables: P {population} equations: d[P,t] = [0, 1,  ]  P process logistic_growth variables: P {population} equations: d[P,t] = [0, 1,  ]  P  (1  P / [0, 1,  ]) process constant_inflow variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1,  ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 equations: d[P1,t] = [0, 1,  ]  P1  nutrient_P2, d[P2,t] =  [0, 1,  ]  P1  nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1,  ]) background knowledge Induction learned knowledge model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1  phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1  zoo / (1  zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] =  1  phyto  nutrient_nitro, d[phyto,t] = 1  phyto  nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] =  1  zoo  nutrient_phyto, d[zoo,t] = 1  zoo  nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5)

training data

Inductive Process Modeling

Observed values for a set of continuous variables as they vary over time or situations

Induction learned model

A specific process model that explains the observed values and predicts future data accurately Generic processes that characterize causal relationships among variables in terms of conditional equations

background knowledge

A Process Model of an Ice-Water System

model WaterPhaseChange variables: temp, heat, ice_mass, water_mass observables: temp, heat, ice_mass, water_mass process ice-warming conditions: ice_mass > 0, temp < 0 equations: d[temp,t] = heat / (0.00206  ice_mass) process ice-melting conditions: ice_mass > 0, temp == 0 equations: d[ice_mass,t] =  (18  d[water_mass,t] = (18  heat) / 6.02, heat) / 6.02

process water-warming conditions: ice_mass == 0, water_mass > 0, temp >= 0, temp < 100 equations: d[temp,t] = heat / (0.004184  water_mass) 0 temp ice_mass water_mass Time

Why Are Process Models Interesting?

Process models are a crucial target for machine learning because:  they incorporate

scientific formalisms

rather than AI notations;   that are easily

communicable

to scientists and engineers; they move beyond descriptive generalization to

explanation

;  while retaining the

modularity

needed to support induction.

These reasons point to process models as an ideal representation for scientific and engineering knowledge.

Process models are an important alternative to formalisms used currently in machine learning.

Challenges of Inductive Process Modeling

Process model induction differs from typical learning tasks in that:   process models characterize behavior of dynamical systems; variables are mainly continuous and data are unsupervised;   observations are not independently and identically distributed; process models contain unobservable processes and variables;  multiple processes can interact to produce complex behavior.

Compensating factors include a focus on deterministic systems and the availability of background knowledge.

Can Existing Methods Induce Process Models?

regression trees

B>6 C>0 C>4 14.3

18.7

11.5

16.9

explanation-based learning equation discovery d[ice_mass,t] =  (18  d[water_mass,t] = (18  heat) / 6.02

heat) / 6.02

hidden Markov models  x =12,  x =1  y =18,  x =2 0.7

0.3

 x =16,  x =2  y =13,  x =1 1.0

 x =12,  x =1  y =10,  x =2 1.0

 x =19,  x =1  y =11,  x =2 inductive logic programming gcd(X,X,X).

gcd(X,Y,D) :- X

gcd(X,Y,D) :- Y

Facets of Inductive Process Modeling

To describe a system that learns process models, we must specify:   characteristics of the data (observations to be explained); a representation for background knowledge (generic processes);   a representation for learned knowledge (process models); a performance element that makes predictions (a simulator);  a learning method that induces process models.

We will use an example from population dynamics to illustrate an initial approach to inductive process modeling.

Data for an Aquatic Ecosystem

Generic Processes for Population Dynamics

process exponential_growth variables: P {population} equations: d[P,t] = [0, 1,  ]  P process exponential_decay variables: P {population} equations: d[P,t] =  [0, 1,  ]  P process logistic_growth variables: P {population} equations: d[P,t] = [0, 1,  ]  P  (1  P / [0, 1,  ]) process constant_inflow variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1,  ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 {number} equations: d[P1,t] = [0, 1,  ]  d[P2,t] =  [0, 1,  ] P1   P1 nutrient_P2,  nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1,  ])

Process Model for an Aquatic Ecosystem

model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1  phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1  zoo / (1  zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] =  1  d[phyto,t] = 1  phyto phyto   nutrient_nitro, nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] =  1  d[zoo,t] = 1  zoo  zoo  nutrient_phyto, nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5)

Making Predictions with Process Models

Specify initial values for input variables and the size for time steps On each time step, check conditions to decide which processes are active Solve algebraic and differential equations with known values Propagate values and recurse to solve other equations Add the effects of different processes on each variable

The IPM Method for Process Model Induction

Find all ways to instantiate known generic processes with specific variables Combine subsets of instantiated processes into generic models Remove candidates that are too complex or not connected graphs For each generic model, search for good parameter values Return parameterized model with the smallest error

Initial Evaluation of IPM Algorithm

To demonstrate IPM's functionality at inducing process models, we ran it on synthetic data for a known system.

1. We used the aquatic ecosystem model to generate data for 100 time steps, setting nitrogen = 1.0, phyto = 0.01, zoo = 0.01; 2. We replaced each ‘true’ value

x r

with

x

came from a Gaussian distribution (  

(1 + r

 = 0 and 

0.05)

, where = 1); 3. We ran IPM on these noisy data, giving it type constraints and generic processes as background knowledge.

The IPM algorithm examined a space of 2196 generic models, each with an embedded parameter optimization.

Predictions from IPM’s Induced Model

Process Model Generated by IPM

model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro_1, nutrient_nitro_2, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.089  phyto process zoo_logistic_growth equations: d[zoo,t] = 0.013  zoo / (1  zoo / 0.469) process phyto_nitro_consumption equations: d[nitro,t] =  1.174  d[phyto,t] = 1.058  phyto phyto   nutrient_nitro_1, nutrient_nitro_1 process phyto_nitro_no_saturation equations: nutrient_nitro_1 = nitro process zoo_phyto_consumption equations: d[phyto,t] =  0.986  d[zoo,t] = 1.089  zoo  zoo  nutrient_phyto, nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.487)

Process Model Generated by IPM (continued)

process nitro_constant_inflow equations: d[nitro,t] = 0.067

process zoo_nitro_consumption equations: d[nitro,t] =  0.470  d[zoo,t] = 1.089  zoo  zoo  nutrient_nitro_2, nutrient_nitro_2 process zoo_nitro_saturation equations: nutrient_nitro_2 = nitro / (nitro + 0.020) These extra processes complicate the model but have little effect on its behavior or its predictive accuracy.

A Proposed Research Agenda

Future research on process modeling should explore methods that:  reduce variance and overfitting (e.g., through pruning);   determine the conditions on processes from training data; associate variables with phyiscal entities to constrain search;   use a taxonomy of process types to organize and limit search; use knowledge of dimensions and conservation to limit search;   support the induction of qualitative process models; and revise existing process models rather than construct them.

This work should draw on traditional induction methods, which have many relevant ideas.

Evaluation of Process Models

Research on this new class of problems should follow the accepted standards; thus, papers should:  make explicit claims about an induction method's abilities;  support these claims with experimental or theoretical evidence;    study behavior on natural data sets to ensure relevance; utilize synthetic data sets to vary dimensions of interest; and incorporate ideas from other tasks and utilize existing methods whenever sensible.

In addition, the focus on communicability and use of background knowledge suggests collaborations with domain experts.

Concluding Remarks

In this exploratory research contribution, we have:  proposed a new problem that involves induction of process models from components to explain observations;   argued that this task does not lend itself to established methods; proposed a formalism for models and background knowledge;   presented an initial system that induces such process models; demonstrated its functionality in a population dynamics domain;  outlined an agenda for future research in this new area.

Process model induction has great potential to aid development of models in science and engineering.

In Memoriam

Early last year, computational scientific discovery lost two of its founding fathers:   Herbert A. Simon (1916 – 2001) Jan M. Zytkow (1945 – 2001) Both contributed to the field in many ways: posing new problems, inventing methods, training students, and organizing meetings.

Moreover, both were interdisciplinary researchers who contributed to computer science, psychology, philosophy, and statistics.

Herb Simon and Jan Zytkow were excellent role models that we should all aim to emulate.

The LaGramge Discovery System

Our approach to inductive process modeling builds on LaGramge (Todorovski & Dzeroski, 1997), a discovery system that:  specifies a space of abstract numeric equations in terms of a context-free grammar;  searches exhaustively through this space, to a given depth, to generate candidate abstract equations;  calls on established optimization techniques to determine the parameters for each equation; and  uses either squared error or minimum description length to select its final equations. LaGramge has rediscovered an impressive class of differential and algebraic equations from noisy data.

Making Predictions with Process Models

To simulate a given process model’s behavior over time, we can:   specify initial values for input variables and time step size; on each time step, determine which processes are active;   solve active algebraic/differential equations with known values; propagate values and recursively solve other active equations;  when multiple processes influence the same variable, assume their effects are additive. This performance element makes specific predictions that we can compare to observations.

A Method for Process Model Induction

We have implemented IPM, an algorithm that constructs process models from generic components in four stages: 1. Find all ways to instantiate known generic processes with specific variables; 2. Combine subsets of instantiated processes into generic models, each specifying an explanatory structure; 2a. Ensure that each candidate consists of a connected graph; 2b. Limit the maximum number of processes that can connect any two variables and the total number of processes; 3. Translate the candidate into a context-free grammar and invoke LaGramge to search for good parameter values; 4. Return the model with the least error produced by LaGramge.