Dynamic Energy Budget theory

Download Report

Transcript Dynamic Energy Budget theory

Basic Methods
in
Theoretical Biology
1
2
3
4
Methodology
Mathematical toolkit
Models for processes
Model-based statistics
http://www.bio.vu.nl/thb/course/tb/tb.pdf
Empirical cycle 1.1
Assumptions summarize insight 1.1
• task of research: make all assumptions explicit
these should fully specify subsequent model formulations
• assumptions: interface between experimentalist  theoretician
• discrepancy model predictions  measurements:
identify which assumption needs replacement
• models that give wrong predictions can be very useful
to increase insight
• structure list of assumptions to replacebility (mind consistency!)
Model: definition & aims 1.1
• model:
scientific statement in mathematical language
“all models are wrong, some are useful”
• aims:
structuring thought;
the single most useful property of models:
“a model is not more than you put into it”
how do factors interact? (machanisms/consequences)
design of experiments, interpretation of results
inter-, extra-polation (prediction)
decision/management (risk analysis)
• observations/measurements:
require interpretation, so involve assumptions
best strategy: be as explicitly as possible in assumptions
Model properties 1.1
• language errors:
mathematical, dimensions, conservation laws
• properties:
generic (with respect to application)
realistic (precision; consistency with data)
simple (math. analysis, aid in thinking)
complex models are easy to make, difficult to test
simple models that capture essence are difficult to make
plasticity in parameters (support, testability)
• ideals:
assumptions for mechanisms (coherence, consistency)
distinction action variables vs measured quantities
need for core and auxiliary theory
Modelling 1 1.1
• model:
scientific statement in mathematical language
“all models are wrong, some are useful”
• aims:
structuring thought;
the single most useful property of models:
“a model is not more than you put into it”
how do factors interact? (machanisms/consequences)
design of experiments, interpretation of results
inter-, extra-polation (prediction)
decision/management (risk analysis)
Modelling 2 1.1
• language errors:
mathematical, dimensions, conservation laws
• properties:
generic (with respect to application)
realistic (precision)
simple (math. analysis, aid in thinking)
plasticity in parameters (support, testability)
• ideals:
assumptions for mechanisms (coherence, consistency)
distinction action variables/meausered quantities
core/auxiliary theory
Presumptions  Laws 11.1
Laws
Theories
Hypotheses
Presumptions
decrease in demonstrated support
amount of support is always limited
Proofs only exist in mathematics
role of abstract concepts
0
“facts”
no predictions possible
large
“general theories”
predictions possible
Theories  Models 1.1
Theory: set of coherent and consistent assumptions
from which models can be derived for particular situations
Models may or may not represent theories
it depends on the assumptions on which they are based
If a model itself is the assumption, it is only a description
if it is inconsistent with data, and must be rejected, you have nothing
If a model that represents a theory must be rejected,
a systematic search can start to assumptions that need replacement
Unrealistic models can be very useful
in guiding research to improve assumptions (= insight)
Many models don’t need to be tested against data
because they fail more important consistency tests
Testability of models/theories comes in gradations
Auxiliary theory 1.1
Quantities that are easy to measure (e.g. respiration, body weight)
have contributions form several processes
 they are not suitable as variables in explenatory models
Variables in explenatory models are not directly measurable
 we need auxiliary theory to link core theory to measurements
Standard DEB model:
isomorph with 1 reserve & 1 structure that feeds on 1 type of food
Measurements typically
involve interpretations, models 1.1
Given:
“the air temperature in this room is 19 degrees Celsius”
Used equipment: mercury thermometer
Assumption: the room has a temperature (spatially homogeneous)
Actual measurement: height of mercury column
Height of the mercury column  temperature: model!
How realistic is this model?
What if the temperature is changing?
Task: make assumptions explicit and be aware of them
Question: what is calibration?
Complex models 1.1
• hardly contribute to insight
• hardly allow parameter estimation
• hardly allow falsification
Avoid complexity by
• delineating modules
• linking modules in simple ways
• estimate parameters of modules only
Causation 1.1
Cause and effect sequences can work in chains
ABC
But are problematic in networks
A
B
C
Framework of dynamic systems allow
for holistic approach
Dimension rules 1.2
• quantities left and right of = must have equal dimensions
• + and – only defined for quantities with same dimension
• ratio’s of variables with similar dimensions are only dimensionless if
addition of these variables has a meaning within the model context
• never apply transcendental functions to quantities with a dimension
log, exp, sin, … What about pH, and pH1 – pH2?
• don’t replace parameters by their values in model representations
y(x) = a x + b, with a = 0.2 M-1, b = 5  y(x) = 0.2 x + 5
What dimensions have y and x? Distinguish dimensions and units!
Models with dimension problems 1.2
• Allometric model: y = a W b
y: some quantity
a: proportionality constant
W: body weight
b: allometric parameter in (2/3, 1)
Usual form ln y = ln a + b ln W
Alternative form: y = y0 (W/W0 )b, with y0 = a W0b
Alternative model: y = a L2 + b L3, where L  W1/3
• Freundlich’s model: C = k c1/n
C: density of compound in soil k: proportionality constant
c: concentration in liquid
n: parameter in (1.4, 5)
Alternative form: C = C0 (c/c0 )1/n, with C0 = kc01/n
Alternative model: C = 2C0 c(c0+c)-1 (Langmuir’s model)
Problem: No natural reference values W0 , c0
Values of y0 , C0 depend on the arbitrary choice
Egg development time 1.2
D  exp(3.3956 0.2193ln(T )  0.3414(ln(T ))2 )
D  exp(a  b ln(T )  c(ln(T ))2 )
dim(a ) 
ln t
ln t
dim(b) 
ln K
ln t
dim(c) 
(ln K ) 2
D egg developmen t time
T temperatur e in Kelvin
Bottrell, H. H., Duncan, A., Gliwicz, Z. M. , Grygierek, E., Herzig, A.,
Hillbricht-Ilkowska, A., Kurasawa, H. Larsson, P., Weglenska, T. 1976
A review of some problems in zooplankton production studies.
Norw. J. Zool. 24: 419-456
Space-time scales 1.3
space
Each process has its characteristic domain of space-time scales
system earth
ecosystem
population
individual
cell
molecule
When changing the space-time scale,
new processes will become important
other will become less important
Models with many variables & parameters
hardly contribute to insight
time
Problematic research areas 1.3
Small time scale combined with large spatial scale
Large time scale combined with small spatial scale
Reason: likely to involve models with
large number of variables and parameters
Such models rarely contribute to new insight
due to uncertainties in formulation and parameter values
Different models can fit equally
well 1.5
O2 consumption, μl/h
Two curves fitted:
a L2 + b L3
with a = 0.0336 μl h-1 mm-2
b = 0.01845 μl h-1 mm-3
a Lb
with a = 0.0156 μl h-1 mm-2.437
b = 2.437
Length, mm
Plasticity in parameters 1.7
If plasticity of shapes of y(x|a) is large as function of a:
• little problems in estimating value of a from {xi,yi}i
(small confidence intervals)
• little support from data for underlying assumptions
(if data were different: other parameter value results,
but still a good fit, so no rejection of assumption)
A model can fit data well for wrong reasons
Biodegradation of compounds 1.7
n-th order model
d
X  kX n
dt
X (t )  X 01 n  (1  n)kt


(1 n ) 1
Monod model
d
X
X  k
dt
KX
0  X (t )  X 0  K ln{X (t ) / X 0}  kt
n 0
K  X 0
n 1
K  X 0
X (t )  X 0  kt ; t  X 0 / k X (t )  X 0  kt ; t  X 0 / k
X (t )  X 0 exp{kt}
1 n
1

a
t (aX 0 )  X 01n k 1
1 n
X (t )  X 0 exp{kt / K }
t (aX0 )  X 0k 1 (a 1)  Kk 1 ln a
X : conc. of compound,
t : time
n : order
X0 : X at time 0
k : degradation rate
K : saturation constant
Biodegradation of compounds 1.7
Monod model
scaled conc.
scaled conc.
n-th order model
scaled time
scaled time
Verification  falsification 1.9
Verification cannot work
because different models can fit data equally well
Falsification cannot work
because models are idealized simplifications of reality
“All models are wrong, but some are useful”
Support works to some extend
Usefulness works
but depends on context (aim of model)
a model without context is meaningless
Model without dimension problem 1.2
Arrhenius model: ln k = a – T0 /T
k: some rate
T: absolute temperature
a: parameter
T0: Arrhenius temperature
Alternative form:
k = k0 exp{1 – T0 /T}, with k0 = exp{a – 1}
Difference with allometric model:
no reference value required to solve dimension problem
Central limit theorems 2.6
The sum of n independent identically (i.i.) distributed random variables
becomes normally distributed for increasing n.
Z  X Y

f Z ( z )   f X ( z  y) fY ( y) dy; P( Z  z )   P( X  z  y) P(Y  y)
y
y
The sum of n independent point processes tends to behave as a
Poisson process for increasing n.
Number of events in a time interval is i.i. Poisson distributed
Time intervals between subsequent events is i.i. exponentially distributed
Poisson prob
Exponential prob dens
Sums of random variables 2.6
n
Y   X i ; Var (Y )  nVar ( X i )
i 1
f X ( x)  λ exp(λx)
λ
fY ( y ) 
(λy ) n 1 exp(λy )
 ( n)
λx
P( X  x)  exp(λ)
x!
(nλ) y
P(Y  y ) 
exp(nλ)
y!
Normal probability density 2.6
σ
σ
 95%
(x-μ)/σ
 1  x  μ 2 
f X ( x) 
exp  
 
2

2πσ
 2 σ  
1
f X ( x) 
 1

exp  x  μ '  -1 x  μ 

2π n   2
1
Dynamic systems 3.2
Defined by simultaneous behaviour of
input, state variable, output
Supply systems: input + state variables  output
Demand systems: input  state variables + output
Real systems: mixtures between supply & demand systems
Constraints: mass, energy balance equations
State variables: span a state space
behaviour: usually set of ode’s with parameters
Trajectory: map of behaviour state vars in state space
Parameters:
constant, functions of time, functions of modifying variables
compound parameters: functions of parameters
Statistics 4.1
Deals with
• estimation of parameter values, and confidence in these values
• tests of hypothesis about parameter values
differs a parameter value from a known value?
differ parameter values between two samples?
Deals NOT with
• does model 1 fit better than model 2
if model 1 is not a special case of model 2
Statistical methods assume that the model is given
(Non-parametric methods only use some properties of the given
model, rather than its full specification)
Stochastic vs deterministic models 4.1
Only stochastic models can be tested against experimental data
Standard way to extend deterministic model to stochastic one:
regression model: y(x| a,b,..) = f(x|a,b,..) + e, with e N(0,2)
Originates from physics, where e stands for measurement error
Problem:
deviations from model are frequently not measurement errors
Alternatives:
• deterministic systems with stochastic inputs
• differences in parameter values between individuals
Problem:
parameter estimation methods become very complex
Stochastic vs deterministic models 4.1
Tossing a die can be modeled in two ways
• Stochastically: each possible outcome has the same probability
• Deterministically: detailed modelling of take off and bounching,
with initial conditions; many parameters
Imperfect control of process makes deterministic model unpractical
Large scatter 4.1
•
•
complicates parameter estimation
complicates falsification
Avoid large scatter by
• Standardization of factors that contribute to measurements
• Stratified sampling
Kinds of statistics 4.1
Descriptive statistics
sometimes useful, frequently boring
Mathematical statistics
beautiful mathematical construct
rarely applicable due to assumptions to keep it simple
Scientific statistics
still in its childhood due to research workers being specialised
upcoming thanks to increase of computational power
(Monte Carlo studies)
Tasks of statistics 4.1
Deals with
• estimation of parameter values, and confidence of these values
• tests of hypothesis about parameter values
differs a parameter value from a known value?
differ parameter values between two samples?
Deals NOT with
• does model 1 fit better than model 2
if model 1 is not a special case of model 2
Statistical methods assume that the model is given
(Non-parametric methods only use some properties of the given
model, rather than its full specification)
Independent observations 4.1
If X and Y are independent
I
I
f
Statements to remember 4.1
• “proving” something statistically is absurd
• if you do not know the power of your test,
you don’t know what you are doing while testing
• you need to specify the alternative hypothesis to know the power
this involves knowledge about the subject (biology, chemistry, ..)
• parameters only have a meaning if the model is “true”
this involves knowledge about the subject
Nested models 4.5
y( x)  w0  w1x  w2 x2
w2  0
y( x)  w0  w1 x
Venn diagram
w1  0
y( x)  w0
y( x)  w0  w2 x2
Testing of hypothesis 4.5
Error of the first kind:
reject null hypothesis while it is true
Error of the second kind:
accept null hypothesis while the alternative hypothesis is true
Level of significance of a statistical test:
 = probability on error of the first kind
Power of a statistical test:
 = 1 – probability on error of the second kind
decision
No certainty in statistics
null hypothesis
true
false
accept
1-

reject

1-
Parameter estimation 4.6
Most frequently used method: Maximization of (log) Likelihood
likelihood: probability of finding observed data (given the model),
considered as function of parameter values
If we repeat the collection of data many times
(same conditions, same number of data)
the resulting ML estimate
Profile likelihood 4.6
large sample
approximation
95% conf interval
Comparison of models 4.6
Akaike Information Criterion
for sample size n and K parameters
n
 2 log L(θ)  2 K
n  K 1
in the case of a regression model
n
2
n log σ  2 K
n  K 1
You can compare goodness of fit of different models to the same data
but statistics will not help you to choose between the models
Confidence intervals 4.6
length, mm
L(t )  L  ( L  L0 ) exp(rBt )
 L0  ( L  L0 )rBt for small t
L0  1
excludes
point 4
95% conf intervals
rB
includes
point 4
time, d
L
correlations among
parameter estimates
can have big effects
on sim conf intervals
estimate
excluding
point 4
sd
excluding
point 4
estimate
including
point 4
sd
including
point 4
L, mm
6.46
1.08
3.37
0.096
rB,d-1
0.099
0.022
0.277
0.023
parameter