Uncertainty in Design Space

Download Report

Transcript Uncertainty in Design Space

Application of Monte Carlo
Methods for Process Modeling
John Kauffman, Changning Guo
FDA\CDER Division of Pharmaceutical Analysis
Jean-Marie Geoffroy
Takeda Global Research and Development
The opinions expressed in this presentation are those of the authors, and do
not necessarily represent the opinions or policies of the FDA.
1
Outline
1. Why propagate uncertainty in regressionbased process models?
2. Why use Monte Carlo (MC) simulation?
3. Why solve regression models with MC?
2
What is Design Space?
• Design space is “the multidimensional
combination and interaction of input
variables and process parameters that
have been demonstrated to provide
assurance of quality”. (ICH Q8)
• Process modeling (DOE) is a central
component of design space determination.
3
Design Space Schematic
Knowledge Space
Parameter
#1
Parameter
#2
4
Design Space Schematic
with uncertainty
Knowledge Space
Parameter
#1
Parameter
#2
5
Case Study #1: Modeling 45 minute
dissolution (D45) of a tableting process
• 32 Factorial Experimental Design
– Granulating Water (GS: 36-38 kg)
– Granulating Power (P: 18.5-22.5 kW)
• Nested Compression Factors
– Compression Force (CF: 11.5-17.5 kN)
– Press Speed (S: 70-110 kTPH)
• Least Squares Predictive Model*
D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2
* Parameter values are mean-centered and range-scaled.
Publication Reference
Application of Quality by Design Knowledge (QbD) From Site Transfers to Commercial
Operations Already in Progress,” J. PAT, Jan/Feb, pg. 8, 2006.
6
Diagram of
Experimental Design
speed
force
38
granulation
water (kg)
37
36
18.5
20.5
granulation power (kW)
22.5
7
Experimentation and Process Modeling
D45exp 1
GSexp 1
Pexp 1
CFexp 1
D45exp 2
GSexp 2
Pexp 2
CFexp 2
D45exp 3
GSexp 3
Pexp 3
CFexp 3




8
Experimentation and Process Modeling
D45pred 1 = B0 + B1·GSexp 1 + B2·Pexp 1 + B3·CFexp 1 + B4·GS2exp 1
D45pred 2 = B0 + B1·GSexp 2 + B2·Pexp 2 + B3·CFexp 2 + B4·GS2exp 2
D45pred 3 = B0 + B1·GSexp 3 + B2·Pexp 3 + B3·CFexp 3 + B4·GS2exp 3






Propagation of uncertainty in process model predictions:
All Model Coefficient variances and Process Variable variances
contribute to each predicted Response uncertainty in a model-dependent
manner.
9
Propagation of Uncertainty in Regression
Modeling
• What procedures can be used to estimate
uncertainty in design space?
• What is the benefit of propagating
uncertainty using Monte Carlo simulation?
10
The Process Model:
Matrix Representation
D45pred 1
1 GSexp 1 Pexp 1 CFexp 1 GS2exp 1
B0
D45pred 2
1 GSexp 2 Pexp 2 CFexp 2 GS2exp 2
B1
1 GSexp 3 Pexp 3 CFexp 3 GS2exp 3
B2
D45pred 3

=





B3
Response
Design
B
matrix
matrix
matrix
B4
R = DB
11
Least Squares Solution to a
Process Model
Matrix Representation of Process Model:
R = DB
Define the pseudoinverse of D:
D† = (DTD)-1DT
Solving for the Model Coefficients:
D †R = B
The pseudoinverse solution of a matrix equation gives the
least squares best estimates of the B coefficients!
12
Estimating Variance in Prediction:
The Basis for Uncertainty in Design Space
Response
Covariance matrix
Cov(R) = B[Cov(D)]BT
Jth experimental variance = Jth diagonal element of Cov(R)
Assumptions: Only D has uncertainty.
Problems: 1.) We know that B has uncertainty.
2.) We know that uncertainties in D will be correlated, but
we don’t know Cov(D)
13
Estimating Variance of Process Model
Regression Coefficients
Model coefficient
Covariance matrix
Response variance
( p = # model coefficients
N = # experiments)
TD]-1
†T
Cov(B)=[D
dR
Cov(B)=D†[Cov(R)]D
2
N
dR = S
2
i=1
(Ri – ^Ri)2
N-p
Jth Model coefficient variance = Jth diagonal element of Cov(B)
Assumptions: Only R has uncertainty; Errors uncorrelated and constant
Problems: 1.) We know that D (matrix of input variables) has uncertainty.
2.) We suspect that uncertainties may be correlated.
14
Monte Carlo Methods
• Develop a mathematical model.
– The Process Model.
• Add random variables.
– Replace quantities of interest with random numbers selected
from appropriate distribution functions that are expected to
describe the variables.
• Monitor selected output variables.
– Output variables become distributions whose properties are
determined by the model and the distributions of the random
variables.
• Advantage #1: We make no assumptions concerning
sources of uncertainty or covariance between variables.
15
Case Study #1: Influence of Process
Parameter Variation on Prediction
• Model Conditions
–
–
–
–
–
GS mean = 36 kg
P mean = 20 kW
CF mean = 14 kN
Input parameter standard deviations were varied.
Dissolution values were predicted.
16
Example: Simulation 1
Distribution for Water (kg)
Distribution for Power (kW)
1.8
Distribution for Force (kN)
0.45
1.6
0.4
0.4
1.4
0.35
0.35
1.2
0.3
0.3
0.25
1
0.25
0.8
0.2
0.2
0.6
0.15
0.15
0.4
0.1
0.1
0.2
0.05
0.05
0
0
0
35
35.5
36
36.5
GS
Mean = 36 kg
Std. Dev. = 0.25 kg
37
15
D45 Simulation #1
0.45
17.5
20
22.5
P
Mean = 20 kW
Std. Dev. = 1 kW
25
0.12
0.1
0.08
0.06
0.04
0.02
9
12
15
CF
Mean = 14 kN
Std. Dev. = 1 kN
0
18
60
70
80
90
100
D45 Simulation Result
Mean = 74.6%
Std. Dev. = 3.70%
D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2
17
Example: Simulations 1-4
D45 Simulation
Mean = 74.6%
Std. Dev. = 3.70%
D45 Simulation
Mean = 75.0%
Std. Dev. = 4.59%
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
60
60
60
60
70
70
70
70
D45 Simulation #4
90
90
90
90
100
100
100
100
GS Std. Dev.=1 kg
P Std. Dev.=1 kW
CF Std. Dev.= 1 kN
D45 Std. Dev.= 0%
80
D45 Simulation #3
80
D45 Simulation #2
80
D45 Simulation #1
80
GS Std. Dev.=0.5 kg
P Std. Dev.=1 kW
CF Std. Dev.= 1 kN
D45 Std. Dev.= 0%
D45 Simulation
Mean = 76.9%
Std. Dev. = 8.26%
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0.1
0.12
0.08
0.06
0.04
0.02
0
0.12
0.1
0.08
0.06
0.04
0.02
0
GS Std. Dev.=0.25 kg
P Std. Dev.=1 kW
CF Std. Dev.= 1 kN
D45 Std. Dev.= 0%
D45 Simulation
Mean = 76.9%
Std. Dev. = 7.89%
GS Std. Dev.=1 kg
P Std. Dev.=2 kW
CF Std. Dev.= 1 kN
D45 Std. Dev.= 0%
18
Influence of Process Parameters Variation
• Increase in granulation water mass (GS)
variance:
– Increases predicted D45 variance.
– Slightly shifts predicted D45 means.
– Skews the predicted D45 distributions.
• Increase in granulator power (P) endpoint
variance:
– Increases predicted D45 variance.
– Does not shift predicted D45 means.
– Does not skew the predicted D45 distributions.
19
Influence of Dissolution
Measurement Error
• Model Conditions
–
–
–
–
–
–
GS mean = 36 kg
P mean = 20 kW
CF mean = 14 kN
Input parameter standard deviations were varied.
Dissolution measurement error was added.
Dissolution values were predicted.
20
Example: Simulations 5-7
D45 Simulation
Mean = 75.0%
Std. Dev. = 3.88%
D45 Simulation
Mean = 75.0%
Std. Dev. = 4.37%
D45 Simulation
Mean = 75.0%
Std. Dev. = 5.58%
D45 Simulation
Mean = 75.0%
Std. Dev. = 7.16%
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0.1
0.12
0.08
0.06
0.04
0.02
0
0.14
0.1
0.12
0.08
0.06
0.04
0.02
0
50
50
50
50
60
60
64
65
Simulation #7
78
Simulation #6
70
Simulation #5
80
80
80
Simulaton 5-7 Control
70
90
90
92
95
100
100
GS Std. Dev=0.5 kg
P Std. Dev=1 kW
CF Std. Dev=0.5 kN
D45 Std. Dev=0%
(Control)
GS Std. Dev=0.5 kg
P Std. Dev=1 kW
CF Std. Dev=0.5 kN
D45 Std. Dev=2%
GS Std. Dev=0.5 kg
P Std. Dev=1 kW
CF Std. Dev=0.5 kN
D45 Std. Dev=4%
GS Std. Dev=0.5 kg
P Std. Dev=1 kW
CF Std. Dev=0.5 kN
D45 Std. Dev=6%
21
Influence of Dissolution
Measurement Error
• Increase in D45 measurement variance:
– does not shift predicted D45 means.
– does not appear to skew predicted D45
distributions.
– increases predicted D45 variance.
• Advantage #2, we get the distribution, not just
the standard deviation.
• Advantage #3, sensitivity analysis allows us to
prioritize process improvement.
22
Measurement Uncertainty and
Prediction Uncertainty
D45(%)
Experiment
Measured Mean
Model Prediction Measured St. Dev.
1
69.0
69.2
3.1
2
73.5
72.3
3.1
3
71.9
72.1
1.9
4
67.1
65.5
2.9
5
69.8
71.2
1.9
6
75.5
75.0
0.8
7
65.4
66.6
2.7
8
75.8
77.3
1.1
9
56.1
59.4
3.7
10
61.0
59.4
3.7
11
67.2
68.3
2.4 Benchmark
12
77.0
77.3
1.1
13
72.7
68.3
2.4
Standard Error of Prediction
2.4
Standard Error (RMS Measurement Standard Deviation)
2.6
23
Monte Carlo Prediction Error
D45
Measurement
Random
N=
6
Coefficients
B matrix
B Std. Dev.
Random
R Std. Dev.
Response
P Std. Dev.
GS Std. Dev.
CF Std. Dev.
Std. Error of
Std. Error
Prediction
2.6%
Random
Inputs
D45 Monte Carlo Simulation Parameters
#1
#2
#3
#4
6
10000
10000
6
regression
regression
regression
regression
regression
regression
regression
0
0
0
0
1.7%
0
0
0.5
0.5
0
0
0.25
0.25
0
0
0.5
0.5
1.7%
1.6%
2.6%
2.5%
Result using estimated
coefficient St. Dev.
24
Prediction Error Based on Estimated
Coefficient Standard Deviations
• Estimated model coefficient standard deviations
do not predict the observed response
uncertainty.
• Can we use Monte Carlo simulation to provide
better estimates of model coefficient standard
deviations?
25
Propagation of Uncertainty in
Process Modeling
The pseudoinverse of D:
D†=(DTD)-1DT
Solving for the Model Coefficients:
D†R=B
B1 = D†11·RExp 1 + D†12·RExp 2 + D†13·RExp 3 +…
B2 = D†21·RExp 1 + D†22·RExp 2 + D†23·RExp 3 +…





1. Assign random variables to Dissolution values (R) and use Monte
Carlo simulations to propagate error to the model coefficients (B).
2. Assign random variables to Process Parameters (D) and use Monte
Carlo simulations to propagate error to B.
26
How Do Variances in Process Parameters
Influence Model Coefficients?
• Simulation # 1 (1-0.25-1)
–
–
–
–
Measured D45 means and standard deviations.
P 19-23 kW ± 1 kW
GS 36-38 kg ± 0.25 kg
CF 12-18 ± 1 kN
• Compare to regression distributions
– Model coefficient means
– Model coefficient standard deviations
27
Regression Ran. Input
How Do Variances in Process Parameters
Influence Model Coefficients?
Bias Coefficient
Power Coefficient
0.5
Water Coefficient
0.4
0.45
0.35
2
Force Coefficient
0.35
Water Coefficient
0.35
0.3
0.3
0.25
0.25
0.2
0.2
0.3
0.25
0.4
0.3
0.35
0.2
0.25
0.3
0.25
0.2
0.15
0.15
0.2
0.15
0.15
0.1
0.15
0.1
0.1
0.1
0.05
0.05
0.1
0.05
0.05
0
0
64
67
70
73
0
-8
-6
Bias Coefficient
-4
-2
0
2
4
-8
-6
-4
Power Coefficient
0.55
0.5
-2
0
2
4
0
-16
6
0.05
0
-14
-12
Water Coefficient
0.9
0.4
0.8
0.35
-10
-8
-6
-4
-2
0.7
0
2
4
6
8
10
6
8
10
2
Water Coefficient
0.35
0.3
0.3
0.4
0.6
0.35
-2
Force Coefficient
1.2
1
0.45
-4
0.25
0.8
0.25
0.3
0.5
0.25
0.4
0.2
0.2
0.6
0.15
0.15
0.2
0.3
0.4
0.15
0.1
0.1
0.2
0.1
0.2
0.05
0
64
65
66
67
68
69
70
“Bias”
71
72
73
0.05
0.05
0.1
0
-3.5
0
-2.375
-1.25
-0.125
Power (P)
1
0
-7
-6
-5
-4
-3
-2
-1
0
1
2
Water (GS)
3
4
0
-5
-4
-3
-2
Force (CF)
-1
-4
-2
0
2
4
Water2
Increase in process parameter variance causes a shift in some model coefficients.
Increase in process parameter variance increases model coefficient variance.
28
How Do Variances in Process Parameters
Influence Model Coefficients?
–
–
–
–
Simulation # 1 (1-.25-1)
Simulation # 2 (1-0.5-1)
Simulation # 3 ( 1-1-1)
Simulation # 4 ( 2-1-1)
• Increasing input parameter variance:
– increases variance in the model coefficients.
– can skew the model coefficient distribution.
– can shift model coefficient means.
29
Estimated Model Coefficient Uncertainties
from Monte Carlo Simulation
P Std. Dev.
GS Std. Dev.
CF Std. Dev.
R Std. Dev.
B0
B(GS)
B(P)
B(CF)
B(GS^2)
Std. Error of
Std. Error
Prediction
Regression
68.35 ± 0.83
-1.34 ± 1.07
-2.88 ± 0.96
-8.95 ± 1.17
2.44 ± 1.35
1.6%
Monte Carlo Simulation Parameters
#1
#2
#3
#4
1
1
0.5
0.25
0.25
0.25
1
1
0.5
measured
measured
2.5%
68.35 ± 0.93 68.41 ± 1.06 68.39 ± 1.44 68.39 ± 0.9
-1.34 ± 1.21 -1.23 ± 1.49 -1.22 ± 1.96 -1.24 ± 1.28
-2.88 ± 1.04 -2.13 ± 1.18 -2.11 ± 1.53 -2.63 ± 1.06
-8.95 ± 1.38 -7.37 ± 1.35 -7.37 ± 1.87 -8.52 ± 1.2
2.44 ± 1.52 2.13 ± 1.84 2.14 ± 2.43 2.15 ± 1.57
1.8%
2.1%
2.8%
1.8%
30
Case Study #2: Nasal Spray
Performance Models
•
A nasal spray product is a combination of a therapeutic
formulation and a delivery device.
•
3-level, 4-factor Box-Behnken designs
–
–
–
Pfeiffer nasal spray pump
Placebo formulations (CMC & Tween 80 solutions)
Reference:
•
•
Changning Guo, Keith J. Stine, John F. Kauffman, William H. Doub. 2008
“Assessment of the influence factors on in vitro testing of nasal
sprays using Box-Behnken experimental design”, European Journal of
Pharmaceutical Sciences 35 (12 ) 417–426
Changning Guo, Wei Ye, John F. Kauffman, William H. Doub. “Evaluation
of Impaction Force of Nasal Sprays and Metered-Dose Inhalers Using
the Texture Analyser.” Journal of Pharmaceutical Sciences. In press
31
Response Variables
Parameters used to describe the shape of a nasal spray plume:
spray pattern area, plume width.
Spray Pattern: measures the
cross sectional uniformity of
the spray
Plume geometry: measures the
side view of a spray plume at its
fully developed phase
32
Response Variables
• Droplet Size Distribution
100
2.2
90
2.0
80
1.8
1.6
70
1.4
60
1.2
50
1.0
40
0.8
30
0.6
20
0.4
10
0.2
0
0.10
Density distribution q3*
Cumulative distribution Q3 / %
– Volume Median Diameter D50
0
0.5
1
5
10
particle size / µm
50
100
• Impaction Force
33
Nasal Spray Response Models
• Optimized regression models
Responses
Prediction Equations
R2
Spray Pattern Area
R  338 47S  156V  279C  89VC  54V 2  148C 2
0.97
Plume Width
R  26.7  8.3V  8.1C  4.3V 2  4.3C 2
0.96
D50
R  34.5  27.6V  19.6C  27.1VC  20.8V 2
0.90
Impaction Force
R  4.45  0.34S  1.51V  0.23T  0.41S 2  0.74V 2
0.94
34
Spray Pattern Model
R  338 47S  156V  279C  89VC  54V 2  148C 2
Spray
pattern
model
term
offset
S
V
C
VC
2
V
2
C
regression
Mean
338
47
156
-279
-89
-54
148
Std.
Dev.
15
13
13
13
23
18
18
Random R
Mean
338
46
155
-279
-89
-54
148
Std.
Dev.
4
4
3
5
7
5
5
Random D
Mean
338
46
155
-279
-89
-54
148
Std.
Dev.
4
4
4
4
7
6
5
Random
R&D
Std.
Mean
Dev.
338
5
46
5
155
4
-279
6
-89
8
-54
7
148
7
Variances from input variables and spray pattern area
measurements have similar level of influence on the
model coefficients.
35
Plume Width Model
R  26.7  8.3V  8.1C  4.3V 2  4.3C 2
Plume
Geometry
model term
offset
V
C
V2
C2
regression
Std.
Mean
Dev.
26.7
8.3
-8.1
-4.4
4.4
0.6
0.5
0.5
0.8
0.7
Random R
Std.
Mean
Dev.
26.7
8.3
-8.1
-40.4
4.4
0.3
0.3
0.3
0.4
0.4
Random D
Std.
Mean
Dev.
26.7
8.3
-8.1
-4.3
4.4
0.2
0.1
0.1
0.2
0.2
Random R&D
Std.
Mean
Dev.
26.7
8.3
-8.1
-4.3
4.4
0.3
0.4
0.3
0.4
0.4
Variance from plume width measurements have more
influence on the model coefficients than those from the
input variables.
36
Droplet Size Model – D50
D50  34.5  27.6V  19.6C  27.1VC  20.8V 2
offset
V
C
VC
V2
Random
R only
Random
D only
Random
R&D
Variance from input variables have more influence on the model
coefficients than those from the D50 measurements.
37
Impaction Force Model
R  4.45  0.34S  1.51V  0.23T  0.41S 2  0.74V 2
force
offset
S
V
T
2
S
2
V
regression
Std.
Mean
Dev.
4.44
0.11
0.34
0.09
1.51
0.09
0.23
0.09
0.41
0.13
-0.72
0.12
Random R
Std.
Mean
Dev.
4.44
0.04
0.34
0.04
1.51
0.03
0.23
0.04
0.41
0.05
-0.72
0.04
Random D
Std.
Mean
Dev.
4.45
0.04
0.33
0.04
1.50
0.03
0.23
0.03
0.39
0.06
-0.72
0.04
Random R&D
Std.
Mean
Dev.
4.45
0.06
0.33
0.06
1.50
0.04
0.23
0.05
0.39
0.07
-0.72
0.08
Variances from input variables and impaction force
measurements have similar level of influence on the
model coefficients.
38
How Do Variances in Formulation and
Actuation Influence Model Coefficients?
• The means of model coefficients show good agreement between
regression results and Monte Carlo simulations
• The standard deviations of model coefficients obtained from
regression results are larger than those from Monte Carlo
simulation.
– The estimated standard deviations from regression may
overestimate the uncertainties in the model coefficients.
– Regression based coefficient standard deviations in defining design
space may result in a smaller selection range of input variable
values that are necessary to meet the desired confidence level.
39
Advantages of Monte Carlo Simulation
1. We make no assumptions concerning
sources of uncertainty or variable
covariance.
2. We see the distribution of output variable
values, not just a standard deviation.
3. Sensitivity analysis allows us to prioritize
high risk input variables and improve
process control.
40