Simulation with Arena - Virginia Commonwealth University

Download Report

Transcript Simulation with Arena - Virginia Commonwealth University

Terminating Statistical
Analysis
By Dr. Jason Merrick
1
Statistical Analysis of Output
Data: Terminating Simulations
• Random input leads to random output (RIRO)
• Run a simulation (once) — what does it mean?
– Was this run “typical” or not?
– Variability from run to run (of the same model)?
• Need statistical analysis of output data
• Time frame of simulations
– Terminating: Specific starting, stopping conditions
– Steady-state: Long-run (technically forever)
– Here: Terminating
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/2
Point and Interval Estimation
• Suppose we are trying to estimate an output
measure E[Y] =  based upon a simulated sample
Y1,…,Yn
• We come up with an estimate ˆ
n
– For instance ˆ  Y  1 n Yi
i 1
• How good is this estimate?
–
–
–
–
Unbiased
Low Variance (possibly minimum variance)
Consistent
Confidence Interval
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/3
T-distribution
• The t-statistic is given by
ˆ  
t
ˆ (ˆ)
– If the Y1,…,Yn are normally distributed and ˆ  Y then the tstatistic is t-distributed
– If the Y1,…,Yn are not normally distributed, but ˆ  Y then
the t-statistic is approximately t-distributed thanks to the
Central Limit Theorem
• requires a reasonably large sample size n
– We require an estimate of the variance of ˆ denoted  2 (ˆ)
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/4
•
An approximate confidence
interval for  is then
[ˆ  t
ˆ (ˆ),ˆ  t
ˆ (ˆ)]
f ,1a / 2,
f ,1a / 2
– The center of the confidence
interval is ˆ
– The half-width of the confidence
interval is t f ,1a / 2ˆ (ˆ)
– t f ,1a / 2 is the 100(/2)%
percentile of a t-distribution with
f degrees of freedom.
Sample Repetition
T-distribution Confidence Interval
1
0
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
5
10
15
20
25
30
Parameter Value
C6/5
T-distribution Confidence Interval
• Case 1: Y1,…,Yn are independent
– This is the case when you are making n independent
replications of the simulations
• Terminating simulations
• Try and force this with steady-state simulations
– Compute your estimate ˆ and then compute the sample
2
n
ˆ
variance
(
Y


)
2
i
s 
i 1
n 1
– s2 is an unbiased estimator of the population variance, so
s2/n is an unbiased estimator of  2 (ˆ) with f = n-1 degrees
of freedom
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/6
T-distribution Confidence Interval
• Case 2: Y1,…,Yn are not independent
– This is the case when you are using data generated within
a single simulation run
• sequences of observations in long-run steady-state simulations
– s2/n is a biased estimator of  2 (ˆ)
– Y1,…,Yn is an auto-correlated sequence or a time-series
– Suppose that our point estimator for  is ˆ  Y, a general
result from mathematical statistics is
n
n
1
 2 (ˆ)  2  cov(Yi , Y j )
n i 1 j 1
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/7
T-distribution Confidence Interval
• Case 2: Y1,…,Yn are not independent
– For n observations there are n2 covariances to estimate
– However, most simulations are covariance stationary, that
is for all i, j and k
cov(Yi ,Yi k )  cov(Y j ,Y j k )
– Recall that k is the lag, so for a given lag, the covariance
remains the same throughout the sequence
– If this is the case then there are n-1 lagged covariances to
estimate, denoted k and
 (ˆ) 
2
2 
 k  k 
1  2  1   2 

n 
n  
i 1 
n 1
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/8
Time-Series Examples
70
100
90
60
80
50
Observed Value
70
Observed Value
Positively
correlated
sequence
with lag 1
60
50
40
30
Positively
correlated
sequence
with lags
1&2
40
30
20
20
10
10
0
0
1
11
21
31
41
51
61
71
81
91
101
1
11
21
Time or Observations
31
41
51
61
71
81
91
101
Time or Observations
20
600
500
15
400
5
0
1
11
21
31
41
51
61
71
81
91
101
Observed Value
300
Observed Value
Negatively
correlated
sequence
with lag 1
10
200
100
0
-5
1
11
21
31
41
51
61
-100
-10
71
81
91
101
Positively
correlated,
covariance
non-stationary
sequence
-200
-15
-300
Time or Observations
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
Time or Observations
C6/9
T-distribution Confidence Interval
• Case 2: Y1,…,Yn are not independent
– What is the effect of this bias term?
B
E[ s 2 / n ]
n 1
 c
n 1
 k
c  1  2 1   k2
n 
i 1 
n 1
2
– For primarily positively correlated sequences B < 1, so the
half-width of the confidence interval will be too small
• Overstating the precision => make conclusions you shouldn’t
– For primarily negatively correlated sequences B > 1, so the
half-width of the confidence interval will be too large
• Underestimating the precision => don’t make conclusions you
should
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/10
Strategy for
Terminating Simulations
• For terminating case, make IID replications
–
–
–
–
Simulate module: Number of Replications field
Check both boxes for Initialization Between Reps.
Get multiple independent Summary Reports
Different random seeds for each replication
• How many replications?
– Trial and error (now)
– Approximate no. for acceptable precision
– Sequential sampling
• Save summary statistics (e.g. average, variance)
across replications
– Statistics Module, Outputs Area, save to files
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/11
Half Width and Number of
Replications
•
•
Prefer smaller confidence intervals — precision
Notation: n  no. replications
YX  sample mean
s = sample standard deviation
t n 11
,  / 2  critical value from t tables
•
s
Confidence interval: X
Y  t n 11
,  / 2
•
Half-width = t n11
,  / 2
s
n
n
Want this to be “small,” say
< h where h is prespecified
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/12
Half Width and Number of
Replications
Y11 , Y12 ,, Y1m1

Y21 , Y22 ,, Y2 m2


•
Yn1 , Yn 2 ,, Ynmn
Y1
Y2


Yn
Y
s2
To improve the half-width, we can
– Increase the length of each simulation run and so increase the mi
– What does increasing the run length do?
s
– Increase the number of replications
t n11
,  / 2
n
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
C6/13
Half Width and Number of
Replications
(cont’d.)
•
•
•
2
s
Set half-width = h, solve for n  t n211
,  / 2 2
h
Not really solved for n (t, s depend on n)
Approximation:
– Replace t by z, corresponding normal critical value
– Pretend that current s will hold for larger samples
– Get
•
n
z12 / 2
s2
h2
s = sample standard
deviation from “initial”
number n0 of replications
Easier but different approximation:
n  n0
h02
h2
h0 = half width from “initial”
number n0 of replications
Simulation with Arena — Intermediate Modeling
and Terminating Statistical Analysis
n grows quadratically
as h decreases.
C6/14