Pseudo-Random Numbers: Linear Congruential Generators As

Download Report

Transcript Pseudo-Random Numbers: Linear Congruential Generators As

Pseudo-Random Numbers: Linear Congruential Generators
As digital computers are deterministic machines, they are incapable of generating random
numbers. Instead they generate large cycles of pseudo-random numbers (i.e.) numbers
which will pass most of the statistical tests for randomness. However, if the program is rerun
on a subsequent day the exact same stream of numbers will be produced. To overcome this
problem, since the chain of numbers is very large, a starting seed which uses the current
date and I.D. number of the user can be used to enter the chain at different points every time
the program is run. a pseudo-random number generator is said to be full cycle if it leaves no
gaps in the range of numbers it generates.
Form:
X0
a given “seed”
Xi+1 = ( a Xi + c ) mod m
Example
Xi+1 = 7 Xi mod 11 ->
Xi+1 = 6 Xi mod 11 ->
Example [IMSL]
[SIMSCRIPT]
[RANDU]
Xi+1 = 16807 Xi
mod (231 - 1)
Xi+1 = 6303600167 Xi mod (231 - 1)
Xi+1 = 65539 Xi
mod (231 )
Lemma.
1, 7, 5, 2, 3, 10, 4, 6, 9, 8, (1)
1, 6, 3, 7, 9, 10, 5, 8, 4, 2, (1)
a prime modulus
Xi+1 = a Xi mod m, with m prime, is full cycle
if m divides ai - 1 for i = m-1, but for no smaller i
Xi+1 = ( a Xi + c ) mod m is full cycle
if c and m are relatively prime and if p divides m then p divides a - 1,
if 4 divides m then 4 divides a - 1
Tests for Randomness
In addition to various “Run Tests” and non-parametric tests, the following are useful.
Chi-Square Test
We partition the M random numbers generated into n classes and calculate the test statistic
S ( Oj - Ej )2 / Ej ~ c2M-n-1 distribution for M large, min {Ej } > 4 and Ej ~ 4 M0.4.
Kolmogorov Smirniv Test.
Order the observations so that X1  X2  …  Xn and calculate over i = { 1, 2, … , n}
K+n =  n Max { i / n - F ( Xi ) }
K-n =  n Max { F ( Xi ) - (i - 1) / n}
If K+n or K-n are either too big or too small, reject
H0: Distribution is F
at the relevant confidence interval.
k-Dimensional Spectral Tests
1. Regard the numbers as points on [0, 1] ~ uniformly distributed
2. Regard the pairs ( Xi, Xi+1 ) as points in [0, 1]2 ~ uniformly distributed
3. Regard the triplets ( Xi, Xi+1, Xi+2 ) as points on [0, 1]3 …..
Lemma. The sequences of size n from the generator Xi+1 = a Xi mod m
fall on at most (n! m)1/n parallel hyperplanes.
The Spectral Value of a generator is the maximum distance between the hyperplanes. A
generator with a small spectral value is desirable.
Example. The diagrams show
that Xi+1 = 7 Xi mod 11
has a spectral value of 3.479,
while that of
Xi+1 = 6 Xi mod 11
is 4.919.
3.479
4.919
Newspaper Boy Problem
This is a seminal example, which shows how a
(Historical Data)
forecasting filter can use historical data to
Index Demand Freq.
optimise parameter settings in nonlinear models.
i
Xi
fi
S fi
The newspaper boy buys papers at 40p, sells
1
10
10
10
them at 50p and gets 20p back on unsold copies
2
30
25
35
at the end of the day. His “corporate strategy” is
3
70
45
80
to buy 90% of the demand that occurred on the
4
100
15
95
previous day.
5
130
5 100
Day
Random #
Buy
Demand
Sell
Return
Profit (p)
Cumulative (p)
0
70
-
1
2
0.25 0.60
63
20
22
52
22
20
41
0
-600 200
-600 -400
3
4
0.80 0.55
47
63
70
48
47
48
0
15
470 180
70 250
5
6
0.90 0.45
43
81
90
39
43
39
0
42
430 -450
680 230
F ( Xi )
0.10
0.35
0.80
0.95
1.00
Nonuniform Random Numbers - Search
In simulation models we typically disregard the first 10,000 iterations to allow the system
reach an equilibrium and then monitor the simulation over the next 100,000 iterations.
Binary search is quite an efficient algorithm for finding the simulation value relative to a
look-up key. In simulation models as this look-up facility will be used a large number of
times, it is important that the most efficient algorithm is used. Bucket search is the most
efficient algorithm known for searching tables.
Suppose
X1 < X2 < … < Xn
Place m equally spaced buckets on [0, 1], with m a power of 2 and n  m  3 n.
Let
pi = Min { j: F (Xj ) > (i - 1) / m}
qi = Min { j: F (Xj )  i / m }
Algorithm:
Generate the uniform [0, 1] random number U
Search pk pk+1, … , qk for j with F (Xj - 1) < U  F (Xj )
Return j - 1
Interpolate for the exact value X = [F(Xj) - U]Xj-1 + [U - F(Xj-1)]Xj
F(Xj) - F(Xj-1)
Example [Newspaper Boy]
Suppose U = 0.96 and m = 8, so we search {4, … , 5} for j : F( X4) < 0.96  F (X5 )
Return 4 and so X = 106, by interpolation.
Generating Nonuniform Random Numbers
1
Lemma. U uniform [0, 1] <=> Prob { U   } = 
U

Lemma. F(X) increasing => F(X) uniform [0, 1]
Prob { F(X)   } = Prob { X  F-1 ( )]
= F F-1 ( ) =  .
Lemma. F-1(U) has the distribution function F.
Prob { F-1 (U)  t} = Prob { U  F( t) } = F( t).
1
1
1
F (x)

X
F-1 (  )
Exponential.
F( x) = 1 - exp ( - l x) ,
for x  0, so
-1
F (U) = - ln ( 1 - U ) / l has an exponential distribution with rate l .
For convenience, tables of - ln ( 1 - U ) have been published. This
distribution is the kernel of simulation systems for modelling queues.
Cauchy
This is the Student t1 distribution has median m, scale s and has no mean
or higher moments. Using this distribution will result in the simulation
model crashing. The sequence in which it crashes can be used to debug
the simulation model. Later a proper random number generator can be
substituted.
F(x) = 1 / p Tan -1 [(x - m ) / s ] + 1 / 2
F-1 ( U) = s Tan [p ( U - 1/2 )] + m
`
Normal
The Box-Muller formulae generate approximate n (0, 1) pairs:
Yi = Cos ( 2 p Ui+1 )  -2 ln (Ui)
Zi = Sin ( 2 p Ui+1 )  -2 ln (Ui)
Illustration of M | M | 1
The diagram illustrates the dual nature of queuing statistics. T is the length of the simulation
run.
State
No. in the System
Expected number in the system =
Shaded Area / T.
This corresponds to numerical integration
(accumulation) and requires careful computer
3
coding.
Expected waiting time in the system =
{S Service Time Ends i - S Arrival Time i }
Number of Customers n
This is a tally operation, requiring the use of three
registers and is easy to implement.
E1 E2
E3 E4
E5
E6 T
0
1 2
3 4 5
6 Stage
Computer simulation languages, such as SIMSCRIPT, are a direct implementation of these
ideas. The one caveat that needs to be added at this time, is that writing good simulation
models requires considerable skill. In particular, unless the variance terms are controlled,
each simulation run will result in wildly differing results.