ISE525: Generating Random Variables

Download Report

Transcript ISE525: Generating Random Variables

ISE525: Generating Random
Variables
Sources of randomness in a computer?
• Methods for generating random numbers:
– Time of day (Seconds since midnight)
– 10438901, 98714982747,
87819374327498,1237477,657418,
– Gamma ray counters
– Rand Tables
– CEFT generator
• Closed Eye Finger Typing
Pseudo Random Numbers
• Pseudo random numbers:
– Generated algorithmically.
– Can be replicated with original parameters.
LCGs
• A linear congruential generator (LCG) represents one
of the oldest and best-known pseudorandom number
generator algorithms.
• The generator is defined by the recurrence relation:
– Xn+1 = (a Xn + c) modulo m
• where Xn is the sequence of pseudorandom values, and
• The period of a general LCG is at most m, and for some
choices of a much less than that. The LCG will have a
full iff:
– (a) andc are are relatively prime,
– a-1 is divisible by all prime factors of m
– a – 1 is a multiple of 4 if m is a multiple of 4.
LCGs continued
• While LCGs are capable of producing decent
pseudo random numbers, they are extremely
sensitive to the choice of the coefficients c, m,
and a.
• Historically, poor choices had led to ineffective
implementations of LCGs. A particularly
illustrative example of this is RANDU which was
widely used in the early 1970s and resulted in
many results that are currently in doubt because
of the use of this poor LCG.
Multiply With Carry Generator
• Simple to implement.
• In its most common form, a lag-r MWC
generator requires a base b, a multiplier a,
and a set of r+1 random seed (starting) values,
consisting of r residues of b i.e. {x0, x1, x2 ,...,
xr−1}, and an initial carry cr−1 < a.
Mersenne Twisters
• Mersenne primes: 243,112,609 − 1
• Mersenne twisters:
– It has a period of 219937 − 1
– It has a very high order of dimensional
equidistribution.
– It passes numerous tests for statistical
randomness.
• Diehard tests
Diehard Tests
• Tests for random number generators:
– Birthday spacings: The spacings between the
points should be asymptotically Poisson
distributed.
DieHard Tests
• Overlapping permutations: Analyze
sequences of five consecutive random
numbers. The 5! possible orderings should
occur with statistically equal probability.
• Parking lot test: Randomly place unit circles in
a 100 x 100 square. If the circle overlaps an
existing one, try again. After 12,000 tries, the
number of successfully "parked" circles should
follow a certain normal distribution.
Random Parking Problem
• In 1958, a Hungarian mathematician, Alfréd Rényi, presented an
interesting problem: "We place at random on the interval (0, x ) unit
intervals not having points in common. What is the expected
number of intervals we can thus place on (0, x )?". By assuming the
unit interval a car on the road, the problem is often called "random
car parking problem". Rényi (1958) solved this problem analytically
and has obtained the solution
lim( M ( x) / x)  0.747590
x 
• Rényi, A. (1958) On a one-dimensional problem concerning random
space-filling: Publications of Mathematical Institute of Hungarian
Academy of Sciences, 3, 109-127.
DieHard Tests
• Runs test: Generate a long sequence of
random floats on [0,1). Count ascending and
descending runs. The counts should follow a
certain distribution.
• Overlapping sums test: Generate a long
sequence of random floats on [0,1). Add
sequences of 100 consecutive floats. The sums
should be normally distributed with
characteristic mean and sigma.
DieHard Tests
• Minimum distance test: Randomly place 8,000
points in a 10,000 x 10,000 square, then find the
minimum distance between the pairs. The square of
this distance should be exponentially distributed
with a certain mean.
• Random spheres test: Randomly choose 4,000
points in a cube of edge 1,000. Center a sphere on
each point, whose radius is the minimum distance to
another point. The smallest sphere's volume should
be exponentially distributed with a certain mean.
DieHard Tests
– The squeeze test: Multiply 231 by random floats
on [0,1) until you reach 1. Repeat this 100,000
times. The number of floats needed to reach 1
should follow a certain distribution.
– Monkey tests: Treat sequences of some number
of bits as "words". Count the overlapping words in
a stream. The number of "words" that don't
appear should follow a known distribution.
Multiply and Carry RNG
xn  (axnr
 axnr  cn1 
 cn1 ) mod b, cn  
,
n

r
,

b


Approaches for RN generation
•
•
•
•
•
Inverse transform
Composition
Convolution
Acceptance-rejection
Special properties
Inverse transform methods
• Suppose X is continuous with cumulative distribution function (CDF) F(x) =
P(X <= x) for all real numbers x that is strictly increasing over all x.
• Step 2 involves solving the equation F(X) = U for X; the solution is written
• X = F – 1(U), i.e., we must invert the CDF F
• Inverting F might be easy (exponential), or difficult (normal) in which case
numerical methods might be necessary (and worthwhile—can be made
“exact” up to machine accuracy)
• Algorithm:
– 1. Generate U ~ U(0, 1) (random-number generator)
– 2. Find X such that F(X) = U and return this value X
Inverse Transform Method
• Uniform [a,b]
ITM
• Exponential distribution:
x
f ( x)  e , x  0
ITM (Weibull)
ITM (Weibull)
ITM (Discrete cases)
Algorithm:
1. Generate U ~ U(0,1) (random-number generator)
2. Find the smallest positive integer I such that U <= F(xI)
3. Return X = xI
ITM (Discrete case)
ITM (Discrete case)
Simulation of a deck of cards
Composition method for generating
random variables
• This applies when the distribution function F
from which the RN is desired can be expressed
as a convex combination of other distribution
functions, F1 , F2 , F3 etc.
• Specifically, we assume for all x, F(x) can be
written as:


j 1
j 1
F ( x)  p j Fj ( x), where p j  0,  p j  1
• This also works if :


j 1
j 1
f ( x)  p j f j ( x), where p j  0,  p j  1
Composition
• In general, the composition algorithm is:
– Generate a positive random integer J such that
• P(J=j) = p_j for j = 1,2,…
• Return X with distribution function Fj
• Example: the double-exponential (or Laplace)
distribution has the density f(x) = 0.5 e|x| for
all x > 0.
Laplace distribution
Composition
• We can express the density as:
f ( x)  0.5ex I( ,0)  0.5e x I[0,) ( x)
• Where IA denotes the indicator function of the set
A, defined by:
 1 if x  A
I A ( x)  
 0 otherwise
• So, the algorithm for generating laplacian
functions is:
– First, generate U1 and U2 as IID U(0,1). If U1 < .5,
return X = ln(U2 ), else return X = -ln(U2 )
Example using Excel
Example 8.4
• For 0<a<1, the right trapeziodal distribution
has the density:
a  2(1  a) x if 0  x 1
f ( x)  
0 otherwise

Convolutions
• If X = Y1 + Y2 + Y3 + Y4 + Y5 …
• Use the following algorithm for generating X:
– Generate Y1, Y2, Y1, …, Ym IID with distribution
function G.
– Return X = Y1 + Y2 + …+ Ym
• Example:
– Erlang: the m-Erlang random variable X with mean
can be defined as the sum of m IID Random 
variables with a common mean of  / m
Acceptance-Rejection
• Specify a function t that majorizes the density
function f, i.e.
t ( x)  f ( x)  x
• Now,


c   t ( x)   f ( x)  1

• However, 
r ( x)  t ( x) / c
is a density function.
Acceptance Rejection
• The algorithm for generating random variables
using acceptance rejection is:
– Generate Y having the density r.
– Generate U ~ U(0,1) independent of Y
– If U <= f(Y)/t(Y), return X=Y; otherwise try again.
• Example: Consider the beta(4,3) distribution
on the unit interval:
60 x3 (1  x)2 if 0  x 1
f ( x)  
0 otherwise

60 x3 (1  x)2 if 0  x 1
f ( x)  
0 otherwise

2.5
t  2.0736
2
1.5
1
0.5
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
0
Inverse Transform Method ?
Acceptance Rejection
x0
0
 1
r ( x)  
0  x 1
 2.0736
 1
otherwise
2.0736 if 0  x 1
t ( x)  
 0 otherwise
So, we first generate Y~U(0,1).
f(Y) = 60*Y^3(1-Y)^2/2.0736
If U < f(Y), return Y, else reject Y and regenerate
12
10
8
6
Series1
4
2
0
1
2
3
4
5
6
7
8
9
10
Acceptance Rejection continued
• The acceptance rejection is used mostly for generating
bounded random variables – although if the variable is
not bounded, it can still be used.
• The choice of the majorizing function is critical for the
efficiency of the method.
• The majorizing function can be composed of segments
of several functions – piecewise linear for instance.
2.5
2
1.5
1
0.5
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
0
Acceptance Rejection continued
• Adding a minorizing function
2.5
2
1.5
1
0.5
0
1
5
9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101
Pros and cons ?
Special Properties
 2 distribution.
• If Y ~N(0,1), Y2 has a
2
2


• If Z1 , Z2 … Zm ~ 1 then Y=Z1 + Z2 + … + Zm ~ m
• We exploit the special relationships between
distributions to generate random variables.
Generating continuous RVs
•
•
•
•
Uniform:
Exponential:
M-Erlang:
Gamma
Generating Normal Variates
• Normal:
N(m,s2 ) can be obtained from N(0,1 ).
– Box and Muller method:
•
•
•
•
Generate U1 and U2
X1 = sqrt(-2 ln U1) cos(2 PI U2)
X2 = sqrt(-2 ln U2) sin( 2 PI U2)
What if an LCG is used for U1 and U2 ?
Generating Normal Variates
• Generate U1 and U2. Vi = 2Ui -1.
• W=V12 +V22 .
• If W > 1, go back to step 1, else let
Y  (2ln W ) / W
• X1 = V1*Y1, X 2 = V2*Y2
• X1 and X2 are N(0,1)
• Excel comparison:
Bezier Distributions
• Beziers are spline curves that fit arbitrary
distributions, given information about
percentiles.
• They can also be constructed to match moments
of a given distribution.
Bezier curves
• Given points P1 and P2 , a linear Bézier curve
is simply a linear bezier curve is the straight
line between the two points:
B(t )  P0  t ( P1  P0 )  (1  t ) P0  tP1, t [0,1]
• A cubic bezier is
B(t )  (1  t )3 P0  3(1  t )2 tP1  3(1  t )t 2 P2  t 3 P3 , t [0,1]
Generating Bezier Variates
Triangular Distributions
x
0 x2

f ( x)   2

0 otherwise
x 0
0
 2
x
F ( x)  
0 x2
4

 1 otherwise
1
0
1
2
Triangular Distributions
1
2 x3
 2 ( x  2)

f ( x)   1
x
3 x  6
 2 (2  3 )

0
otherwise

x2
0
1
 ( x  2) 2
2 x3

F ( x)   4
 1 ( x 2  12 x  24) 3  x  6
 12

1
otherwise

Special Discrete Distributions
• Arbitrary Discrete Distribution:
– Generate U ~U(0,1)
– Return the nonnegative integer X = 1, satisfying:
i 1
I
j 0
j 0
 p( j )  U   p( j )
Geometric distribution
• Number of failures before a success.
p( X  x)  p(1  p)
x 1
p( X  x)  1  (1  p)
ln(r )
x
ln(q)
x
Generating Poisson RVs
• algorithm Poisson Random Number (Knuth):
init:
Let L ← e−λ, k ← 0 and p ← 1.
• do:
k ← k + 1.
Generate uniform random number u in [0,1] and let
p ← p × u.
• while p ≥ L.
• return k − 1.
Example of Poisson Distribution
Generation
Generating Binomial RVs
If Yi is geometrically distributed with parameter p, then the smallest integer x
x 1
for which  yi  n is binomially distributed with parameters (p,n).
i 1
Random Processes
•
•
•
•
Markov Chains
Semi-Markov Chains
Renewal Processes
Martingales
Properties of Markov Chains
Generating random transition matrices
Example: