Generating Random Numbers W. Implementation in C
Generating Random Numbers W. Implementation in C
Generation
Biostatistics 615/815
Lecture 14
Homework 5, Question 1:
Quick Sort Optimization …
12 200
Thousands
10 167
Comparisons
Time (ms)
8 133
6 100
0 10 20 30 40 50 60
M
Homework 5, Question 1:
Merge-Sort Optimization
14 200
Thousands
175
12
Comparisons
Time (ms)
150
10
125
8 100
0 10 20 30 40 50 60
M
Homework 5, Question 2:
z Comparison of Hashing Strategies
• Linear hashing
• Double hashing
z Interesting aspects:
• Memory dramatically impacts performance
• In double-hashing, it is important to choose the
second hash function carefully:
• Specifically, it is key to avoid that it might return the
values 0, 1 and any multiple of the table size M
Today
z Random Number Generators
• Key ingredient of statistical computing
int main()
{
int i;
z Where
• Ij is the jth number in the sequence
• m is a large prime integer
• a is an integer 2 .. m - 1
Rescaling
• Uj = I j / m
z Good sources:
• Numerical Recipes in C
• Park and Miller (1988) Communications of the ACM
seed = (a * seed) % m;
return seed / (double) m;
}
seed = (a * seed) % m;
return seed / (double) m;
}
z Where
• q=m/a
• r = m mod a
• r<q
⎧ a ( I j mod q ) − r[ I j / q ] if ≥ 0
z Then aI j mod m = ⎨
⎩a ( I j mod q ) − r[ I j / q ] + m
Random Number Generator:
A Portable Implementation
#define RAND_A 16807
#define RAND_M 2147483647
#define RAND_Q 127773
#define RAND_R 2836
#define RAND_SCALE (1.0 / RAND_M)
double Random()
{
int k = seed / RAND_Q;
if (seed == 0) seed = 1;
random_next = random_tbl[0];
}
Example: Shuffling (Part II)
double Random()
{
// Generate the next number in the sequence
int k = seed / RAND_Q, index;
seed = RAND_A * (seed – k * RAND_Q) – k * RAND_R;
if (seed < 0) seed += RAND_M;
double Random()
{
int k, result;
k = seed1 / RAND_Q1;
seed1 = RAND_A1 * (seed1 – k * RAND_Q1) – k * RAND_R1;
if (seed1 < 0) seed1 += RAND_M1;
k = seed2 / RAND_Q2;
seed2 = RAND_A2 * (seed2 – k * RAND_Q2) – k * RAND_R2;
if (seed2 < 0) seed2 += RAND_M2;
z Define
• Cumulative density function F(x)
• Inverse cumulative density function F-1(x)
z Sample x ~ U(0,1)
z Evaluate F-1(x)
Example: Exponential Distribution
z Consider:
• f (x) = e-x
• F (x) = 1 – e-x
• F-1(y) = -ln(1 – y)
double RandomExp()
{
return –log(Random());
}
Example: Categorical Data
z To sample from a discrete set of outcomes, use:
return outcome;
}
More Useful Examples
z Numerical Recipes in C has additional
examples, including algorithms for
sampling from normal and gamma
distributions
The Mersenne Twister
z Current gold standard random generator
z Web: www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
• Or Google for “Mersenne Twister”
// After these lines, lo has the bottom 31 bits of result, hi has bits 32 and up
lo += (hi & 0x7FFF) << 16; // Combine lower 15 bits of hi with lo’s upper bits
hi >>= 15; // Discard the lower 15 bits of hi