SCOUG OS/2 For You - May 1993
Random Number Generators and the SCOUG Raffle Program
i.e., Is Our Raffle Fair?
by Michael Lavender
My raffle program has had a lot of ribbing over how random it really is.
I've viewed this as just joking around, but I was asked to make a
presentation on the creation of the raffle program. Since that would
basically be "how Mike programs," I've decided to broaden the subject and
write about randomness in general and programming in particular.
The real purpose for random number generators is to test out statistical
theories for flaws. One would take a random number creator (dice, pack of
cards, coins, etc.) and produce the necessary number of numbers for the
test. Since it's possible to need as many as 100,000 numbers for a test,
it would not surprise me to find out that a large number of economics and
statistics graduate students changed majors to something less repetitious
and more lifeaffirming then rolling dice all week; something like
accounting.
The definition of a pseudorandom number sequence is as follows: "An
ordered set of numbers that has been determined by some defined arithmetic
process but is effectively a random number sequence for the purpose for
which it is required." The series need not be naturally random. But, it is
necessary that:
Every number must have the same chance of appearing as every
other number.
This means, in the examples above, if you used cards to pick your numbers,
the cards would have to be shuffled for each and every draw. The other
feature of a good pseudorandom sequence is the period length of the
sequence. If a random series only lasts a few numbers before repeating,
its use in complex statistical theories is negligible.
Is Random Random?
The definition also brings up the question of whether the random
generators we take for granted (dice, coins, cards, etc.) are truly
random. People, after some practice, can make a coin come up heads or
tails. There is very little randomness in most card shuffling. And, raffle
tickets in a bag are usually picked by ignoring the top layer and staying
away from the sides of the bag, as if that made the selection more random.
Two different products appeared early in the century to alleviate graduate
students from the humdrum of existence (unless their professor liked doing
things the old-fashioned way): books with nothing but random numbers and
machines that rolled dice or picked cards or whatever. The books were
bought by both scientists and spies. The machines were used to pick
lottery winners as well as help graduate students. Once computers were
invented, the idea of a mathematical solution to the problem was sought.
Technically Speaking...
In 1946, the idea of using squares of numbers in a unique way was put
forward by John von Neumann. If a number is squared, the resultant number
will have either twice, or one less than twice, the number of digits as
the original. Those who use(d) slide rulers might remember. The idea was
to take the middle numbers of the square: so if you started with a number
of four digits and squared it, you would take the middle four digits from
the resulting seven or eight digit number, and you would end up with a
number that was for all intents random. Although it worked fairly well
with numbers of many digits, if the number was of few digits, the series
would repeat itself in a short while, which isn't desirable.
D. H. Lehmer, in 1949 wrote a mathematical equation to produce random
numbers:
Xn+1 = (aXn + c) mod m
where X is a number in a random series, m is the range of possible numbers
in the series, and a, c, and X0 are arbitrary numbers between 0 and m.
The possible range for the random set should, in a computer, be the size
of the computer's word. A word, in computerese, is the number of bits run
by the computer in a cycle. In the 32 bit world of OS/2, this means 232,
or 4 thousand million.
The period length goal is the size of m, which the book's author proved is
the longest possible period obtainable from the equation above. How to
possibly achieve this? First a must be larger than or equal to 2, or the
series will not be random-like at all. Another variable, b, is introduced
and described as one less than a. b then is defined as a multiple of prime
denominators of m (in computers, a multiple of 2) and c is a number
relatively prime in relation to m (in computers, prime would be any odd
number). From there, it is a matter of picking out numbers, checking the
output for repeating series, and then comparing the output with formulas
that try to determine randomness.
What does all this mean to you?
Well, not much, since most of us neophyte programmers use the random
number generators in the programming language we work in. BASIC and REXX
both have such generators, and I have been using them with the hope that
the programmers have followed good testing procedures. Researching this
article has got me thinking, however, about designing a random number
generator as a learning project in C or C++ - but that's another issue.
In REXX's random number generator, there is a parameter called the "seed."
The seed can be viewed as the starter of the generator, or, in the
equation above, X0. The random number generator will run exactly the same
series for each particular seed. The advantage is that this allows for
controlled experiments on programs.
For our raffle program, the seed allows for recovery from a breakdown.
Thus, if the system should go down, temporary files keeping the seed, the
number of steps run in the generator, and other information allow the
program to resume where it left off and keep going as if nothing had gone
wrong. It also means that the seed must be different each time.
The seed in REXX must be an integer. Nowhere in the on-line manual does it
say what the upper limit is. I ran a program that told me that nine digit
numbers will not work in REXX. This limits me to construct a hopefully
random eight digit or less seed.
Randomness is Fostered with Lots of Calculations
To make the program random, I put in the time, date, number of days since
January 1, Year 1, and a random file pick from the OS2 directory. That
file was then converted to numbers and the collection of numbers were
added and multiplied and divided to attain a final number of seven or
eight digits.
The creation of the membership sign up program adds to the randomness of
the raffle program. My earlier versions of the raffle program used an
alphabetic list of members. Even though that list is changing each month,
the same seed might produce a very similar list of winners between two
raffles. Now, however, the membership sign up outputs the list in the same
order as the members sign in. This produces a nearly perfect random list,
that with a variable seed, will - hopefully - create a truly random
raffle.
The Southern California OS/2 User Group
P.O. Box 26904
Santa Ana, CA 92799-6904, USA
Copyright 1994 the Southern California OS/2 User Group. ALL RIGHTS
RESERVED.
SCOUG is a trademark of the Southern California OS/2 User Group.
OS/2, Workplace Shell, and IBM are registered trademarks of International
Business Machines Corporation.
All other trademarks remain the property of their respective owners.
|