# Random number

In statistics a random number is a single observation (outcome) of a specified random variable. Where no distribution is specified, the continuous uniform distribution on the interval [0,1) is usually, but not always, intended.

In an informal sense, there is some circularity in this definition as the idea of random variable itself rests on the concept of randomness. A number itself cannot be random except in the sense of how it was generated. Informally, to generate a random number means that before it was generated, all elements of some set were equally probable as outcomes. In particular, this means that knowledge of earlier numbers generated by this process, or some other process, do not yield any extra information about the next number. This is equivalent to statistical independence.

 Contents

## Importance of random numbers

Statistical practice is based on statistical theory which, itself is founded on the concept of randomness. Many elements of statistical practice depend on the emulation of randomness through random numbers. Where those random numbers fall short of the conceptual ideal of randomness any subsequent statistical analysis may suffer from bias. Elements of statistical practice that depend on randomness include: choosing a representative sample, disguising the protocol of a study from a participant (see randomized controlled trial) and Monte Carlo simulation.

Randomness is also important in other activities such as cryptography and gambling, while pseudo-random numbers are of general importance in programming and computer science.

## Reliable sources of random numbers

### Tables of random numbers

Tables of random numbers have the desired properties no matter how chosen from the table: by row, column, diagonal or irregularly. The first such table was published by a student of Karl Pearson's in 1927, and since then a number of other such tables were developed. The first tables were generated through a variety of ways—one (by L.H.C. Tippett) took its numbers "at random" from census registers, another (by R.A. Fisher and Francis Yates) used numbers taken "at random" from logarithm tables, and in 1939 a set of 100,000 digits were published by M.G. Kendall and B. Babington Smith produced by a specialized machine in conjunction with a human operator. In the mid-1940s, the RAND Corporation set about to develop a large table of random numbers for use with the Monte Carlo method, and using a hardware random number generator produced A Million Random Digits with 100,000 Normal Deviates. The RAND table used electronic simulation of a roulette wheel attached to a computer, the results of which were then carefully filtered and tested before being used to generate the table. The RAND table was an important breakthrough in delivering random numbers because such a large and carefully prepared table had never before been available (the largest previously published table was ten times smaller in size), and because it was also available on IBM punch cards, which allowed for its use in computers. In the 1950s, a hardware random number generator named ERNIE was used to draw British lottery numbers.

The first "testing" of random numbers was developed by M.G. Kendall and B. Babington Smith in the late 1930s, and was based upon looking for certain types of probabilistic expectations in a given sequence. The simplest test looked to make sure that roughly equal numbers of 1s, 2s, 3s, etc. were present; more complicated tests looked for the number of digits between successive 0s and compared the total counts with their expected probabilities. Over the years more complicated tests were developed. Kendall and Smith also created the notion of local randomness , whereby a given set of random numbers would be broken down and tested in segments. In their set of 100,000 numbers, for example, two of the thousands were somewhat less "locally random" than the rest, but the set as a whole would pass its tests. Kendall and Smith advised their readers not to use those particular thousands by themselves as a consequence.

If carefully prepared, the filtering and testing processes remove any noticeable bias or asymmetry from the hardware-generated original numbers so that such tables provide the most 'reliable' random numbers available to the casual user. But note that any published table (and in fact any previously prepared table at all) are unusable for cryptographic purposes since the existence of the public (or private) table provides a way for an attacker to break any cryptosystem using the random numbers as an input. In short, the numbers in such tables are not unpredictable; they can be stolen or copied by an attacker.

### Hardware random-number generators

Some physical phenomena, such as thermal noise in zener diodes appear to be truly random and can be used as the basis for hardware random number generators. However, many mechanical phenomena feature asymmetries and biases that make their outcomes not truly random. The many successful attempts to exploit such phenomena by gamblers, especially in roulette and blackjack are testimony to these effects.

There are several imaginative sources of random numbers online, most notable perhaps is LavaRand which creates random numbers from images taken of a lava lamp. Random.org has a more obvious approach of listening to atmospheric noise. Details about how they turn their input into random numbers can be found on their respective sites.

## Sources that approximate random numbers

### Pseudo-random numbers

Pseudo-random number generators (PRNGs) are algorithms that can automatically create long runs (for example, millions of numbers long) with good random properties but eventually the sequence repeats exactly (or the memory usage grows without bound). One of the most common PRNG is the linear congruential generator which uses the recurrence

[itex]X_{n+1} = aX_n + b \pmod m[itex]

to generate numbers. The maximum number of numbers the formula can produce is the modulus, m. See the article in question for more details. Another, much earlier method of determining random number was the so-called middle square method. The method is quite simple to understand, it is however not a great generator. You take the previous number, square it and extract the middle part of the square to use as the next number.

They are very useful in developing Monte Carlo simulations as debugging is facilitated by the ability to run the same sequence of random numbers again by starting from the same seed. They are also used in cryptography so long as the seed is secret. Sender and receiver can generate the same set of numbers automatically to use as keys.

#### Random enough

The generation of pseudo-random numbers is an important and common task in computer programming. While cryptography and certain numerical algorithms require a very high degree of apparent randomness, many other operations only need a modest amount of unpredictability. Some simple examples might be presenting a user with a "Random Quote of the Day", or determining which way a villain might move in a computer game. Weaker forms of randomness are also closely associated with hash algorithms and in creating amoritized searching and sorting algorithms.

### Hardware random-number generators

Many mechanical methods of generating random numbers tend to be unreliable. Hardware random number generators need much care to ensure adequate mixing and should be checked for randomness before use.

## Testing random numbers

The first tests for random numbers were published by M.G. Kendall and B. Babington Smith in the Journal of the Royal Statistical Society in 1938. They were built on statistical tools such as Pearson's chi-square test which were developed in order to distinguish whether or not experimental phenomena matched up with their theoretical probabilities (Pearson developed his test originally by showing that a number of dice experiments by W.F.R. Weldon did not display "random" behavior).

Kendall and Smith's original four tests were hypothesis tests, which took as their null hypothesis the idea that each number if a given random sequence had an equal chance of occuring, and that various other patterns in the data should be also distributed equiprobably.

Their first test, the frequency test, was very basic: checking to make sure that there were roughly the same number of 0s, 1s, 2s, 3s, etc. The second test, the serial test, did the same thing but for sequences of two digits at a time (00, 01, 02, etc.), comparing their observed frequencies with their hypothetical predictions were they equally distributed. The third test, the poker test, tested for certain sequences of five numbers at a time (aaaaa, aaaab, aaabb, etc.) based on hands in the game poker. The fourth test, the gap test, looked at the distances between 0s (00 would be a distance of 0, 010 would be a distance of 1, 02250 would be a distance of 3, etc.). If a given sequence was able to pass all of these tests within a given degree of significance (generally 5%), then it was judged to be, in their words "locally random". Kendall and Smith differentiated "local randomness" from "true randomness" in that many sequences generated with truly random methods might not display "local randomness" to a given degree — very large sequences might contain many rows of a single digit. This might be "random" on the scale of the entire sequence, but in a smaller block it would not be "random" (it would not pass their tests), and would be useless for a number of statistical applications.

As random number sets became more and more common, more tests, of increasing sophistication were used. Some modern tests plot random digits as points on a three-dimensional plane, which can then be rotated to look for hidden patterns. In 1995, the statistician George Marsaglia created a set of tests known as DIEHARD which he distributes with a CD-ROM of 5 billion pseudorandom numbers. (http://stat.fsu.edu/pub/diehard/)

## References

• M.G. Kendall and B. Babington Smith, "Randomness and Random Sampling Numbers," Journal of the Royal Statistical Society 101:1 (1938), 147-166.

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy