Pseudorandom number generator
|
A pseudorandom number generator (PRNG) is an algorithm that generates a sequence of numbers, the elements of which are approximately independent of each other.
The outputs of pseudorandom number generators are not truly random—they only approximate some of the properties of random numbers. John von Neumann emphasized this with the remark "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."1 While "truly" random numbers can be generated using hardware random number generators, pseudorandom numbers are a critical part of modern computing, from cryptography to the Monte Carlo method for simulating physical systems. Careful mathematical analysis is required to ensure that the generated numbers are sufficiently "random;" as Robert R. Coveyou of Oak Ridge National Laboratory once titled an article, "The generation of random numbers is too important to be left to chance."2
Most such algorithms attempt to produce samples that are uniformly distributed. Common classes of algorithms are linear congruential generators, lagged Fibonacci generators, linear feedback shift registers and generalised feedback shift registers. Recent instances of algorithms include Blum Blum Shub, Fortuna, and the Mersenne Twister.
Contents |
Inherent nonrandomness
Because any PRNG run on a deterministic computer (contrast quantum computer) is a deterministic algorithm, its output will inevitably have one property that a true random sequence would not exhibit: guaranteed periodicity. It is certain that if the generator uses only a fixed amount of memory then, given a sufficient number of iterations, the generator will revisit the same internal state twice, after which it will repeat forever. A generator that isn't periodic can be designed, but its memory requirements would grow as it ran. In addition, a PRNG can be started from an arbitrary starting point, or seed state, and will always produce an identical sequence from that point on. The practical significance of this periodicity is limited. The length of the maximum period doubles with each bit of added memory. It is easy to build PRNGs with periods so long that no computer could complete one cycle in the expected lifetime of the universe. It is an open question, and one central to cryptography, whether there is any way to distinguish the output of a well designed PRNG from perfect random noise without knowing its seed. Most cryptography relies on the assumption that it is infeasible to distinguish a suitable PRNG from noise; the simplest example is a stream cipher, which (often) works by taking the exclusive or of the secret message with the output of a PRNG. The design of such PRNGs is extremely difficult, and most programs use much simpler PRNGs.
In practice, many common PRNGs exhibit artifacts which can cause them to fail statistically significant tests. These include, but are certainly not limited to:
- Shorter than expected periods for some seed states (not full period)
- Poor dimensional distribution
- Successive values are not independent
- Some bits may be 'more random' than others
- Lack of uniformity
Defects exhibited by a flawed PRNG may range from unnoticeable to ridiculous. The RANDU random number algorithm used for decades on mainframe computers was flawed, and much research work of that time is less reliable than it should have been as a result. Sometimes, but not always, modeling problems are noticed and lead to improvements in PRNGs.
Early approaches
The first PRNG was suggested by John von Neumann in 1946, known as the middle-square method. The method was very simple: take any given number, square it, and remove the middle digits of the resulting number as your "random number" and the seed for your next operation. For example, squaring the number "1111" would result in "1234321", which would provide "3432" as the "random" number. Repeating process would give you "7786" as the next result, and so on. Von Neumann used numbers of 10 digits in length, but the process was the same.
The problem with the "middle square" method is that even the best sequences must eventually repeat themselves, some doing so very quickly. Some sequences, such as "0000", will destroy the process immediately. Von Neumann was aware of such things, but for his purposes he found it sufficient and its errors easy to detect (he was worried that mathematical "fixes" would simply hide the errors rather than get rid of them). On the ENIAC computer which he was using, using the "middle square" method generated numbers at a rate some two hundred times more quickly than reading numbers from punch cards. In Von Neumann's view, hardware random number generators would not work, because if they did not record the numbers generated, they could not later be tested for errors. If they did record their numbers, they would tax the computer's memory and its ability to read and write numbers. If the numbers were written to cards, they would take too long to read. The middle-sequare method was simple and worked for his purposes, but later applications of the Monte Carlo method would require more elaborate generators.
Mersenne twister
The recent invention of the Mersenne twister algorithm, by Makoto Matsumoto and Takuji Nishimura in 1997, avoids most of these problems. It has a colossal period of 219937-1 iterations, is proven to be equidistributed in 623 dimensions (for 32-bit values), and runs faster than all but the least statistically desirable generators. It is now increasingly becoming the "random number generator of choice" for statistical simulations and generative modeling.
However, it is possible to efficiently analyze the output of the Mersenne Twister and recognize the numbers as being non-random (as with the Berlekamp-Massey algorithm or an extension from it, such as the Reed-Sloane algorithm). At this time this has no known negative effects except making the Mersenne Twister unusable as cryptographically secure PRNG.
Cryptographically secure pseudorandom number generators
- Main article: Cryptographically secure pseudo-random number generator
A PRNG suitable for cryptographic applications is called a cryptographically secure PRNG (CSPRNG). Its output should not only pass all statistical tests for randomness but satisfy some additional cryptographic requirements.
Some classes of CSPRNGs include the following:
- Stream ciphers or block ciphers running in counter or output feedback mode.
- Special designs with a security proof. For example Blum Blum Shub has a strong conditional security proof, though it is slow.
- PRNGs that have been designed specifically to be cryptographically secure. One example is ISAAC, which is fast and whose security recommendations feature, among others, a very large expected cycle time.
See also
- Pseudo-random binary sequence
- Randomized algorithm
- Random number generator
- Random number generator attack
- List of pseudorandom number generators
- Hardware random number generator
- randomness
Notes
1 "Various techniques used in connection with random digits", Applied Mathematics Series, no. 12, 36-38 (1951).
2 Peterson. Ivars. The Jungles of Randomness: A Mathematical Safari. Wiley, NY, 1998. (pp. 178) ISBN 0-471-16449-6
References
- Donald E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms, 3rd edition (Addison-Wesley, Boston, 1998).
- J. Viega, Practical Random Number Generation in Software (http://www.acsac.org/2003/papers/79.pdf), in Proc. 19th Annual Computer Security Applications Conference, Dec. 2003.
- John von Neumann, "Various techniques used in connection with random digits," in A.S. Householder, G.E. Forsythe, and H.H. Germond, eds., Monte Carlo Method, National Bureau of Standards Applied Mathematics Series, 12 (Washington, D.C.: U.S. Government Printing Office, 1951): 36-38.
External links
- The GNU Scientific Library (http://www.gnu.org/software/gsl/). A free (GPL) C library that includes a number of PRNG algorithms.de:Pseudozufallszahlengenerator
ja:擬似乱数 ru:Генератор псевдослучайных чисел fi:Nennissatunnaislukugeneraattori zh:随机函数