Random Number Generators: True Randomness

Apr 16

11:00

2006

Karim J A Brathwaite

A survey of various random number generating algorithms are studied and presented. The presented algorithms are compared and contrasted for their various efforts to recreate the events of randomality. This brings into question, the essence of randomality and the feasibility of its pursuit by these and other algorithms.

I Introduction:

Random (adj): a: lacking a definite plan, purpose, or pattern. b: made, done, or chosen at random c: relating to, having, or being elements or events with definite probability of occurrence. d: being or relating to a set or to an element of a set each of whose elements has equal probability of occurrence. [Oxford English Dictionary]

Before commencing deep discussion of the art of “true randomality”, it must first be made clear that true randomness is theoretically impossible by the defining principals of our universe. The definition above clearly defines the paradox that surrounds the concept of randomality when subject to probability. In essence we will claim that “truly random” is the state in which for a given set A, for any i, element in A, i if chosen at random, has a probability of [1/|A|] (where |x| denotes cardinality of the set) of occurrence. This is how we judge the “randomness” of a Random Number Generator (RNG(s)), by its ability to exploit probability; given a set A, a perfect RNG will not repeat an element before the set is exhausted; described as the period of a generator, its point of repetition.

I.i Various Uses of Random Number Generators:

Random numbers have a multitude of applications. Of particular interest to this study and intended future studies by the author is Cryptography. Many cryptographic protocols make use of RNGs, particularly, public key cryptography (RSA) and some implementations of symmetric ciphers (DES, Serpent). Besides cryptographic function however, RNGs are used in Simulations, for the realistic recreation of “natural” occurrences; in this case, computer games are qualified as simulations, in which RNGs are heavily used in conjunction with probability weights (Gaussian). They’re also used for integrity testing on various computer applications during development, even hardware tests, such as GPU to memory pipelines on AGP based graphic cards. Among those mentioned are many other uses and purposes for the development and “perfection” of making truly random choices.

I.ii Brief Algorithm Introduction:

Random number generating algorithms come in multiple flavors; these can be separated into two main groups, linear number generators (LNG(s)) and non-linear number generators (NLNG(s)). Each group contains multiple types of RNGs and each of these, have their purpose and uses. It is important to know that although not all generators are made equal, good generators have purposes for which other good generators are not suited to perform.

II Linear Congruential Generators:

LNGs ultimately generate a sequence of integers between 0 and a given modulus, for the equation Ij+1 = aIj + c (mod m). In this equation, m is used as the modulus, a is a multiplier, and c is an increment. The sequence will repeats within a period of m, where m is usually prime.

The advantage of the LNG is immediately noticed by the equation above; its fast, requiring one multiplication, one addition, and one modulus. This immediately lends itself to explaining its wide uses. This type of generator is used in a multitude of applications for which the nature of the random sequence doesn’t matter, only that it be a different sequence from execution to execution; Monte Carlo simulations for example. The speed and simplicity of the algorithm has also led to the amount of flavors developed for different applications. For instance, the equation above is the same equation that the ANSI C/C++ committee has dubbed for use with these languages, given appropriate values for a, c, and m. Besides the linear equation, there are also:

Quadratic LCG:
Xn = (aXn-12 + bXn-1 + c) mod m

Cubic LCG:
Xn = (aXn-13 + bXn-12 + cXn-1 + d) mod m

It is possible to force an LCG to be probabilistically correct, by creating uniform deviations. The problem with focusing on the deviation of an LCG is that a well deviated LCG will have an extended period, but its complexity increases, due to the deviation, sometimes beyond the useful range of LCGs. It is also possible to use deviations normalized in a given interval, such as Gaussian deviations, where the period is lengthened such that every integer in the given field is selected.

III Inversive Congruential Generators and Linear Feedback Shift Registers:

Defined as our non-linear generators, ICGs and EICGs, developed by Eichenauer and Lehn (1986, 1993) are defined by the following congruence:

yn+1 = ayn + b (mod m)

Where, for practicality, m is also chosen as a prime number in parallel to LCGs. These two generators are most notable due to their speed and period efficiency compromise. They produce relatively long periods, in (5 – 10x average) more time, than LCG.

III.i Linear Feedback Shift Registers:

LFSR generators are of more relevance here, as a suitable alternative to LCGs for cryptographic applications. Shift registers work on the concept of generating on the bit level, instead of in a base 10 finite field. They address the problem of creating uniform distribution of single random bits, with 0 and 1 equally probable.

LFSR generators are made up of two parts, the shift register and the feedback function. The shift register is the bit sequence, in which the size of the shift register is determined by the length of its bits. When bits are needed, the register is shifted 1 bit to the right, and the left bit is computed based on the remaining bits in the register. The simplest method of representing LFSRs feedback function is the XOR of key register bits. The method used by the feedback function is called tap sequence.

It’s obvious, that an n-bit LFSR can be in 2n -1 states; noting that the size of the register denotes its length. Therefore, a 4 bit register (1111 would be called a 4 bit register, based on the cardinality of the register, rather than the value of the high order bit, base 10) would yield 24-1, or 15 unique states. With an output sequence of the least significant bits following the shifts, the size of the register defines a period or 2n – 1 bits before repetition, therefore yielding a base 10 value of 22**n -1 as maximum value, and period. In order to reach this maximum period, a primitive polynomial must be formed by the tap sequence, where the degree of the polynomial yields the length of the shift register.

III.i.a Primitive Polynomials in Linear Feedback Shift Registers

A primitive Polynomial is defined as an irreducible polynomial P of degree d in [Z/p[x]] (denoting finite field of p[x] in Z) is primitive if P divides xd – 1 but does not divide xi – 1 for any integer i with 0 < i < d. Therefore, a polynomial P of degree d is primitive if and only if:

xn = 1 mod P
but,
xi != 1 mod P for 0 < i < d

It is not necessary to check all smaller exponential values than d, but only possible values from the divisors of degree d.

There is no easy method for generating primitive polynomials modulo 2 for a given degree. Therefore, choosing random polynomials and testing if its primitive is the most used method.

III.i.b Maximizing and Reaching the Period of an LFSR

As stated above, the period of any LFSR is based on the size of the shift register; the size of the shift register is based on the degree of the polynomial, therefore the period of the LFSR is determined by the polynomial used. Primitive polynomials are necessary here because if the polynomial used is not primitive, the period will be shortened, and there may be bad states which may shorten the period further. Therefore choosing a primitive polynomial is of dominant importance to any LFSR.

Given the polynomial, x8+x4+x3+x2+x which is primitive modulo 2, a maximal period can be produced. The first exponent is the length of the shift register, and all exponents except 1 and 0 are used to specify the tap sequence by the feedback function, where low degree terms correspond to the left most register bits. Therefore, in a given 8-bit shift register, a new bit is generated by XORing the 8th, 4th, 3rd and 2nd bits of the register together. The LFSR that results from this polynomial will have a precise period of 28-1.

IV Discussion:

The Linear Congruential Generators we presented here for an ease into the concept of random number generation, and for study of a variety of methods of generation. However, in the domain of cryptography, LCGs are all but irrelevant. LFSR, and their “cousins” Non-Linear Feedback Shift Registers are of more relevance to the authors interest. Besides the methods described above, another interesting method of generating random numbers are also worth brief mention.

Symmetric block ciphers are also capable of producing randomality. Beginning with block product ciphers like LUCIFER and DES, the use of block ciphers to produce random bits, although many times slower than LCG and even LFSR, yields considerably random bit sequences. The more recent successor to DES, AES (Rijndael algorithm) and even the runner up contender to AES, Serpent, have also been used for the generation of random bits.

Essentially, the philosophical debate of the concept of randomality displays the obvious difficulty in constructing a truly random algorithm. The fact that most random generation algorithms are based on random sequences, be they number sequences or bit sequences, undermines the term random from the very first line of code and comments justifying its feasibility. However, random sequences are frequently used, and make amply use of the ignorance factor. As long as new methods of seeding and new primitive polynomials are found, for which we have an infinite supply, the number generators above will remain in use. The possible existence of a truly random number algorithm is questionable, but with many algorithms and sequences still remaining unbroken and uncharted, that existence cannot be denied.

Article "tagged" as: