|
Introduction
Most nature processes are at random. From an infinite packaged order -the Big Bang- the Universe is relentlessly going to chaos, total absence of life, of movement. However randomness in lines, planes and volumes at a certain scale, seen from “upwards” may look like ordered, neat, with well defined lines surfaces and volumes. Dead tend to make all things the same. However, following a sequence of infinite “however”, we suppose that our Universe is causal and as such everything that happens in a system, either open or closed, is supposed to be “thermodynamically” deterministic, that is nature processes would be at last “pseudo random”, with all events being “at large” (at God scale perhaps?) predictable.
Pseudo Random Numbers: Randomness could be emulated by a computer program, for instance to generate (if binary) a sequence of (2^128 -1) strings of 128 bits each, all different but “at large” cycling in a predetermined sequence, from a given “seed” to go back to the same seed after (2^128-1) steps. They are predictable in the sense that given one string you may “guess” the string m steps ahead via a mathematic transformation, that is they are pseudo random generated by a virtual machines with (2^n – 1) different states S(j) at “time-steps” j’s. All those states are predictable from a given “seed” S(0) by an expression of the following type: S(j) = (T^j).S(0) where T^j is the “j power” of a unitary transformation T0 that obtain S(j+1) from S(j).
Random numbers are used for games, operations research, simulation, scientific experimenting and for the creation of crypto keys. In these cases, by obvious reasons, numbers should be unpredictable or at least extremely difficult to predict/unveil!. Ideally crypto keys should be created by True Random Numbers generators. True random numbers are those generated by nature process, for instance the position of a gel particle within a Brownian movement. Most nature processes are “entropy sources” in the sense that entropy is a state function.
Each state, for instance an ice cube, meanwhile its temperature is T has certain entropy that depends only of the temperature, no matter how the ice cube was formed. In the other extreme of complexity we may imagine a man of today with entropy that would depends on his “state” as a man alive, no matter how he achieves this state. Leaving this ice cube in an environment at a higher temperature, let’s say 20 Centigrade degrees, it will experiment a “degradation” becoming water, dying as ice.
If we isolate the ice cube and the room where the ice cube is on a table and consider both as a system we may measure that the air around the ice will frozen a little and we note that the ice cube start to melt becoming water that slides along the table wasting some energy: thermal energy transforms in kinetic energy. When a new equilibrium state is attained many things changed, no more ice cube being the most substantial change!. Globally we intuit that the order of the Universe, even though infinitesimally, has been degraded. This degradation is measured by the increase of its entropy, the more entropy the more the disorder is. In some extent entropy is also considered the “arrow of time” because as we go forward in time the entropy of an isolated system can only increase or remains the same, never decrease!. Entropy is in this concern the clock of the Universe.
Our concern is particularly related to Information Theory where entropy means how much randomness (or uncertainty) an event has. If we have to guess a number (or any object) out of a sequence of 16 our uncertainty “degree of ignorance” has a value of 16 if all numbers are equally probable. Now we are going to program a sort of “guessing game” with the aid of an informant that will provide us information -in the most economic way imagined- in order to reduce our ignorance gradually (we were tempted to say “bitwise”). First of all to impress the “public” we have to organize a little the scenario: the objects (letters) will be presented along a sequence in positions 0 to 15, representing pairs [position object] as depicted below:
|
Position |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
|
Number |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
|
Letter |
z |
% |
4 |
a |
h |
w |
c |
* |
# |
1 |
& |
5 |
9 |
@ |
+ |
= |
Let’s program now the game supposing that we ignore the positional binary system. At most we were trained to know that a 0 means that something is in the lower half of a sequence and that a 1 means the contrary. To make things easier our game will deal with sets having 2^n objects, 2, 4, 8, 16, 32… 2^n, being n a natural number.
a) Someone of the public selects an object and informs our assistant either the object or the position chosen.
b) Our assistant must provide us the minimum but sufficient information to “guess” the chosen object, of course avoiding any unfair form.
Be the object chosen the number 11 (or the symbol &). Our assistant must provide us pieces of information to reduce our “ignorance” progressively; let’s say along a binary progression of the following type: 16 => 8 => 4 => 2 => 1. If now our assistant whisper the binary sequence [1010] in our ears to inform us that the right answer is the symbol located in position 10 in decimal we may say that we passed from an uncertainty measured by 16 to the certainty, namely from 16 to 1. The same information could be obtained progressively if previously we agreed with our assistant the procedure (as discussed above) : 1 meaning that the answer is in the upper half of the uncertainty range. So the sequence [1010] will mean:
Step 1: [1] the number is in the upper half [9 10 11 12 13 14 15 16] being 8 the level of our uncertainty after receiving the first “bit” of information.
Step2: [0] Once received the second bit as 0, according to the agreed procedure, our uncertainty is now reduced to 4, being certain that our number will be in the lower half, among numbers [9 10 11 12];
Step 3: [1] Our informant whisper now the third bit as a 1 and we know then that our number will be one of two, namely [11 12];
Step 4: [0] finally our informant whisper the fourth bit that permit us to be certain that the right answer is 11!.
Shannon defined Information Entropy as a variable related to the probabilities of events, outcomes of a random process like for instance flipping a coin. If all the symbol outcomes are equally likely then increasing the number of symbols should the entropy increases.
Information - Formal definition
Claude E. Shannon
in its Mathematical Theory of Communications (a reprint from The Bell System Technical Journal) defines entropy H(x) in terms of a discrete random event x, with n possible states or outcomes as:
Where p(i) are probabilities of outcomes i’s. In our last examples n1=16, p1(i) = 1/16 for i=1, 2, 3,…, 16, if all outcomes were equally probable resulting H1=4, and if n2=2, (flipping a coin) p2(0-Head)= ½ , p2(1-Tail) = ½ , giving H1=1, being H in both cases the number of bits necessary to represent precisely the uncertainty of the guessing. Effectively with four bits we may represent 16 equally different outcomes, from 0000 to 1111 and with one bit the two flipping of coin outcomes, 0 for Head and 1 for Tail.
True Random Numbers
There are many Internet services providing True Random Numbers, using natural sources like radiation, Brownian movement, and quantum mechanics effects. Below we list some of them:
1. HotBits is a site from Switzerland at FermiLab. They said textually about themselves:
HotBits is an Internet resource that brings genuine random numbers, generated by a process fundamentally governed by the inherent uncertainty in the quantum mechanical laws of nature, directly to your computer in a variety of forms. HotBits are generated by timing successive pairs of radioactive decays detected by a Geiger-Müller tube interfaced to a computer.
The
HotBits generation hardware
produces data at a modest rate (about 30 bytes per second). Once the random bytes are delivered, they are immediately discarded--the same data will never be sent to any other user and no records are kept of the data at this or any other site.
A really good source of entropy is a radioactive source. The points in time at which a radioactive source decays are completely unpredictable, and can be sampled and fed into a computer, avoiding any buffering mechanisms in the operating system. In fact, this is what the HotBits people at Fermilab in Switzerland are doing.
Another sources of entropy could be atmospheric noise from a radio, like that used here at
random.org
, or even just background noise from an office or laboratory. The
lavarand
people at Silicon Graphics have been clever enough to use lava lamps to generate random numbers, so their entropy source not only gives them entropy, it also looks good! The latest random number generator to come online (both lavarand and HotBits precede random.org) is Damon Hart-Davis'
Java EntropyPool
which gathers random bits from a variety of sources including HotBits and random.org, but also from web page hits received by the EntropyPool's web server.
2. The
LavaRND people
at Silicon Graphics maintain a site that provides random numbers from lava lamps (kidding!). They proudly announce that LavaRND is a cryptographically sound random number generator. At its heart, they use not lava but some small thermal noise sources like some chips easy to find in Web cams like CCD.
3.
Random.org
. They use atmospheric noise to generate random numbers. They proudly announce themselves as the only service which offers a large (16K) block of numbers at once. They tuned a radio into a frequency where nobody is transmitting.
The atmospheric noise picked up by the receiver is fed into a workstation through the microphone port where it is sampled by a program at a given rate and the upper seven bits of each sample are discarded. The remaining bits are turned into a stream of bits with supposedly has a high entropy value. Skew Correction is performed on the bit stream, in order to ensure that there is an approximately even distribution of 0s and 1s.
4.
ID Quantique
. Photons - light particles - are sent one by one onto a semi-transparent mirror and detected. The exclusive events (reflection - transmission) are associated to "0" - "1" bit values. The operation of Quantis is continuously monitored to ensure immediate detection of a failure and disabling of the random bit stream. A High bit rate of 4Mbits/sec (up to 16Mbits/sec for PCI card) could be obtained.
The Quantum random bits generation process. See
its justification
.
Random Numbers Products and Manufacturers
The Marsaglia CD-ROM
One of the best sources of random numbers is the
George Marsaglia
CD-ROM containing 600 megabytes of random numbers. These were produced by many sources (for example rap music).
Some suspected key ideas were used on it such as “if we have two collections of all possible bytes intentionally unordered (as octets) XOR them may kill all possible remaining order”. Marsaglia used three hardware sources identified by region origin: Canada, Germany and California.
Note: Some randomness gurus have found serious failures to this generator. One source of error could be in programs that manage data because some tricky details. For instance one of these experts found that Canadian and German generators failed when tested and inspecting the series they discover the following pattern: each byte containing a 10 was preceded by a byte containing a 13!. What could have happened?. One explanation was: when you write a program to write bytes to disk, you must open it as binary. If not the program thinks you are trying to write an end-of-line and the convention on a PC is to represent this by a carriage return (13) followed by a line feed (10).
Here are some manufacturers’ web sites:
-
• Colorado
http://www.comscire.com/
-
• Holland
http://valley.interact.nl/av/com/orion/home.html
-
• Canada
http://www.tundra.com/
(find the data encryption section, then look under RBG1210. My device is an NM810 which is 2?8? RBG1210s on a PC card)
-
• Protego from Sweden,
http://www.protego.se
-
• ID Quantique,
http://www.idquantique.com/products/quantis.htm
Skew Correction
Recommended source: Request for Comments,
RFC 1750
We are going to describe here a De-Skew technique, originally due to von Neumann. Suppose the probability of getting a 0 is equal to ½ plus a given error e and that all bits are independent. To eliminate skew, John von Neumann suggested to following algorithm:
• Read the bits two at a time.
• Skip 00’s and 11’s pairs.
• For 01 and 10 outputs the first bit.
Meaning to exam a bit stream as a sequence of non-overlapping pairs. We could then discard any 00 or 11 pairs found, interpret 01 as a 0 and 10 as a 1. Assume the probability of a 1 is 0.5 + e and the probability of a 0 is 0.5 - e where e is the eccentricity of the source. Then the probability of each pair is as follows:
|
pair
|
probability
|
|
00
|
(0.5 – e)^2 = 0.25 – e – e^2
|
|
01
|
(0.5 – e)*(0.5 + e) = 0.25 – e^2
|
|
10
|
(0.5 + e)*(0.5 - e) = 0.25 – e^2
|
|
11
|
(0.5 + e)^2 = 0.25 + e + e^2
|
So the probability of a 0 in the de-skewed sequence is the same as for a 1!. This technique will completely eliminate any bias but at the expense of taking an indeterminate number of input bits for any particular desired number of output bits. The probability of any particular pair being discarded is 0.5 + 2e^2 so the expected number of input bits to produce X output bits is X/(0.25 - e^2).
Defining Randomness
Sequences at random
Let’s consider infinite sequences of x’s pertaining to a GF2
.
Definition 1: Density D(x, n)
D(x, n) = 1/n. ∑ (x(i)), for i<n
Where the ∑ operator stands for Summa extended over i. This is like the “average” of ones. The limiting density of x’s is Lim(n)D(x, n) if that limit exists.
A true random sequence should have a limiting density ½ (though one might wonder why the limit should actually exist in the strict sense of convergence in mathematic analysis sense).
In fact, one would look for a certain degree of convergence: the D(x, n) should tend to ½
even though not too rapidly.
Definition 2: An infinite sequence of x’s pertaining to a GF2 is Mises random if the limiting density of any subsequence (xij) is ½ where the subsequence is selected by a rule named Auswahlregel.
For example, all the following sequences should have limiting density ½.
x0, x1, x2, . . . , xn, . . .
x0, x2, x4, . . . , x2n, . . .
x1, x4, x7, . . . , x3n+1, . . .
x0, x1, x4, . . . , xn2, . . .
Auswahlregeln Rule
This rule must be defined as a function N => N without any knowledge of x: otherwise
we can simply pick a subsequence of all 0’s. Now suppose we have a countable system of Auswahlregeln and our sequence passes all these tests. In other words, we have accounted for many increasing functions f : N => N and for each of these: lim(n)D((xf(n)), n) = ½ apply. Then we can think the sequence of x’s as being true random.
Some random numbers tables
1000 numbers ranged between 1 and 10^6 - First Half
|
402731 |
355255 |
154123 |
584556 |
751892 |
903723 |
25205 |
760565 |
897040 |
213990 |
|
441393 |
677505 |
513222 |
477979 |
575753 |
153682 |
927392 |
462215 |
326299 |
154300 |
|
7113 |
538794 |
659094 |
940086 |
739828 |
164042 |
345218 |
924871 |
276389 |
877453 |
|
112344 |
699253 |
160452 |
245731 |
935663 |
131555 |
97529 |
453863 |
237865 |
446778 |
|
582918 |
659913 |
579583 |
267023 |
236868 |
776264 |
579878 |
158904 |
245705 |
314755 |
|
242246 |
644370 |
29066 |
55101 |
727869 |
10294 |
713041 |
664440 |
47941 |
581262 |
|
106445 |
963934 |
422305 |
389524 |
858656 |
312743 |
463920 |
143510 |
626996 |
75941 |
|
44558 |
322874 |
665393 |
370611 |
884583 |
700677 |
644060 |
841021 |
455419 |
285948 |
|
852984 |
651258 |
181086 |
601009 |
311878 |
348734 |
364168 |
817977 |
278107 |
638529 |
|
503425 |
222469 |
879648 |
542972 |
36080 |
72762 |
220882 |
229886 |
511935 |
113689 |
|
820528 |
65764 |
771115 |
868606 |
297175 |
886905 |
822500 |
895128 |
103563 |
82383 |
|
454158 |
540267 |
436079 |
20744 |
10822 |
967651 |
544366 |
804219 |
451866 |
861483 |
|
973190 |
671425 |
239898 |
457862 |
271177 |
301561 |
594764 |
467726 |
624052 |
513146 |
|
651157 |
26436 |
969007 |
730913 |
564395 |
402834 |
569136 |
716368 |
195069 |
982773 |
|
941901 |
502563 |
94654 |
766071 |
433906 |
695429 |
168568 |
957576 |
728272 |
148213 |
|
564964 |
731952 |
555924 |
755902 |
244080 |
250173 |
889551 |
96071 |
421689 |
313064 |
|
244314 |
167124 |
910755 |
109523 |
879437 |
869055 |
141570 |
221321 |
885127 |
792362 |
|
234977 |
366080 |
467802 |
234934 |
971705 |
389998 |
312822 |
502094 |
970228 |
508835 |
|
58146 |
707416 |
176557 |
411654 |
84764 |
988424 |
125248 |
106819 |
229198 |
302001 |
|
635963 |
851857 |
436919 |
875096 |
116619 |
170512 |
131842 |
205997 |
955607 |
786409 |
|
798168 |
616861 |
920186 |
747493 |
371193 |
89771 |
99944 |
701803 |
337795 |
321898 |
|
335517 |
629836 |
76756 |
32623 |
561271 |
765490 |
573580 |
903601 |
416345 |
585323 |
|
942989 |
541129 |
110975 |
10323 |
514031 |
491242 |
34238 |
374835 |
334675 |
627431 |
|
651885 |
743818 |
304934 |
663833 |
998674 |
577361 |
313107 |
399227 |
64378 |
271688 |
|
956407 |
831546 |
109956 |
433681 |
280359 |
728984 |
831667 |
482074 |
247478 |
748427 |
|
80184 |
406763 |
5193 |
530165 |
841365 |
43432 |
689502 |
968111 |
767587 |
244725 |
|
663521 |
171253 |
27660 |
684025 |
882715 |
759566 |
26623 |
768598 |
19665 |
484674 |
|
339912 |
355548 |
535437 |
673437 |
868915 |
529508 |
436557 |
12535 |
386206 |
817160 |
|
165361 |
204483 |
393140 |
773686 |
392295 |
201825 |
760986 |
137368 |
435325 |
743865 |
|
465678 |
440884 |
605953 |
584710 |
851201 |
253401 |
966675 |
200924 |
985525 |
439910 |
|
929779 |
823883 |
874702 |
661457 |
274790 |
748398 |
957099 |
196545 |
409026 |
273619 |
|
13936 |
861360 |
427853 |
193261 |
469031 |
690423 |
643138 |
682794 |
573367 |
153231 |
|
208994 |
| |