Introduction
Most nature processes are at random. From an infinite packaged order the Big Bang the Universe is relentlessly going to chaos, total absence of life, of movement. However randomness in lines, planes and volumes at a certain scale, seen from “upwards” may look like ordered, neat, with well defined lines surfaces and volumes. Dead tend to make all things the same. However, following a sequence of infinite “however”, we suppose that our Universe is causal and as such everything that happens in a system, either open or closed, is supposed to be “thermodynamically” deterministic, that is nature processes would be at last “pseudo random”, with all events being “at large” (at God scale perhaps?) predictable.
Pseudo Random Numbers: Randomness could be emulated by a computer program, for instance to generate (if binary) a sequence of (2^128 1) strings of 128 bits each, all different but “at large” cycling in a predetermined sequence, from a given “seed” to go back to the same seed after (2^1281) steps. They are predictable in the sense that given one string you may “guess” the string m steps ahead via a mathematic transformation, that is they are pseudo random generated by a virtual machines with (2^n – 1) different states S(j) at “timesteps” j’s. All those states are predictable from a given “seed” S(0) by an expression of the following type: S(j) = (T^j).S(0) where T^j is the “j power” of a unitary transformation T0 that obtain S(j+1) from S(j).
Random numbers are used for games, operations research, simulation, scientific experimenting and for the creation of crypto keys. In these cases, by obvious reasons, numbers should be unpredictable or at least extremely difficult to predict/unveil!. Ideally crypto keys should be created by True Random Numbers generators. True random numbers are those generated by nature process, for instance the position of a gel particle within a Brownian movement. Most nature processes are “entropy sources” in the sense that entropy is a state function.
Each state, for instance an ice cube, meanwhile its temperature is T has certain entropy that depends only of the temperature, no matter how the ice cube was formed. In the other extreme of complexity we may imagine a man of today with entropy that would depends on his “state” as a man alive, no matter how he achieves this state. Leaving this ice cube in an environment at a higher temperature, let’s say 20 Centigrade degrees, it will experiment a “degradation” becoming water, dying as ice.
If we isolate the ice cube and the room where the ice cube is on a table and consider both as a system we may measure that the air around the ice will frozen a little and we note that the ice cube start to melt becoming water that slides along the table wasting some energy: thermal energy transforms in kinetic energy. When a new equilibrium state is attained many things changed, no more ice cube being the most substantial change!. Globally we intuit that the order of the Universe, even though infinitesimally, has been degraded. This degradation is measured by the increase of its entropy, the more entropy the more the disorder is. In some extent entropy is also considered the “arrow of time” because as we go forward in time the entropy of an isolated system can only increase or remains the same, never decrease!. Entropy is in this concern the clock of the Universe.
Our concern is particularly related to Information Theory where entropy means how much randomness (or uncertainty) an event has. If we have to guess a number (or any object) out of a sequence of 16 our uncertainty “degree of ignorance” has a value of 16 if all numbers are equally probable. Now we are going to program a sort of “guessing game” with the aid of an informant that will provide us information in the most economic way imagined in order to reduce our ignorance gradually (we were tempted to say “bitwise”). First of all to impress the “public” we have to organize a little the scenario: the objects (letters) will be presented along a sequence in positions 0 to 15, representing pairs [position object] as depicted below:
Position 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
Number 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
Letter 
z 
% 
4 
a 
h 
w 
c 
* 
# 
1 
& 
5 
9 
@ 
+ 
= 
Let’s program now the game supposing that we ignore the positional binary system. At most we were trained to know that a 0 means that something is in the lower half of a sequence and that a 1 means the contrary. To make things easier our game will deal with sets having 2^n objects, 2, 4, 8, 16, 32… 2^n, being n a natural number.
a) Someone of the public selects an object and informs our assistant either the object or the position chosen.
b) Our assistant must provide us the minimum but sufficient information to “guess” the chosen object, of course avoiding any unfair form.
Be the object chosen the number 11 (or the symbol &). Our assistant must provide us pieces of information to reduce our “ignorance” progressively; let’s say along a binary progression of the following type: 16 => 8 => 4 => 2 => 1. If now our assistant whisper the binary sequence [1010] in our ears to inform us that the right answer is the symbol located in position 10 in decimal we may say that we passed from an uncertainty measured by 16 to the certainty, namely from 16 to 1. The same information could be obtained progressively if previously we agreed with our assistant the procedure (as discussed above) : 1 meaning that the answer is in the upper half of the uncertainty range. So the sequence [1010] will mean:
Step 1: [1] the number is in the upper half [9 10 11 12 13 14 15 16] being 8 the level of our uncertainty after receiving the first “bit” of information.
Step2: [0] Once received the second bit as 0, according to the agreed procedure, our uncertainty is now reduced to 4, being certain that our number will be in the lower half, among numbers [9 10 11 12];
Step 3: [1] Our informant whisper now the third bit as a 1 and we know then that our number will be one of two, namely [11 12];
Step 4: [0] finally our informant whisper the fourth bit that permit us to be certain that the right answer is 11!.
Shannon defined Information Entropy as a variable related to the probabilities of events, outcomes of a random process like for instance flipping a coin. If all the symbol outcomes are equally likely then increasing the number of symbols should the entropy increases.
Information  Formal definition
Claude E. Shannon
in its Mathematical Theory of Communications (a reprint from The Bell System Technical Journal) defines entropy H(x) in terms of a discrete random event x, with n possible states or outcomes as:
Where p(i) are probabilities of outcomes i’s. In our last examples n1=16, p1(i) = 1/16 for i=1, 2, 3,…, 16, if all outcomes were equally probable resulting H1=4, and if n2=2, (flipping a coin) p2(0Head)= ½ , p2(1Tail) = ½ , giving H1=1, being H in both cases the number of bits necessary to represent precisely the uncertainty of the guessing. Effectively with four bits we may represent 16 equally different outcomes, from 0000 to 1111 and with one bit the two flipping of coin outcomes, 0 for Head and 1 for Tail.
True Random Numbers
There are many Internet services providing True Random Numbers, using natural sources like radiation, Brownian movement, and quantum mechanics effects. Below we list some of them:
1. HotBits is a site from Switzerland at FermiLab. They said textually about themselves:
HotBits is an Internet resource that brings genuine random numbers, generated by a process fundamentally governed by the inherent uncertainty in the quantum mechanical laws of nature, directly to your computer in a variety of forms. HotBits are generated by timing successive pairs of radioactive decays detected by a GeigerMüller tube interfaced to a computer.
The
HotBits generation hardware
produces data at a modest rate (about 30 bytes per second). Once the random bytes are delivered, they are immediately discardedthe same data will never be sent to any other user and no records are kept of the data at this or any other site.
A really good source of entropy is a radioactive source. The points in time at which a radioactive source decays are completely unpredictable, and can be sampled and fed into a computer, avoiding any buffering mechanisms in the operating system. In fact, this is what the HotBits people at Fermilab in Switzerland are doing.
Another sources of entropy could be atmospheric noise from a radio, like that used here at
random.org
, or even just background noise from an office or laboratory. The
lavarand
people at Silicon Graphics have been clever enough to use lava lamps to generate random numbers, so their entropy source not only gives them entropy, it also looks good! The latest random number generator to come online (both lavarand and HotBits precede random.org) is Damon HartDavis'
Java EntropyPool
which gathers random bits from a variety of sources including HotBits and random.org, but also from web page hits received by the EntropyPool's web server.
2. The
LavaRND people
at Silicon Graphics maintain a site that provides random numbers from lava lamps (kidding!). They proudly announce that LavaRND is a cryptographically sound random number generator. At its heart, they use not lava but some small thermal noise sources like some chips easy to find in Web cams like CCD.
3.
Random.org
. They use atmospheric noise to generate random numbers. They proudly announce themselves as the only service which offers a large (16K) block of numbers at once. They tuned a radio into a frequency where nobody is transmitting.
The atmospheric noise picked up by the receiver is fed into a workstation through the microphone port where it is sampled by a program at a given rate and the upper seven bits of each sample are discarded. The remaining bits are turned into a stream of bits with supposedly has a high entropy value. Skew Correction is performed on the bit stream, in order to ensure that there is an approximately even distribution of 0s and 1s.
4.
ID Quantique
. Photons  light particles  are sent one by one onto a semitransparent mirror and detected. The exclusive events (reflection  transmission) are associated to "0"  "1" bit values. The operation of Quantis is continuously monitored to ensure immediate detection of a failure and disabling of the random bit stream. A High bit rate of 4Mbits/sec (up to 16Mbits/sec for PCI card) could be obtained.
The Quantum random bits generation process. See
its justification
.
Random Numbers Products and Manufacturers
The Marsaglia CDROM
One of the best sources of random numbers is the
George Marsaglia
CDROM containing 600 megabytes of random numbers. These were produced by many sources (for example rap music).
Some suspected key ideas were used on it such as “if we have two collections of all possible bytes intentionally unordered (as octets) XOR them may kill all possible remaining order”. Marsaglia used three hardware sources identified by region origin: Canada, Germany and California.
Note: Some randomness gurus have found serious failures to this generator. One source of error could be in programs that manage data because some tricky details. For instance one of these experts found that Canadian and German generators failed when tested and inspecting the series they discover the following pattern: each byte containing a 10 was preceded by a byte containing a 13!. What could have happened?. One explanation was: when you write a program to write bytes to disk, you must open it as binary. If not the program thinks you are trying to write an endofline and the convention on a PC is to represent this by a carriage return (13) followed by a line feed (10).
Here are some manufacturers’ web sites:

• Colorado
http://www.comscire.com/

• Holland
http://valley.interact.nl/av/com/orion/home.html

• Canada
http://www.tundra.com/
(find the data encryption section, then look under RBG1210. My device is an NM810 which is 2?8? RBG1210s on a PC card)

• Protego from Sweden,
http://www.protego.se

• ID Quantique,
http://www.idquantique.com/products/quantis.htm
Skew Correction
Recommended source: Request for Comments,
RFC 1750
We are going to describe here a DeSkew technique, originally due to von Neumann. Suppose the probability of getting a 0 is equal to ½ plus a given error e and that all bits are independent. To eliminate skew, John von Neumann suggested to following algorithm:
• Read the bits two at a time.
• Skip 00’s and 11’s pairs.
• For 01 and 10 outputs the first bit.
Meaning to exam a bit stream as a sequence of nonoverlapping pairs. We could then discard any 00 or 11 pairs found, interpret 01 as a 0 and 10 as a 1. Assume the probability of a 1 is 0.5 + e and the probability of a 0 is 0.5  e where e is the eccentricity of the source. Then the probability of each pair is as follows:
pair

probability

00

(0.5 – e)^2 = 0.25 – e – e^2

01

(0.5 – e)*(0.5 + e) = 0.25 – e^2

10

(0.5 + e)*(0.5  e) = 0.25 – e^2

11

(0.5 + e)^2 = 0.25 + e + e^2

So the probability of a 0 in the deskewed sequence is the same as for a 1!. This technique will completely eliminate any bias but at the expense of taking an indeterminate number of input bits for any particular desired number of output bits. The probability of any particular pair being discarded is 0.5 + 2e^2 so the expected number of input bits to produce X output bits is X/(0.25  e^2).
Defining Randomness
Sequences at random
Let’s consider infinite sequences of x’s pertaining to a GF2
.
Definition 1: Density D(x, n)
D(x, n) = 1/n. ∑ (x(i)), for i<n
Where the ∑ operator stands for Summa extended over i. This is like the “average” of ones. The limiting density of x’s is Lim(n)D(x, n) if that limit exists.
A true random sequence should have a limiting density ½ (though one might wonder why the limit should actually exist in the strict sense of convergence in mathematic analysis sense).
In fact, one would look for a certain degree of convergence: the D(x, n) should tend to ½
even though not too rapidly.
Definition 2: An infinite sequence of x’s pertaining to a GF2 is Mises random if the limiting density of any subsequence (xij) is ½ where the subsequence is selected by a rule named Auswahlregel.
For example, all the following sequences should have limiting density ½.
x0, x1, x2, . . . , xn, . . .
x0, x2, x4, . . . , x2n, . . .
x1, x4, x7, . . . , x3n+1, . . .
x0, x1, x4, . . . , xn2, . . .
Auswahlregeln Rule
This rule must be defined as a function N => N without any knowledge of x: otherwise
we can simply pick a subsequence of all 0’s. Now suppose we have a countable system of Auswahlregeln and our sequence passes all these tests. In other words, we have accounted for many increasing functions f : N => N and for each of these: lim(n)D((xf(n)), n) = ½ apply. Then we can think the sequence of x’s as being true random.
Some random numbers tables
1000 numbers ranged between 1 and 10^6  First Half
402731 
355255 
154123 
584556 
751892 
903723 
25205 
760565 
897040 
213990 
441393 
677505 
513222 
477979 
575753 
153682 
927392 
462215 
326299 
154300 
7113 
538794 
659094 
940086 
739828 
164042 
345218 
924871 
276389 
877453 
112344 
699253 
160452 
245731 
935663 
131555 
97529 
453863 
237865 
446778 
582918 
659913 
579583 
267023 
236868 
776264 
579878 
158904 
245705 
314755 
242246 
644370 
29066 
55101 
727869 
10294 
713041 
664440 
47941 
581262 
106445 
963934 
422305 
389524 
858656 
312743 
463920 
143510 
626996 
75941 
44558 
322874 
665393 
370611 
884583 
700677 
644060 
841021 
455419 
285948 
852984 
651258 
181086 
601009 
311878 
348734 
364168 
817977 
278107 
638529 
503425 
222469 
879648 
542972 
36080 
72762 
220882 
229886 
511935 
113689 
820528 
65764 
771115 
868606 
297175 
886905 
822500 
895128 
103563 
82383 
454158 
540267 
436079 
20744 
10822 
967651 
544366 
804219 
451866 
861483 
973190 
671425 
239898 
457862 
271177 
301561 
594764 
467726 
624052 
513146 
651157 
26436 
969007 
730913 
564395 
402834 
569136 
716368 
195069 
982773 
941901 
502563 
94654 
766071 
433906 
695429 
168568 
957576 
728272 
148213 
564964 
731952 
555924 
755902 
244080 
250173 
889551 
96071 
421689 
313064 
244314 
167124 
910755 
109523 
879437 
869055 
141570 
221321 
885127 
792362 
234977 
366080 
467802 
234934 
971705 
389998 
312822 
502094 
970228 
508835 
58146 
707416 
176557 
411654 
84764 
988424 
125248 
106819 
229198 
302001 
635963 
851857 
436919 
875096 
116619 
170512 
131842 
205997 
955607 
786409 
798168 
616861 
920186 
747493 
371193 
89771 
99944 
701803 
337795 
321898 
335517 
629836 
76756 
32623 
561271 
765490 
573580 
903601 
416345 
585323 
942989 
541129 
110975 
10323 
514031 
491242 
34238 
374835 
334675 
627431 
651885 
743818 
304934 
663833 
998674 
577361 
313107 
399227 
64378 
271688 
956407 
831546 
109956 
433681 
280359 
728984 
831667 
482074 
247478 
748427 
80184 
406763 
5193 
530165 
841365 
43432 
689502 
968111 
767587 
244725 
663521 
171253 
27660 
684025 
882715 
759566 
26623 
768598 
19665 
484674 
339912 
355548 
535437 
673437 
868915 
529508 
436557 
12535 
386206 
817160 
165361 
204483 
393140 
773686 
392295 
201825 
760986 
137368 
435325 
743865 
465678 
440884 
605953 
584710 
851201 
253401 
966675 
200924 
985525 
439910 
929779 
823883 
874702 
661457 
274790 
748398 
957099 
196545 
409026 
273619 
13936 
861360 
427853 
193261 
469031 
690423 
643138 
682794 
573367 
153231 
208994 
937108 
844273 
821546 
162671 
705495 
506970 
386180 
778256 
813997 
907508 
665302 
583885 
352004 
66542 
879896 
553498 
960091 
119094 
363836 
370235 
818862 
141234 
392104 
96156 
385213 
963452 
264686 
773125 
607811 
340552 
859759 
819024 
599492 
199309 
527948 
33991 
471319 
561111 
914422 
760311 
77303 
789400 
604991 
655287 
459663 
205745 
322275 
448294 
512560 
365314 
848266 
689923 
409228 
798852 
78319 
542320 
642772 
24522 
525587 
825543 
426495 
91718 
764801 
642175 
871543 
756602 
501042 
619124 
59665 
9357 
757087 
854838 
862245 
162107 
602905 
531981 
336645 
48774 
665606 
636871 
656134 
512035 
171716 
988495 
794742 
145965 
665021 
879351 
51008 
562383 
304588 
704141 
22833 
384694 
758538 
425808 
778008 
991437 
855172 
128354 
68051 
661493 
103890 
205699 
307198 
862970 
671014 
229046 
209262 
74240 
927183 
465301 
327361 
839239 
916390 
425635 
359064 
538934 
650876 
134409 
408439 
374379 
876828 
439344 
226248 
185475 
578841 
432027 
719581 
298798 
749018 
792228 
92700 
431475 
660083 
705749 
387091 
894808 
523139 
362575 
268843 
709018 
863779 
64615 
340339 
928097 
255587 
27048 
278296 
838362 
625096 
197433 
440369 
167165 
962867 
778557 
26203 
904885 
467099 
682499 
68030 
793724 
204422 
666646 
941242 
793207 
987888 
123860 
799254 
399158 
333009 
163327 
168446 
612715 
554773 
350460 
953904 
543950 
639771 
1000 numbers ranged between 1 and 10^6  Second Half
507368 
741560 
267648 
60201 
686812 
549492 
942751 
605533 
633070 
696625 
96195 
817291 
244691 
343187 
740342 
847998 
150521 
154055 
113408 
55556 
753021 
600817 
996556 
206266 
529291 
556184 
233396 
282657 
177960 
510283 
588077 
106376 
474659 
989934 
399735 
163972 
373956 
350341 
807149 
170869 
563289 
420543 
917312 
383036 
704262 
160937 
666785 
181754 
227104 
309281 
851083 
507075 
17199 
441110 
412477 
997084 
786149 
749333 
912248 
238208 
480289 
560399 
640650 
662733 
121869 
141432 
494298 
540186 
626572 
504770 
607672 
63544 
612575 
544909 
557693 
817951 
94969 
910596 
174155 
637526 
770918 
108629 
309371 
551619 
695373 
723873 
360181 
593923 
293386 
81866 
763764 
210197 
499000 
239312 
907660 
215696 
266324 
820661 
690647 
125907 
240812 
261508 
748336 
438925 
406423 
858922 
236916 
495915 
928165 
74811 
102363 
659287 
64973 
958067 
365198 
864384 
857272 
184884 
1515 
350970 
606544 
888663 
802780 
113272 
335680 
705320 
20085 
127327 
880542 
992949 
691119 
169669 
2256 
762437 
841847 
843387 
670241 
12729 
788545 
416216 
866910 
603139 
222251 
336233 
227213 
393772 
894677 
778420 
724251 
256433 
313627 
419089 
86153 
138042 
652751 
438991 
389899 
623023 
773061 
383 
427148 
221092 
512927 
194856 
720270 
873615 
721935 
279893 
588572 
414098 
913787 
394235 
123153 
241451 
570025 
788980 
970076 
569283 
115794 
169025 
421442 
586507 
269760 
339576 
638478 
398400 
707585 
163898 
170094 
271350 
74685 
255993 
580046 
714423 
514854 
23866 
326097 
729762 
885200 
378001 
107956 
316428 
998855 
587125 
413658 
868643 
980244 
148746 
98687 
530640 
969331 
899957 
279348 
107830 
444774 
115092 
518064 
803026 
468996 
137469 
628251 
623816 
830101 
922496 
676894 
187753 
678496 
456730 
433268 
342736 
460017 
67891 
781569 
989942 
613858 
651343 
937955 
204282 
421223 
506955 
551492 
27154 
203805 
983230 
672484 
96096 
215460 
268737 
428982 
186315 
272166 
874152 
462995 
612945 
939274 
607422 
10587 
67834 
899126 
847122 
168761 
70969 
36479 
971218 
890343 
526252 
323322 
375821 
300350 
259904 
320130 
154035 
841271 
314972 
118815 
791875 
257867 
493601 
41166 
468622 
807177 
101938 
94352 
583628 
424882 
714102 
118249 
214981 
225805 
1337 
565596 
97764 
681091 
971430 
626475 
292899 
177735 
973600 
415421 
113407 
948023 
669948 
788894 
285744 
876513 
113451 
648163 
766619 
791165 
310145 
828782 
124910 
876774 
969 
989787 
671817 
733136 
203128 
120576 
522427 
373614 
56491 
371052 
843224 
780484 
330051 
117301 
941224 
911782 
278482 
644946 
534257 
234257 
579261 
422763 
350480 
576118 
528989 
255457 
167680 
182345 
106639 
490194 
284026 
545174 
820313 
369547 
342332 
399341 
404579 
87283 
608957 
891362 
745380 
410440 
161179 
335042 
997210 
46436 
679168 
252228 
960905 
195084 
727861 
222337 
282351 
972626 
941294 
519452 
501394 
791025 
744013 
975885 
426401 
484236 
116913 
934294 
190242 
362931 
923620 
244690 
32545 
605965 
607058 
148153 
513295 
80594 
797572 
959889 
631540 
252375 
322443 
944438 
490524 
181308 
450773 
706513 
454344 
741861 
982455 
51763 
130881 
205573 
619091 
245597 
7673 
491550 
129109 
543035 
165523 
593634 
197807 
614668 
275675 
146747 
673127 
382458 
272429 
198392 
771932 
615020 
36851 
694004 
270003 
660497 
689770 
845168 
800741 
616901 
89884 
524483 
173943 
564949 
721992 
470836 
729368 
525571 
27529 
488340 
465165 
567928 
757253 
196049 
61459 
109223 
605898 
89911 
641598 
657670 
105019 
323666 
653545 
408192 
879387 
101907 
725919 
425115 
65179 
404303 
211800 
151786 
360788 
610547 
616733 
499082 
336837 
797667 
523431 
111795 
460528 
478648 
208509 
78451 
535893 
837780 
123702 
388973 
116954 
897058 
73235 
700775 
311922 
390914 
594050 
843244 
98180 
70525 
423444 
119363 
361505 
511949 
559334 
383043 
982303 
250125 
551059 
450925 
152187 
815219 
803037 
256 HX pairs
79 
41 
e6 
44 
17 
de 
d9 
f8 
3b 
b2 
4f 
45 
c8 
be 
7c 
ca 
17 
09 
6c 
ee 
17 
fb 
a3 
39 
bd 
c0 
12 
e6 
50 
24 
d6 
18 
61 
69 
35 
c3 
80 
10 
a8 
9a 
4a 
e8 
b6 
0c 
6f 
06 
2c 
5f 
09 
2e 
a7 
5b 
c2 
01 
ae 
98 
8a 
77 
aa 
89 
b4 
de 
83 
ef 
f9 
2a 
af 
cd 
3f 
51 
8d 
43 
d0 
01 
d4 
dd 
aa 
43 
dc 
87 
9e 
9d 
c8 
ac 
ae 
48 
c7 
46 
99 
24 
f0 
cc 
1b 
95 
0c 
c0 
d9 
2f 
a2 
e2 
8f 
47 
70 
97 
f3 
69 
55 
4e 
35 
a8 
31 
c7 
12 
ec 
68 
e3 
d4 
74 
4c 
5a 
99 
2d 
8d 
e2 
3d 
08 
42 
68 
5c 
bc 
6b 
56 
06 
26 
7d 
68 
8e 
bb 
1e 
db 
59 
df 
11 
5d 
b8 
85 
f9 
9c 
67 
03 
08 
9a 
c8 
63 
d8 
8f 
ea 
f1 
11 
74 
4f 
e4 
29 
81 
15 
9b 
58 
68 
ed 
d7 
bf 
ec 
e5 
e9 
12 
5e 
e0 
3d 
69 
29 
f1 
e7 
ae 
7b 
a3 
5e 
7c 
a2 
02 
7f 
28 
b3 
0e 
6b 
f8 
83 
76 
1b 
04 
cd 
70 
9e 
e6 
37 
83 
b9 
f6 
ef 
01 
9b 
75 
a7 
b8 
26 
91 
ab 
ee 
62 
68 
82 
80 
74 
26 
cc 
95 
ac 
82 
d7 
20 
e0 
57 
be 
06 
5f 
78 
84 
60 
f3 
19 
80 
aa 
8b 
ce 
8e 
fa 
dc 
49 
08 
56 
c0 
c8 
ff 
52 
d7 
26 
44 
256 octal
010 
250 
262 
242 
366 
012 
113 
002 
360 
376 
236 
347 
263 
317 
225 
132 
240 
301 
102 
373 
366 
212 
220 
261 
370 
312 
066 
317 
161 
071 
321 
305 
064 
367 
106 
140 
356 
107 
103 
335 
120 
226 
306 
170 
030 
172 
053 
202 
306 
133 
255 
366 
105 
130 
006 
026 
037 
115 
105 
212 
212 
037 
324 
345 
337 
315 
001 
032 
044 
275 
226 
164 
313 
163 
345 
253 
342 
327 
355 
227 
150 
331 
047 
101 
243 
025 
124 
145 
146 
327 
067 
240 
321 
016 
132 
222 
271 
301 
315 
272 
342 
346 
253 
270 
054 
233 
243 
137 
016 
373 
122 
273 
304 
313 
052 
366 
276 
024 
047 
361 
157 
106 
141 
115 
376 
332 
221 
342 
073 
156 
133 
014 
343 
161 
356 
244 
026 
251 
073 
213 
342 
332 
252 
020 
333 
132 
057 
350 
354 
173 
167 
202 
214 
154 
273 
135 
201 
105 
336 
142 
250 
024 
052 
366 
042 
151 
045 
373 
007 
200 
267 
265 
122 
201 
212 
025 
375 
036 
223 
356 
257 
174 
220 
314 
214 
325 
122 
151 
124 
204 
375 
063 
025 
352 
110 
117 
035 
376 
025 
016 
032 
305 
176 
343 
366 
304 
065 
367 
177 
160 
265 
316 
175 
141 
171 
100 
345 
375 
306 
115 
061 
060 
176 
031 
213 
254 
063 
211 
046 
074 
240 
353 
060 
341 
240 
144 
125 
274 
334 
266 
221 
053 
144 
065 
355 
313 
017 
034 
332 
304 
333 
167 
267 
116 
164 
147 
Monte Carlo Techniques
Source:
Department of Physics, Drexel University, Philadelphia
.
This technique is fundamentally used to simulate statistic events and it derives from Integral Calculus. Let’s suppose we have to compute H by integrating function f over a given nspace as follows
H =.∫∫∫∫…f(x(1), x(2), x(3), …., x(n). dx(1).dx(2).dx(3)…..dx(n)
Where ∫’s stand for “Integrals” defined over domains in an ndimensional space of the elementary computation (f.dv) where f stands for an ndimensional function and dv for the infinitesimal hypercube defined by dv = dx(1).dx(2).dx(3)…..dx(n). In its discrete approximation It would mean m^n elementary computations of the core algorithm in a Cartesian space, namely m^n infinitesimal calculations of the form H <= H + f.dv along n loops of range m, where m stands for the number of sample points along each dimension supposedly all equal. If on the contrary for each dimension we have a particular m(i) the Cartesian space will have m(1).m(2).m(3)….m(n) points instead.
Note: That’s too much for m and n values easy to find in simulations, for example being n=6 and m=1000 for a 1000^6 = 10^18 grid for some weather forecast models!. We may navigate throughout this space systematically by “layers” at “brute force” mode (throughout all points) or following random walks. Monte Carlo approach estimates H via random samples within the Cartesian m^n space.
One dimensional case 1D
For a given f defined over an interval [a b], the H computation will in pseudo code have the following form:
H=0, i=0
Do Core from i=0 till i=m, i++
Core: H = H + f(i)
H= delta.H
Where:
delta is the interval length divided in m parts, a practical infinitesimal.
f(i) is an array of m values along [a b]
Then the content of H at the end of computation will be
H = delta. ∑ (i) f(i).,
That reads delta multiplied by the ∑ of all array values. It’s the same as
H = delta.m.<f>,
Where <f> is the average of f along the interval [a b]. As delta.m is the length L of interval [a b] it results
H = L.<f>
So instead of computing H via “brute force” along m elementary computations, the Monte Carlo technique try to get a “good enough” statistical estimator of <f>. The economy rests on choosing a step larger than “delta” step. Let’s suppose that m = 10^8 points; applying the brute force method would imply to make 10^8 calculations an to keep someplace at hand equal number of f(i) values. What would be the resulting “error” in H if we use 1000 points instead of 10^8?: Perhaps not too much for our purpose. We may then take 1000 points spaced regularly along the interval [a b], equal number of corresponding f values and obtain <f> and its corresponding “variance” by the expression
Where <f^2> means the average of squares.
Pi Estimation by Monte Carlo
One of the classical Monte Carlo demos is the determination of number “pi” at a certain precision level. It’s based on the “quadrature” paradox of the antique world (Babylonians knew Pi with a reasonable precision as 25/8 = 3.125 that compared with Pi = 3.1416 gives us an error of 0.5%). If we integrate with a very elementary f (for instance f=1 within the X[1, 1], Y[1, 1] domain and zero outside it) along a circle of radio equal to 1, enclosed with a square of size equal to 2 we may find the following expression:
Area of the circle = (Pi)x1^2 = Pi
Area of the square = 4
Then area of the circle/area of the square = Pi/4, is a parameter that could be approached statistically by Monte Carlo techniques as follows. The idea is to “throw”/”pick” a zero dimension point chosen at random within the interval [1, 1] in both dimensions X and Y of a Cartesian space. The probability p of these points to “fall” in either figure, the circle or the square is in relation to their areas
p(circle) = Pi/4
p(square) = 1
We may then generate N random pairs [x, y] of coordinates, for example between the range [0.99999, 0.99999] and simply compute a Boolean variable P telling us if falling is inside or outside the circle. The pseudo code could be something like the one depicted below:
H=0
……………………….
x <= Random [1 1]
y <= Random[1 1]
If (x^2 + y^2>=1) P=1, and P=0 otherwise
H=H+P
…………………………
..
At the end of computation the estimation of Pi is given by four times H/N. As this is a pseudo Bernoulli process it will follow the Binomial Distribution that gives:
E(H) = E(<f>) = N.p
Variance = N.p.(1p)
In numbers if we generate 10,000 pairs [x, y],
N=10,000,
E(H) = 10,000x3,1416/4 = 10,000x0.7854 = 7854
Totaling 7854 points within the circle with an expected standard deviation given by:
Variance = 10.000x(0.7854)x(0,2146) = 1685 => standard deviation = (1685)^ ½ = 41.
Statistics and Tests
NIST,
http://csrc.nist.gov/rng/
The quality of random numbers can be measured in a variety of ways. One common method is to compute the Information Density, or entropy, in a series of numbers. The higher the entropy in a series of numbers is, the more difficult it is to predict a given number on the basis of the preceding numbers in the series. A sequence of good random numbers will have a high level of entropy, although a high level of entropy does not guarantee randomness. Paradoxically files compressed have a high level of entropy but as data is highly structured its randomness is not guaranteed being in fact considered not at random. Hence, for a thorough test of a random number generator, computing the level of entropy in the numbers alone is not enough.
Guide to Statistical Tests
http://csrc.nist.gov/rng/rng9.html
A total of sixteen statistical tests were developed, implemented and evaluated. The following describes each of the tests.

• Frequency (Monobits) Test

o The focus of the test is the proportion of zeroes and ones for the entire sequence. The purpose of this test is to determine whether that number of ones and zeros in a sequence are approximately the same as would be expected for a truly random sequence. The test assesses the closeness of the fraction of ones to ½, that is, the number of ones and zeroes in a sequence should be about the same.

• Test for Frequency within a Block

o The focus of the test is the proportion of zeroes and ones within Mbit blocks. The purpose of this test is to determine whether the frequency of ones is an Mbit block is approximately M/2.

• Runs Test

o The focus of this test is the total number of zero and one runs in the entire sequence, where a run is an uninterrupted sequence of identical bits. A run of length k means that a run consists of exactly k identical bits and is bounded before and after with a bit of the opposite value. The purpose of the runs test is to determine whether the number of runs of ones and zeros of various lengths is as expected for a random sequence. In particular, this test determines whether the oscillation between such substrings is too fast or too slow.

• Test for the Longest Run of Ones in a Block

o The focus of the test is the longest run of ones within Mbit blocks. The purpose of this test is to determine whether the length of the longest run of ones within the tested sequence is consistent with the length of the longest run of ones that would be expected in a random sequence. Note that an irregularity in the expected length of the longest run of ones implies that there is also an irregularity in the expected length of the longest run of zeroes. Long runs of zeroes were not evaluated separately due to a concern about statistical independence among the tests.

• Random Binary Matrix Rank Test

o
The focus of the test is the rank of disjoint submatrices of the entire sequence. The purpose of this test is to check for linear dependence among fixed length substrings of the original sequence.

• Discrete Fourier Transform (Spectral) Test

o The focus of this test is the peak heights in the discrete Fast Fourier Transform. The purpose of this test is to detect periodic features (i.e., repetitive patterns that are near each other) in the tested sequence that would indicate a deviation from the assumption of randomness.

• Nonoverlapping (Aperiodic) Template Matching Test

o The focus of this test is the number of occurrences of predefined target substrings. The purpose of this test is to reject sequences that exhibit too many occurrences of a given nonperiodic (aperiodic) pattern. For this test and for the Overlapping Template Matching test, an mbit window is used to search for a specific mbit pattern. If the pattern is not found, the window slides one bit position. For this test, when the pattern is found, the window is reset to the bit after the found pattern, and the search resumes.

• Overlapping (Periodic) Template Matching Test

o The focus of this test is the number of predefined target substrings. The purpose of this test is to reject sequences that show deviations from the expected number of runs of ones of a given length. Note that when there is a deviation from the expected number of ones of a given length, there is also a deviation in the runs of zeroes. Runs of zeroes were not evaluated separately due to a concern about statistical independence among the tests. For this test and for the Nonoverlapping Template Matching test, an mbit window is used to search for a specific mbit pattern. If the pattern is not found, the window slides one bit position. For this test, when the pattern is found, the window again slides one bit, and the search is resumed.

• Maurer's Universal Statistical Test

o The focus of this test is the number of bits between matching patterns. The purpose of the test is to detect whether or not the sequence can be significantly compressed without loss of information. An overly compressible sequence is considered to be nonrandom.

• LempelZiv Complexity Test

o The focus of this test is the number of cumulatively distinct patterns (words) in the sequence. The purpose of the test is to determine how far the tested sequence can be compressed. The sequence is considered to be nonrandom if it can be significantly compressed. A random sequence will have a characteristic number of distinct patterns.

• Linear Complexity Test

o The focus of this test is the length of a generating feedback register. The purpose of this test is to determine whether or not the sequence is complex enough to be considered random. Random sequences are characterized by a longer feedback register. A short feedback register implies nonrandomness.

• Serial Test
The focus of this test is the frequency of each and every overlapping mbit pattern across the entire sequence. The purpose of this test is to determine whether the number of occurrences of the 2m mbit overlapping patterns is approximately the same as would be expected for a random sequence. The pattern can overlap.

• Approximate Entropy Test

o The focus of this test is the frequency of each and every overlapping mbit pattern. The purpose of the test is to compare the frequency of overlapping blocks of two consecutive/adjacent lengths (m and m+1) against the expected result for a random sequence.

• Cumulative Sum (Cusum) Test

o : The focus of this test is the maximal excursion (from zero) of the random walk defined by the cumulative sum of adjusted (1, +1) digits in the sequence. The purpose of the test is to determine whether the cumulative sum of the partial sequences occurring in the tested sequence is too large or too small relative to the expected behavior of that cumulative sum for random sequences. This cumulative sum may be considered as a random walk. For a random sequence, the random walk should be near zero. For nonrandom sequences, the excursions of this random walk away from zero will be too large.

• Random Excursions Test

o : The focus of this test is the number of cycles having exactly K visits in a cumulative sum random walk. The cumulative sum random walk is found if partial sums of the (0,1) sequence are adjusted to (1, +1). A random excursion of a random walk consists of a sequence of n steps of unit length taken at random that begin at and return to the origin. The purpose of this test is to determine if the number of visits to a state within a random walk exceeds what one would expect for a random sequence.

• Random Excursions Variant Test

o : The focus of this test is the number of times that a particular state occurs in a cumulative sum random walk. The purpose of this test is to detect deviations from the expected number of occurrences of various states in the random walk.
