# Detecting Subtle Departures from Randomness

Entitled “Detecting Subtle Departures from Randomness”, the full version in PDF format is accessible in the “Free Books and Articles” section, here.

Figure 1 below shows two plots arising from two different, non-periodic pseudo-random sequences of +1 and -1 in about equal proportions, distributed seemingly randomly. The left one is associated to prime numbers, assumed to be distributed somewhat randomly. The right one is based on the binary digits of the square root of 2, also assumed to be distributed somewhat randomly.

The test proposed here clearly magnifies the lack of randomness of the first sequence, while unable to find patterns in the latter. The implications are significant: prime numbers are not distributed as randomly as most people think. A consequence is that pretty much all congruential pseudo-random number generators, which rely on properties of the prime numbers in one way or another, do not produce perfectly random sequences, and may be hacked. This includes the Mersenne twister implemented in Python and other programming languages. It makes many cryptographic systems unsecure, unless they are reinforced.

I discuss, in simple English, how to detect weak deviations from randomness, and workarounds to get better random-looking and unbreakable sequences. The theoretical background is related to the famous unsolved Riemann Hypothesis in number theory. This article also offers a strong, state-of-the-art summary on this topic, accessible to machine learning practitioners or beginners, and to decision makers in the field. The topic is usually explained in obscure jargon or inane generalities. To the contrary, this article will intrigue you with the beauty and power of this theory.

## Abstract

I discuss a new test of randomness for pseudo random number generators (PRNG), to detect subtle patterns in binary sequences. The test shows that congruential PRNGs, even the best ones, have flaws that can be exacerbated by the choice of the seed. This includes the Mersenne twister used in many programming languages including Python. I also show that the digits of some numbers such as the square root of 2205, conjectured to be perfectly random, fail this new test, despite the fact that they pass all the standard tests. I propose a methodology to avoid these flaws, implemented in Python. The test is particularly useful when high quality randomness is needed. This includes cryptographic and military-grade security applications, as well as synthetic data generation and simulation-intensive Markov chain Monte Carlo methods.

The origin of this test is in number theory and connected to the Riemann Hypothesis. In particular, it is based on Rademacher stochastic processes. These random multiplicative functions are a number-theoretic version of Bernoulli trials. My article features state-of-the-art research on this topic, as well as an original, simple, integer-based formula to compute square roots to generate random digits. It is offered with a Python implementation that handles integers with millions of digits.

Introduction

Strong Pseudo-random numbers

• New test of randomness for PRNG’s
• Theoretical background: the law of the iterated logarithm
• Connection to the Generalized Riemann Hypothesis

Testing well-known sequences

• Reverse-engineering a pseudo-random sequence
• Illustrations

Python code

• Fixes to the faulty random function in Python
• Prime test implementation to detect subtle flaws in PRNG’s
• Special formula to compute 10 million digits of square root of 2

References