LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

In my recent article “Piercing the Deepest Mathematical Mystery” posted here, I paved the way to proving a famous multi-century old conjecture: are the digits of major mathematical constant such as π, e, log 2, or √2 evenly distributed? No one before ever managed to prove even the most basic trivialities, such as whether the proportion of ‘0’ or ‘1’ exists in the binary expansions of any of these constants, or if it oscillates indefinitely between 0% and 100%.

Figure 1: Dynamics of digit sum function linked to conjecture

Here I provide an overview of the new framework built to uncover deep results about the digit distribution of Euler’s number e, discuss the latest developments, share a 10x faster version of the code, and feature new potential research areas in LLMs, AI, quantum dynamics, high performance computing, cryptography, dynamical systems, number theory and more, arising from my discovery. Perhaps the most interesting part is testing LLMs and other AI tools to assess their reasoning capabilities on a fascinating math problem with no solution posted anywhere.

The LLM challenge

You can use any AI tool. In my paper, I also mention alternatives to LLMs, but all of them rely on deep neural networks. The goal is not to ask AI to solve a very tough problem and show that it cannot. Instead, you want to provide as many hints as possible, all the insights already uncovered by human intelligence, to help it succeed. Then measure success according to some metrics and compare the performance of various tools on their ability either to come up with a final proof or discover deeper insights and formulas that will help a human finalize the proof of a new, seminal result.

My first experiments suggest that Grok and DeepSeek do better on the first questions I asked, compared to Perplexity or OpenAI. While I describe a 2.5 petabytes dataset, any tool that can do better with much less – say a terabyte – should get a much higher rating.

Figure 2: Same as Figure 1, for case not linked to conjecture

The questions I ask cover many aspects of the problem. For instance, one of them consists in assessing if the digit sum function is gap-free as in Figure 1 (I expect the answer to be positive), or if we face a situation like Figure 2. The latter would make a final proof more complicated. The goal in the end is to get a simple formula able to generate all the model parameters. Better, prove that the formula in question is correct, thus formally proving a ground-breaking result regarding the digits of e.

Access the paper, Python code, and dataset

The 13-page PDF with many illustrations is available (for free) as paper 52, here. It links to a subset of the whole dataset on GitHub. It also features fast Python code (also with link to GitHub) to deal with gigantic numbers larger than 2ⁿ + 1 at power 2ⁿ with n = 10⁵, uncover patterns in their digit sum function, as well as the questions to ask to AI and LLMs, the applications, how to generate the full dataset, and state-of-the-art research and references on the topic.

To no miss future articles, subscribe to my AI newsletter, here.

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

	messerb5467 on Quantum Derivatives, GenAI, an…
	Vincent Granville on Quantum Derivatives, GenAI, an…
	Brad Messer on Quantum Derivatives, GenAI, an…
	Sanjay Gautam on Number Theory: Longest Runs of…
	Artem Melnyk on Autonomous Driving: Boosting O…

LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

The LLM challenge

Access the paper, Python code, and dataset

About the Author

Like this:

Leave a ReplyCancel reply

LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

The LLM challenge

Access the paper, Python code, and dataset

About the Author

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from NextGen AI Technology