Math-free, Parameter-free Gradient Descent in Python

Entitled “Math-free, Parameter-free Gradient Descent in Python”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. 

I discuss techniques related to the gradient descent method in 2D. The goal is to find the minima of a target function, called the cost function. The values of the function are computed at evenly spaced locations on a grid and stored in memory. Because of this, the approach is not directly based on derivatives, and there is no calculus involved. It implicitly uses discrete derivatives, but foremost, it is a simple geometric algorithm. The learning parameter typically attached to gradient descend is explicitly specified here: it is equal to the granularity of the mesh and does not need fine-tuning. In addition to gradient descent and ascent, I also show how to build contour lines and orthogonal trajectories, with the exact same algorithm.

I apply the method to investigate one of the most famous unsolved problems in mathematics: the Riemann Hypothesis. The functions studied here are defined on the complex plane. However, no advanced knowledge of complex calculus is required as I use the standard 2D space in my illustrations. I show how the distribution of the minima of |ζ(σ + it)| can be studied by looking at (say) σ = 2 rather than σ = 1/2. These minima are the non-trivial roots of the Riemann zeta function and all of them are conjectured to have σ = 1/2. It is a lot easier to work with σ > 1 due to accelerated convergence. In the process, I introduce synthetic functions with arbitrary infinite Hadamard products (the most well known is the sine function) to assess non-Dirichlet functions that may behave like ζ, and gain more insights and generalization about the problem. My presentation is mostly in simple English and accessible to first year college students.

. . . Data animation (video) loading, may take 5 secs . . .

Convergence path for 100 random starting points

Table of Contents

  1. Introduction
  2. Gradient descent and related optimization techniques
    . . . Implementation details
    . . . General comments about the methodology and parameters
    . . . Mathematical version of gradient descent and orthogonal trajectories
  3. Distribution of minima and the Riemann Hypothesis
    . . . Root taxonomy
    . . . Studying root propagation with synthetic math functions
  4. Python code
    . . . Contours and orthogonal trajectories
    . . . Animated gradient descent starting with 100 random points

Download the Article

The technical article, entitled Math-free, Parameter-free Gradient Descent in Python, is accessible in the “Free Books and Articles” section, here. It contains links to my GitHub files, to easily copy and paste the code. The text highlighted in orange in this PDF document are keywords that will be incorporated in the index, when I aggregate all my related articles into books about machine learning, visualization and Python, similar to these ones. The text highlighted in blue corresponds to external clickable links, mostly references. And red is used for internal links, pointing to a section, bibliography entry, equation, and so on.

To not miss future articles, sign-up to our newsletter, here.

About the Author

Vincent Granville is a pioneering data scientist and machine learning expert, co-founder of Data Science Central (acquired by  TechTarget in 2020), founder of, former VC-funded executive, author and patent owner. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. Vincent is also a former post-doc at Cambridge University, and the National Institute of Statistical Sciences (NISS).  

Vincent published in Journal of Number TheoryJournal of the Royal Statistical Society (Series B), and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is also the author of multiple books, including “Intuitive Machine Learning and Explainable AI”, available here. He lives  in Washington state, and enjoys doing research on spatial stochastic processes, chaotic dynamical systems, experimental math and probabilistic number theory.

%d bloggers like this: