# Math-free, Parameter-free Gradient Descent in Python

Entitled “Math-free, Parameter-free Gradient Descent in Python”, the full version in PDF format is accessible in the “Free Books and Articles” section, here.

I discuss techniques related to the gradient descent method in 2D. The goal is to find the minima of a target function, called the cost function. The values of the function are computed at evenly spaced locations on a grid and stored in memory. Because of this, the approach is not directly based on derivatives, and there is no calculus involved. It implicitly uses discrete derivatives, but foremost, it is a simple geometric algorithm. The learning parameter typically attached to gradient descend is explicitly specified here: it is equal to the granularity of the mesh and does not need fine-tuning. In addition to gradient descent and ascent, I also show how to build contour lines and orthogonal trajectories, with the exact same algorithm.

I apply the method to investigate one of the most famous unsolved problems in mathematics: the Riemann Hypothesis. The functions studied here are defined on the complex plane. However, no advanced knowledge of complex calculus is required as I use the standard 2D space in my illustrations. I show how the distribution of the minima of |ζ(σ + it)| can be studied by looking at (say) σ = 2 rather than σ = 1/2. These minima are the non-trivial roots of the Riemann zeta function and all of them are conjectured to have σ = 1/2. It is a lot easier to work with σ > 1 due to accelerated convergence. In the process, I introduce synthetic functions with arbitrary infinite Hadamard products (the most well known is the sine function) to assess non-Dirichlet functions that may behave like ζ, and gain more insights and generalization about the problem. My presentation is mostly in simple English and accessible to first year college students.

. . . Data animation (video) loading, may take 5 secs . . .

1. Introduction
2. Gradient descent and related optimization techniques
. . . Implementation details
. . . Mathematical version of gradient descent and orthogonal trajectories
3. Distribution of minima and the Riemann Hypothesis
. . . Root taxonomy
. . . Studying root propagation with synthetic math functions
4. Python code
. . . Contours and orthogonal trajectories
. . . Animated gradient descent starting with 100 random points