Some people climb Mount Everest solo in winter, with no oxygen. Some mathematicians prove difficult theorems using only elementary arithmetic. The proof, despite labeled as “elementary” is typically far more complicated than those based on advanced mathematical theory. The people accomplishing these feasts are very rare.
For years, I have developed machine learning techniques that barely use any mathematics. I view it as a sport. Not that I don’t know anything about mathematics, quite the contrary. I believe you must be very math-savvy to achieve such accomplishments. This article epitomizes math-free machine learning. It is the result of years of research. The highly non-linear methodology described here may not be easier to grasp than math-heavy techniques. It has its own tricks. Yet, you could, in principle, teach it to middle school students.
I did not in any way compromise on the quality and efficiency of the technique, for the sake of gaining the “math-free” label. What I describe here is a high performing technique in its own right. You can use it to solve various problems: multivariate regression, interpolation, data compression, prediction, or spatial modeling (well, without “model”). It comes with prediction intervals. Yet there is no statistical or probability model behind it, no calculus, no matrix algebra, no regression coefficients, no bootstrapping, no resampling, not even square roots.
I included the Python source code, the synthetic data sets and some spreadsheets, for easy replicability and to help you understand what the technique does. One of the applications discussed is in number theory, related to the famous Riemann Hypothesis. The technical article, entitled Interpretable Machine Learning: Multipurpose, Model-free, Math-free Fuzzy Regression, is accessible in the “Free Books and Articles” section, here.
The innovative technique discussed here does much more than regression. It is useful in signal processing, in particular spatial filtering and smoothing. Initially designed using hyperplanes, the original version can be confused with support vector machines or support vector regression. However, the closest analogy is fuzzy regression. A weighted version based on splines makes it somewhat related to nearest neighbor or inverse distance interpolation, and highly non-linear. In the end, it is a kriging-like spatial regression, with many potential applications ranging from compression to signal enhancement and prediction. It comes with confidence intervals for the predicted values, despite the absence of statistical model. A predicted value is determined by hundreds or thousands of splines. The splines play the role of nodes in neural networks. Unlike neural networks, all the parameters — the distances to the splines — have a natural interpretation.
The methodology was tested on synthetic data. The performance, depending on hyperparameters and the number of splines, is measured on the validation set, not on the training set. Despite (by design) nearly perfect predictions for training set points, it is robust against outliers, numerically stable, and does not lead to overfitting. There is no regression coefficients, no intercept, no matrix algebra involved, no calculus, no statistics beyond empirical percentiles, and not even square roots. It is accessible to high school students. Despite the apparent simplicity, the technique is far from trivial. In its simplest form, the splines are similar to multivariate Lagrange interpolation polynomials. Python code is included in this document.
Table of Contents
The article covers the following topics:
Full, non-linear model in higher dimensions
- Geometric proximity, weights, and numerical stability
- Predicted values and prediction intervals
- Illustration, with spreadsheet
- Performance assessment
- Amplitude restoration
Python code and data sets
The technical article, entitled Interpretable Machine Learning: Multipurpose, Model-free, Math-free Fuzzy Regression, is accessible in the “Free Books and Articles” section, here. The text highlighted in orange in this PDF document are keywords that will be incorporated in the index, when I aggregate all my related articles into a single book about innovative machine learning techniques. The text highlighted in blue corresponds to external clickable links, mostly references. And red is used for internal links, pointing to a section, bibliography entry, equation, and so on.
To not miss future articles, sign-up to our newsletter, here.