New Interpolation Methods for Data Synthetization and Prediction

Entitled “New Interpolation Methods for Synthetization and Prediction”, the full version in PDF format is accessible in the “Free Books and Articles” section as paper #16, here. This article is an extract from my book “Synthetic Data and Generative AI”, available here.

I describe little-known original interpolation methods with applications to real-life datasets. These simple techniques are easy to implement and can be used for regression or prediction. They offer an alternative to model-based statistical methods. Applications include interpolating ocean tides at Dublin, predicting temperatures in the Chicago area with geospatial data, and a problem in astronomy: planet alignments and frequency of these events. In one example, the 5-min data can be replaced by 80-min measurements, with the 5-min increments reconstructed via interpolation, without noticeable loss. Thus, my algorithm can be used for data compression.

The first technique has strong ties to Fourier methods. In addition to the above applications, I show how it can be used to efficiently interpolate complex mathematical functions such as Bessel and Riemann zeta. For those familiar with MATLAB or Mathematica, this is an opportunity to play with the MPmath library in Python and see how it compares with the traditional tools in this context. In the process, I also show how the methodology can be used to generate synthetic data, be it time series or geospatial data.

Depending on the parameters, in the geospatial context, the interpolation is either close to nearest-neighbor methods, kriging (also known as Gaussian process regression), or a truly original and hybrid mix of additive and multiplicative techniques. There is an option not to interpolate at locations far away from the training set, where regression or interpolation results may be meaningless, regardless of the technique used. Another application is detecting the full extent of an oil field after digging only a dozen wells. Likewise, the temperature data sets also has few stations with an actual measurement, and the goal is to obtain interpolated values fully covering a specific area.

The second technique is based on ordinary least squares — the same method used to solve polynomial or multivariate regression — but instead of highly unstable polynomials leading to overfitting, I focus on generic functions that avoid these pitfalls, using an iterative greedy algorithm to find the optimum. In particular, a solution based on orthogonal functions leads to a particularly simple implementation with a direct and elegant solution.

Introduction
First method
. . . Example with infinite summation
. . . Applications: ocean tides, planet alignment
. . . Problem in two dimensions
. . . Spatial interpolation of the temperature dataset
Second method
. . . From unstable polynomials to robust orthogonal regression
. . . Using orthogonal functions
. . . Application to regression
Python code
. . . Time series interpolation
. . . Geospatial temperature dataset
. . . Regression with Fourier series

Temperature map in the Chicago area: real data (round dots) blended with synthetic data

Download the Article

The technical article, entitled New Interpolation Methods for Synthetization and Prediction, is accessible in the “Free Books and Articles” section as paper #16, here. It contains links to my GitHub files, to easily copy and paste the code. The text highlighted in orange in this PDF document are keywords that will be incorporated in the index, when I aggregate all my related articles into books about machine learning, visualization and Python, similar to these ones. The text highlighted in blue corresponds to external clickable links, mostly references. And red is used for internal links, pointing to a section, bibliography entry, equation, and so on.

To not miss future articles, sign-up to our newsletter, here.

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

	messerb5467 on Quantum Derivatives, GenAI, an…
	Vincent Granville on Quantum Derivatives, GenAI, an…
	Brad Messer on Quantum Derivatives, GenAI, an…
	Sanjay Gautam on Number Theory: Longest Runs of…
	Artem Melnyk on Autonomous Driving: Boosting O…

New Interpolation Methods for Data Synthetization and Prediction

Table of Contents

Download the Article

About the Author

Like this:

Leave a ReplyCancel reply

New Interpolation Methods for Data Synthetization and Prediction

Table of Contents

Download the Article

About the Author

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from xLLM and AI Technology