Synthesizing Geospatial Data with A Simple NoGAN Technique

If you regularly read my articles, you know that I developed several different techniques for data synthetization. Many are explained in details in my upcoming book Synthetic Data and Generative AI (Elsevier), available here. It includes generative adversarial networks (GANs), copulas, agent-based modeling, methods based on interpolation, correlated noise mixtures, and more.

The technique presented here was first tested on time series and then extended to geospatial data, a 2D generalization. It relies on exact multivariate interpolation, and designed to avoid overfitting. Most other methods such as kriging produce smoothed data, and are thus not suitable for data synthetization. To the contrary, this method preserves the local irregularities and spikes. For instance, it can be used to model chaotic processes or reconstruct full elevation maps when the altitude is known only for a small number of locations.

Figure 1: Interpolated grid based on training points (the circles)

The algorithm is explained in chapter 9 in my book, and applied to the Chicago temperature dataset consisting of 31 locations. Here I illustrate how it works on much larger datasets. The main novelty is measuring smoothness in higher dimensions. The concept may sound trivial, but in two or three dimensions, no one agrees on the definition, and it is not obvious to compare the smoothness of two different datasets produced by different algorithms, or featuring different geographic areas. I address this issue.

The full technical article with Python code and case study is presented as a project for participants enrolled in my GenAI certification program. It consists of multiple steps to complete. My own solutions are included. Download the document as paper #28, here.

Figure 2 features a function of the second order gradient, used to estimate the smoothness of the data shown in Figure 1. The data consisted of a few dozens data points (shown as small circles), and the full grid was interpolated based on this training set, to produce the image in question. Because the true value is known everywhere in this example, I was able to assess the accuracy of my interpolation method.

NoGAN is a class of synthetization algorithms that do not rely on neural networks (GAN) for training. They run much faster, lead to explainable AI, and some produce even better results. My upcoming article “Generative AI Technology Break-through: Spectacular Performance of New, NoGAN Synthesizer” will be the first seminal paper on this topic, following a new trend started with NoSQL, NoCode, and NoMath in other contexts.

To no miss future articles and discover the benefits offered to subscribers only, visit our newsletter sign-up page, here. Subscription is free.

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

	messerb5467 on Quantum Derivatives, GenAI, an…
	Vincent Granville on Quantum Derivatives, GenAI, an…
	Brad Messer on Quantum Derivatives, GenAI, an…
	Sanjay Gautam on Number Theory: Longest Runs of…
	Artem Melnyk on Autonomous Driving: Boosting O…

Synthesizing Geospatial Data with A Simple NoGAN Technique

About the Author

Like this:

Leave a ReplyCancel reply

Synthesizing Geospatial Data with A Simple NoGAN Technique

About the Author

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from xLLM and AI Technology