
New Python Library to Evaluate AI-generated Data and Compare Models
- Vincent Granville
- September 19, 2023
Called GenAI-Evalution, you use it for instance to assess the quality of tabular synthetic data. In this case, it measures how faithfully the synthetization mimics the real data it is derived from, by comparing the full joint empirical distributions (ECDF) attached to the two datasets. It works both with categorical and numerical features, and returns […]
Read More
How to Fix a Failing Generative Adversarial Network
- Vincent Granville
- August 12, 2023
In this article, I explore different front-end strategies to improve a generative adversarial network (GAN) that leads to poor synthetization, in the context of tabular data generation. It is well known that tabular data is a lot more challenging than images, when using deep neural networks for synthetization purposes. An algorithm may work very well […]
Read More
Synthesizing Geospatial Data with A Simple NoGAN Technique
- Vincent Granville
- July 28, 2023
If you regularly read my articles, you know that I developed several different techniques for data synthetization. Many are explained in details in my upcoming book Synthetic Data and Generative AI (Elsevier), available here. It includes generative adversarial networks (GANs), copulas, agent-based modeling, methods based on interpolation, correlated noise mixtures, and more. The technique presented […]
Read More
Sound Generation in Python: Turning Your Data into Music
- Vincent Granville
- July 13, 2023
Not long ago, I published here an article entitled “The Sound that Data Makes”. The goal was turning data — random noise in this case — into music. The hope was that by “listening” to your data, you could gain a different kind of insights, not conveyed by visualizations or tabular summaries. This article is […]
Read More
Generative AI: Synthetic Data Vendor Comparison and Benchmarking Best Practices
- Vincent Granville
- June 16, 2023
The goal of data synthetization is to produce artificial data that mimics the patterns and features present in existing, real data. Many generation methods and evaluation techniques are available, depending on purposes, the type of data, and the application field. Everyone is familiar with synthetic images in the context of computer vision, or synthetic text […]
Read More
Massively Speed-Up your Learning Algorithm, with Stochastic Thinning
- Vincent Granville
- April 7, 2023
Dramatically Speed-Up your Learning Algorithm, with Stochastic Thinning. Includes use case, Python code, regression and neural network illustrations.
Read More
Feature Clustering: A Simple Solution to Many Machine Learning Problems
- Vincent Granville
- March 12, 2023
Feature clustering is an unsupervised machine learning technique to separate the features of a dataset into homogeneous groups. In short, it is a clustering procedure, but performed on the features rather than on the observations. Such techniques often rely on a similarity metric, measuring how close two features are to each other. In this article, […]
Read More
Data Synthetization: enhanced GANs vs Copulas
- Vincent Granville
- March 8, 2023
Using case studies, I compare generative adversarial networks (GANs) with copulas to synthesize tabular data. I discuss back-end and front-end improvements to help GANs better replicate the correlation structure present in the real data. Likewise, I discuss methods to further improve copulas, including transforms, the use of separate copulas for each population segment, and parametric […]
Read More
New Book on Synthetic Data: Version 3.0 Just Released
- Vincent Granville
- February 3, 2023
Update on March 3, 2023: Version 4.0 has been released and now replaces version 3.0 on the e-Store. It contains a new full chapter on enhanced generative adversarial networks (GANs) with comparison to copula-based methods for data synthetization, with illustrations on real-life datasets. The book has considerably grown since version 1.0. It started with synthetic […]
Read More
New Interpolation Methods for Data Synthetization and Prediction
- Vincent Granville
- January 14, 2023
Entitled “New Interpolation Methods for Synthetization and Prediction”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. This article is an extract from my book “Synthetic Data and Generative AI”, available here. I describe little-known original interpolation methods with applications to real-life datasets. These simple techniques are easy to implement and can […]
Read More
You must be logged in to post a comment.