Explainable AI Featured Posts Generative AI Machine Learning Podcasts Synthetic Data

NoGAN: New Generation of Synthetic Data (Video)

My talk at the Generative AI Conference, London, September 2023. View or download the PowerPoint presentation, here. I introduce a new, NoGAN alternative to standard tabular data synthetization. It is designed to run faster by several orders of magnitude, compared to training generative adversarial networks (GAN). In addition, the quality of the generated data is […]

Read More
Explainable AI Featured Posts Generative AI Python Statistical Science Synthetic Data

GenAI: Fast Data Synthetization with Distribution-free Hierarchical Bayesian Models

Deep learning models such as generative adversarial networks (GAN) require a lot of computing power, and are thus expensive. Also, they may not convergence. What if you could produce better data synthetizations, in a fraction of the time, with explainable AI and substantial cost savings? This is what Hierarchical Deep Resampling was designed for. It […]

Read More
Data Sets Explainable AI Featured Posts Generative AI Machine Learning Synthetic Data

New Python Library to Evaluate AI-generated Data and Compare Models

Called GenAI-Evalution, you use it for instance to assess the quality of tabular synthetic data. In this case, it measures how faithfully the synthetization mimics the real data it is derived from, by comparing the full joint empirical distributions (ECDF) attached to the two datasets. It works both with categorical and numerical features, and returns […]

Read More
Data Sets Featured Posts Generative AI Machine Learning Python Synthetic Data

How to Fix a Failing Generative Adversarial Network

In this article, I explore different front-end strategies to improve a generative adversarial network (GAN) that leads to poor synthetization, in the context of tabular data generation. It is well known that tabular data is a lot more challenging than images, when using deep neural networks for synthetization purposes. An algorithm may work very well […]

Read More
Explainable AI Featured Posts Generative AI Synthetic Data

Generative AI Technology Break-through: Spectacular Performance of New Synthesizer

Neural network methods have overshadowed all other techniques in the last decade, to the point that alternatives are simply ignored. And for good reasons: techniques such as generative adversarial networks (GAN) proved very successful in some contexts, especially computer vision. Indeed, there has been several attempts to turn every problem and traditional method — regression, […]

Read More
Data Sets Explainable AI Featured Posts Generative AI Synthetic Data Time Series

Synthesizing Geospatial Data with A Simple NoGAN Technique

If you regularly read my articles, you know that I developed several different techniques for data synthetization. Many are explained in details in my upcoming book Synthetic Data and Generative AI (Elsevier), available here. It includes generative adversarial networks (GANs), copulas, agent-based modeling, methods based on interpolation, correlated noise mixtures, and more. The technique presented […]

Read More
Data Sets Experimental Math Featured Posts Generative AI Synthetic Data Time Series

Sound Generation in Python: Turning Your Data into Music

Not long ago, I published here an article entitled “The Sound that Data Makes”. The goal was turning data — random noise in this case — into music. The hope was that by “listening” to your data, you could gain a different kind of insights, not conveyed by visualizations or tabular summaries. This article is […]

Read More
Featured Posts Generative AI Machine Learning Synthetic Data

Generated Data vs Monte-Carlo Simulations: What are the Differences?

I sometimes get asked this question: could you use simulations instead of synthetizations? Below is my answer, also focusing on some particular aspects of data synthetizations, that differentiate them from other techniques. Simulations do not simulate joint distributions Sure, if all your features behave like a mixture of multivariate normal distributions, you can use GMMs […]

Read More
Books Featured Posts Generative AI Python Synthetic Data

My Book on Generative AI Now on Amazon

Published by Elsevier, available in print in January 2024. You can pre-order it now, here. The PDF version is available on my e-store. The book is fully written. It is offered at a steep discount to participants in my Generative AI certification program, available here. This is the first technical book focusing on synthetic data and its […]

Read More
Data Sets Featured Posts Synthetic Data

Generative AI: Synthetic Data Vendor Comparison and Benchmarking Best Practices

The goal of data synthetization is to produce artificial data that mimics the patterns and features present in existing, real data. Many generation methods and evaluation techniques are available, depending on purposes, the type of data, and the application field. Everyone is familiar with synthetic images in the context of computer vision, or synthetic text […]

Read More