Easy Trick to Debias GenAI Models: Quantile Convolution

All of the GenAI apps that I tested, including my own, have the same problem. They cannot easily generate data outside the observation range. As an example, let’s focus on the insurance dataset discussed in my new book. I use it to generate synthetic data with GAN (generative adversarial networks) and the NoGAN models discussed […]

New Book: Understanding Deep Learning

By Simon Prince, computer science Professor at the University of Alberta. To be published by MIT Press, Dec 2023. The author shares the associated Jupyter notebooks on his website, here. Very popular, it got over 5,000 likes when the author announced the upcoming book on LinkedIn. I pre-ordered my copy. Summary An authoritative, accessible, and […]

Quantum Derivatives, GenAI, and the Riemann Hypothesis

Have you ever encountered a function or cumulative probability distribution (CDF) that is nowhere differentiable, yet continuous everywhere? Some are featured in this article. For a CDF, it means that it does not have a probability density function (PDF), and for a standard function, it has no derivative. At least, not until now. The quantum […]

Number Theory: Longest Runs of Zeros in Binary Digits of Square Root of 2

Studying the longest head runs in coin tossing has a very long history, starting in gaming and probability theory. Today, it has applications in cryptography and insurance. For random sequences or Bernoulli trials, the associated statistical properties and distributions have been studied in detail, even when the proportions of zero and one are different. Yet, […]

NoGAN: New Generation of Synthetic Data (Video)

My talk at the Generative AI Conference, London, September 2023. View or download the PowerPoint presentation, here. I introduce a new, NoGAN alternative to standard tabular data synthetization. It is designed to run faster by several orders of magnitude, compared to training generative adversarial networks (GAN). In addition, the quality of the generated data is […]

GenAI: Fast Data Synthetization with Distribution-free Hierarchical Bayesian Models

Deep learning models such as generative adversarial networks (GAN) require a lot of computing power, and are thus expensive. Also, they may not convergence. What if you could produce better data synthetizations, in a fraction of the time, with explainable AI and substantial cost savings? This is what Hierarchical Deep Resampling was designed for. It […]

New Python Library to Evaluate AI-generated Data and Compare Models

Called GenAI-Evalution, you use it for instance to assess the quality of tabular synthetic data. In this case, it measures how faithfully the synthetization mimics the real data it is derived from, by comparing the full joint empirical distributions (ECDF) attached to the two datasets. It works both with categorical and numerical features, and returns […]

How to Fix a Failing Generative Adversarial Network

In this article, I explore different front-end strategies to improve a generative adversarial network (GAN) that leads to poor synthetization, in the context of tabular data generation. It is well known that tabular data is a lot more challenging than images, when using deep neural networks for synthetization purposes. An algorithm may work very well […]

Generated Data vs Monte-Carlo Simulations: What are the Differences?

I sometimes get asked this question: could you use simulations instead of synthetizations? Below is my answer, also focusing on some particular aspects of data synthetizations, that differentiate them from other techniques. Simulations do not simulate joint distributions Sure, if all your features behave like a mixture of multivariate normal distributions, you can use GMMs […]

High-value AI and Machine Learning Certifications Under $50

Our AI/ML research lab now offers a quick path to certification in generative AI and other modern topics relevant to new developments in the industry, as well as traditional and specialized certifications and training. All in Python. Probably the fastest and most affordable way to earn professional, high value credentials offered by one of the […]

