
My talk at the ODSC Conference, San Francisco, October 2023. Includes Notebook demonstration, using our open-source Python libraries. View or download the PowerPoint presentation, here.
I discuss NoGAN, an alternative to standard tabular data synthetization. It runs 1000x faster than GAN, consistently delivering better results according to the most sophisticated evaluation metric, implemented here for the first time. A game changer that significantly reduces costs (cloud or GPU time, training time, and fine-tuning parameters replaced by auto-tuning). Now available as open-source.
In real-life case studies, the synthetization was generated in less than 5 seconds, versus 10 minutes with GAN. It produced higher quality results, verified via cross-validation. Thanks to the very fast implementation, it is possible to automatically and efficiently fine-tune the hyperparameters. I also discuss next steps to further improve the speed, the faithfulness of the generated data, auto-tuning, Gaussian NoGAN, and applications other than synthetization.
Additional material including my book “Statistical Optimization for GenAI and Machine Learning” can be found here. To not miss future articles and access members-only content, sign-up to my free newsletter, here.
Speaker
Vincent is also a former post-doc at Cambridge University, and the National Institute of Statistical Sciences (NISS). He published in Journal of Number Theory, Journal of the Royal Statistical Society (Series B), and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is the author of multiple books, including “Synthetic Data and Generative AI” (Elsevier, 2024). Vincent lives in Washington state, and enjoys doing research on stochastic processes, dynamical systems, experimental math and probabilistic number theory. He recently launched a GenAI certification program, offering state-of-the-art, enterprise grade projects to participants.