My talk at the ODSC Conference, San Francisco, October 2023. Includes Notebook demonstration, using our open-source Python libraries. View or download the PowerPoint presentation, here.
I discuss NoGAN, an alternative to standard tabular data synthetization. It runs 1000x faster than GAN, consistently delivering better results according to the most sophisticated evaluation metric, implemented here for the first time. A game changer that significantly reduces costs (cloud or GPU time, training time, and fine-tuning parameters replaced by auto-tuning). Now available as open-source.
In real-life case studies, the synthetization was generated in less than 5 seconds, versus 10 minutes with GAN. It produced higher quality results, verified via cross-validation. Thanks to the very fast implementation, it is possible to automatically and efficiently fine-tune the hyperparameters. I also discuss next steps to further improve the speed, the faithfulness of the generated data, auto-tuning, Gaussian NoGAN, and applications other than synthetization.
Additional material including my book “Statistical Optimization for GenAI and Machine Learning” can be found here. To not miss future articles and access members-only content, sign-up to my free newsletter, here.
Vincent Granville is a pioneering GenAI scientist and machine learning expert, co-founder of Data Science Central (acquired by a publicly traded company in 2020), Chief AI Scientist at MLTechniques.com, former VC-funded executive, author and patent owner — one related to LLM. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET.
Vincent is also a former post-doc at Cambridge University, and the National Institute of Statistical Sciences (NISS). He published in Journal of Number Theory, Journal of the Royal Statistical Society (Series B), and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is the author of multiple books, including “Synthetic Data and Generative AI” (Elsevier, 2024). Vincent lives in Washington state, and enjoys doing research on stochastic processes, dynamical systems, experimental math and probabilistic number theory. He recently launched a GenAI certification program, offering state-of-the-art, enterprise grade projects to participants.