Breakthrough: Zero-Weight LLM for Accurate Predictions and High-Performance Clustering
- Vincent Granville
- May 4, 2024
While most AI companies keep building LLMs with more weights and tokens (now one trillion is a standard number), I went in the opposite direction. Of course, zero weight means that there is no neural network behind the scenes. More specifically, it means that there is no lengthy Blackbox process to find the “best” weights […]
Read MoreGenome: Synthesizing DNA Sequences with LLM Techniques
- Vincent Granville
- December 8, 2023
This methodology is not focused on genome data alone. The purpose is to design a generic solution that may also work in other contexts, such as synthesizing molecules. The problem involves dealing with a large amount of “text”. Indeed, the sequences discussed here consist of letter arrangements, from an alphabet that has 5 symbols: A, […]
Read More10 GenAI Notebooks: OpenAI, LLM, RAG, GPT, and More
- Vincent Granville
- December 1, 2023
For developers and AI/ML professionals. This comprehensive free resource offered by our sponsor is designed to provide you with hands-on experience and deeper insights into building cutting-edge GenAI applications. 🌟 Special Opportunity: You can win a pair of Apple Airpods simply by following the tutorial and learning something new. How to Participate Follow these 2 […]
Read MoreNumber Theory: Longest Runs of Zeros in Binary Digits of Square Root of 2
- Vincent Granville
- October 27, 2023
Studying the longest head runs in coin tossing has a very long history, starting in gaming and probability theory. Today, it has applications in cryptography and insurance. For random sequences or Bernoulli trials, the associated statistical properties and distributions have been studied in detail, even when the proportions of zero and one are different. Yet, […]
Read MoreNew Python Library to Evaluate AI-generated Data and Compare Models
- Vincent Granville
- September 19, 2023
Called GenAI-Evalution, you use it for instance to assess the quality of tabular synthetic data. In this case, it measures how faithfully the synthetization mimics the real data it is derived from, by comparing the full joint empirical distributions (ECDF) attached to the two datasets. It works both with categorical and numerical features, and returns […]
Read MoreHow to Fix a Failing Generative Adversarial Network
- Vincent Granville
- August 12, 2023
In this article, I explore different front-end strategies to improve a generative adversarial network (GAN) that leads to poor synthetization, in the context of tabular data generation. It is well known that tabular data is a lot more challenging than images, when using deep neural networks for synthetization purposes. An algorithm may work very well […]
Read MoreSynthesizing Geospatial Data with A Simple NoGAN Technique
- Vincent Granville
- July 28, 2023
If you regularly read my articles, you know that I developed several different techniques for data synthetization. Many are explained in details in my upcoming book Synthetic Data and Generative AI (Elsevier), available here. It includes generative adversarial networks (GANs), copulas, agent-based modeling, methods based on interpolation, correlated noise mixtures, and more. The technique presented […]
Read MoreSound Generation in Python: Turning Your Data into Music
- Vincent Granville
- July 13, 2023
Not long ago, I published here an article entitled “The Sound that Data Makes”. The goal was turning data — random noise in this case — into music. The hope was that by “listening” to your data, you could gain a different kind of insights, not conveyed by visualizations or tabular summaries. This article is […]
Read MoreGenerative AI: Synthetic Data Vendor Comparison and Benchmarking Best Practices
- Vincent Granville
- June 16, 2023
The goal of data synthetization is to produce artificial data that mimics the patterns and features present in existing, real data. Many generation methods and evaluation techniques are available, depending on purposes, the type of data, and the application field. Everyone is familiar with synthetic images in the context of computer vision, or synthetic text […]
Read MoreMassively Speed-Up your Learning Algorithm, with Stochastic Thinning
- Vincent Granville
- April 7, 2023
Dramatically Speed-Up your Learning Algorithm, with Stochastic Thinning. Includes use case, Python code, regression and neural network illustrations.
Read More