Statistical Science

Data Sets Featured Posts Machine Learning Statistical Science Synthetic Data

Military-grade Fast Random Number Generator Based on Quadratic Irrationals

This article is an extract from my book “Synthetic Data and Generative AI”, available here. There are very few serious articles in the literature dealing with digits of irrational numbers to build a pseudo-random number generator (PRNG). It seems that this idea was abandoned long ago due to the computational complexity and the misconception that such […]

Read More
Featured Posts Machine Learning Statistical Science Synthetic Data Time Series Visualization

Machine Learning Cloud Regression: The Swiss Army Knife of Optimization

Entitled “Machine Learning Cloud Regression: The Swiss Army Knife of Optimization”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. Also discussed in details with Python code in chapter 1 in my book “Intuitive Machine Learning and Explainable AI”, available here. Many machine learning and statistical techniques exist as seemingly unrelated, […]

Read More
Featured Posts Machine Learning Statistical Science Stochastic Systems Synthetic Data Time Series

Weird Random Walks: Synthetizing, Testing and Leveraging Quasi-randomness

Entitled “Weird Random Walks: Synthetizing, Testing and Leveraging Quasi-randomness”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. I discuss different types of synthetized random walks that are almost perfectly random, in one and two dimensions. Besides the theoretical interest, it provides new modeling tools, especially for physicists, engineers, natural […]

Read More
Experimental Math Featured Posts Machine Learning Statistical Science

New Perspective on the Riemann Hypothesis

Entitled “New Perspective on the Riemann Hypothesis”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. In about 10 pages (plus Python code, exercises and figures), this article constitutes a scratch course on the subject. It covers a large range of topics, both recent as well as unpublished, in a […]

Read More
Data Sets Featured Posts Machine Learning Podcasts Statistical Science Synthetic Data

Synthetic Data in Machine Learning: What, Why, How?

In this episode, Nicolai Baldin (CEO) and Simon Swan (Machine Learning Lead) of Synthesized are welcoming the founder of Data Science Central and MLTechniques.com Vincent Granville to discuss synthetic data generation, share secrets about Machine Learning on synthetic data, key challenges with synthetic data, and using generative models to solve issues related to fairness and […]

Read More
Books Explainable AI Featured Posts Machine Learning Statistical Science Synthetic Data Visualization

2nd Edition of My Book Now Published, with Python Code

The book covers supervised classification, including fractal classification, as well as unsupervised clustering, using an innovative approach. Datasets are first mapped onto an image, then processed using image filtering techniques. I discuss the analogy with neural networks, comparing very deep but sparse neural networks, with standard networks. Sponsors The free distribution of our content would […]

Read More
Featured Posts Machine Learning Statistical Science Stochastic Systems Time Series

Gentle Introduction to Linear Algebra, with Spectacular Applications

This material is also discussed in details with Python code in chapter 3 in my book “Intuitive Machine Learning and Explainable AI”, available here. This is not a traditional tutorial on linear algebra. The material presented here, in a compact style, is rarely taught in college classes. It covers a wide range of topics, while avoiding […]

Read More
Explainable AI Featured Posts Machine Learning ML with Excel Statistical Science Synthetic Data Visualization

Fuzzy Regression: A Generic, Model-free, Math-free Machine Learning Technique

Some people climb Mount Everest solo in winter, with no oxygen. Some mathematicians prove difficult theorems using only elementary arithmetic. The proof, despite labeled as “elementary” is typically far more complicated than those based on advanced mathematical theory. The people accomplishing these feasts are very rare. Introduction For years, I have developed machine learning techniques […]

Read More
Data Sets Explainable AI Featured Posts Machine Learning ML with Excel Statistical Science Synthetic Data

Little Known Secrets about Interpretable Machine Learning on Synthetic Data

Entitled “Little Known Secrets about Interpretable Machine Learning on Synthetic Data”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. This first article in a new series on synthetic data and explainable AI, focuses on making linear regression more meaningful and controllable. Includes synthetic data, advanced machine learning with Excel, […]

Read More
Machine Learning Statistical Science

Why are Confidence Regions Elliptic? Simple Explanation

A 90% confidence region is a domain of minimum area, containing 90% of the mass of a distribution. By distribution, here I mean a bivariate probability distribution, though the concept is not specific to machine learning. The 90% is called the confidence level, and I denote it as γ. Confidence regions are a generalization of […]

Read More