Cybersecurity Use Case: AI Agent for Anomaly Detection
The case discussed here concerns fraudulent paid clicks to defraud a Google advertiser. The sophisticated click fraud scheme involving clicking viruses, data centers and other means, is undetected by Google. I worked with the law firm involved in the litigation, to build an agent able to pinpoint the sources of fraudulent traffic. The agent processes […]
Read More
Visualizing Trading Strategies that Consistently Outperform the Stock Market
In this short paper, I discuss two topics. First, strategies to trade the S&P 500 index with few trades over long time periods, offering best exit, entry and re-entry points during the journey, to beat the baseline return. The baseline consists of staying long the whole time. The dataset has 40 years’ worth of daily […]
Read More
Fast Random Generators with Infinite Period for Large-Scale Reproducible AI and Cryptography
Modern GenAI apps rely on billions if not trillions of pseudo-random numbers. You find them in the construction of latent variables in nearly all deep neural networks and almost all applications: computer vision, synthetization, and LLMs. Yet, few AI systems offer reproducibility, though those described in my recent book, do. When producing so many random […]
Read More
New GenAI Evaluation Metric, Ultrafast Search, and Perfect Randomness
This article covers three different GenAI topics. First, I introduce one of the best random number generators (PRNG) with infinite period. Then I show how to evaluate the synthesized numbers using the full multivariate empirical distribution (same as KS that I used for NoGAN evaluation), but this time with ultra-fast radix search, a competitor to […]
Read More
Sampling Outside the Observation Range with Quantile Convolution
All of the GenAI apps that I tested, including my own, have the same problem. They cannot easily generate data outside the observation range. As an example, let’s focus on the insurance dataset discussed in my new book. I use it to generate synthetic data with GAN (generative adversarial networks) and the NoGAN models discussed […]
Read More
Quantum Derivatives, GenAI, and the Riemann Hypothesis
Have you ever encountered a function or cumulative probability distribution (CDF) that is nowhere differentiable, yet continuous everywhere? Some are featured in this article. For a CDF, it means that it does not have a probability density function (PDF), and for a standard function, it has no derivative. At least, not until now. The quantum […]
Read More
Number Theory: Longest Runs of Zeros in Binary Digits of Square Root of 2
Studying the longest head runs in coin tossing has a very long history, starting in gaming and probability theory. Today, it has applications in cryptography and insurance. For random sequences or Bernoulli trials, the associated statistical properties and distributions have been studied in detail, even when the proportions of zero and one are different. Yet, […]
Read More
GenAI: Fast Data Synthetization with Distribution-free Hierarchical Bayesian Models
Deep learning models such as generative adversarial networks (GAN) require a lot of computing power, and are thus expensive. Also, they may not convergence. What if you could produce better data synthetizations, in a fraction of the time, with explainable AI and substantial cost savings? This is what Hierarchical Deep Resampling was designed for. It […]
Read More
A Synthetic Stock Exchange Played with Real Money
Not only that, but you can predict — more precisely compute with absolute certainty — what the value of any stock will be tomorrow. Transaction fees are well below 0.05% and the market, at least in the version presented here, is fair: in other words, a zero-sum game if you play by luck. If instead […]
Read More
Smart Grid Search for Faster Hyperparameter Tuning
The objective is two-fold. First, I introduce a 2-parameter generalization of the discrete geometric and zeta distributions. Indeed, a combination of both. It allows you to simultaneously match the variance and mean in observed data, thanks to the two parameters p and α. To the contrary, each distribution taken separately only has one parameter, and […]
Read More
You must be logged in to post a comment.