# Fast Classification and Clustering via Image Convolution Filters

Subtitled “Alternative to Generative Mixture Models”, the full version in PDF format is accessible in the “Free Books and Articles” section, here. It is also described in details in my book “Stochastic Processes and Simulations: A Machine Learning Perspective”, available here.

I explain, with Python code and numerous illustrations, how to turn traditional tabular data into images, to perform both clustering and supervised classification using simple image filtering techniques. I also explain how to generalize the methodology to higher dimensions, using tensors rather than images. In the end, image bitmaps are 2D arrays or matrices, that is, 2D tensors. By classifying the entire space (in low dimensions), the resulting classification rule is very fast. I also discuss the convergence of the algorithm, and how to further improve its speed.

This short article covers many topics and can be used as a first introduction to synthetic data generation, mixture models, boundary effects, explainable AI, fractal classification, stochastic convergence, GPU machine learning, deep neural networks, and model-free Bayesian classification. I use very little math, making it accessible to the layman, and certainly, to non-mathematicians. Introducing an original, intuitive approach to general classification problems, I explain in simple English how it relates to deep and very deep neural networks. In the process, I make connections to image segmentation, histogram equalization, hierarchical clustering, convolution filters, and stochastic processes. I also compare standard neural networks with very deep but sparse ones, in terms of speed and performance. The fractal classifier — an example of very deep neural network — is illustrated with a Python-generated video (see video here). It is useful when dealing with massively overlapping clusters and a large number of observations. Hyperparameters allow you to fine tune the level of cluster overlap in the synthetic data, and the shape of the clusters.

## Abstract

I generate synthetic data using a superimposition of stochastic processes, comparing it to Bayesian generative mixture models (Gaussian mixtures). I explain the benefits and differences. The actual classification and clustering algorithms are model-free, and performed in GPU as image filters, after transforming the raw data into an image. I then discuss the generalization to 3D or 4D, and to higher dimensions with sparse tensors. The technique is particularly suitable when the number of observations is large, and the overlap between clusters is substantial.

It can be done using few iterations and a large filter window, comparable to a neural network, with pixels in the local window being the nodes, and their distance to the local center being the weight function. Or you can implement the method with a large number of iterations — the equivalent of hundreds of layers in a deep neural network — and a tiny window. This latter case corresponds to a sparse network with zero or one connection per node. It is used to implement fractal classification, where point labeling changes at each iteration, around highly non-linear cluster boundaries. This is equivalent to putting a prior on class assignment probabilities in a Bayesian framework. Yet, classification is performed without underlying model. Finally, the clustering (unsupervised) part of the algorithm relies on the same filtering techniques, combined with a color equalizer. The latter can be used to perform hierarchical clustering.

The Python code, included in this document, is also on my GitHub repository. A data animation illustrates how simple the methodology is: each frame in the video represents one iteration, that is, a single application of the filter to all the data points. Indeed, the classifier can be used as a black box system. It follows the modern trend of interpretable machine learning, also called explainable AI. The video shows how the algorithm converges to an optimum, producing a classification of the entire observation space. Classifying a new point is then immediate: read its color. The whole system is time-efficient. It does not require the computation of all training set point intra-distances. However it is memory-intensive. Large filters can be slow, though they require very few iterations. I discuss a simple technique to make them a lot faster.

Introduction

Generating the synthetic data

• Simulations with logistic distribution
• Mapping the raw observations onto an image bitmap

Classification and unsupervised clustering

• Supervised classification based on convolution filters
• Clustering based on histogram equalization
• Fractal classification: deep neural network analogy
• Generalization to higher dimensions
• Towards a very fast implementation

Python code

• Fractal classification
• GPU classification and clustering