Massively Speed-Up your Learning Algorithm, with Stochastic Thinning

You have to see it to believe it! Imagine a technique where you randomly delete as many as 80% of your observations in the training set, without decreasing the predictive power (actually improving it in many cases), and reducing computing time by an order of magnitude. In its simplest version, that’s what stochastic thinning does. Here, performance improvement is measured outside the training set, on the validation set also called test data. I illustrate this method on a real-life dataset, in the context of regression and neural networks. In the latter, it speeds up the training stage by a noticeable factor. The thinning process applies to the training set, and may involve multiple tiny random subsets called fractional training sets, representing less than 20% of the training data when combined together. It can also be used for data compression, or to measure the strength of a machine learning algorithm.

Predicted vs. observed (diagonal is perfect fit), after calibration (right plot)

I also show the potential limitations of the new technique, and introduce the concepts of leading or influential observations (those kept for learning purposes) and followers (observations dropped from the training set). The word “influential observations” should not be confused with its usage in statistics, although in both cases it leads to explainable AI. The neural network used in this article offers replicable results by controlling all the sources of randomness, a property rarely satisfied in other implementations.

Neural nets, stochastic convergence: different seeds lead to different local optima

If you are new to neural networks and deep learning or manage a group of engineers developing or using such tools, the full technical article (13 pages including 6 pages of Python code) will give you a quick overview of the issues and benefits surrounding these methods, and a solid high-level introduction to the subject including how to discover and overcome — or leverage — the problems faced.

Download the Article

The technical article, entitled Massively Speed-Up your Learning Algorithm, with Stochastic Thinning, is accessible in the “Free Books and Articles” section as paper #23, here. It contains links to my GitHub files, to easily copy and paste the code. The text highlighted in orange in this PDF document are keywords that will be incorporated in the index, when I aggregate all my related articles into books about machine learning, visualization and Python. The text highlighted in blue corresponds to external clickable links, mostly references. And red is used for internal links, pointing to a section, bibliography entry, equation, and so on.

To not miss future articles, sign-up to our newsletter, here.

About the Author

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

	messerb5467 on Quantum Derivatives, GenAI, an…
Vincent Granville – Author, publisher, machine learning scientist. Founder of MLtechniques.com. Co-founder of Data Science Central, acquired by Tech Target.	Vincent Granville on Quantum Derivatives, GenAI, an…
	Brad Messer on Quantum Derivatives, GenAI, an…
	Sanjay Gautam on Number Theory: Longest Runs of…
Artem Melnyk – Ukraine – Hello there! My name is Artem. I am an AI enthusiast and affiliate marketer. As an AI enthusiast, I'm always on the lookout for new tools, techniques, and ideas that can help businesses and individuals utilize AI to stimulate innovation and growth. As an affiliate marketer, I'm passionate about helping people discover the best AI products and services available. Whether it's an advanced AI platform or powerful machine learning tool, my insights and recommendations are always eager to be shared with others. Are you passionate about AI content? Look no further! I enjoy liking, following and commenting on blogs related to AI, as well as finding new opportunities to collaborate with fellow AI enthusiasts and marketers. If you're interested in learning more about my affiliate marketing endeavors, feel free to check out https://zeep.ly/SmdwN. I'm sure that you'll find some fantastic AI products and services that can help take your business or personal projects to the next level. Thanks for stopping by; I look forward to connecting with you soon!	Artem Melnyk on Autonomous Driving: Boosting O…

Massively Speed-Up your Learning Algorithm, with Stochastic Thinning

Table of Contents

Download the Article

About the Author

Leave a ReplyCancel reply

Massively Speed-Up your Learning Algorithm, with Stochastic Thinning

Table of Contents

Download the Article

About the Author

Share this:

Leave a ReplyCancel reply

Discover more from xLLM and AI Technology