Advanced Machine Learning with Basic Excel: Simple Alternative to XGBoost

Entitled “Advanced Machine Learning with Basic Excel”, the full version in PDF format is accessible in the “Free Books and Articles” section as paper #11, here. Also discussed in detail with Python code in chapter 2 in my book “Intuitive Machine Learning and Explainable AI”, available here.

I discuss ensemble methods combining many mini decision trees, blended with regression, explained in simple English with both Excel and Python implementations. Case study: natural language processing (NLP) problem. Ideal reading for professionals who want to start light with Machine Learning (say with Excel) and get very fast to much more advanced material and Python. The Python code is not just a call to some Blackbox functions, but a full-fledge detailed procedure on its own. This algorithm is in the same category as boosting, bagging, stacking and AdaBoost.

Abstract

The method described here illustrates the concept of ensemble methods, applied to a real-life NLP problem: ranking articles published on a website to predict performance of future blog posts yet to be written, and help decide on title and other features to maximize traffic volume and quality, and thus revenue. The method, called hidden decision trees (HDT), implicitly builds a large number of small usable (possibly overlapping) decision trees. Observations that don’t fit in any usable node are classified with an alternate method, typically simplified logistic regression.

This hybrid procedure offers the best of both worlds: decision tree combos and regression models. It is intuitive and simple to implement. The code is written in Python, and I also offer a light version in basic Excel. The interactive Excel version is targeted to analysts interested in learning Python or machine learning. HDT fits in the same category as bagging, boosting, stacking and adaBoost. This article encourages you to understand all the details, upgrade the technique if needed, and play with the full code or spreadsheet as if you wrote it yourself. This is in contrast with using Blackbox Python functions without understanding their inner workings and limitations. Finally, I discuss how to build model-free confidence intervals for the predicted values.

Methodology
. . . How hidden decision trees (HDT) work
. . . NLP Case study: summary and findings
. . . Parameters
. . . Improving the methodology
Implementation details
. . . Correcting for bias
. . . . . . Time-adjusted scores
. . . Excel spreadsheet
. . . Python code and dataset
Model-free confidence intervals and perfect nodes
. . . Interesting asymptotic properties of confidence intervals

Download the Article

The technical article is accessible in the “Free Books and Articles” section as paper #11, here. The text highlighted in orange in this PDF document are keywords that will be incorporated in the index, when I aggregate all my related articles into a single book about innovative machine learning techniques. The text highlighted in blue corresponds to external clickable links, mostly references. And red is used for internal links, pointing to a section, bibliography entry, equation, and so on.

To not miss future articles, sign-up to our newsletter, here.

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

	messerb5467 on Quantum Derivatives, GenAI, an…
	Vincent Granville on Quantum Derivatives, GenAI, an…
	Brad Messer on Quantum Derivatives, GenAI, an…
	Sanjay Gautam on Number Theory: Longest Runs of…
	Artem Melnyk on Autonomous Driving: Boosting O…

Advanced Machine Learning with Basic Excel: Simple Alternative to XGBoost

Abstract

Table of Contents

Download the Article

About the Author

Like this:

Leave a ReplyCancel reply

Advanced Machine Learning with Basic Excel: Simple Alternative to XGBoost

Abstract

Table of Contents

Download the Article

About the Author

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from xLLM and AI Technology