
This complements the list that I posted earlier under the title “Math for Machine Learning: 14 Must-Read Books”, available here. Many of the following books have a free PDF version, their own website and GitHub repository, and usually you can purchase the print version. Some are self-published, with the PDF version regularly updated, and even browsable online. I included a few textbooks from top companies and universities. Whenever possible, the link to the free version is posted here.
Books Published in 2020 or Later
This section covers new books, technical and non-technical, directly or indirectly related to machine learning. They quickly gained a lot of popularity.
1. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. November 2022 (850 pages). By Aurélien Géron. With this updated third edition, the author explores a range of techniques, starting with simple linear regression and progressing to deep neural networks. Numerous code examples and exercises throughout the book help you apply what you’ve learned. Programming experience is all you need to get started. Visit the GitHub repository here, and see what’s new in the third edition, here. You can purchase this O’Reilly book, here.
2. Learning Scientific Programming with Python. This is the second edition, published in 2020. With 570 pages in the print version, it is a good reference on the topic, especially for Matplotlib, NumPy and SciPy users and beginners. The book, authored by Christian Hill, is available here on the SciPython.com website. The best way to use the online browsable version is via the search box on the website, which is much more useful than the table of contents.
3. Python Quick Guide. This is more an online reference than a book, but particularly useful and well written. The quick guide, available here, explains how to get started with detailed installation instructions. The full guide, here, is easy to navigate thanks to the table of contents on the left panel. You can buy the PDF version here (405 pages). This tutorial is designed for software programmers who need to learn Python programming language from scratch. You should have a basic understanding of computer programming terminologies. A basic understanding of any of the programming languages is a plus.
4. Intuitive Machine Learning and Explainable AI. Self-published by Vincent Granville, 2022 (156 pages). With 25 pages of Python code on GitHub (here), animated data visualizations on YouTube (here), and glossary. Several of the 11 chapters are based on articles available for free, here. The most recent, full version of the book, can be purchased here, together with its companion book. Contains of lot of new unpublished and original material, ranging from simple to advanced but always explained in simple English. For instance, convergence of very deep neural networks, a simple alternative to XGBoost, and a generic unsupervised regression technique covering all regressions methods (generic logistic, Lasso and so on), as well as curve fitting and clustering in one single constrained optimization algorithm. Heavy use of synthetic data and model-free inference. Read more here.

5. Concise Machine Learning. lecture notes for UC Berkeley’s class CS 189/289A, by Jonathan Shewchuck (2022). This report contains lecture notes for UC Berkeley’s introductory class on Machine Learning. It covers many methods for classification and regression, and several methods for clustering and dimensionality reduction. It is concise because nothing is included that cannot be written or spoken in a single semester’s lectures and because the choice of topics is limited to a small selection of particularly useful, popular algorithms. You can download these notes here (172 pages) or view it on LinkedIn, here.
6. Practical Statistics for Data Scientists. O’Reilly, 2nd edition, 2020 (368 pages). By Peter Gedeck. It covers 50+ essential concepts using R and Python. The PDF version of the book (and a few others about machine learning with R) is posted on ResearchGate, here. Available on Amazon, here. In the preface, it says: “We are well aware of the limitations of traditional statistics instruction: statistics as a discipline is a century and a half old, and most statistics textbooks and courses are laden with the momentum and inertia of an ocean liner. All the methods in this book have some connection — historical or methodological — to the discipline of statistics. Methods that evolved mainly out of computer science, such as neural nets, are not included”. Good reading though, with a strong emphasis on modern statistical techniques used in machine learning.
7. Algorithms for Decision Making. By Mykel Kochenderfer, Associate Professor at Stanford University. Published by MIT Press in 2022 (700 pages). Website with free access: AlgorithmsBook.com. This book is about decision science. It will appeal more to people with an operations research background rather than machine learning professionals. Over 100 optimization algorithms are covered, with mathematical explanations and formulas, computational complexity, and accompanying Julia code. Yet the style is concise.
8. A Brief Overview of 160 Cognitive Biases. By Murat Durmus. Includes a chapter on algorithmic biases. Self-published in 2022 (233 pages). The author posted a free PDF version on LinkedIn, here. You can buy the book here. Not a technical book, but well worth reading, and a hot topic recently. The author discusses perception biases, and how they can permeate into AI when people designing and building the technology are unaware of cultural and other biases ingrained in their brain.
Earlier Books
In this section, I included books that are very useful and popular, available for free if possible, yet not well known by everyone. In short, hidden gems. Of course there are other books, older, like the seminal masterpiece written by Jerome Friedman, “The Elements of Statistical Learning” available for free here.
1. Foundations of Data Science. Published by Microsoft in 2017 (465 pages). Available for free, here. There is a video series associated with this PDF, accessible from the same web page. Authored by Avrim Blum, John Hopcroft and Ravi Kannan. Interesting topics covered in this tutorial include random graphs, Hidden Markov Models, wavelets, singular value decomposition and dimension reduction, random walks, generative adversarial networks, streaming and sketching (massive data problems), and clustering (K-means).
2. Pattern Recognition and Machine Learning. Published by Microsoft in 2006, authored by Christopher Bishop (758 pages). Available for free, here. Aimed at advanced undergraduates or first year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self contained introduction to basic probability theory. Interesting topics include sparse kernel machines, approximate inference, Markov chain Monte Carlo and hidden models. Of course it covers most of the classics known in 2006: principal components analysis, EM algorithm, mixtures, graphical models, neural networks, and linear models.
3. A Brief Introduction to Machine Learning for Engineers. Preprint on ArXiv, 237 pages, updated in 2018. Available here. Authored by Osvaldo Simeone, professor of information engineering at King College, London. This is not a book about data engineering. This monograph aims at providing an introduction to key concepts, algorithms, and theoretical results in machine learning. The treatment concentrates on probabilistic models for supervised and unsupervised learning problems. It introduces fundamental concepts and algorithms by building on first principles, while also exposing the reader to more advanced topics with extensive pointers to the literature, within a unified notation and mathematical framework. The material covers discriminative and generative models, frequentist and Bayesian approaches, exact and approximate inference, as well as directed and undirected models. This monograph is meant as an entry point for researchers with a background in probability and linear algebra.
4. Data Clustering: Algorithms and Applications. By Robert Haralick (Editor), distinguished professor of computer science a CUNY. Published in 2014 (648 pages). In-depth monograph surveying the topic, with over 25 contributing authors. Available for free, here. Covers grid and density-based clustering, probabilistic, graph and hierarchical models, feature selection, time series, big data, categorical data, text and multimedia clustering, as well as uncertain and semi-supervised clustering.
In addition, some university textbooks are available online. For instance, Applied Data Science (Columbia University, 2013), Machine Learning for Intelligent Systems (Cornell University, 2018, CS 4780/CS 5780 courses), and Introduction to Probability (Harvard University, 2019, Stat 110 course). Berkeley has several textbooks freely available, for instance A Comprehensive Guide to Machine Learning (2019, CS 189 course) and Mathematics for Machine Learning (2018). See also Deep Learning (MIT, 2016). The online tutorial Dive Into Deep Mining is adopted by 400 universities, according to the author. Vipul Patel, Chief Data Scientist at SAP, offers a large collection of tutorials covering pretty much everything, see here.
You must log in to post a comment.