Course: Intuitive Machine Learning and Explainable AI

This course is based on my book “Intuitive Machine Learning and Explainable AI”, available here. Participants will receive a free copy of this book. To register, follow this link. To receive updates about my future courses, sign-up to my newsletter, here.

The information below provides a brief overview of the course. See also my video presentation here, and a 6-min video extract from this live course, here.

Course Description

Solid machine learning foundations presented by a world leading expert. Full life cycle of machine learning development applied to enterprise-grade projects. Includes Python coding, scientific computing, optimization algorithms, explainable AI and state-of-the-art methods favoring simplicity, scalability, reusability, replicability, fast implementation, and easy maintenance. From data cleaning to model design, testing and feature selection, to great visualizations easy to “sell” to stakeholders and decision makers. Depending on the student background and interest, topics may cover augmented data, generative and mixture models, big data, deep neural networks, image processing, machine learning in GPU, graph models, curve and shape fitting, taxonomy creation (NLP) and more. Numerous regression methods including logistic or Lasso are unified and presented under a same umbrella.

Prerequisites

Familiarity with basic linear algebra concepts such as elementary matrix operations. Familiarity with basic calculus principles such as maximum and minimum of a function. Be able to install Python on your laptop, and relevant libraries (though I will explain how to do it). Familiarity with basic file processing on a laptop or online folder.

Who is this course for

Anyone with some analytic background (engineer, analyst, data scientist, quant, statistician, software developer, teacher, preferably with at least one year of college education with a first course in calculus, some exposure to programming languages (C, C++, Java, Python, PHP, Perl, R, SQL). Experience with manipulating some datasets, even if in Excel only, will help.

The course is suited to busy professionals and students who want to learn quickly and get to the important points without wasting time on long, boring videos. Also ideal for self-learners who need a solid “jump-start” for career acceleration, and interested in quickly working on real-life problems.

Supervised classification (top) and unsupervised clustering (bottom)

What you will get out of the course

Be able to complete machine learning projects from beginning to end, just like a professional working in the industry, for projects ranging from NLP, clustering, regression to computer vision. Learn how to learn and become independent to solve any future problems. Tasks performed during the training include writing Python code and using Python libraries, data cleaning and exploratory analysis, modeling and testing using cross-validation methods, implementing model-free techniques, feature and model selection, testing black-box systems using synthetic data, and state-of-the art data animations (including data videos and sound) to present your results. Successful completion of four modules comes with a personal recommendation (endorsement) on LinkedIn.

Participants are also encouraged to seek advice regarding various career options. The instructor — born in a modest family — has literally done it all and is happy to help you. This includes raising VC money, working for various startups and large companies across multiple industries, self-funding a business from creation to selling to a publicly traded company, starting your own blog and turning it into a multi-million-dollar revenue stream, or creating a strong online presence (including GitHub, LinkedIn) with so many connections that you will never have to look for a job again (jobs will come to you).

New time series models

Modules

More modules will be added. Currently, the following are offered.

  • Python — Installing Python, running Python scripts, using libraries and understanding what they do. Writing your own code to solve new problems, using the most appropriate data structures. This scratch course is more than an introduction to Python: it is aimed at making you capable of quickly obtaining the right information to solve any problem you may face, and introduce you to scientific computing. I also discuss automated data cleaning and exploratory data analysis, as well as using GitHub. For code samples, see here.
  • Supervised and Unsupervised Learning — Covers the core of machine learning, including classification, clustering, regression, structuring unstructured data, cross-validation, model-fitting, feature selection, and a simple ensemble method related to boosted trees. Nearest neighbor graphs and deep neural networks are discussed in the context of GPU machine learning: classifying data using image processing techniques, after turning tabular data into images. New, simple clustering and mode-finding algorithm with exact solution (comparison to K-means).
  • Generative Models, Explainable AI and Synthetic Data — Testing black-box systems, designing better ones, and generating and leveraging rich synthetic data to improve the robustness of predictions, minimize overfitting, and assess when an algorithm does well, or not. Useful to deal with wide data and fraud analysis. This module also covers bootstrapping, alternatives to R-squared, minimum contrast estimation and dual confidence regions.
  • Visualization and Data Animation Techniques — Producing high quality visualizations in Python, including animated gifs, data videos, and even soundtracks to present insights that are easy to grasp by non-experts. Topics include optimum palettes, leveraging color transparency, video processing in R and Python, visualizing high-dimensional data, scatterplots for high dimensional data, and sound processing in Python.
  • Time Series — Including random walks, 2D Brownian motions with strong clustering structure, integrated Brownian motions, smooth and chaotic processes, parameter estimation for non-periodic time series, pseudo-random numbers and prime test of randomness, auto-regressive processes, special time series and an introduction to discrete dynamical systems. Special topics: long-range autocorrelations, optimization techniques in the presence of numerical instability using hybrid Monte-Carlo simulations and fixed-point algorithms.
  • Natural Language Processing — Enterprise-grade web crawling and text parsing techniques are used to create keyword taxonomies, with numerous practical applications. Besides solving original real-life problems, the goal is to structure unstructured data, and to develop distributed algorithms that can be resumed without data loss after computer crashes. Computational complexity and fast clustering of text data is discussed.
  • Statistical Foundations — New tests of independence, large selection of probability distributions, simple alternative to K-means clustering, alternative to logistic regression, model-free tests of hypothesis. Fundamental theorems with applications: central limit, Berry Esseen, Kolmogorov-Smirnov, law of the iterated logarithm, Le Cam’s theorem. Model-free confidence intervals with very fast convergence based on sample size, with both core theoretical results and applications. This module features a unified and original approach to all regression problems and curve fitting, using constrained optimization and gradient methods.
  • LaTeX — You will learn how to produce modern, top-quality, well-structured documents in LaTeX, including books with glossary, index, bibliography, tables, figures, a smart use of colors, cross-references and external clickable links. See examples here. Indeed, we use the LaTeX sources of textbooks and articles presented as teaching material in the other modules, as templates to start building great PDF documents. The course starts with installing MikTex on your laptop, or using the Overleaf online platform.

Testimonials

These testimonials pertain to the training material published by the author.

  • Jackson Andreas Pola — Hello Vincent, I find all the materials you shared on your website extremely useful. I will share this with my colleagues who started their journey in machine learning. Again thank you for being connected on LinkedIn.  Kind regards, Jackson
  • Mohammed Alshahrani — Thanks Vincent always your materials are supportive. Most of my students used to review your online materials. You might not know but frankly your impact is very noticeable specially for low-income University students.
  • Isabel Marín — Very interesting your last article “The sound that the data make”. Would you be interested, once I have introduced my students to the basics, in participating in one of the classes online? Showing them your work.
  • Milan McGraw — Thank you Vincent, I appreciate your operational excellence and resources. You are an invaluable resource to the community!

About the Instructor

Vincent Granville is a pioneering data scientist and machine learning expert, co-founder of Data Science Central (acquired by TechTarget in 2020), former VC-funded executive, author and patent owner. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, CNET, InfoSpace. Vincent is also a former post-doc at Cambridge University, and the National Institute of Statistical Sciences (NISS).  

Vincent published in Journal of Number TheoryJournal of the Royal Statistical Society (Series B), and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is also the author of multiple books, available here. He lives in Washington state, and enjoys doing research on stochastic processes, dynamical systems, experimental math and probabilistic number theory.

%d bloggers like this: