What this book covers

Chapter 1, Machine Learning Model Fundamentals, explains the most important theoretical concepts regarding machine learning models, including bias, variance, overfitting, underfitting, data normalization, and cost functions. It can be skipped by those readers with a strong knowledge of these concepts.

Chapter 2, Introduction to Semi-Supervised Learning, introduces the reader to the main elements of semi-supervised learning, focusing on inductive and transductive learning algorithms.

Chapter 3, Graph-Based Semi-Supervised Learning, continues the exploration of semi-supervised learning algorithms belonging to the families of graph-based and manifold learning models. Label propagation and non-linear dimensionality reduction are analyzed in different contexts, providing some effective solutions that can be immediately exploited using Scikit-Learn functionalities.

Chapter 4, Bayesian Networks and Hidden Markov Models, introduces the concepts of probabilistic modeling using direct acyclic graphs, Markov chains, and sequential processes.

Chapter 5, EM Algorithm and Applications, explains the generic structure of the Expectation-Maximization (EM) algorithm. We discuss some common applications, such as Gaussian mixture, Principal Component Analysis, Factor Analysis, and Independent Component Analysis. This chapter requires deep mathematical knowledge; however, the reader can skip the proofs and concentrate on the final results.

Chapter 6, Hebbian Learning and Self-Organizing Maps, introduces Hebb's rule, which is one of the oldest neuro-scientific concepts and whose applications are incredibly powerful. The chapter explains how a single neuron works and presents two complex models (Sanger network and Rubner-Tavan network) that can perform a Principal Component Analysis without the input covariance matrix.

Chapter 7, Clustering Algorithms, introduces some common and important unsupervised algorithms, such as k-Nearest Neighbors (based on KD Trees and Ball Trees), K-means (with K-means++ initialization), fuzzy C-means, and spectral clustering. Some important metrics (such as Silhouette score/plots) are also analyzed.

Chapter 8, Ensemble Learning, explains the main concepts of ensemble learning (bagging, boosting, and stacking), focusing on Random Forests, AdaBoost (with its variants), Gradient Boosting, and Voting Classifiers.

Chapter 9, Neural Networks for Machine Learning, introduces the concepts of neural computation, starting with the behavior of a perceptron and continuing the analysis of multi-layer perceptron, activation functions, back-propagation, stochastic gradient descent (and the most important optimization algorithm), regularization, dropout, and batch normalization.

Chapter 10, Advanced Neural Models, continues the explanation of the most important deep learning methods focusing on convolutional networks, recurrent networks, LSTM, and GRU.

Chapter 11, Autoencoders, explains the main concepts of an autoencoder, discussing its application in dimensionality reduction, denoising, and data generation (variational autoencoders).

Chapter 12, Generative Adversarial Networks, explains the concept of adversarial training. We focus on Deep Convolutional GANs and Wasserstein GANs. Both techniques are extremely powerful generative models that can learn the structure of an input data distribution and generate brand new samples without any additional information.

Chapter 13, Deep Belief Networks, introduces the concepts of Markov random fields, Restricted Boltzmann Machines, and Deep Belief Networks. These models can be employed both in supervised and unsupervised scenarios with excellent performance.

Chapter 14, Introduction to Reinforcement Learning, explains the main concepts of Reinforcement Learning (agent, policy, environment, reward, and value) and applies them to introduce policy and value iteration algorithms and Temporal-Difference Learning (TD(0)). The examples are based on a custom checkerboard environment.

Chapter 15, Advanced Policy Estimation Algorithms, extends the concepts defined in the previous chapter, discussing the TD(λ) algorithm, TD(0) Actor-Critic, SARSA, and Q-Learning. A basic example of Deep Q-Learning is also presented to allow the reader to immediately apply these concepts to more complex environments.