Duration
21 hours (usually 3 days including breaks)
Requirements
Engineering level
Audience: Engineers, Data-Scientists wishing to learn neural networks / Deep Learning
Overview
Artificial intelligence has revolutionized a large number of economic sectors (industry, medicine, communication, etc.) after having upset many scientific fields. Nevertheless, his presentation in the major media is often a fantasy, far removed from what really are the fields of Machine Learning or Deep Learning. The aim of this course is to provide engineers who already have a master’s degree in computer tools (including a software programming base) an introduction to Deep Learning as well as to its various fields of specialization and therefore to the main existing network architectures today. If the mathematical bases are recalled during the course, a level of mathematics of type BAC + 2 is recommended for more comfort. It is absolutely possible to ignore the mathematical axis in order to maintain only a “system” vision, but this approach will greatly limit your understanding of the subject.
Course Outline
The course is divided into three separate days, the third being optional.
Day 1 – Machine Learning & Deep Learning: theoretical concepts
1. Introduction IA, Machine Learning & Deep Learning
– History, basic concepts and usual applications of artificial intelligence far
Of the fantasies carried by this domain
– Collective Intelligence: aggregating knowledge shared by many virtual agents
– Genetic algorithms: to evolve a population of virtual agents by selection
– Usual Learning Machine: definition.
– Types of tasks: supervised learning, unsupervised learning, reinforcement learning
– Types of actions: classification, regression, clustering, density estimation, reduction of
dimensionality
– Examples of Machine Learning algorithms: Linear regression, Naive Bayes, Random Tree
– Machine learning VS Deep Learning: problems on which Machine Learning remains
Today the state of the art (Random Forests & XGBoosts)
2. Basic Concepts of a Neural Network (Application: multi-layer perceptron)
– Reminder of mathematical bases.
– Definition of a network of neurons: classical architecture, activation and
Weighting of previous activations, depth of a network
– Definition of the learning of a network of neurons: functions of cost, back-propagation,
Stochastic gradient descent, maximum likelihood.
– Modeling of a neural network: modeling input and output data according to
The type of problem (regression, classification …). Curse of dimensionality. Distinction between
Multi-feature data and signal. Choice of a cost function according to the data.
– Approximation of a function by a network of neurons: presentation and examples
– Approximation of a distribution by a network of neurons: presentation and examples
– Data Augmentation: how to balance a dataset
– Generalization of the results of a network of neurons.
– Initialization and regularization of a neural network: L1 / L2 regularization, Batch
Normalization …
– Optimization and convergence algorithms.
3. Standard ML / DL Tools
A simple presentation with advantages, disadvantages, position in the ecosystem and use is planned.
– Data management tools: Apache Spark, Apache Hadoop
– Tools Machine Learning: Numpy, Scipy, Sci-kit
– DL high level frameworks: PyTorch, Keras, Lasagne
– Low level DL frameworks: Theano, Torch, Caffe, Tensorflow
Day 2 – Convolutional and Recurrent Networks
4. Convolutional Neural Networks (CNN).
– Presentation of the CNNs: fundamental principles and applications
– Basic operation of a CNN: convolutional layer, use of a kernel,
Padding & stride, feature map generation, pooling layers. Extensions 1D, 2D and
3D.
– Presentation of the different CNN architectures that brought the state of the art in classification
Images: LeNet, VGG Networks, Network in Network, Inception, Resnet. Presentation of
Innovations brought about by each architecture and their more global applications (Convolution
1×1 or residual connections)
– Use of an attention model.
– Application to a common classification case (text or image)
– CNNs for generation: super-resolution, pixel-to-pixel segmentation. Presentation of
Main strategies for increasing feature maps for image generation.
5. Recurrent Neural Networks (RNN).
– Presentation of RNNs: fundamental principles and applications.
– Basic operation of the RNN: hidden activation, back propagation through time,
Unfolded version.
– Evolutions towards the Gated Recurrent Units (GRUs) and LSTM (Long Short Term Memory).
Presentation of the different states and the evolutions brought by these architectures
– Convergence and vanising gradient problems
– Classical architectures: Prediction of a temporal series, classification …
– RNN Encoder Decoder type architecture. Use of an attention model.
– NLP applications: word / character encoding, translation.
– Video Applications: prediction of the next generated image of a video sequence.
Day 3 – Generational Models and Reinforcement Learning
6. Generational models: Variational AutoEncoder (VAE) and Generative Adversarial Networks (GAN).
– Presentation of the generational models, link with the CNNs seen in day 2
– Auto-encoder: reduction of dimensionality and limited generation
– Variational Auto-encoder: generational model and approximation of the distribution of a
given. Definition and use of latent space. Reparameterization trick. Applications and
Limits observed
– Generative Adversarial Networks: Fundamentals. Dual Network Architecture
(Generator and discriminator) with alternate learning, cost functions available.
– Convergence of a GAN and difficulties encountered.
– Improved convergence: Wasserstein GAN, Began. Earth Moving Distance.
– Applications for the generation of images or photographs, text generation, super-
resolution.
7. Deep Reinforcement Learning.
– Presentation of reinforcement learning: control of an agent in a defined environment
By a state and possible actions
– Use of a neural network to approximate the state function
– Deep Q Learning: experience replay, and application to the control of a video game.
– Optimization of learning policy. On-policy && off-policy. Actor critic
architecture. A3C.
– Applications: control of a single video game or a digital system.