1. Intro DL

Here you will learn about:

  1. Perceptron

  2. Neural networks

  3. Training neural networks

    • Optimization

    • Backpropagation

Motivation: hand engineered features are time and energy consuming and not scalable in practice. Can we learn the underlying features directly from data?


Why now? Big data & Hardware (GPU) & Software (pytorch)

  1. Perceptron

Idea: takes input data and outputs prediction.



Why activation function?

To introduce non-linearity and learn more complex dependencies.
Recall: linear combo of linear combo will give no more than linear combo :(

2. Neural Networks

Like single perceptron, we can make it with two outputs:

Next, we can add more intermediate neurons

3. Training Neural Networks

The goal of training: find optimal weights

  1. Define optimization problem: find argmin of loss function

  2. Gradien descent

  3. Computing gradient: chain rule and backpropogation

  4. Loss landscape and learning rate

  5. Stochastic gradient descent aka mini-batches

  6. OVerfiting

  7. Dropout and early stopping