10 steps on the road to Deep Learning (part 2)

Julien Simon

10 min readJan 15, 2018

In the first part, I told you about the first five steps you could take to get started with Deep Learning (DL):

1 — You can do it

2 — Ignore Math (for now)

3 — Ride the snake

4 — Walk before you run

5 — Pick a library (and don’t look back)

Let’s look at the five next ones.

The way to Deep Learning is NOT shut. Draw your sword!

6 — It’s code, so code!

DL is code. Most of us learned to code by reading someone else’s code and by figuring out what it does. This applies to DL too.

Image classification is a fun problem to start with. Reasonably-sized data sets such as Dogs and cats, MNIST, CIFAR or German traffic signs are a good starting point.

As you probably know, there are many different neural network architectures. Initially, you should focus on fully connected networks (aka multi-layer perceptrons aka MLPs). Yes, they’re basic but they’re easy to understand. Early on, don’t worry about building the best model possible: focus on understanding the concepts (training set, validation set, loss function, optimization, hyper parameters) and the steps (data preparation, training loop, etc.) required to train and use a model

So, grab some examples based on your favorite library, read them, figure out what each bit does, run them and by all means tweak them: add layers, resize layers, change hyper-parameters (batch size, learning rate, number of epochs, etc.) , predict with your own data, etc. Building intuition on how a given change may affect the performance of a model is extremely important.

Once you’re comfortable with fully connected networks, I’d suggest looking at Convolutional Neural Networks (aka CNNs). Don’t let operations like convolution and pooling scare you away. Use the black box approach we discussed earlier.

Later on, you’ll be ready to look at other architectures like LSTM, GAN, etc. They are pretty weird beasts, so please don’t start with these :)

keras-team/keras

keras - Deep Learning for humans

github.com

Tutorials - mxnet documentation

These tutorials introduce a few fundamental concepts in deep learning and how to implement them in MXNet. The Basics…

mxnet.incubator.apache.org

Caffe2 Tutorials Overview

We'd love to start by saying that we really appreciate your interest in Caffe2, and hope this will be a high…

caffe2.ai

Welcome to PyTorch Tutorials - PyTorch Tutorials 0.3.0.post4 documentation

To get started with learning PyTorch, start with our Beginner Tutorials. The 60-minute blitz is the most common…

pytorch.org

7 — Welcome to the jungle

After a while, you’ll be ready to expand your horizon and start exploring additional techniques used by DL practitioners. Here are a few that I found very valuable to study in detail.

Reusing pre-trained models

Most libraries come with a model zoo, i.e. a set of models pre-trained on large data sets (such as ImageNet for image classification models). You can download the models and use them right away. No training needed!

Applications - Keras Documentation

Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be…

keras.io

Gluon Model Zoo - mxnet documentation

Pretrained models are converted from torchvision. All pre-trained models expect input images normalized in the same way…

mxnet.incubator.apache.org

Fine-tuning pre-trained models

These models can be further trained on your own data. This is a very powerful technique that lets you reuse the initial training while specializing the model for your own purpose. All the cool kids do it, e.g. Expedia or Conde Nast.

Room with a View: How Expedia Will Pick the Right Hotel Photos for You | The Official NVIDIA Blog

We all need to make a good first impression. Even hotels. That's why travel giant Expedia is using AI to help hotels…

blogs.nvidia.com

Machine Learning at Condé Nast, Part 2: Handbag Brand and Color Detection.

For a primer on Neural Network concepts, please visit our first post in this series. Over the past few years, we here…

technology.condenast.com

Building powerful image classification models using very little data

But what's more, deep learning models are by nature highly repurposable: you can take, say, an image classification or…

blog.keras.io

Increasing data set diversity with data augmentation

Data augmentation is another powerful technique which creates additional samples from existing ones, e.g. cropping images, resizing them, distorting them, etc. This is especially useful if your initial data set is too small to train from scratch or even for successful fine-tuning. DL libraries provide data augmentation APIs. You can also use the imgaug standalone library.

aleju/imgaug

imgaug - Image augmentation for machine learning experiments.

github.com

Studying additional building blocks for neural networks

For instance, you could look at special layers like Dropout and Batch Normalization, both of which help network learn and generalize better.

You could also dive deeper on activation functions such as sigmoid, ReLU, LeakyRelu and Softmax: try to understand how they differ and when you’d use one or the other. Same thing for loss functions and regularization.

Last but not least, you should spend some time learning about optimization algorithms. Chances are you’ve used SGD, but there are many more: Adagrad, Adadelta, Adam, etc.

An overview of gradient descent optimization algorithms

Note: If you are looking for a review paper, this blog post is also available as an article on arXiv. Update 15.06.2017…

ruder.io

And don’t forget: code or it didn’t happen. Try adding one of these Lego blocks to your toy networks, see what happens and try to figure it out.

At this point, Math will inevitably show up in pretty much every blog post or article you’re reading.
Some of you will decide not to go any further. Maybe they simply won’t be interested in spending the time and effort required to climb that wall.
Guess what: THAT’S OK! DON’T FEEL BAD! Even if you stop here, you already know A LOT — more than 99% of developers out there — and you’re now able to use Deep Learning in your applications. You can skip the next two steps and go directly to the Resources section.
Some of you will soldier on. Congratulations and good luck with your studies. Please remember to show patience and respect when less experienced developers will ask for your help. There are more than enough snarky ML/DL specialists out there, we definitely don’t need more ;)

8 — Stop worrying and learn to love Math

Caveat emptor

I took a lot of Math classes during my studies (French engineers will know what ‘a lot’ means). I thought that I had forgotten everything over the years, but surprisingly it came back pretty fast. Depending on your own studies, you’ll either fly through this stuff, or sweat and curse all the way. Stay focused, move at your own pace and remember, YOU CAN DO IT!

Having said that, here’s a list of topics you’ll have to study if you want to start understanding how DL building blocks really work.

Statistics and probabilities

Since DL is able to extract features automatically from data, I felt that mastering statistics and probabilities was less important than for ML where they play a major role in feature engineering.

Still, you can’t really ignore this stuff. The material below is pretty basic but it will certainly come in handy at some point.

High School Statistics | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

www.khanacademy.org

Linear algebra

Having a good grasp on linear algebra, matrices, etc. is mandatory in understanding how forward propagation, backward propagation and so on are implemented.

Linear Algebra | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

www.khanacademy.org

Derivatives

Derivatives are the foundation on which optimization algorithms — such as SGD — built. Again, this is mandatory study if you want to understand them in detail.

If you’re really rusty (or learning this for the first time), start with this one.

Differential Calculus | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

www.khanacademy.org

Then, you should study this as well (the first 3 chapters should be enough). You need to know partial derivatives in order to grasp the cosmic beauty of back propagation :*)

Multivariable Calculus | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

www.khanacademy.org

9 — Stare into the abyss

Once you’ve (re)sharpened your Math skills, you will be ready to figure out how these mysterious DL Lego blocks work. Black boxes no more!

Here are some selected topics you should now be able to crack:

Forward propagation & back propagation,
Gradients,
SGD and other optimization algorithms,
Convolution and transposed convolution.

A great way to put your knowledge to the test is to stop using high-level APIs such as model.fit() or model.train(). Try switching to examples based on a custom training loop, where every step of the training process is explicitly defined. Once again, read, understand and hack away!

Every now and then, you’ll stumble on a Math concept that you don’t know well. Just go and learn it. You’ve gone too far to let it stop you, right?

10 — D^Keep learning

There are a million resources out there, but here are a few that I found very valuable. When embarking on such a learning adventure, it definitely helps to rely on structured content. Blog posts and Stack Overflow will only take you so far :)

Books

Data Science from Scratch — Joel Grus

Exactly what the title says and a great combination of tools of techniques you’ll need along the way: Python essentials (probably not enough if you’ve never worked with it before), linear algebra, statistics, probability, plotting data, machine learning algorithms, etc.

The book doesn’t rely on any existing ML/DL framework. Pure Python all the way. Buy it with your eyes closed.

Data Science from Scratch

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they're also a good way…

shop.oreilly.com

Machine Learning Mastery — Jason Brownlee

Jason runs machinelearningmastery.com, possibly my favorite ML/DL blog. Tons of well-explained, pragmatic examples on how to solve problems! He also authored multiple books based on his posts.

This one is a really good introduction to ML algorithms, but I’m sure they’re all equally good.

Master Machine Learning Algorithms

37 USD You must understand the algorithms to get good (and be recognized as being good) at machine learning. In this…

machinelearningmastery.com

Gluon: the straight dope — Zachary Lipton

This is possibly the best online DL book you’ll come across.

Zachary is a Research Scientist for AWS, but I wouldn’t mention his work if it sucked ;)

This guide is a no-nonsense introduction to DL: algorithms, network architectures, etc. Zach starts by explaining general concepts and then shows you how to implement them with MXNet and Gluon. Even if you’ve decided to work with a different library, the former is extremely valuable.

Deep Learning - The Straight Dope - The Straight Dope 0.1 documentation

Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view…

gluon.mxnet.io

Deep Learning — Ian Goodfellow, Yoshua Bengio and Aaron Courville

Don’t let the simple title fool you. This as BRUTAL as it gets (and I mean that in a good way). The authors are world-expert researchers on Deep Learning and the book is math-heavy (quite an understatement). Being able to read this is a goal in itself. Awesome, but definitely NOT for beginners.

The book is available online, as well as in printed form (Amazon.com).

Deep Learning

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine…

www.deeplearningbook.org

MOOC

Fast.ai — Jeremy Howard & Rachel Thomas

Jeremy and Rachel have built two free 7-week courses, which are equally brilliant. You’ll be coding from day 1 (mostly on Keras), which is definitely inline with what we’ve been discussing so far: less focus on Math, more focus on code. Production quality is a bit rough at times, but I cannot recommend these courses enough.

Deep Learning For Coders-36 hours of lessons for free

fast.ai's practical deep learning MOOC for coders. Learn CNNs, RNNs, computer vision, NLP, recommendation systems…

course.fast.ai

Deep Learning For Coders-36 hours of lessons for free

fast.ai's practical deep learning MOOC for coders. Learn CNNs, RNNs, computer vision, NLP, recommendation systems…

course.fast.ai

Machine Learning and Deep Learning on Coursera — Andrew Ng

Andrew is one of the leading experts in Deep Learning. He authored two reference courses on Coursera, one on Machine Learning and one on Deep Learning.

Machine Learning | Coursera

About this course: Machine learning is the science of getting computers to act without being explicitly programmed. In…

www.coursera.org

The Machine Learning course will take you through all major Machine Learning algorithms. Unlike the fast.ai course, the focus here is on algorithms and there’s definitely more Math involved than you’ll probably like :) There are some programming assignments based on GNU Octave. I understand why you’d pick Octave over Matlab, but IMHO this course is screaming for a new revision based on Python. Having said that, it’s extremely interesting!

Deep Learning | Coursera

Deep Learning from deeplearning.ai. If you want to break into AI, this Specialization will help you do so. Deep…

www.coursera.org

The Deep Learning course was launched a few months ago. It will take you through neural networks, all major network architectures, etc. Here too, the focus is on understanding algorithms, although programming assignments are based in Python, Keras and Tensorflow. A great quality course, led by a brilliant instructor.

Blogs

There’s a zillion of them, but here are a few that are definitely worth your time.

Conclusion

It’s a long road for sure and it will probably take you many months of hard work to go through of all this. If you have focus and dedication, you will make it and acquire these skills. Good luck on your journey to Deep Learning!

As always, thank you for reading. Please reach out on Twitter if you have questions.

If you liked this post, why not share it on Facebook, Twitter or LinkedIn for more people to enjoy? Thank you :)

The soundtrack to this post was live Judas Priest. Hail the Metal Gods.

10 steps on the road to Deep Learning (part 2)

6 — It’s code, so code!

keras-team/keras

keras - Deep Learning for humans

Tutorials - mxnet documentation

These tutorials introduce a few fundamental concepts in deep learning and how to implement them in MXNet. The Basics…

Caffe2 Tutorials Overview

We'd love to start by saying that we really appreciate your interest in Caffe2, and hope this will be a high…

Welcome to PyTorch Tutorials - PyTorch Tutorials 0.3.0.post4 documentation

To get started with learning PyTorch, start with our Beginner Tutorials. The 60-minute blitz is the most common…

7 — Welcome to the jungle

Reusing pre-trained models

Applications - Keras Documentation

Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be…

Gluon Model Zoo - mxnet documentation

Pretrained models are converted from torchvision. All pre-trained models expect input images normalized in the same way…

Fine-tuning pre-trained models

Room with a View: How Expedia Will Pick the Right Hotel Photos for You | The Official NVIDIA Blog

We all need to make a good first impression. Even hotels. That's why travel giant Expedia is using AI to help hotels…

Machine Learning at Condé Nast, Part 2: Handbag Brand and Color Detection.

For a primer on Neural Network concepts, please visit our first post in this series. Over the past few years, we here…

Building powerful image classification models using very little data

But what's more, deep learning models are by nature highly repurposable: you can take, say, an image classification or…

Increasing data set diversity with data augmentation

aleju/imgaug

imgaug - Image augmentation for machine learning experiments.

Studying additional building blocks for neural networks

An overview of gradient descent optimization algorithms

Note: If you are looking for a review paper, this blog post is also available as an article on arXiv. Update 15.06.2017…

8 — Stop worrying and learn to love Math

Caveat emptor

Statistics and probabilities

High School Statistics | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

Linear algebra

Linear Algebra | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

Derivatives

Differential Calculus | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

Multivariable Calculus | Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…

9 — Stare into the abyss

10 — D^Keep learning

Books

Data Science from Scratch

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they're also a good way…

Master Machine Learning Algorithms

37 USD You must understand the algorithms to get good (and be recognized as being good) at machine learning. In this…

Deep Learning - The Straight Dope - The Straight Dope 0.1 documentation

Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view…

Deep Learning

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine…

MOOC

Deep Learning For Coders-36 hours of lessons for free

fast.ai's practical deep learning MOOC for coders. Learn CNNs, RNNs, computer vision, NLP, recommendation systems…

Deep Learning For Coders-36 hours of lessons for free

fast.ai's practical deep learning MOOC for coders. Learn CNNs, RNNs, computer vision, NLP, recommendation systems…

Machine Learning | Coursera

About this course: Machine learning is the science of getting computers to act without being explicitly programmed. In…

Deep Learning | Coursera

Deep Learning from deeplearning.ai. If you want to break into AI, this Specialization will help you do so. Deep…

Blogs

Conclusion

Written by Julien Simon