Deep Learning – Course Introduction

Deep Learning – Course Introduction


Hi everyone so this is introduction to the
course on deep learning which will be offered through NPTEL. So for over the past few years, past decade
or so, deep learning has become very prevalent and it has finds applications in a wide range
of area such as speech, computer vision, natural language processing like most of the state
of the art systems in these areas from even companies like Google, Facebook etc., use
deep learning as the underlying solution. So in this course we will learn some of the
foundational or the fundamental blocks of deep learning, in particular we will start
right from the basics and start from perceptron or sigmoid neuron, a single neuron and from
there we will try to go to multilayered network of neurons, or multilayered perceptron as
it is commonly known, so and we will look at algorithms for training such networks and
the specific algorithm that we look at is back propagation which uses gradient descent. And then we will look at several applications
of feed forward neural network like auto encoders and word2vec and so on. Then we will move on to the next type of neural
networks which is recurrent neural networks which find applications in areas where you
have to deal with sequence. So sequences are again omnipresent. You have sequences in natural language text. So when you talk of a sentence you can think
of it as a sequence of words. In fact words themselves you can think of
them as sequence of characters and such sequences also occur in other area such as speech, you
have a sequence of phonemes and also in case of videos, where videos could be treated as
a sequence of images. So how do you deal with such sequential data
when you want to do various things on top of this data such as you might want to do
classification, you might want to do sequence prediction for example given 1 sentence in
a source language, you might want to predict the equivalent sequence in the target language. So all these applications require something
known as recurrent neural networks. So we will be looking at that and we will
also look at algorithms for training recurrent neural network which is again back propagation,
but with a twist to it and that is known as back propagation through time. So we will look at the math behind that and
some of the challenges in training recurrent neural networks and to overcome these challenges
we will look at other forms of other types of RNNs, or recurrent neural networks which
is LSTMS. And gated recurrent units which overcome some
of the challenges that occur while training recurrent neural networks and the third type
of neural network that we look at is convolutional neural networks which largely find application
in the vision domain so when you have an image how do you come up with the good representation
for the image and then do various task on top of that such as classification, object
detection, segmentation and so on. So again the underlying block here which is
almost prevalent in all computer vision application or image processing applications is something
on convolutional neural network which uses convolutional operation to come up with the
abstract representation of an image and not just one representation or deep hierarchical
representation, abstract representation of an image. Right, so we will look at what a convolutional
neural network is how it is different from feed forward neural network and so on. And once we have done with these 3 fundamental
blocks which is feed forward neural networks, recurrent neural networks and convolutional
neural networks. We will then put them all together and look
at something known as encoder, decoder models which are used to take any kind of input say
image or speech or text. Encode them into a representation and decode
some output from this. So this output could either be a classification
output or it could be a sequence in itself, for example you could encode an image and
then have to generate a caption for the image or you could encode a video and then try to
generate a caption for the video and so on. So these encoder, decoder models they use
a combination of these fundamental blocks, RNNs, CNNs and feed forward neural networks
and combine them in interesting ways to apply them to various downstream tasks right, like
image captioning, machine translation, document summarization and so on. And one another critical component of these
models is something known as the attention network which learns to pay attention to important
parts of the input. For example, if you are trying to write a
caption for an image where the image shows a boy throwing a Frisbee in a park. The main components of the image among all
the background is just the boy, the Frisbee and the green grass that indicates the park
right. Everything else, all the other pictures just
gets summarized into these 3 main objects which are there in the image. So model which can generate a good caption
for this image should learn to pay attention to these critical components of the input
and this is not just restricted to images, it could also happen in the case of document
classification, so whether you want to find, whether this document talks about politics
or sports or finance there will be some important words in the document that you need to focus
on which reveal what is the type of the document right or the class of the document. So for such things also it is very important
to find out the important words in the input and pay attention to them so this is done
by something known as the attention mechanism. So we will look at what attention mechanism
is and how to integrate that with encoder and decoder models. So that is the main part of the course and
we will be structuring this into 30 hours of teaching or 12 weeks of teaching. Apart from that beyond these basic models
we also have an extended version of the course where we will talk about deep generative models
where we will look at the use of neural networks for learning probability distributions and
the 3 or 4 main neural networks or 4 main paradigm that we look at here are something
known as restricted Boltzmann machines, variational autoencoders, auto regressive models and generative
adversarial networks. So we will look at some theory behind these
and how all of them connect together, what are their relative advantages, disadvantages,
and what is the taxonomy under which all these different models fall. So that is going to be an extended version
of the course which may not be a part of the main syllabus. The main syllabus will only contain feed forward
neural networks, RNNs, CNNs and sequence to sequence models with attention mechanism or
encoder, decoder models with attention mechanism so that is all, so that is the introduction
for the course. I hope you enroll for it and enjoy the course,
thank you.

11 thoughts on “Deep Learning – Course Introduction

  1. Dear Professor,
    A introductory lecture should and MUST contain a description of Pre Requisites for the course. More so in a Online Course. It will be also so nice if you can point to resources that can bridge the gap for the willing student not having the background. ….

  2. The greatest Deep Learning course out there. These courses helped me get two nternships while in BTech. I am forever grateful to NPTEL to bring us tier 3 college students high quality courses such as this.

Leave a Reply

Your email address will not be published. Required fields are marked *