Convolutional Unit (ReLU), Pooling and Fully Connected. Not

Convolutional Neural Networks (ConvNets) have
been a rage in the recent times amongst modern day research and development
community to try and ease various applications. Right from image processing to
pattern recognition and classification, ConvNets have been experimented with a
variety of domains and has been providing some excellent results. ConvNets are
basically a modified version of multi-layer perceptron designed such that they
require very less processing as compared various other networks. Being a part
of Deep Learning, its architecture contains a humongous number of hidden layers
between the input and the output one to provide ultimate optimization. Its
hidden layers mainly consist of multiple or combination of each of the
following most popular layers- Convolutional, Rectified Linear Unit (ReLU),
Pooling and Fully Connected. Not going in too depth and mathematics, the above
mentioned layers are explained in layman’s terms for simplicity and better
understanding. Convolutional Layers are the core layers which perform most of
the computational work. They use a variety of different dimensional masks over
the required image where each mask involving a specific feature of the image
thus imitating small patterns in an image and mapping to them. Each filter
gives a respective convolved image. Thus, a single image is split into ‘n’
number of filtered images stacked together where each layer is connected to the
other. Stacking such numerous filtered images gives us the convolutional layer.
The ReLU layer performs the normalization process. It is used to introduce
non-linearity into our ConvNets since most of the real life information i.e.
data with which our network would be training will be non-linear. This ReLU
operation is performed at the pixel level transforming each negative value to
zero. We could also use other non-linear functions such as hyperbolic tangent
(tanh) or sigmoidal function but ReLU has found to produce better results for
most of the cases. Next comes the Pooling Layer. It basically shrinks the image
to a more essential form. Max-Pooling is one of the most popular layers used.
It takes the maximum or the most prominent feature out of the block of neurons
of the previous layer. On the same note, Average Pooling involves taking out
the average value from the neurons cluster of the previous layer. The above 3
layers are repetitively cascaded after each other as per the requirement which
is also known as Deep Stacking. Fully Connected Layer connects the each and
every neuron of a layer with neurons in the next layer. It is very similar to
multilayer perceptron neural network. The Deep Stacked layers contain
high-level features of the input image. The Fully Connected Layer makes use of
these features to classify the image into different classes based on our training
dataset. The sum of the output probabilities of this layer is 1. This is done
by the Softmax Activation Function. This activation function takes a vector of
real-valued score and compresses it to a vector of values between zero being
the lowest and one being the highest such that these values also add up to one