Understanding Neural Network Foundations: Perceptron, ADALINE and MLP
Perceptron and ADALINE are often considered as prototypes of deep learning; which is important to understand architecture of neural network.
Perceptron
Perceptron (or Binary Classifier) is an algorithm that classifies data into two categories by determining which class an input belongs to. Unlike regression models which focus on prediction and analysis, the perceptron specializes in classifying inputs into binary classes represented as .
Formulation and Learning Rule
It does not use partial derivatives, but instead relies on the simple linear separability of the data.
Initialize weights to or a random number.
Calculate outputs:
Update weights:
Comparing Perceptron and Logistic Regression
The perceptron is often confused with logistic regression. Though they share similarities, they are entirely different concepts.
Perceptron outputs a hard class label based on a threshold, while logistic regression outputs a probability using the sigmoid function.
Logistic regression is a probabilistic model, while the perceptron is deterministic.
The perceptron uses hinge loss instead of gradient descent rules and cross-entropy.
The perceptron converges only when data is linearly separable, while logistic regression always converges.
ADALINE
ADALINE (Adaptive Linear Neuron, Adaptive Linear Element) is an enhanced version of the perceptron. It classifies data using a layer of parallel perceptrons. This structure serves as a prototype for artificial neural networks.
Formulation and Learning Rule
It takes multiple inputs and produce a single output as a multi-layer neural network composed of various nodes.
The least mean square error is calculated as .
Update the weights: .
represents the learning rate.
represents the target output value.
MLP, Multi-Layer Perceptron
Multi-Layer Perceptron ( or MLP, fully connected artificial neural network) consists of three layers of perceptrons with non-linear activation functions. Its primary purpose is to classify inputs in a linear manner.
Every perceptron in an MLP is fully connected to the next layer of perceptrons, giving each perceptron multidimensional weights $w_{i,j}$.
These perceptrons function as signal processing units, similar to neurons in the human brain, which is why they are called neurons.
Every neuron has an activation function that maps scalar responses to a non-linear number range. This is a crucial concept in MLPs that enables their functionality and improves their performance.
In most implementations, MLPs use the hyperbolic tangent or sigmoid function as their activation function.
Formulation and Learning Rule
The learning rule is based on the concept of a neuron; it updates the neuron's weights by calculating partial derivatives of their cost/loss function. This represents a fundamental mechanism of modern deep learning.
Calculate the error at the output layer using , where represents the desired output values of the model.
Update using the gradient descent rule: .
represents the output of layer .
represents the output of neuron i in the previous layer.
represents the weights of neuron, while is the learning rate.
is the partial derivative of with respect to , which can be expressed as: where represents the activation function.
Last updated