Artificial Neural Network into Deep Neural Network, Birth of AI Generation
Artificial Neural Network (ANN) is a machine learning model that mimics the biological system of the human brain. It consists of neurons that work like synapses to pass signals, which are represented as real numbers. Each neuron receives signals from other neurons and processes them into new inputs for subsequent neurons using its non-linear activation function and weights.
Neuron: The Basic Building Block of ANNs
A neuron receives multiple inputs and produces a result . These inputs can be either raw data or outputs from previous neurons. The output is calculated by taking the weighted sum of inputs multiplied by weights and adding a bias. This value is then typically passed through an activation function to generate the final signal .
Neural Network Layers: Organizing Neurons for Processing
An ANN can be viewed as a graph with a layered architecture of neurons. This architecture becomes more complex in modern ANNs, such as deep learning models. Neurons in each layer connect to neurons in the next layer, enabling the model to capture complex and abstract features.
A layer that receives signals from external sources (input data or training data) is called an Input Layer.
A layer that receives signals from the input layer or other hidden layers is called a Hidden Layer.
A layer that produces the final signal is called an Output Layer.
Hyperparameters: Training Control Variables
Hyperparameters are values that control the training of neural networks. They define the model architecture and remain constant during the training algorithm, typically being set before training begins. They heavily influence model performance, and one of the main goals in ANN development is finding optimized hyperparameters.
Learning rate determines the step size for updating weights. A high learning rate may be more cost-effective but leads to lower accuracy, while a low learning rate may be more costly but achieves higher accuracy.
Batch Size refers to the number of samples processed in each training iteration. It affects training speed, memory usage, and model regularization. Large batch sizes provide more stable gradient adjustments but require more memory, while small batch sizes use less memory but result in less stable gradient adjustments.
Depth refers to the number of hidden layers in the model. Greater depth allows the model to learn more complex patterns; however, it also increases the possibility of overfitting. The appropriate depth is determined by both the complexity of the problem and the volume of available data.
Backpropagation
Backpropagation is an algorithm that enables efficient computation in deep neural networks. It calculates gradients of the loss function with respect to weights by working backward through the network. This key technique makes it possible to train large, complex neural architectures.
In modern AI/ML, "forward propagation" refers to the process where input data flows through all layers of the network to produce the output, while "backpropagation" represents a method to simplify gradient calculations using the chain rule:
Some experts describe backpropagation as a novel way to draw a computation graph; however, its main benefit comes from its recursive approach across all neural network architectures.
is the input., the output.
is the loss/cost function.
is the number of layers.
is the activation function of layer .
is the activation of the -th node in layer .
is the weight between layers and — is the weight between the -th node in layer and the -th node in layer .
From Simple to Deep: The Evolution of Neural Networks
Deep Neural Network (DNN) refers to ANNs with multiple hidden layers. Although the overall architecture is similar to a traditional ANN, the use of deep layers creates significant differences in performance.
Last updated