Modern AI Systems: An In-depth Guide to Cutting-edge Technologies and Applications

Today's AI systems are developing and integrating across many approaches and domains. This page will discuss some notable topics about artificial intelligence systems.

Multi-Modal Learning is a machine learning approach that uses multiple types of data modalities. For example, a model can be designed to process both text and images as input.

Video Generation models must understand both video and text modalities.
Autonomous Vehicles handle multiple modalities including vision, RADAR, and vehicle speed data.
The Real-World Examples: CLIP, DALLE, Florence, Whisper, BEiT-3 and GPT-4V

Multi-modal can be divided into the four topics below:

Early-Fusion (or Feature-level Fusion) means that a model processes different types of modalities together as a combined input.
Late-Fusion (or Decision-level Fusion) refers to a system where separate models handle each modality, and a primary model combines their outputs into a single result.
Hybrid/Intermediate Fusion combines early and late fusion approaches, enabling interactions between modalities throughout the network.
Cross-modal Learning is designed so that a secondary modality serves as support for the primary modality.

Transformers Architecture/Transformers

A Transformer is a network that handles sequential data in parallel using the self-attention mechanism. This revolutionary network architecture has led to the creation of many AI startups.

In Self-Attention Mechanism, some of its weights work as a “learnable database” to relate each words.
Multi-Head Attention refers more than one of self-attention node in parallel to capture more diverse patterns.
The Encoder-Decoder Structure forms the key building blocks of transformers. Transformers can vary in how they use encoders and decoders—BERT uses an encoder-only approach, while GPT uses a decoder-only approach.
The Real-World Examples: GPT-4, BERT, T5, ViT and DALLE

Transfer Learning

Transfer Learning is a technique to reuse already trained model for different tasks. This is

In the pre-training phase, the model learns generic features from an enormous dataset.
After pre-training, the model undergoes Transfer/Fine-Tuning through two possible approaches:
- Option 1: Retrain only the new later layers
- Option 2: Retrain all layers
The Real-World Examples: VGG, ResNet, EfficientNet, BERT, GPT, T5, CLIP and DALLE

Federated Learning

Federated Learning is a decentralized approach to machine learning. Multiple devices or organizations train and develop a model together without sharing raw data. Instead of sending data to a main server, each device receives the model, trains it locally, and sends back the updated model.

Initialization: The central server initializes the global model.
Local Training: When the server sends the model to participating devices, each device trains the model using its local data.
Model Updates without Data: Devices encrypt and send only the model's parameter changes to the central server instead of the original data.
Federated Averaging: The server integrates the received updates to improve the global model.

Actionable AI

Actionable AI is an artificial intelligence system that goes beyond simply analyzing data and predicting results to directly drive decisions and actions. While traditional AI was limited to providing insights, actionable AI connects analysis with execution to produce automated responses.

Traditional AI focused on pattern detection and prediction through modeling. "Sales are expected to decrease by 10% next quarter."
In contrast, actionable AI recommends specific actions. "To respond to next quarter's sales decline, increase advertising spend by 15%."

This system requires the implementation and securing of the following three elements:

Real-Time Data Processing: Must be able to accurately analyze data as it flows in real-time.
Decision Automation: AI must present clear and actionable courses of action.
HITL, Human-In-The-Loop: While AI proposes actions, final decisions require human approval. Therefore, humans must establish and follow systematic rules for reviewing and approving proposals.

The automated system aims to fully automate decision-making and coordination, while the prescriptive system aims to suggest decisions to humans.

PreviousChallenges in Training Deep Neural Network and the Latest Solutions NextTokenization and Stemming, Lemmatization, Stop-word Removal: Foundational Works of NLP

Last updated 3 months ago

Multi-modal Learning

Transformers Architecture/Transformers

Transfer Learning

Federated Learning

Actionable AI