Prototypical Network is an algorithm for few-shot learning, which enables effective learning with limited data. It specializes in image classification and has proven performance in few-shot data scenarios.
Introduction
It solves the overfitting problem in few-shot learning scenarios. When working with extremely limited data, a classifier needs to be inductive. The network achieves this through clustering for each class, using the concept of a "prototype" to represent each class in the embedding space.
Formulation and Problem Set-up
Support SetsS for each class k can be represented as: Sk={(xi,yi),...,(xN,yN)}where is:
xi∈RD
yi∈{1,…,K}
k is index of a class
The Prototypical Network uses prototypes ck(∈RM) for each class in an M-dimensional space. For this, the function fθ:RD→RM is used to calculate their distribution: Prototype=ck=∣Nc∣1⋅∑(xi,yi)∈Skfθ(xi).
Calculate distance from prototypes of each class using distance function d=RM⋅RM←[0,+inf)=d(x,y)=∑i=1n(xi−yi)2: pθ(y=k∣x)=∑k’exp(−d(fθ(x),ck’))exp−d(fθ(x),ck).
Learning is computed through negative log probability J(θ)=−logpθ(y=k∣x) and SGD. The dataset for one episode consists of a randomly selected subset of classes and a few random samples for each. The remainder is used as query points.
Algorithm
The training algorithm for the model fθ(x) can be described in two learning phases. Phase 1 involves calculating the prototype ck using the support set. In Phase 2, the loss J is computed using the query set, and the weights θ of the embedding network are updated.
The experimental section of this paper demonstrates that simultaneous execution of meta-level learning and base-level learning has a positive effect on optimization between the two learners. The analysis of these results shows that MAML can converge in fewer steps as it avoids overfitting and considers the distribution and representation between tasks.