๐Ÿ“˜
Lif31up's Blog
  • Welcome! I'm Myeonghwan
  • How to Read the Pages
  • Sound Engineering
    • How to Install and Load Virtual Studio Instruments
    • A Guide to Audio Signal Chains and Gain Staging
    • Equalizer and Audible Frequency: How to Adjust Tone of the Signal
    • Dynamic Range: the Right Way to Compress your Sample
  • Acoustic Space Perception and Digital Reverberation: A Comprehensive Analysis of Sound Field Simulat
  • Songwriting
    • A Comprehensive Guide to Creating Memorable Melodies through Motif and Phrasing
  • Musical Artistry
    • What is Artistry?
    • Visualizing as Musical Context
  • Creating Personal Myth
  • Instagram Management
  • Art Historiography
    • Importance of Art Historiography
    • Overview on Post-internet Art, New Aesthetic and Post-Digital Art
  • Brutalism and Brutalist Architecture
  • General AI
    • Foundational Work of ML: Linear/Logistic Regression
    • Early-stage of AI: Perceptron and ADALINE
    • What is Deep Learning?: Artificial Neural Network to Deep Neural Network
  • * Challenges in Training Nerual Network
  • Meta Learning
    • Overview on Meta Learning
    • Prototypical Networks for Few-shot Learning
    • Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
  • Front-end Development
    • Overview on Front-end Development
    • Learning React Basic
      • React Component: How They are Rendered and Behave in Browser
      • State and Context: A Key Function to Operate the React Application
      • Design Pattern for Higher React Programming
Powered by GitBook
On this page
  • Formulation
  • Algorithm
  1. Meta Learning

Prototypical Networks for Few-shot Learning

PreviousOverview on Meta LearningNextModel-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Last updated 1 month ago

์›์‹œ ์‹ ๊ฒฝ๋ง(prototypical network)์€ ํ“จ ์ƒท ๋Ÿฌ๋‹์˜ ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ, ์†Œ์ˆ˜์˜ ์˜ˆ์‹œ๋งŒ์œผ๋กœ๋„ ํšจ๊ณผ์ ์ธ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๋ฅผ ๋น„๋กฏํ•œ ์—ฌ๋Ÿฌ ๋ถ„์•ผ์—์„œ ๊ทธ ์„ฑ๋Šฅ์ด ์ž…์ฆ๋˜์—ˆ์œผ๋ฉฐ ์ œํ•œ๋œ ๋ฐ์ดํ„ฐ๋กœ๋„ ์ •ํ™•ํ•œ ๋ถ„๋ฅ˜๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์›์‹œ ์‹ ๊ฒฝ๋ง์˜ ๊ตฐ์ง‘ํ™” ๋กœ์ง์„ ๋ณด์—ฌ์ฃผ๋Š” ๋„ํ‘œ

ํ“จ ์ƒท ํ•™์Šต๊ณผ ๊ฐ™์€ ๊ทนํžˆ ์ œํ•œ๋œ ๋ฐ์ดํ„ฐ ํ™˜๊ฒฝ์—์„œ ๋ถ„๋ฅ˜๊ธฐ๋Š” ๋‹จ์ˆœํ•˜๊ณ  ๊ท€๋‚ฉ์  ํŠน์„ฑ์„ ๊ฐ€์ ธ์•ผ ๊ณผ๋Œ€์ ํ•ฉ์„ ํ”ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๊ฐ ํด๋ž˜์Šค๋ฅผ ๋Œ€ํ‘œํ•˜๋Š” ์›ํ˜•์„ ์ค‘์‹ฌ์œผ๋กœ ๊ตฐ์ง‘ํ™”๊ฐ€ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค. ์ด๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋น„์„ ํ˜• ๋งคํ•‘๊ณผ ๋ถ„์‚ฐ ๊ณต๊ฐ„(embedding space)์„ ํ™œ์šฉํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ์ˆ˜ํ–‰๋˜๋ฉฐ, ์ด ๊ณต๊ฐ„์—์„œ ๊ฐ ํด๋ž˜์Šค์˜ ์›ํ˜•์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค.

ํ•ด๋‹น ๋‚ด์šฉ์€ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

Formulation

๊ณ„๊ธ‰ kkk์— ๋Œ€ํ•œ ์„œํฌํŠธ ์„ธํŠธ SkS_kSkโ€‹๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ˆ˜์‹ํ™”๋ฉ๋‹ˆ๋‹ค:

  • Sk={(xi,yi),...,(xN,yN)}whereย is:ย S_k = \{(x_i,y_i),...,(x_N,y_N)\} \quad\text{where is: }Skโ€‹={(xiโ€‹,yiโ€‹),...,(xNโ€‹,yNโ€‹)}whereย is:ย 

    • x_i \in \mathbb{R}^D$

    • yiโˆˆ{1,โ€ฆ,K}y_i \in \{ 1, โ€ฆ, K \}yiโ€‹โˆˆ{1,โ€ฆ,K}

    • kย isย indexย ofย aย classk \text{ is index of a class}kย isย indexย ofย aย class

์ด๋•Œ, ๋‹ค์Œ์ด ์ •์˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

  • ์›์‹œ ์‹ ๊ฒฝ๋ง์€ MMM-์ฐจ์›์— ๋Œ€ํ•ด ๊ทธ ๊ณ„๊ธ‰์˜ ๋Œ€ํ‘œ์ธ ์›ํ˜• ckโˆˆRMc_k \in \mathbb{R}^Mckโ€‹โˆˆRM์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•œ ๋ถ„์‚ฐ ๊ฐ’์„ ์–ป๊ธฐ์œ„ํ•ด ๋งค๊ฐœ๋ณ€์ˆ˜ ฮธ\thetaฮธ๋ฅผ ๊ฐ€์ง„ ๋ถ„์‚ฐ ํ•จ์ˆ˜ fฮธ:RDโ†’RMf_{\theta}:\mathbb{R}^D \rightarrow \mathbb{R}^Mfฮธโ€‹:RDโ†’RM๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ์›ํ˜•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.

    • Prototype=ck=1โˆฃNCโˆฃโˆ‘(xi,yi)โˆˆSkfฮธ(xi)\text{Prototype} = c_k = \frac{1}{|N_C|}\sum_{(x_i, y_i) \in S_k}{f_{\theta}(x_i)}Prototype=ckโ€‹=โˆฃNCโ€‹โˆฃ1โ€‹โˆ‘(xiโ€‹,yiโ€‹)โˆˆSkโ€‹โ€‹fฮธโ€‹(xiโ€‹)

  • ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜ d:RMโ‹…RMโ†’[0,+infโก)d: \mathbb{R}^M \cdot \mathbb{R}^M \rightarrow [0, +\inf)d:RMโ‹…RMโ†’[0,+inf)๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์›์‹œ ์‹ ๊ฒฝ๋ง์€ ๋ถ„์‚ฐ ๊ณต๊ฐ„์—์„œ ๊ฐ ๊ณ„๊ธ‰์˜ ์›ํ˜•์— ์ฟผ๋ฆฌ ์ง€์  xxx ๋Œ€ํ•œ ๊ฑฐ๋ฆฌ, ๊ทธ์— ๋Œ€ํ•œ ์†Œํ”„ํŠธ๋งฅ์Šค๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

    • pฮธ(y=kโˆฃx)=expโกโˆ’d(fฮธ(x),ck)โˆ‘kโ€™expโก(โˆ’d(fฮธ(x),ckโ€™))p_{\theta}(y = k | x)=\frac{\exp{-d(f_{\theta}(x), c_k)}}{\sum_{kโ€™}{\exp(-d(f_{\theta}(x),c_{kโ€™}))}}pฮธโ€‹(y=kโˆฃx)=โˆ‘kโ€™โ€‹exp(โˆ’d(fฮธโ€‹(x),ckโ€™โ€‹))expโˆ’d(fฮธโ€‹(x),ckโ€‹)โ€‹

    • ๊ฑฐ๋ฆฌ๋ฅผ ๊ตฌํ•˜๋Š” ํ•จ์ˆ˜ d(.,.)d(., .)d(.,.)๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค: d(x,y)=โˆ‘i=1n(xiโˆ’yi)2d(x,y) = \sqrt{\sum_{i=1}^{n}{(x_i - y_i)^2}}d(x,y)=โˆ‘i=1nโ€‹(xiโ€‹โˆ’yiโ€‹)2โ€‹

  • ํ•™์Šต์€ ๋„ค๊ฑฐํ‹ฐ๋ธŒ ๋กœ๊ทธ ํ™•๋ฅ  J(ฮธ)=โˆ’logโกpฮธ(y=kโˆฃx)J(\theta) = -\log{p_{\theta}(y = k | x) }J(ฮธ)=โˆ’logpฮธโ€‹(y=kโˆฃx)๊ณผ SGDSGDSGD๋ฅผ ํ†ตํ•ด ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค. ํ•œ ์—ํ”ผ์†Œ๋“œ๋ฅผ ์œ„ํ•œ ๋ฐ์ดํ„ฐ์…‹์€ ๋ฌด์ž‘์œ„๋กœ ์„ ์ •๋œ ๊ณ„๊ธ‰์˜ ๋ถ€๋ถ„์ง‘ํ•ฉ๊ณผ ๊ทธ์— ๋Œ€ํ•œ ๋ฌด์ž‘์œ„ ์ƒ˜ํ”Œ ๋ช‡ ๊ฐœ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ๋‚˜๋จธ์ง€๋Š” ์ฟผ๋ฆฌ ์ง€์ ์œผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

Algorithm

๋ชจ๋ธ fฮธ(x)f_{\theta}(x)fฮธโ€‹(x)์˜ ํ›ˆ๋ จ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋‘ ํ•™์Šต ๋‹จ๊ณ„๋กœ ์„ค๋ช…๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹จ๊ณ„ 1์—์„  ์„œํฌํŠธ ์„ธํŠธ๋ฅผ ํ†ตํ•ด ์›ํ˜• ckc_kckโ€‹๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๋‹จ๊ณ„ 2์—์„  ์ฟผ๋ฆฌ ์„ธํŠธ๋ฅผ ํ†ตํ•ด ์†์‹ค JJJ์„ ๊ณ„์‚ฐํ•˜๊ณ  ์ž„๋ฒ ๋”ฉ ๋„คํŠธ์›Œํฌ๋ฅผ ๊ฐ€์ค‘์น˜ ฮธ\thetaฮธ๋ฅผ ๊ฐฑ์‹ ํ•ฉ๋‹ˆ๋‹ค.

  • ์†Œํ”„ํŠธ๋งฅ์Šค: pฮธ(y=kโˆฃx)=expโกโˆ’d(fฮธ(x),ck)โˆ‘kโ€™expโก(โˆ’d(fฮธ(x),ckโ€™))p_{\theta}(y = k | x)=\frac{\exp{-d(f_{\theta}(x), c_k)}}{\sum_{kโ€™}{\exp(-d(f_{\theta}(x),c_{kโ€™}))}}pฮธโ€‹(y=kโˆฃx)=โˆ‘kโ€™โ€‹exp(โˆ’d(fฮธโ€‹(x),ckโ€™โ€‹))expโˆ’d(fฮธโ€‹(x),ckโ€‹)โ€‹

  • ๋„ค๊ฑฐํ‹ฐ๋ธŒ ๋กœ๊ทธ ํ™•๋ฅ : J(ฮธ)=โˆ’logโกpฮธ(y=kโˆฃx)J(\theta) = -\log{p_{\theta}(y = k | x) }J(ฮธ)=โˆ’logpฮธโ€‹(y=kโˆฃx)

  1. ๋‹จ๊ณ„ 1:

    1. S=RandomSample(D,Nk)S = \text{RandomSample}(D, N_k)S=RandomSample(D,Nkโ€‹)

    2. ck=1Skโ‹…โˆ‘(xi,yi)โˆˆSkfฮธ(xi)c_k = \frac{1}{S_k} \cdot \sum_{(x_i, y_i) \in S_k}{f_{\theta}(x_i)}ckโ€‹=Skโ€‹1โ€‹โ‹…โˆ‘(xiโ€‹,yiโ€‹)โˆˆSkโ€‹โ€‹fฮธโ€‹(xiโ€‹)

  2. ๋‹จ๊ณ„ 2:

    1. Q=RandomSample(...S+D,Nq)Q = \text{RandomSample}(... S + D, N_q)Q=RandomSample(...S+D,Nqโ€‹)

    2. for kkk in QQQ

      1. pฮธ(y=kโˆฃx)=expโกโˆ’d(fฮธ(x),ck)โˆ‘kโ€™expโก(โˆ’d(fฮธ(x),ckโ€™))p_{\theta}(y = k | x)=\frac{\exp{-d(f_{\theta}(x), c_k)}}{\sum_{kโ€™}{\exp(-d(f_{\theta}(x),c_{kโ€™}))}}pฮธโ€‹(y=kโˆฃx)=โˆ‘kโ€™โ€‹exp(โˆ’d(fฮธโ€‹(x),ckโ€™โ€‹))expโˆ’d(fฮธโ€‹(x),ckโ€‹)โ€‹

        • J(ฮธ)=โˆ’logโกpฮธ(y=kโˆฃx)J(\theta) = -\log{p_{\theta}(y = k | x) }J(ฮธ)=โˆ’logpฮธโ€‹(y=kโˆฃx)

        • ฮธโ†ฮธ+ฮ”J(ฮธ)\theta \leftarrow \theta + \Delta{J(\theta)}ฮธโ†ฮธ+ฮ”J(ฮธ)

์ œ๊ฐ€ ์ง์ ‘ ์ž‘์„ฑํ•œ ๋ฅผ ํ™•์ธํ•˜์„ธ์š”!

๊ตฌํ˜„ ์ฝ”๋“œ
์ด ๋…ผ๋ฌธ