๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Study/Deep Learning

[Deep learning] What is 'Style transfer'?

by ์œ ๋ฏธ๋ฏธYoomimi 2024. 7. 7.

 

The result of style transfer (CVPR 2016)

 

โœจ Style Transfer๋ž€?

 ์ด๋ฏธ์ง€์˜ '์ปจํ…์ธ '๋Š” ๊ทธ๋Œ€๋กœ ๋‘๊ณ  '์Šคํƒ€์ผ'์„ ๋ณ€ํ™˜ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ํŠนํžˆ 2016๋…„์— ๋ฐœํ‘œ๋œ ๋‘ ๋…ผ๋ฌธ, "Image Style Transfer Using Convolutional Neural Networks" (CVPR 2016)๊ณผ "Perceptual Losses for Real-Time Style Transfer and Super-Resolution" (ECCV 2016)์€ style transfer ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์— ํฐ ๊ธฐ์—ฌ๋ฅผ ํ–ˆ๋‹ค.

 

 

1. Image Style Transfer Using Convolutional Neural Networks (CVPR 2016)

 

 ์ด ๋…ผ๋ฌธ์€ CNN์„ ์‚ฌ์šฉํ•ด style transfer๋ฅผ ์ด๋ฃจ๋Š” ๋ฐฉ์‹์„ ์—ฐ๊ตฌํ–ˆ๋‹ค. contents ์ด๋ฏธ์ง€์˜ ๊ตฌ์กฐ์™€ style ์ด๋ฏธ์ง€์˜ style์„ ํ•ฉํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ๋‘ ๊ฐœ์˜ CNN ์ค‘ ํ•˜๋‚˜๋Š” contents ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ , ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” style ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•œ๋‹ค. ์ด๊ฒƒ๋“ค์ด 'ํ•ฉํ•ด์ง€๋Š”' ๊ฒƒ์€ ๊ฒฐ๊ตญ loss function์— ๋‹ฌ๋ ค์žˆ๋‹ค. Loss ์—ญ์‹œ ๋‘˜๋กœ ๋‚˜๋‰˜๋Š”๋ฐ, content loss์€ contents ์ด๋ฏธ์ง€์™€ ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€ ์‚ฌ์ด์˜ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ™”ํ•˜๊ณ , style ์†์‹ค์€ style ์ด๋ฏธ์ง€์™€ ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€ ์‚ฌ์ด์˜ ์Šคํƒ€์ผ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ™”ํ•œ๋‹ค.

์šฐ์„  content loss๋‹ค. ์—ฌ๊ธฐ์„œ F๋Š” ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€์˜ ๋ ˆ์ด์–ด l์—์„œ์˜ feature map, P๋Š” contents ์ด๋ฏธ์ง€์˜ ๋ ˆ์ด์–ด l์—์„œ์˜ feature map์„ ์˜๋ฏธํ•œ๋‹ค. loss ํ˜•ํƒœ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๊ทธ๋ ‡๋“ฏ MSE loss term์— ๋ฏธ๋ถ„ ํŽธํ•˜๊ฒŒ 1/2๊ฐ€ ๊ณฑํ•ด์ง„ ํ˜•ํƒœ๋‹ค.

๋‹ค์Œ์€ style loss๋‹ค. ์—ฌ๊ธฐ์„œ G๋Š” ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€์˜ ๋ ˆ์ด์–ด l์—์„œ์˜ Gram Matrix, A๋Š” style ์ด๋ฏธ์ง€์˜ ๋ ˆ์ด์–ด l์—์„œ์˜ Gram Matrix๋‹ค. (Gram Matrix ๊ณ„์‚ฐ์€ feature map์˜ ๊ณฑ์œผ๋กœ ์ด๋ฃจ์–ด์ง„๋‹ค.) MSE๋ฅผ ์ทจํ•˜๊ณ  ๋ ˆ์ด์–ด l์˜ ํ•„ํ„ฐ ์ˆ˜์ธ N_l๊ณผ ๋ ˆ์ด์–ด l์˜ feature map ํฌ๊ธฐ M_l์„ ์ด์šฉํ•ด 1์ฐจ weighting์„ ํ•ด์ฃผ๊ณ , w_l๋กœ ๋ ˆ์ด์–ด l์˜ ์ตœ์ข… weighting์„ ๊ณฑํ•ด ๋ชจ๋“  layer์—์„œ loss๋ฅผ ํ•ฉํ•ด(์ „์ฒด layer ์ˆ˜ L) ์ตœ์ข… style loss๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.

 

 

๊ฒฐ๊ตญ total loss๋Š” ์ด๋ ‡๊ฒŒ ๊ฒฐ์ •๋œ๋‹ค. ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ธฐ ์œ„ํ•œ alpha, beta ์กฐ์ •์ด ํ•ต์‹ฌ์ด๋‹ค.

 

The algorithm of style transfer

 

 

2. Perceptual Losses for Real-Time Style Transfer and Super-Resolution (ECCV 2016)

 

 ์œ„์˜ ๋…ผ๋ฌธ(CVPR 2016์˜ style transfer)์—์„œ๋Š” loss ๊ณ„์‚ฐ์— MSE๋ฅผ ์‚ฌ์šฉํ•ด ๋‘ ์ด๋ฏธ์ง€ ๊ฐ„์˜ ํ”ฝ์…€ ๋‹จ์œ„ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘์—ˆ๋Š”๋ฐ, ์ด๋Ÿฌํ•œ ์ ‘๊ทผ๋ฒ•์€ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋‚˜ ๋…ธ์ด์ฆˆ๊ฐ€ ํฌํ•จ๋œ ์ด๋ฏธ์ง€์™€ ๊ฐ™์€ ๋‹จ์ˆœํ•œ ๊ฒฝ์šฐ์—๋Š” ์ž˜ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๊ณ ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋‚˜ ๋ณต์žกํ•œ ํŒจํ„ด์„ ํฌํ•จํ•œ ์ด๋ฏธ์ง€์—์„œ๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” style transfer์™€ super-resolution์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด loss function์œผ๋กœ  perceptual loss์„ ์ œ์•ˆํ–ˆ๋‹ค. perceptual loss๋Š” CNN์˜ ์ค‘๊ฐ„ ๋ ˆ์ด์–ด์—์„œ ์ถ”์ถœํ•œ feature map์„ ์‚ฌ์šฉํ•ด ๊ณ„์‚ฐ๋˜์–ด ์ด๋ฏธ์ง€์˜ ๊ณ ์ˆ˜์ค€ ์‹œ๊ฐ์  ์ •๋ณด๋ฅผ ์œ ์ง€ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค€๋‹ค.

 

Content Loss์ด๋‹ค. ฯ•j๋Š” ์‹ ๊ฒฝ๋ง์˜ jj๋ฒˆ์งธ ๋ ˆ์ด์–ด์—์„œ์˜ feature map์„ ์˜๋ฏธํ•œ๋‹ค y_hat์ด ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€, y๊ฐ€ contents ์ด๋ฏธ์ง€๋‹ค.

 

Style loss๋Š” Gram Matrix๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ feature map์˜ ์ฑ„๋„ ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์„ ํ‰๊ฐ€ํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ ์‚ฌ์šฉ๋˜๋Š” Gram Matrix๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

์ถ”๊ฐ€๋กœ  Total Variation Loss๋ฅผ ๊ตฌํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ์ด๋ฏธ์ง€์˜ ๊ณต๊ฐ„์  ์—ฐ์†์„ฑ์„ ์œ ์ง€ํ•˜๊ณ , ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€์—์„œ artifact(์ธ์œ„์ ์ธ ์™œ๊ณก์ด๋‚˜ noise)๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•จ์ด๋‹ค.

 

 

๊ทธ๋ž˜์„œ ์ตœ์ข…์ ์œผ๋กœ ์ตœ์ ํ™”ํ•ด์•ผํ•˜๋Š” loss function์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋œ๋‹ค.