Motivation

GAN은 오랜 학습 시간 + 많은 학습 데이터를 필요
- BigGAN is trained on 1M of images for 120 GPU days

Current state-of-the-art GANs, however, often require a large amount of training data and heavy computational resources, which thus limits the applicability of GANs in practical scenarios.

Previous Works

While several methods propose such transfer-learning ap- proaches to training GANs, they are often prone to overfitting with limited training data or not robust in learning a significant distribution shift.

Experiments

판별자의 초기 layer는 일반적인 feature를 학습한다.
FreezeD stably converges to the better optima than fine-tuning

which layer to freeze?

Intuitively, the lower layers of the discriminator learn generic features of images while the upper layers learn to classify whether the image is real or fake based on the extracted features.

Freezing until intermediate layers performs the best
- 학습한 데이터셋 (source) 과 학습할 데이터셋 (target) 간의 분포 거리에 따라 다르다.

Thoughts.

Transfer learning은 다양한 분야에서 적용되어 왔는데, GAN에도 잘 적용된다는 것은 그럴 수 밖에 없을 것 같으면서도 신기하다.
Fine tuning 보다 낫다는 것은 신기하다.
Feature distillation으로 더 잘 할 수 있을 것 같다.

저작자표시

Seungjun's blog

Seungjun's blog

태그

최근글

댓글

공지사항

아카이브

Motivation

Previous Works

Experiments

which layer to freeze?

Thoughts.

티스토리툴바