[논문 리뷰] Freeze the Discriminator:a Simple Baseline for Fine-Tuning GANs

2020. 11. 21. 17:34카테고리 없음

  • Authors: Sangwoo Mo, Minsu Cho, Jinwoo Shin
  • Affiliation: KAIST, POSTECH
  • Conference: CVPR 2020
  • slide

Motivation

  • GAN은 오랜 학습 시간 + 많은 학습 데이터를 필요
    • BigGAN is trained on 1M of images for 120 GPU days

Current state-of-the-art GANs, however, often require a large amount of training data and heavy computational resources, which thus limits the applicability of GANs in practical scenarios.

Previous Works

While several methods propose such transfer-learning ap- proaches to training GANs, they are often prone to overfitting with limited training data or not robust in learning a significant distribution shift.

Experiments

  • 판별자의 초기 layer는 일반적인 feature를 학습한다.
  • FreezeD stably converges to the better optima than fine-tuning

which layer to freeze?

Intuitively, the lower layers of the discriminator learn generic features of images while the upper layers learn to classify whether the image is real or fake based on the extracted features.

  • Freezing until intermediate layers performs the best
    • 학습한 데이터셋 (source) 과 학습할 데이터셋 (target) 간의 분포 거리에 따라 다르다.

Thoughts.

  • Transfer learning은 다양한 분야에서 적용되어 왔는데, GAN에도 잘 적용된다는 것은 그럴 수 밖에 없을 것 같으면서도 신기하다.
  • Fine tuning 보다 낫다는 것은 신기하다.
  • Feature distillation으로 더 잘 할 수 있을 것 같다.