GigaGAN is a large-scale Generative Adversarial Network (GAN) designed for text-to-image synthesis tasks. It is a 1 billion parameter model that has been presented at CVPR 2023.
Features of GigaGAN
- High Performance: GigaGAN outperforms other models like Stable Diffusion v1.5, DALL·E 2, and Parti-750M in terms of Fréchet Inception Distance (FID). It can generate 512px outputs at a speed of 0.13 seconds, which is significantly faster than diffusion and autoregressive models.
- Disentangled, Continuous, and Controllable Latent Space: GigaGAN comes with a disentangled, continuous, and controllable latent space. This allows for layout-preserving fine style control by applying different prompts at fine scales.
- High-Resolution Image Generation: GigaGAN can be used to train an efficient, higher-quality upsampler. This allows it to synthesize ultra high-res images at 4k resolution in just 3.66 seconds.
- Text-to-Image Synthesis: GigaGAN is designed for text-to-image synthesis tasks. It can take a text prompt and generate a corresponding image, making it a powerful tool for creative and design applications.