About MegaPortraits

MegaPortraits is an advancement in the neural head avatar technology that aims to achieve megapixel resolution. This technology focuses on the challenging task of cross-driving synthesis, where the appearance of the driving image is significantly different from the animated source image. The main objective is to produce high-quality rendered images that can generalize to novel views and motions.

Features of MegaPortraits

  1. Megapixel Resolution: MegaPortraits advances the neural head avatar technology to achieve megapixel resolution.
  2. Cross-Driving Synthesis: Focuses on the challenging task of cross-driving synthesis, where the appearance of the driving image is substantially different from the animated source image.
  3. New Neural Architectures and Training Methods: Proposes a set of new neural architectures and training methods that utilize both medium-resolution video data and high-resolution image data to achieve the desired levels of rendered image quality and generalization to novel views and motion.
  4. High-Quality Neural Avatars: The suggested architectures and methods produce convincing high-resolution neural avatars, which outperform competitors in the cross-driving scenario.
  5. Distillation into Lightweight Model: A trained high-resolution neural avatar model can be distilled into a lightweight student model for real-time operation.
  6. Identity Lock: The lightweight model locks the identities of neural avatars to several dozens of pre-defined source images, which is essential for many practical applications.
  7. One-Shot Creation: The system is designed for the one-shot creation of high-resolution human avatars, termed as megapixel portraits or MegaPortraits.
  8. Two-Stage Training: The model undergoes training in two stages, with an optional additional distillation stage proposed for faster inference.
  9. Standard Training Setup: The training setup involves sampling two random frames from the dataset at each step: the source frame and the driver frame. The model then imposes the motion of the driving frame onto the appearance of the source frame to produce an output image.