Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars.
EMOPortraits introduces a novel approach for generating realistic and expressive one-shot head avatars driven by multimodal inputs, including extreme and asymmetric emotions.
For more details, please refer to:
You can set up the environment using the provided conda-pack
archive:
-
Download the Environment Archive
Download
sav.tar.gz
from the Google Drive or Yandex Disk. -
Unpack the Environment
# Create a directory for the environment mkdir -p sav_env # Unpack the tar.gz archive into the directory tar -xzf sav.tar.gz -C sav_env
-
Using Python Without Activating
# Run Python directly from the unpacked environment ./sav_env/bin/python
-
Activating the Environment
# Activate the environment source sav_env/bin/activate
Once activated, you can run Python as usual:
(sav_env) $ python
-
Cleanup Prefixes
After activating the environment, you may need to run the following command to fix any issues with environment paths:
(sav_env) $ conda-unpack
This command can also be run without activating the environment, as long as Python is installed on the machine.
Note: This option may not work as it has not been thoroughly tested.
Due to limitations with conda-pack
, the following repositories need to be installed manually:
-
Face Detection: Install from GitHub
git clone https://github.com/hhj1897/face_detection.git cd face_detection pip install -e .
-
ROI Tanh Warping: Install from GitHub
git clone https://github.com/ibug-group/roi_tanh_warping.git cd roi_tanh_warping pip install -e .
-
Face Parsing: Install from GitHub
git clone https://github.com/hhj1897/face_parsing.git cd face_parsing pip install -e .
-
Download Required Files
Please download the following files from Google Drive or Yandex Disk:
logs.zip
(contains main model weights - not yet available)logs_s2.zip
(contains stage 2 model weights)repos.zip
(contains dependencies repos and it's weights)
-
Extract Files
Extract all the downloaded zip files into the root directory of the project:
unzip logs.zip -d ./ unzip logs_s2.zip -d ./ unzip repos.zip -d ./
-
Download and Extract Loss Models
Navigate to the
losses
directory and download the following files:cd losses
loss_model_weights.zip
gaze_models.zip
Extract them within the same
losses
directory:unzip loss_model_weights.zip -d ./ unzip gaze_models.zip -d ./
Instructions on how to run the code, train models, and perform inference will be added here.
This repository is primarily intended for demonstration purposes, allowing enthusiasts to explore the network architecture and training procedures in detail. The primary author is not currently affiliated with academia and may not have the capacity to actively maintain this repository. Community contributions and support are highly encouraged.
A significant factor contributing to the success and quality of the results is the dataset used for training. The original model was trained on a high-quality (HQ) version of the VoxCeleb2 dataset, which is no longer publicly available. However, there are now newer datasets of higher quality and larger scale. Utilizing these can potentially yield even better results, as seen in recent methods that build upon ideas presented in the MegaPortraits paper.
Our FEED dataset (link), introduced in our paper, was instrumental in incorporating asymmetric and extreme emotions into the latent emotion space. We encourage the community to actively use and expand upon this dataset. Given that the final version is slightly smaller (due to some participants withdrawing consent), supplementing it with other datasets containing extreme emotions (e.g., NeRSemble) can enhance model performance, especially when attempting to replicate or improve upon the techniques presented in EMOPortraits.
We are providing version of the pre-trained model weights (located in logs.zip):
- Retrain_with_17_V1_New_rand_MM_SEC_4_drop_02_stm_10_CV_05_1_1
This model will be retrained using the same parameters as described in our paper but with 17 IDs in the FEED dataset instead of the original 23. Since the FEED dataset samples were used 25% of the time during training, this change might slightly affect performance in intensive tests.
Please refer to notebooks/E_emo_infer_video.ipynb
We extend our gratitude to all contributors and participants who made this project possible. Special thanks to the developers of the datasets and tools that were instrumental in our research.
This project is licensed under the Creative Commons BY-NC-SA 4.0 license. You are free to use, modify, and distribute this work non-commercially, as long as appropriate credit is given and any derivative works are licensed under identical terms.