How Do You Perceive My Face? Recognizing Facial Expressions in Multi‐Modal Context by Modeling Mental Representations
How Do You Perceive My Face? Recognizing Facial Expressions in Multi-Modal Context by Modeling Mental Representations Florian Blume*, Runfeng Qu*, Pia Bideau, Martin Maier, Rasha Abdel Rahman, Olaf Hellwich
Link to paper: https://arxiv.org/abs/2409.02566
-
Clone the repository
git clone [todo]
-
Create anaconda enviroment follow
cd context_matters conda env create -f environment.yaml
-
For evaluation and test purposes, we provide pre-trained model on RAVDESS and mead
The training of the entire framework contains 3 steps
Edit the data_path with the path of your dataset and the in_channel in Residual_vaegan.yaml
according to your input data and run
python run.py --config configs/Residual_vaegan.yaml --command fit
Modify the ckpt in MLP_classify.yaml
with the path of face reconstruction model you obtained in step 1 and run the following command
python run.py --config configs/MLP_classify.yaml --command fit
Compile the ckpts of backbone_1, backbone_2 and classifier in ia_attention_class.yaml
with the models you trained in the last two steps. In the end run the command
python run.py --config configs/ia_attention_class.yaml --command fit
Edit the ckpt in ia_attention_class.yaml
with the model you trained or the one we provide, and execute the command
python run.py --config configs/ia_attention_class.yaml --command eval
It will print the validation accuracy, followed by the test accuracy.
You can select one image and an audio file from the provided dataset and produce the merged face image using the following command
python interface.py --image path/to/image --audio path/to/audio