Skip to content

Commit

Permalink
first commit based on ggvad-genea2023
Browse files Browse the repository at this point in the history
  • Loading branch information
rltonoli committed Oct 21, 2024
1 parent fa8b9c8 commit f6efad5
Show file tree
Hide file tree
Showing 58 changed files with 11,590 additions and 1 deletion.
135 changes: 135 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Removing datasets

wavlm/*.pt

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
generate_wavlm_reps.ipynb
generate_wavlm_reps.ipynb
28 changes: 28 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
FROM nvidia/cuda:12.2.0-devel-ubuntu22.04


ENV PATH="/root/miniconda3/bin:${PATH}"
ARG PATH="/root/miniconda3/bin:${PATH}"

RUN apt-get update
RUN apt-get install -y wget git nano ffmpeg

RUN wget \
https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh \
&& mkdir /root/.conda \
&& bash Miniconda3-py37_4.8.3-Linux-x86_64.sh -b \
&& rm -f Miniconda3-py37_4.8.3-Linux-x86_64.sh

RUN conda --version

WORKDIR /root
COPY environment.yml /root

RUN conda install tqdm -f
RUN conda update conda
RUN conda install pip
RUN conda --version
RUN conda env create -f environment.yml

SHELL ["conda", "run", "-n", "ggvad", "/bin/bash", "-c"]
RUN pip install git+https://github.com/openai/CLIP.git
99 changes: 98 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,99 @@
# stylistic-gesture
Stylistic Co-Speech Gesture Generation: Modeling Personality and Communicative Styles in Virtual Agents
Official repository for the paper Stylistic Co-Speech Gesture Generation: Modeling Personality and Communicative Styles in Virtual Agents.

## Preparing environment

1. Git clone this repo

2. Enter the repo and create docker image using

```sh
docker build -t ggvad .
```

3. Run container using

```sh
docker run --rm -it --gpus device=GPU_NUMBER --userns=host --shm-size 64G -v /MY_DIR/ggvad-genea2023:/workspace/ggvad/ -p PORT_NUMBR --name CONTAINER_NAME ggvad:latest /bin/bash
```

for example:
```sh
docker run --rm -it --gpus device=0 --userns=host --shm-size 64G -v C:\ProgramFiles\ggvad-genea2023:/workspace/my_repo -p '8888:8888' --name my_container ggvad:latest /bin/bash
```

> ### Cuda version < 12.0:
>
> If you have a previous cuda or nvcc release version you will need to adjust the Dockerfile. Change the first line to `FROM pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel` and remove lines 10-14 (conda is already installed in the pythorch image). Then, run container using:
>
> ```sh
> nvidia-docker run --rm -it -e NVIDIA_VISIBLE_DEVICES=GPU_NUMBER --runtime=nvidia --userns=host --shm-size 64G -v /work/rodolfo.tonoli/GestureDiffusion:/workspace/gesture-diffusion/ -p $port --name gestdiff_container$number multimodal-research-group-mdm:latest /bin/bash
> ```
OR use the shell script ggvad_container.sh (don't forget to change the volume) using the flags -g, -n, and -p
example:
```sh
sh ggvad_container.sh -g 0 -n my_container -p '8888:8888'
```
4. Activate cuda environment:
```sh
source activate ggvad
```
## Data pre-processing
1. Get the GENEA Challenge 2023 dataset and put it into `./dataset/`
(Our system is monadic so you'll only need the main-agent's data)
2. Download the [WavLM Base +](https://github.com/microsoft/unilm/tree/master/wavlm) and put it into the folder `/wavlm/`
3. Inside the folder `/workspace/ggvad`, run
```sh
python -m data_loaders.gesture.scripts.genea_prep
```
This will convert the bvh files to npy representations, downsample wav files to 16k and save them as npy arrays, and convert these arrays to wavlm representations. The VAD data must be processed separetely due to python libraries incompatibility.
4. (Optional) Process VAD data
We provide the speech activity information (from speechbrain's VAD) data, but if you wish to process them yourself you should redo the steps of "Preparing environment" as before, but for the speechbrain environment: Build the image using the Dockerfile inside speechbrain (`docker build -t speechbrain .`), run the container (`docker run ... --name CONTAINER_NAME speechbrain:latest /bin/bash`) and run:
```sh
python -m data_loaders.gesture.scripts.genea_prep_vad
```
## Train model
To train the model described in the paper use the following command inside the repo:
```sh
python -m train.train_mdm --save_dir save/my_model_run --dataset genea2023+ --step 10 --use_text --use_vad True --use_wavlm True
```
## Gesture Generation
Generate motion using the trained model by running the following command. If you wish to generate gestures with the pretrained model of the Genea Challenge, use `--model_path ./save/default_vad_wavlm/model000290000.pt`
```sh
python -m sample.generate --model_path ./save/my_model_run/model000XXXXXX.pt
```
## Render
To render the official Genea 2023 visualizations follow the instructions provided [here](https://github.com/TeoNikolov/genea_visualizer/)
## Cite
If you with to cite this repo or the paper
```text
@article{tonoli2024stylistic,
author = {Tonoli, Rodolfo L. and Costa, Paula D. P.},
title = {Stylistic Co-Speech Gesture Generation: Modeling Personality and Communicative Styles in Virtual Agents},
journal = {N/A},
year = {N/A},
}
```
Empty file added data_loaders/__init__.py
Empty file.
Empty file.
Loading

0 comments on commit f6efad5

Please sign in to comment.