while setting variables #85

palashmoon · 2024-05-13T19:02:02Z

Hello, can you please help me with what value I should set for the variable in each file ?? for example PASE_CONFIG=
PASE_CHCK_PATH=
USE_AUTH_TOKEN= what should be the value for this??

david-gimeno · 2024-05-14T09:00:09Z

Hi Palash,

I guess we are talking about the script ./scripts/feature_extraction/extract-dvlog-pase+-feats.sh for extracting the audio-based PASE+ features. Different aspects to take into account:

· The videos of the dataset are expected to be placed at ./data/D-vlog/videos/, as indicated by variable $VIDEO_DIR.
· You should clone the official PASE+ repo: https://github.com/santi-pdp/pase.git
· You then have to follow the instructions here to download the pre-trained model provided by the original authors and to know the config file you should use.

Therefore, once you have cloned the repo and downloaded the pre-trained model checkpoint, you should be able to set these variables, e.g., as follows:

PASE_CONFIG=./pase/cfg/frontend/PASE+.cfg
PASE_CHCK_PATH=./pase/FE_e199.ckpt

Regarding the USE_AUTH_TOKEN variable, is a commonly required authentication token when using certain HuggingFace models. Please find the instructions to use PyAnnote here.

palashmoon · 2024-05-14T19:20:59Z

Thank you @david-gimeno for the help.
can you help me in setting up these variables too?

INSTBLINK_CONFIG=
ETH_XGAZE_CONFIG=
BODY_LANDMARKER_PATH= ./data/D-vlog/body_landmarks
HAND_LANDMARKER_PATH= ./data/D-vlog/hand_landmarks.

david-gimeno · 2024-05-15T08:41:41Z

You can find the toolkit we employed to extract each modality in the paper (see page no. 7). However, I am seeing that is not as easy as I expected configuring all these feature extractors. Let's go step by step.

Configuring InstBlink: You should clone the following repo and download the checkpoint indicated here. Then, if you inspect the script the original authors wrote, we can infer the following configuration:
INSTBLINK_CONFIG=./MPEblink/configs/instblink/instblink_r50.py
INSTBLINK_CHCK_PATH=./MPEblink/pretrained_models/instblink_r50.pth
Configuring ETH_XGaze: Similarly, you should clone the following repo. The authors provide config files for various models. In our work, we used the ETH-XGaze detector. So, the config file should be as follows:
ETH_XGAZE_CONFIG=./pytorch_mpiigaze_demo/ptgaze/data/configs/eth-xgaze.yaml
Configuring MediaPipe's Models: As indicated in the paper, we based our body and hand landmarkers on MediaPipe by Google. Take into account that this platform is plenty of models and you are not limited to using the same as ours. However, I can share the specific models we employed. If you read this tutorial you can find the Linux command to download the model checkpoint for the body pose estimator. For the hand landmark detector, you can find the checkpoint downloading Linux command here. How did I find this tutorial? Search in google for hand landmark mediapipe and then click on Python Code Example. Therefore, once you download the model checkpoints, the configuration should be something as follows:
```
        `BODY_LANDMARKER_PATH=./landmarkers/pose_landmarker.task`
        `HAND_LANDMARKER_PATH=./landmarkers/hand_landmarker.task`
```

Note that the /landmarkers/ directory is not automatically created, it was just not do a mess in our code

palashmoon · 2024-05-15T10:07:55Z

Thank you @david-gimeno for the quick response.
I cloned the repo as you mentioned in the above comment for gaze and used this >ETH_XGAZE_CONFIG=./pytorch_mpiigaze_demo/ptgaze/data/configs/eth-xgaze.yaml
but using this i giving me one more issue which is

I am unable to find this checkpoint over the net can you please provide some suggestion for this?
one more thing why are cloning pytorch_mpiigaze_demo.. and how can i resolve this issue??
Thank you so much once again

david-gimeno · 2024-05-15T10:27:40Z

I would need more details. What OS are you using, Ubuntu? and How did you set the variable MPIIGAZE_DIR=?

palashmoon · 2024-05-15T10:39:25Z

Hi, I am using Ubuntu only I have set like this.

MPIIGAZE_DIR=./scripts/conda_envs/feature_extractors/pytorch_mpiigaze_demo/
ETH_XGAZE_CONFIG=./scripts/conda_envs/feature_extractors/pytorch_mpiigaze_demo/ptgaze/data/configs/eth-xgaze.yaml

please let me know if you need any other information.

david-gimeno · 2024-05-15T10:49:19Z

According to the script the model checkpoints should automatically be downloaded. So, let's try using absolute paths, just in case. However, unless you modified our repo folder structure, your paths may be wrong because ./scripts/conda_envs/feature_extractors/... doesn't exist, it should be ./scripts/feature_extractors/...

palashmoon · 2024-05-15T10:58:31Z

Actually i have clone mpiigaze inside this path only which is ./scripts/conda_envs/feature_extractors/pytorch_mpiigaze_demo/
should I still use path as ./scripts/feature_extractors/... ??

david-gimeno · 2024-05-15T11:45:51Z

Okey, I think the problem is in the config file pytorch_mpiigaze_demo/ptgaze/data/configs/eth-xgaze.yaml. Open it with a text editor and modify the paths to the checkpoints according to the way you structure your project. I mean, you should replace ~/

palashmoon · 2024-05-15T13:40:16Z

Hi @david-gimeno . I have updated the file like this
gaze_estimator:
checkpoint: ./scripts/conda_envs/feature_extractors/pytorch_mpiigaze_demo/ptgaze/models/eth-xgaze_resnet18.pth
camera_params: ${PACKAGE_ROOT}/data/calib/sample_params.yaml
use_dummy_camera_params: false
normalized_camera_params: ${PACKAGE_ROOT}/data/normalized_camera_params/eth-xgaze.yaml
normalized_camera_distance: 0.6
image_size: [224, 224]
but still I am getting the same error.
Actually there is no checkpoint as eth-xgaze_resnet18.pth inside the models folder...
this is the current structure

david-gimeno · 2024-05-15T14:14:59Z

Our script uses the function download_ethxgaze_model(), which is defined in the original repo of the gaze tracker here. It returns the path where the model checkpoint should be downloaded. Try to modify our script to print that path and check if it matches the one specified in the config file.

If the error persists, you should contact the original authors of the gaze tracker code.

Note: Linux can have hidden files if their name is preceded by a dot. There are ways to see them even if they are hidden

palashmoon · 2024-05-16T12:23:43Z

Thank you @david-gimeno i was able to extract gaze feature.. after downloading the model it was getting stored in the path as

~/.patze./...
which was not able to expand properly by python I use os.expanduser(~/.patze/..) to get the correct path.
Thanks for the help again.

palashmoon · 2024-05-16T13:43:26Z

Hi @david-gimeno. I am trying to extract the emonet features using the following code it is looking for a face_ID inside the directory.

But the current structure is like this.

following is the error I am getting.

How can I extract facesId in the current locations??
can you please provide some suggestion on this??

Also, I am facing multiple issue while installing the requirement for pase+.. can you please help me in this too?

david-gimeno · 2024-05-16T14:57:20Z

It seems the script was expecting an additional level of directories. So, the script has been modified, you can check it here. Please update the repo and try again.

Regarding the requirements for each feature extractor, we provide info in the README of the repo. Please, read it carefully. Nonetheless, take into account that these installations usually depend on the OS architecture and might fail on certain occasions. Issues related to these installations should be solved by contacting the original authors of the corresponding models.

palashmoon · 2024-05-17T02:46:35Z

@david-gimeno is there any workaround for the installation of requirement.txt.. there are few installation issues I am facing. i have raised issue with the original authors too.
santi-pdp/pase#128. these are current errors I am getting while installation.

palashmoon · 2024-05-23T07:36:45Z

HI @david-gimeno.. I have downloaded all the videos from the D-vlog dataset.. should i split it based on the ids given in the test, train and validation.csv file?? or there is a separate file python3 ./scripts/feature_extraction/dvlog/extract_wavs.py --csv-path ./data/D-vlog/video_ids.csv --column-video-id video_id --video-dir $VIDEO_DIR --dest-dir $WAV_DIR video_id.csv which is missing in the repo?? please help me with this..

waHAHJIAHAO · 2024-07-15T03:47:42Z

Hi @palashmoon ，could you share your d-vlog dataset，I have been looking for this for a long time.

bucuram · 2024-07-18T09:09:41Z

@waHAHJIAHAO please reach out to the authors of the D-vlog dataset to request access: D-vlog: Multimodal Vlog Dataset for Depression Detection

waHAHJIAHAO · 2024-07-31T03:30:50Z

Hi, @bucuram ,May I ask how to fill this config file.

bucuram · 2024-07-31T07:25:18Z

Hi, @waHAHJIAHAO!

There should be the paths to the data, for example:

reading-between-the-frames:
  d-vlog: data/D-vlog/
  d-vlog-original: data/D-vlog/splits/original/
  daic-woz: data/DAIC-WOZ/
  e-daic-woz: data/E-DAIC-WOZ/
  num_workers: 8

waHAHJIAHAO · 2024-07-31T08:29:18Z

Thanks @bucuram a lot!! and What should I fill in the field of "ewing-between-the-frames"

bucuram · 2024-07-31T11:32:55Z

That field should remain as it is in the example above. Then, you can use the env config when running the experiments.

The env config is already set as ENV="reading-between-the-frames".

multimodal-depression-from-video/experiments/original-dvlog/train-baseline-original-dvlog-modality-ablation.sh

Line 20 in d19568c

ENV="reading-between-the-frames"

waHAHJIAHAO · 2024-07-31T16:29:56Z

@bucuram YESS!!，thank u for reply~~ ,due to I want to process new dataset which collected from my university,I have some quetions about d-vlog dataset . the filepath is "data/D-vlog/splits",which contain 4 csv files."voice_presence face_presence body_presence hand_presence"
These four fields were originally present in the D-VLOG dataset, or were they processed by your team later. If you did the pre-processing operation later, please tell me how to generate this part of data. And does the absence of this part have any effect on the model?

this is part of my dataset:

david-gimeno · 2024-08-01T09:18:03Z

Hi @waHAHJIAHAO!

These four dataframe columns were not originally in the D-Vlog dataset. We computed these statistics thanks to our feature extraction scripts you can find in this directory. As you can observe, for example, in the face detection script and body landmarks identification script, we were creating a numpy array with the index of those frames were no face or no body was detected. Additionally, as you can also notice, we were zero-filling the final feature sequence representing the video sample.

So, how did we compute that voice, face, etc. presence values? Having the information I mentioned above and knowing the frame per seconds of each video clip, we can compute the number of seconds were the subject was actually talking, actually present in the scene, etc. These are statistics to know more things about the dataset. What we actually used for model training was the array with the index where there was, e.g., no face, to create a mask with 0's and 1's to tell the model where it shouldn't pay attention.

waHAHJIAHAO · 2024-08-02T07:48:41Z

Hi @david-gimeno ~，Thank you for your careful reply. I have completed the pre-processing of the "presence" part. I still pre-processed my new data set according to D-vlog, generated some segmented npz files to access the model, and wrote the following data processing scripts which are logically consistent with dvlog. The current error problem is that my "traindataloader" did "torch.stack()" loading data error. I read on the fifth page of your paper that there is a learnable modal encoder that can unify the output of each mode. I would like to ask where is this encoder, or do you have any suggestions for my problem?

this is my error:

waHAHJIAHAO · 2024-08-02T12:49:43Z

I just use 5 modality and print their shape ，it look like this：

david-gimeno · 2024-08-05T15:10:43Z

@waHAHJIAHAO
The tensor shapes look nice. I guess 270 refers to the number of frames composing your context window, which, if your videos were recorded at 25fps, should correspond to a 10-second span. Are you using our code or are you implementing your own dataset? Anyway, I recommend you to carefully inspect our dataset script, specifically here you have a good starting point. You can debug how you data shape looks at every dataset step, either some tools or simple, yet effective, print() and exit(),

Regarding the learnable modality encoders to unify all inputs to the same dimensional space, you can find their implementation here. These modality encoders are subsequently used here when defining our Transformer-based model. Note that some of the modalities will be flatten (check this code and this config file). I agree our model do and takes into account a lot of details, but I believe that going step by step to understand the code is the proper way and it will not be for sure a waste of time :)

waHAHJIAHAO · 2024-08-17T11:05:58Z

@david-gimeno Thank u for reply！！！！I have been run my dataset successfully，but i found a question：It does not seem to converge when training, the best results will occur in a few epochs, and 200 epochs seem unnecessary

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

while setting variables #85

while setting variables #85

palashmoon commented May 13, 2024

david-gimeno commented May 14, 2024 •

edited

Loading

palashmoon commented May 14, 2024

david-gimeno commented May 15, 2024

palashmoon commented May 15, 2024 •

edited

Loading

david-gimeno commented May 15, 2024

palashmoon commented May 15, 2024

david-gimeno commented May 15, 2024 •

edited

Loading

palashmoon commented May 15, 2024 •

edited

Loading

david-gimeno commented May 15, 2024 •

edited

Loading

palashmoon commented May 15, 2024 •

edited

Loading

david-gimeno commented May 15, 2024 •

edited

Loading

palashmoon commented May 16, 2024 •

edited

Loading

palashmoon commented May 16, 2024 •

edited

Loading

david-gimeno commented May 16, 2024

palashmoon commented May 17, 2024 •

edited

Loading

palashmoon commented May 23, 2024

waHAHJIAHAO commented Jul 15, 2024 •

edited

Loading

bucuram commented Jul 18, 2024

waHAHJIAHAO commented Jul 31, 2024

bucuram commented Jul 31, 2024

waHAHJIAHAO commented Jul 31, 2024

bucuram commented Jul 31, 2024

waHAHJIAHAO commented Jul 31, 2024

david-gimeno commented Aug 1, 2024 •

edited

Loading

waHAHJIAHAO commented Aug 2, 2024

waHAHJIAHAO commented Aug 2, 2024

david-gimeno commented Aug 5, 2024 •

edited

Loading

waHAHJIAHAO commented Aug 17, 2024 •

edited

Loading

while setting variables #85

while setting variables #85

Comments

palashmoon commented May 13, 2024

david-gimeno commented May 14, 2024 • edited Loading

palashmoon commented May 14, 2024

david-gimeno commented May 15, 2024

palashmoon commented May 15, 2024 • edited Loading

david-gimeno commented May 15, 2024

palashmoon commented May 15, 2024

david-gimeno commented May 15, 2024 • edited Loading

palashmoon commented May 15, 2024 • edited Loading

david-gimeno commented May 15, 2024 • edited Loading

palashmoon commented May 15, 2024 • edited Loading

david-gimeno commented May 15, 2024 • edited Loading

palashmoon commented May 16, 2024 • edited Loading

palashmoon commented May 16, 2024 • edited Loading

david-gimeno commented May 16, 2024

palashmoon commented May 17, 2024 • edited Loading

palashmoon commented May 23, 2024

waHAHJIAHAO commented Jul 15, 2024 • edited Loading

bucuram commented Jul 18, 2024

waHAHJIAHAO commented Jul 31, 2024

bucuram commented Jul 31, 2024

waHAHJIAHAO commented Jul 31, 2024

bucuram commented Jul 31, 2024

waHAHJIAHAO commented Jul 31, 2024

david-gimeno commented Aug 1, 2024 • edited Loading

waHAHJIAHAO commented Aug 2, 2024

waHAHJIAHAO commented Aug 2, 2024

david-gimeno commented Aug 5, 2024 • edited Loading

waHAHJIAHAO commented Aug 17, 2024 • edited Loading

david-gimeno commented May 14, 2024 •

edited

Loading

palashmoon commented May 15, 2024 •

edited

Loading

david-gimeno commented May 15, 2024 •

edited

Loading

palashmoon commented May 15, 2024 •

edited

Loading

david-gimeno commented May 15, 2024 •

edited

Loading

palashmoon commented May 15, 2024 •

edited

Loading

david-gimeno commented May 15, 2024 •

edited

Loading

palashmoon commented May 16, 2024 •

edited

Loading

palashmoon commented May 16, 2024 •

edited

Loading

palashmoon commented May 17, 2024 •

edited

Loading

waHAHJIAHAO commented Jul 15, 2024 •

edited

Loading

david-gimeno commented Aug 1, 2024 •

edited

Loading

david-gimeno commented Aug 5, 2024 •

edited

Loading

waHAHJIAHAO commented Aug 17, 2024 •

edited

Loading