Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensuring correct flow computation and tweaking n_rgb #140

Open
antortjim opened this issue Apr 21, 2023 · 6 comments
Open

Ensuring correct flow computation and tweaking n_rgb #140

antortjim opened this issue Apr 21, 2023 · 6 comments

Comments

@antortjim
Copy link

antortjim commented Apr 21, 2023

Dear deepethogram developers

I am testing the performance of my flow generators to ensure that they detect the subtle fly behaviors I am interested in. For this I am interactively running deepethogram.flow_generator.inference.extract_movie to get a better intuition of what the flow generators do.

In my tests, I see that microbehaviors like a leg twitch are not captured by the flow generator when using the default n_rgb=11. I was wondering if, because my videos have a high fps (150), only 11 frames are not able to capture the overall behavior i.e. within only 11 frames the animal always almost completely static. I am also worried that the flow computation seems to be there but it's almost invisible, which suggests it may not be working 100%

twitch.mp4

I have tried looking up in the code and also in https://arxiv.org/pdf/1704.00389.pdf what would be the impact on speed and accuracy of increasing the n_rgb in the flow_generator training, but could not find any intuitive explanation, beyond what I can guess from reading the paper and the linked preprint.
Is the flow computation shown in the video normal, or is there indeed a problem? If so, would increasing the n_rgb help? Or maybe there is a better approach?

Thanks
Antonio

@antortjim
Copy link
Author

antortjim commented Apr 21, 2023

This is the same video after cropping the flow (left) part, normalizing it and bringing it back with the original rgb. You can appreciate the flow contains squared artifacts, probably either because

  1. reflecting the kernels
  2. the normalization because the original span of the flow was 235 (min) - 255 (max) i.e. ~20 out of 255

as a result, the leg movement is not clear

out.mp4

NOTE: This is how I normalized the flow so that its span becomes 0-255

import cv2
import numpy as np
cap=cv2.VideoCapture("twitch.mp4")
ret, frame=cap.read()
original = np.uint8(frame[:, frame.shape[1]//2:, :])
frame=frame[:, :frame.shape[1]//2,:]
absolute_min = min(max(0, frame.min()), absolute_min)
absolute_max = max(min(255, frame.max()), absolute_max)

frame=frame-frame.min()
frame=255*(frame/frame.max())
frame=np.uint8(frame)
frame=np.hstack([frame, original])
print(absolute_min, absolute_max)
# 235 255

@antortjim
Copy link
Author

This is the same experiment for a Proboscis Extension behavior

PE.mp4
out.mp4

@antortjim
Copy link
Author

antortjim commented Apr 21, 2023

Also please note, the artifacts in the flow are not a result of the .mp4 encoding, since they are present in the .png I saved with no compression, like this:

# https://stackoverflow.com/a/59921563/3541756
cv2.imwrite(f"{str(i).zfill(6)}.png", frame, [cv2.IMWRITE_PNG_COMPRESSION, 0])

@antortjim
Copy link
Author

antortjim commented May 4, 2023

Revisiting my question, I noticed by studying the code further, that deepethogram.flow_generator.inference.extract_movie contains an argument I glossed over, which puts a cap on how much flow magnitude (i.e. movement strength) can be represented by the colors i.e. what is the maximum movement that will be encoded with a different hue saturation, above which the maximum saturation will be used (regardless of the flow magnitude).

This argument is maxval. By default it's 5, but I noticed in my videos the maximum is around 0.5, which means my colors will only cover around 10% of the possible space, which means only max 10% saturation will be represented in my output video. This of course translates into overly white videos, as per how the HSV system works (low S means white).

So to sum up, there is no need to normalize the video like I was doing (also, it's a wrong way to normalize them). Instead, the maxval parameter should be tweaked to something around the max magnitude in your flow outputs

@antortjim
Copy link
Author

antortjim commented May 4, 2023

I have made this commit so that the HiddenTwoStream model uses num_images given by the config and not the hardcoded value of 11.

shaliulab@9359aed

I am training DEG by calling sequentially the two scripts in https://github.com/shaliulab/vsc-scripts/tree/91ea5e1acd1663e2b14f832f76941ad346717596/manual/deepethogram/train like so

python flow_generator_train.py     --n-rgb 55 --epochs 10
python feature_extractor_train.py  --n-rgb 55 --epochs 10  --flow-weights $PATH_TO_WEIGHTS_FROM_LAST_RUN --feat-weights pretrained

To be more explicit, this is how I signal to the DEG config that the models should have a different number of flows

cfg.flow_generator.n_rgb=args.n_rgb
cfg.feature_extractor.n_flows=args.n_rgb-1

A discussion on whether n_rgb needs to be matched to the framerate of the videos, and in that case, whether it makes more sense to increase the n_rgb or downsample the video's framerate would be appreciated.

@antortjim
Copy link
Author

Regardless of what n_rgb I use, the videos produced with extract_movie (and `maxval=0.5) consistently show this pattern where the fly has a homogeneous color throughout the recording, which is not what I expected, since there is no movement there, and also movements in opposite directions get the same color, as opposed to opposing colors.

FlyHostel2_6X_2023-03-13_14-00-00_000205_001_flows.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant