You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the RVM model, the GRU layer accounts for a huge number of computations. It is intuitive to ask: would it be better to replace the GRU layer with Conv layer that occupies the same number of computations? A simple answer of 'yes' or 'no' will be greatly appreciated.
Recently I've been trying my best to implement a matting model with excellent performance. I have read many recently proposed video matting papers and test their matting performance. Even RVM was proposed two years ago, it is the best open-sourced (including training code) model in my test results. I wonder if you can provide some tips to improve the performance of RVM? I believe you have a lot of good ideas that are worth trying. It will be greatly appreciated if you can share some of your insights here. Thank you very much!
The text was updated successfully, but these errors were encountered:
No. The whole point of our research is to replace conv with GRU. GRU recurrent architecture allows the model to analyze the video sequence with temporal memory. If you replace it with Conv, then it will treat each frame independently. It will have flickers.
I have not been following matting research lately, but here are some ideas just top of my head:
Use transformer instead of conv gru to model temporal relation.
Use better backbone, based on ViT, like DinoV2.
Treat matting as a generative task, using diffusion objective etc.
In the RVM model, the GRU layer accounts for a huge number of computations. It is intuitive to ask: would it be better to replace the GRU layer with Conv layer that occupies the same number of computations? A simple answer of 'yes' or 'no' will be greatly appreciated.
Recently I've been trying my best to implement a matting model with excellent performance. I have read many recently proposed video matting papers and test their matting performance. Even RVM was proposed two years ago, it is the best open-sourced (including training code) model in my test results. I wonder if you can provide some tips to improve the performance of RVM? I believe you have a lot of good ideas that are worth trying. It will be greatly appreciated if you can share some of your insights here. Thank you very much!
The text was updated successfully, but these errors were encountered: