You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello Tsun-An, I have read your paper "IMPROVING PERCEPTUAL QUALITY BY PHONE-FORTIFIED PERCEPTUAL LOSS FOR SPEECH ENHANCEMENT", which was actually well written. And I have a question about your code in 'dataset.py' L41&L42: why do you use a constant, 16384, to constrain the input length? Or does this constant have any special meaning? Thank you~
Hi, we are grateful to know that you are interested in our work!
The input is truncated due to the limitation of VRAM because DCU-Net20 is quite large.
To generate an output with an identical length as its input, we need the input to be 2^n, and therefore we choose the length of 16384, which is about 1 second long.
Hello Tsun-An, I have read your paper "IMPROVING PERCEPTUAL QUALITY BY PHONE-FORTIFIED PERCEPTUAL LOSS FOR SPEECH ENHANCEMENT", which was actually well written. And I have a question about your code in 'dataset.py' L41&L42: why do you use a constant, 16384, to constrain the input length? Or does this constant have any special meaning? Thank you~
PhoneFortifiedPerceptualLoss/dataset.py
Line 41 in d763760
PhoneFortifiedPerceptualLoss/dataset.py
Line 42 in d763760
The text was updated successfully, but these errors were encountered: