You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
E.g, line 62 in spec_augment_tensorflow.py:
'''
fbank_size = tf.shape(spectrogram)
n, v = fbank_size[1], fbank_size[2]
'''
And 'n' is used as the length of time, and 'v' is used as the length of frequency.
But in spec_augment_test_TF.py, the re-shaped mel_spectrogram from librosa should be (-1, n_mels, t, 1), which means fbank_size[1] is actually the length of frequency and fbank_size[2] is the length of time.
Was I wrong or did I miss something?
The text was updated successfully, but these errors were encountered:
To me it looks like all the dimensions are in the wrong order for the tensorflow script at least. For me the script does the time warp on the frequency axis for instance. An easy fix I think could be to do a transpose of the spectrogram, pass it to the program and then transpose it again, though I haven't tried it
E.g, line 62 in spec_augment_tensorflow.py:
'''
fbank_size = tf.shape(spectrogram)
n, v = fbank_size[1], fbank_size[2]
'''
And 'n' is used as the length of time, and 'v' is used as the length of frequency.
But in spec_augment_test_TF.py, the re-shaped mel_spectrogram from librosa should be (-1, n_mels, t, 1), which means fbank_size[1] is actually the length of frequency and fbank_size[2] is the length of time.
Was I wrong or did I miss something?
The text was updated successfully, but these errors were encountered: