Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unintentionally commented-out line in 02-pytorch-asr main_ctc.py? #22

Open
NickleDave opened this issue Oct 28, 2022 · 3 comments
Open

Comments

@NickleDave
Copy link

Hi again @jeremyfix

I noticed in this solution that this comment appears to contain another line of code that maybe should not be commented out?

# compute the log_softmax unpacked_predictions = unpacked_predictions.log_softmax(dim=2) # T, B, vocab_size

shouldn't it actually be

# compute the log_softmax 
unpacked_predictions = unpacked_predictions.log_softmax(dim=2)  # T, B, vocab_size

so that you transform the "logits" to log softmax?
If there's some reason you're not converting to log softmax on purpose, I'd be curious to know

@jeremyfix
Copy link
Owner

Good catch , that's actually a mistake;

wrap_ctc is indeed called before CTCLoss which is itself expecting the log probabilities.

That part of code is also not expected to be different from the base code provided to be filled in.

@jeremyfix
Copy link
Owner

@NickleDave I'm kind of reopening that issue; I do realize that this lab work is not perfectly working.

At least tested on a the french corpus of CommonVoice, my code fails to overfit a single minibatch and fails to overfit the training set; However training on a larger corpus, It ends producing some recognition that looks like the groundtruth ; There must be still be a bug somewhere. I spent some times looking all along the code, I'm not able to catch anything wrong;

As you were digging deep into the code, did you possibly discovered other issues in the code ? Did it you work when you tried it , maybe on other languages than French ?

Thank you for your insights.

@jeremyfix jeremyfix reopened this Jan 26, 2023
@NickleDave
Copy link
Author

Hi @jeremyfix thank you for letting me know about this.

Long story short, we are extending a previously designed model for annotating birdsong:
https://github.com/yardencsGitHub/tweetynet
But I am in the middle of a big revamp of the framework we use to run experiments:
https://github.com/vocalpy/vak/tree/version-1.0
Would expect to be back running experiments by the end of Feb.

Mainly I was looking at your code since it's one of the only good detailed examples I could find of using the rnn.utils API for a model that is not pure NLP.

I haven't discovered any other issues but I will definitely tell you if I do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants