New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

关于decoder部分训练时输入的问题 #70

Open

Lebron-Harden opened this issue Oct 18, 2021 · 0 comments

Lebron-Harden commented Oct 18, 2021

最近在做古籍识别，用到了paddleocr中的attention部分作为序列预测部分，但是碰到了一个问题：

在attention的decoder部分，如果将前一时刻decoder的输出作为当前时刻的输入，模型训练效果很差，收敛很慢，准确率上不去；但是如果将前一时刻的真实标签作为当前时刻的输入，模型收敛速度直接起飞，很快训练准确率就到1，但是预测准确率一直是0，似乎是这样做直接把真实标签作为了训练模型的输入，导致模型根本没有得到训练。
但就我个人对seq2seq模型的理解，在训练时将前一时刻的真实标签作为当前时刻的输入，应该是更容易将模型往理想的方向训练，更容易收敛，模型理应训练得更好，但是出现了预测准确了一直为0的情况。我真的很困惑，不知道大佬是否可以解决一下我的疑问。

The text was updated successfully, but these errors were encountered:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment