We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
而这个训练好的模型是根据图片输出操作列表...即一个多分类器
The text was updated successfully, but these errors were encountered:
也算是知识蒸馏了😏
Sorry, something went wrong.
多分类器吗...我还以为是强化学习T_T
一个模型是计算reward的值,一个模型是用于分类,PPO算法本身就是这样的
PPO算法确实是这样的。但是这个代码并不是。不建议阅读并参考这份代码,纯属浪费时间。
No branches or pull requests
而这个训练好的模型是根据图片输出操作列表...即一个多分类器
The text was updated successfully, but these errors were encountered: