Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential error in eval_gsm8k.py #23

Open
hbin0701 opened this issue Dec 27, 2023 · 0 comments
Open

Potential error in eval_gsm8k.py #23

hbin0701 opened this issue Dec 27, 2023 · 0 comments

Comments

@hbin0701
Copy link

Dear authors, thank you for the amazing work and sharing your code and data!

I wanted to ask about your evaluation code, as currently if the model outputs an answer with decimal point, it automatically rounds to the nearest integer.

In this way, a wrong answer (i.e. 8.5) could be considered correct (i.e. as 9), in spite of a calculation error, which indeed often occurs with some model generations.

In this light, I believe a stricter evaluation code may be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant