Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<QUESTION> Does llama3 add <EOS> token when do pretraining? #362

Open
bugm opened this issue Nov 20, 2024 · 0 comments
Open

<QUESTION> Does llama3 add <EOS> token when do pretraining? #362

bugm opened this issue Nov 20, 2024 · 0 comments

Comments

@bugm
Copy link

bugm commented Nov 20, 2024

Hello,
I noticed that the llama3 tokenizer loaded with hf transformers.AutoTokenizer only add a token when call the encode function. May I ask during llama3 pretraining, which behavior is taken? only add token or add both and tokens for each training document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant