Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is the model available Instruction tuned? #3

Open
CHesketh76 opened this issue Feb 29, 2024 · 0 comments
Open

Question: Is the model available Instruction tuned? #3

CHesketh76 opened this issue Feb 29, 2024 · 0 comments

Comments

@CHesketh76
Copy link

CHesketh76 commented Feb 29, 2024

Hello,

Just wondering if the model that you provided on huggingface was instruction tuned to perform the needle in the haystack test.

Also, (hypothetically speaking) would some of the practices to reduce GPU requirements also apply to SSSM models? For example, Unsloth reduces the GPU demand so consumer GPUs can train Llama2 -7B and Mistral - 7B models. My 8BG GPU was able to finetune Mistral for a small usecase of mine. It would absolutely amazing to see a Mamba-7B model train for half the resources that Unsloth Mistral 7B needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant