Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Transformers initializer for Falcon models #1988

Open
martin-gorner opened this issue Nov 19, 2024 · 3 comments
Open

Missing Transformers initializer for Falcon models #1988

martin-gorner opened this issue Nov 19, 2024 · 3 comments
Assignees

Comments

@martin-gorner
Copy link
Contributor

martin-gorner commented Nov 19, 2024

Repro code:
model5 = keras_hub.models.CausalLM.from_preset("hf://tiiuae/falcon-7b-instruct", dtype="bfloat16")
Result:
ValueError: KerasHub has no converter for huggingface/transformers models with model type 'falcon'

Now that the Falcon model family exists in Keras-hub, this should work.

@mehtamansi29 mehtamansi29 self-assigned this Dec 4, 2024
@mehtamansi29
Copy link
Collaborator

Hi @martin-gorner -

Thanks for reporting the issue. You can intialize falcon-7b-instruct model using transformers AutoTokenizer, AutoModelForCausalLM class.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto") 

And for loading falcon model family(falcon_refinedweb_1b_en) in keras-hub, you can use like this.
model5 = keras_hub.models.CausalLM.from_preset("hf://keras/falcon_refinedweb_1b_en", dtype="bfloat16")

Attached gist here for the reference.

Copy link

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Dec 19, 2024
@martin-gorner
Copy link
Contributor Author

Thanks @mehtamansi29 but this issue is filed in keras-hub. The problem is to initialize a Keras-hub model from the safetensor checkpoint, as is possible for Llama, Gemma etc. I'm logging this because cross-compatibility between Keras-hub and Transformers checkpoints is not guaranteed even if the model architecture exists on both sides. A checkpoint translation module is necessary. It is important to keep track of model architectures where this translation module was implemented and where it was not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants