Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional arg to specify device for Transformer model. #165

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

anordin95
Copy link
Contributor

Hi,

First off, I wanted to say thanks for publishing this work so openly!

For curiosity's sake, I've been trying to run the models locally on my Mac M1, so my device options are 'cpu' and 'mps'. Either way, I need a way to specify the device rather than always using cuda.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 2, 2024
Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you not need any other changes in generation.py?

@anordin95
Copy link
Contributor Author

anordin95 commented Oct 2, 2024

Did you not need any other changes in generation.py?

I probably would, but I haven't/can't use any of the generative models. My machine already struggles with the 1B and 3B models and usually kills the 8B models for using too much memory.

Update: I didn't find any other .cuda() calls, so there may be no other changes needed.

Update #2: There are other ways to hard-code "cuda" usage (oops silly me). I believe I found them all and updated them appropriately.

image

@anordin95
Copy link
Contributor Author

I went ahead and tried to find all the hard-coded "cuda" device calls and replace them appropriately.

@anordin95 anordin95 requested a review from ashwinb October 2, 2024 22:41
@TAMERALKHATE3B
Copy link

models/llama3/reference_impl/multimodal/model.py

torch.set_default_tensor_type(torch.cuda.HalfTensor)
else:
torch.set_default_tensor_type(torch.float16)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.set_default_tensor_type seems to be deprecated starting from pytorch 2.1, see comment at API description https://pytorch.org/docs/2.5/generated/torch.set_default_tensor_type.html#torch-set-default-tensor-type

Maybe it's better to use torch.set_default_dtype(torch.float16)?

model.setup_cache(model_args.max_batch_size, torch.bfloat16)
else:
model = Transformer(model_args)
model = Transformer(model_args, device=device)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my side that's not enough to apply this PR and pass device to Transforner on initialization. I still need to call model.to(device) in a consequent step. I.e.:

model = Transformer(model_args, device=device)
model.to(device)

This does not quite makes sense to me though. If we pass device on class creation then we should not require to still call .to() to cast everything to the same device. I guess it implies that either .to() should be done in the end of __init__() of Transformer or passing device should completely be not required and initialization of classes fixed in a way that they are device agnostic (created on CPU), then .to() should just work since it seem to run recursively for all submodules. I think the reason it did not work is couple places where .cuda() is called on tensor creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants