Add optional arg to specify device for Transformer model. #165

anordin95 · 2024-10-02T18:41:50Z

Hi,

First off, I wanted to say thanks for publishing this work so openly!

For curiosity's sake, I've been trying to run the models locally on my Mac M1, so my device options are 'cpu' and 'mps'. Either way, I need a way to specify the device rather than always using cuda.

ashwinb

Did you not need any other changes in generation.py?

anordin95 · 2024-10-02T19:44:59Z

Did you not need any other changes in generation.py?

I probably would, but I haven't/can't use any of the generative models. My machine already struggles with the 1B and 3B models and usually kills the 8B models for using too much memory.

~~Update: I didn't find any other .cuda() calls, so there may be no other changes needed.~~

Update #2: There are other ways to hard-code "cuda" usage (oops silly me). I believe I found them all and updated them appropriately.

… Note: untested.

anordin95 · 2024-10-02T22:40:58Z

I went ahead and tried to find all the hard-coded "cuda" device calls and replace them appropriately.

TAMERALKHATE3B · 2024-10-30T09:04:52Z

models/llama3/reference_impl/multimodal/model.py

dvrogozh · 2024-11-19T00:37:44Z

models/llama3/reference_impl/generation.py

            torch.set_default_tensor_type(torch.cuda.HalfTensor)
+        else:
+            torch.set_default_tensor_type(torch.float16)


torch.set_default_tensor_type seems to be deprecated starting from pytorch 2.1, see comment at API description https://pytorch.org/docs/2.5/generated/torch.set_default_tensor_type.html#torch-set-default-tensor-type

Maybe it's better to use torch.set_default_dtype(torch.float16)?

dvrogozh · 2024-11-19T01:50:14Z

models/llama3/reference_impl/generation.py

            model.setup_cache(model_args.max_batch_size, torch.bfloat16)
        else:
-            model = Transformer(model_args)
+            model = Transformer(model_args, device=device)


On my side that's not enough to apply this PR and pass device to Transforner on initialization. I still need to call model.to(device) in a consequent step. I.e.:

model = Transformer(model_args, device=device) model.to(device)

This does not quite makes sense to me though. If we pass device on class creation then we should not require to still call .to() to cast everything to the same device. I guess it implies that either .to() should be done in the end of __init__() of Transformer or passing device should completely be not required and initialization of classes fixed in a way that they are device agnostic (created on CPU), then .to() should just work since it seem to run recursively for all submodules. I think the reason it did not work is couple places where .cuda() is called on tensor creation.

Add optional arg to specify device for Transformer model.

a39b008

anordin95 requested review from ashwinb, yanxi0830, hardikjshah, dltn and raghotham as code owners October 2, 2024 18:41

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 2, 2024

anordin95 mentioned this pull request Oct 2, 2024

Add optional arg to specify device for Transformer model. #162

Closed

ashwinb reviewed Oct 2, 2024

View reviewed changes

Add option to initialize multimodal model on devices other than cuda.…

13563f3

… Note: untested.

anordin95 requested a review from ashwinb October 2, 2024 22:41

TAMERALKHATE3B approved these changes Oct 30, 2024

View reviewed changes

dvrogozh reviewed Nov 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional arg to specify device for Transformer model. #165

Add optional arg to specify device for Transformer model. #165

anordin95 commented Oct 2, 2024

ashwinb left a comment

anordin95 commented Oct 2, 2024 •

edited

Loading

anordin95 commented Oct 2, 2024

TAMERALKHATE3B commented Oct 30, 2024

dvrogozh Nov 19, 2024

dvrogozh Nov 19, 2024

Add optional arg to specify device for Transformer model. #165

Are you sure you want to change the base?

Add optional arg to specify device for Transformer model. #165

Conversation

anordin95 commented Oct 2, 2024

ashwinb left a comment

Choose a reason for hiding this comment

anordin95 commented Oct 2, 2024 • edited Loading

anordin95 commented Oct 2, 2024

TAMERALKHATE3B commented Oct 30, 2024

dvrogozh Nov 19, 2024

Choose a reason for hiding this comment

dvrogozh Nov 19, 2024

Choose a reason for hiding this comment

anordin95 commented Oct 2, 2024 •

edited

Loading