-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The output is so bad - total garbage what I am doing wrong? It is also super slow and requires huge amount of RAM #32
Comments
Thanks for reporting, will look into this and get back to you tomorrow |
In half precision mode 6.7b fits on a 3090. As for the output quality, you need to tweak generation parameters a little, this blogpost explains quite a bit. Here's a snippet of how I use it: import torch, gc
from transformers import AutoTokenizer, OPTForCausalLM
tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-6.7b")
tokenizer.pad_token_id = 1
tokenizer.padding_side = 'left'
tokenizer.model_max_length = 2020
model = OPTForCausalLM.from_pretrained("facebook/galactica-6.7b", device_map="auto", torch_dtype=torch.float16) input_text = """# Scientific article.
title: Contrastive analysis of models used for DRAM simulation.
# Introduction
"""
input_ids = tokenizer(input_text, return_tensors="pt", padding='max_length').input_ids.to("cuda")
outputs = model.generate(input_ids,
max_new_tokens=1000,
do_sample=True,
temperature=0.7,
top_k=25,
top_p=0.9,
no_repeat_ngram_size=10,
early_stopping=True)
print(tokenizer.decode(outputs[0]).lstrip('<pad>'))
gc.collect()
torch.cuda.empty_cache()
|
@AbstractQbit ty very much for answer May I ask something regarding this format
So the text generator understand # character as a special character and do something? what does these 2 parameter do? |
Authors say in the paper that the model was trained on text in markdown format, so giving markdown-ish prompts to the model should probably work best, I guess. Padding params I took from here Lines 85 to 86 in f6d9b0a
|
@AbstractQbit ty so much for answers about these hyper parameters, have you tested them or how did you come up with those values?
|
@FurkanGozukara Those are just what I've ended up with after playing around with the model for a bit. There was no real methodology for picking those. They just produced somewhat sensible output, so I've shared them here as a starting point for you. There are no one-size-fits-all parameters, you'll have to experiment yourself to tailor them to your needs. As to what they do, please refer to the article I've linked above. I'm not an NLP expert, so I can't explain them any better than HF people. |
@FurkanGozukara I played around with your prompt, and this is what the model came up with.
I followed the parameters from @AbstractQbit. from transformers import AutoTokenizer, OPTForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-6.7b")
tokenizer.pad_token_id = 1
tokenizer.padding_side = 'left'
tokenizer.model_max_length = 4020
model = OPTForCausalLM.from_pretrained("facebook/galactica-6.7b", device_map="auto", torch_dtype=torch.float16)
#input_text = "The Transformer architecture [START_REF]"
input_text = "Title: The benefits of deadlifting\n\n"
input_ids = tokenizer(input_text, padding='max_length', return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(input_ids, max_new_tokens=1000,
do_sample=True,
temperature=0.7,
top_k=25,
top_p=0.9,
no_repeat_ngram_size=10,
early_stopping=True)
print(tokenizer.decode(outputs[0]).lstrip('<pad>')) |
@AbstractQbit your answer has hleped me a lot, many thanks! Do you (or anyone else) now how to use the |
i'm getting CUDA error with @legor and @AbstractQbit ' code details bug report when setting CUDA_LAUNCH_BLOCKING=1/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [12,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [13,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [14,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [15,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [16,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [17,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [18,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [19,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [20,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [21,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [22,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [23,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [57,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1660087551192/work/aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [165,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed. --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In [10], line 1 ----> 1 outputs_ids = model.generate( 2 input_ids, max_new_tokens=1000, 3 do_sample=True, temperature=0.7, 4 top_k=25, top_p=0.9, 5 no_repeat_ngram_size=10, 6 early_stopping=True 7 ) |
@legor ty so much. I wonder they release a model without any proper example as yours. |
unfortunately hugging face doesnt support newdoc i dont know why. your other questions i also wonder |
Here my entire command
And the output is total repetition and garbage. I am trying to generate an article based on the topic sentence I provide
Also even 28 GB VRAM is not enough for 6.7b model. I am testing CPU runtime on IPU and it has been more than 2 hours with just 6.7b model.
the output as below
The text was updated successfully, but these errors were encountered: