-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully compatible with the open clip Tokeniser #24925
Comments
cc @patil-suraj and @ArthurZucker |
Hey, I am not really sure I understand the issue here. The >>> tokenizer = CLIPTokenizer.from_pretrained("stabilityai/stable-diffusion-2-1", pad_token="<|endoftext|>")
>>> print(tokenizer.pad_token_id)
49407
>>> tokenizer.pad_token_id = 0
>>> print(tokenizer.pad_token)
'!' |
You are right, my implementation does not seem to be perfect. The purpose of this implementation is to match when converting from text to token_id. What I want to know is how to set pad_token_id to 0 without affecting the words that are normally used like |
You can't really do that unless you train a new tokenizer 😅 |
cc @patil-suraj to explain the context around Stable Diffusion and OAI vs. openCLIP here maybe |
Hey @laksjdjf , Indeed, there's a discrepancy between the But note that for the tokenizer used in SD2 and SDXL it's already set correctly cf https://huggingface.co/stabilityai/stable-diffusion-2-1/blob/main/tokenizer/special_tokens_map.json#L16 And a bit more context about padding token in CLIP. CLIP doesn't care about padding token and the wrong padding token will only affect inference when using all token embeddings (like Stable Diffusion). For training, even if the padding token is wrong (i.e if we use
So the way CLIP is trained, it doesn't care about padding token. It'll only affect the inference if a different token (compared to the padding token used for training) is used for padding. And this is already taken care of in SD 2 and SDXL repos. |
My concern is that the above process will result in |
Hey @laksjdjf, As @patil-suraj mentioned CLIP never used a padding token for training. It was trained with a causal mask and only tokens until the eos token are taken into account when computing the CLIP contrastive loss. All tokens following the eos token have no influence on the model, so one could have added any token here. Now, it only matters if a pretrained CLIP is further fine-tuned as is done for SD. In this case the padding token was used to influence the loss and in that sense SD does not make a difference between |
Hmmm, maybe I just don't understand, but my question is about the behaviour of the >>>tokenizer = CLIPTokenizer.from_pretrained("stabilityai/stable-diffusion-xl-base-0.9", subfolder="tokenizer_2")
>>>prompt = "! !! !!!"
>>>input_ids = tokenizer(prompt,padding="max_length", max_length=tokenizer.model_max_length, truncation=True, return_tensors='pt').input_ids
>>>print(input_ids)
tensor([[49406, 0, 0, 0, 0, 0, 0, 49407, 0, 0, ...
>>>print(open_clip.tokenize(prompt))
tensor([[49406, 256, 748, 995, 49407, 0, 0, 0, 0, 0, ... |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
hello, the fact that the pad tokens differ across both SDXL-base-1.0 tokenizers raises a RunTimeError in a specific case, when using the compel library. I am not 100% sure where the responsibility lies in this case, can anyone advise? Thank you! |
Hey! Would recommend you to open an issue on |
Feature request
The open clip tokeniser has a
pad_token_id
of 0, but this cannot be achieved becauseCLIPTokeniser.__init__()
cannot set apad_token_id
Motivation
Related to huggingface/diffusers#4153
Stable diffusion v2 and XL use the open clip tokeniser. To avoid increasing dependencies, the transformers must also have the same functionality.
Your contribution
It seems possible to set pad_token_id directly, but it is not realistic.
The text was updated successfully, but these errors were encountered: