-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weighted Prompts for Diffusers stable diffusion pipeline #1506
Comments
This has unfortunately only been added as community pipeline, which imo, is a very broken system that just adds tons of work to the end-usage managing all these pipes, and not very API friendly. https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py With community pipelines, you get only what it advertises, and nothing else. It's not like the many other repos out there like AUTOMATIC where these things are more packages together for usage with all available features, creating a robust and feature rich system. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
For future readers: For a direct use case, we have the following community pipeline: https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py You can also define your own attention processor that weighs certain prompts differently by making use of this API: |
@patrickvonplaten Are there any plans to integrate this into the main pipeline? As @WASasquatch said the community pipeline implementation is not very user friendly. It seems like it would be pretty useful to have it built in as a feature given how often prompt weighting is used in the community |
Upvoting this as I think prompt weighting is indeed an important feature that should be added to Thanks for your hard work! <3 |
cc @patil-suraj what do you think? |
My opinion here is that
Nevertheless, we could/should try to more actively maintain: https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py and potentially write a documentation page about it. Also @SkyTNT what do you think maybe :-) |
How does supporting prompt weighting transform diffusers toward a UI? I think the kind of usage that would be expected here is to be able to use weights in a way similar to this and let the backend do it’s magic ;) :
|
I agree with @Ephil012 . But I'm busy recently, so I may not be able to contribute. |
What does a user interface have to do with back-end functionality? |
@patrickvonplaten I'd argue that adding this feature does not lead to diffusers becoming a full fledged UI. This would simply be a feature on the backend when inputting prompts (like how alexisrolland mentioned). You mentioned that the goal of diffusers is to act as a backend for projects providing a SD UI. However, by not implementing this feature it's arguably making it harder to use diffusers as a backend. When building a UI, most users expect there to be prompt weighting built in. By not having it in diffusers, it leads to each project having to build their own implementation. This causes duplicated work between projects and in general makes using diffusers harder. Personally, I started looking for other alternatives to diffusers to build my side project on top of simply because it was missing essential features like prompt weighting. I'd also argue other common features should be built in, such as long prompts (this may have already been added, not sure), but that's a discussion for another thread. Yes there are community pipelines that can be used, but it would make sense to have it in the main pipeline too for maintainability and reliability. As far as implementation goes, I do think that some projects might not want to follow the A111 syntax. I think there could be a default syntax, which you could customize via code. Or you could take the approach imaginAIry does where they allow you to create a list of prompts and set weights in code (example below). Either approach would allow for using your own syntax
|
If you are going to refer people to the current InvokeAI code as an example of how to use diffusers as a backend, be warned that there are parts that are not pretty. 😆 This is definitely a place where we had to work around the StableDiffusionPipeline rather than with it. I see that
You've already identified other use cases for exposing an API that takes text embeddings directly, such as #205 and #1869. It's also always easier to pass values to things than it is to subclass and override template methods, so factoring such a method out of the existing StableDiffusionPipeline. |
I have a work in progress project of turning the prompt weighting code i built for InvokeAI into a library called A simple way of providing painless weighting support would be for the stable diffusion pipeline to support conditioning vectors as alternative input to prompt strings. The process of doing weighted prompting would then look something like this: pipeline = StableDiffusionPipeline.from_pretrained(...)
incite = Incite(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)
# weight of 'fluffy' is increased, weight of 'dark' is decreased
positive_conditioning_tensor = incite.build_conditioning_tensor(
"a fluffy+++ cat playing with a ball in a dark-- forest"
)
negative_conditioning_tensor = incite.build_conditioning_tensor(
"ugly, poorly drawn, etc."
)
images = pipeline(positive_conditioning=positive_conditioning_tensor,
negative_conditioning=negative_conditioning_tensor).images This in itself is just a first step, however, - because being able to to alter prompts on the fly unlocks all sorts of other possibilities. Here's a more advanced design: pipeline = StableDiffusionPipeline.from_pretrained(...)
incite = Incite(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)
# at 50% of the way through the diffusion process, replace the word "cat" with "dog"
prompt="a cat.swap(dog, start=0.5) playing with a ball in the forest"
conditioning_scheduler = incite.build_conditioning_scheduler(
positive_prompt=prompt,
negative_prompt=""
)
images = pipeline(conditioning_scheduler=conditioning_scheduler).images
# at the start of every diffusion step the pipeline queries the conditioning_scheduler
# for positive and negative conditioning tensors to apply for that step This unlocks the capability for, as one early reviewer, @raefu, put it, "a generalized macro language that ultimately creates conditioning vectors for every step of the image generation". With such a flexible model it would be possible to do wild things like performing image comparison operations with the latent image vector part-way through the diffusion process and then programmatically altering the conditioning/prompt based on what has been partially diffused already. The possibilities are endless, and really quite exciting. |
Opening a PR that allows |
thanks @patrickvonplaten - with 0.12 and my prompt weighting library Compel (based on the InvokeAI weighting code) I can now do this to apply weights to different parts of the prompt: from compel import Compel
from diffusers import StableDiffusionPipeline
pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)
# upweight "ball"
prompt = "a cat playing with a ball++ in the forest"
embeds = compel.build_conditioning_tensor(prompt)
image = pipeline(prompt_embeds=embeds).images[0] works great - thank you! |
Very cool @damian0815 ! |
So coool I need to try this !! Thank you!! |
@damian0815 very cool! What would be the syntax if we want to add weight to a group of words rather than just a single word? Thanks! |
you can put the (words you want to weight)++ in parentheses this (also (supports)-- nesting)+ speech marks "also work"+ like this |
Thanks @damian0815 ! Do you actually have the link of a documentation describing the different syntaxes? I am also wondering how to add different level of weights to different bags of words... is it just something like: (this bag is heavy)+++ while (this bag is medium)+ and (this one is really light)--- ? |
that's right @alexisrolland . docs are linked on the readme but it's basically adapted from what i wrote for InvokeAI - https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#prompt-syntax-features |
@damian0815 If I may, I think it would be nice if your |
nope, not happening. the Auto111 syntax is rubbish |
Ha ha ha as much as I agree with you, it's becoming the defacto standard 😀 I prefer your syntax too... |
what i might consider adding is a converter that can convert auto syntax to invoke syntax. pull requests welcome :) |
That would be fantastic... the best of both worlds ^^ |
BTW, another use case that should be somewhat easily enabled by this is long-weight prompting: #2136 (comment) |
@patrickvonplaten I saw that the PR added the ability to pass embeddings in now. From my understanding, you still need to either write the prompt weighting code yourself or use a third party library (like compel). Do you know if there's any plans to add built in prompt weighting (similar to the LPW community pipeline) into one of the main Stable Diffusion pipelines? That way people don't have to use a third party code for this functionality. |
Prompt weightin won't be included in the main pipeline in order to keep the pipeline simple so that users can easily follow and modify the pipeline on their own. The philosphy behind this is explained in this doc, we encourage users to give it a read :) |
Maybe drop all that state of the art stuff, then. It's antiquated already. You all need to do better. People are going to be modifying this pipe, and be lost, because of the lack of proper support, for shenanigans. As it stands most big places using Diffusers aren't even using your pipes, but the community ones, and racking their heads on your backwards logic and "philosophy" (one of the worse things to talk about in open source code, your philosophy should be whatever the people want, otherwise just sell an API and be a business where this is expected behavior)
…On Tue, Feb 7, 2023, 5:11 AM Suraj Patil ***@***.***> wrote:
Prompt weightin won't be included in the main pipeline in order to keep
the pipeline simple so that users can easily follow and modify the pipeline
on their own. The philosphy behind this is explained in this doc
<https://huggingface.co/docs/diffusers/main/en/conceptual/philosophy>, we
encourage users to give it a read :)
—
Reply to this email directly, view it on GitHub
<#1506 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIZEZPKO625MAKSEDA5R7DWWJCWVANCNFSM6AAAAAASQ5AD7Y>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hello @damian0815 I am trying to use your [...]
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)
prompt_embeds = compel.build_conditioning_tensor(payload.prompt) if payload.prompt else None
negative_prompt_embeds = compel.build_conditioning_tensor(payload.negative_prompt) if payload.negative_prompt else None
[...]
pipeline(
prompt_embeds=prompt_embeds,
negative_prompt_embeds = negative_prompt_embeds,
image=init_images,
strength=payload.init_image_noise,
num_inference_steps=payload.steps,
guidance_scale=payload.guidance,
num_images_per_prompt=payload.num_images,
generator=generators
) Returns
I checked my |
hi @alexisrolland , check that you're on at least diffusers v0.12 . if that doesn't fix it, please post the full stack trace (on the compel github issues rather than here) |
@damian0815 yes I'm on |
@patil-suraj I read the doc you sent. It helped clarify a lot of things for me. Thanks! However, the one concern I have is about community pipeline support. Some of these pipelines provide essential features to devs, but seem to be less well maintained than the main pipeline. As a result, it makes devs hesitant to build on top of them or diffusers in general. The same goes for third party libs. Would it make sense to keep a simple main pipeline and then make some of the community pipelines part of the official pipelines list that are more actively maintained by huggingface? That way the philosophy of keeping stuff simple is adhered to, but it also provides devs with the features they need without worrying about if a community pipeline will be abandoned in the future. I know it involves a lot of commitment to support a new pipeline, but I figured I might as well ask in case. I feel like officially supporting this will attract more people to diffusers vs other libraries. On an unrelated note, but should some of the stuff for compel be moved to another thread on that repo? It seems a lot of this thread has become a troubleshooting thread for a separate library. It might make sense to move the talk to compel's repo so that it's easier for people to find in the future while also keeping this thread more on topic. |
yeah that's probably my bad for not immediately redirecting people there. i'll be sure to do so in the future. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
I could not find anything for diffusers and unfortunately I'm not on the Level yet where I can implement it myself. :)
It would be amazing to be able to weight prompts like "a dog with a hat:0.5"
Thank you for this amazing library !!
The text was updated successfully, but these errors were encountered: