Fooocus 2.1.0 Image Prompts (Midjourney Image Prompts) #557

lllyasviel · 2023-10-07T23:04:16Z

lllyasviel
Oct 7, 2023
Maintainer

Fooocus 2.1.0 has completed the implementation of image prompts. Because after this version, almost all features of Midjourney are included, the version directly jump to 2.1.0.

Image Prompt is one of the most important feature of Midjourney. Below is the banner from Midjourney:

In Fooocus, it looks like this:

Technically, this feature is based on a mixture of IP-Adapter, and a pre-computed negative embedding from Fooocus team, an attention hacking algorithm from Fooocus team, and an adaptive balancing/weighting algorithm from Fooocus team.

The motivation of these efforts is to achieve a best match to the Midjourney Image Prompt. In other software like A1111/ComfyUI/InvokeAI, the IP-Adapter still has some open problems like ignoring text prompts, or over-burned results when multiple images are used. These problems are solved in Fooocus and users can enjoy Midjourney-like experience of Image Prompt.

The detailed differences are in the below table:

	Midjourney Image Pompt	IP-Adapter + A1111/ComfyUI/InvokeAI	Fooocus Image Prompt
Work together with text prompts	Text prompts and image prompts will be mixed	Tends to ignore text prompts	Text prompts and image prompts will be mixed
Multiple images as input	Result quality does not decrease for multiple image inputs	Using more images leads to worse result quality	Result quality does not decrease for multiple image inputs
When the method fails (single image inputs)	Give unrelated but still high-quality image	Give related but low-quality and over-burned image	Give unrelated but still high-quality image
When the method fails (multiple image inputs)	Partially ignore some images that it cannot understand, still give high-quality results	Give related but low-quality and over-burned image	Partially ignore some images that it cannot understand, still give high-quality results
Quality Influence	Using image prompt does not influence the output quality	Using image prompt influences the quality of base model	Using image prompt does not influence the output quality, almostly
Result diversity	Results are still diverse after using image prompts	Results tend to have small and minimized variations	Results are still diverse after using image prompts

Using this method will download 2.5GB files at the first time!

Example: Single Image Prompt without Text Prompts

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(seed 1234, here is the image)

(this example uses default style and Fooocus V2 style)

Example: Single Image Prompt with Text Prompts

Note that mixing text and IP-Adapter is extremely difficult in ComfyUI/A1111. Fooocus does not have this problem.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(this example uses default style and Fooocus V2 style)

Example: Multiple Images without Text Prompts

Note that mixing multiple IP-Adapters is likely to cause lower result quality in ComfyUI/A1111. Using Fooocus can resolve this to some extents.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(this example uses default style and Fooocus V2 style)

Example: Multiple Images with Text Prompts and Even Multiple Styles

This is almost impossible in A1111/ComfyUI since mixing text and IP-Adapter is extremely difficult in ComfyUI/A1111, and mixing multiple IP-Adapters is likely to cause lower result quality in ComfyUI/A1111.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

This image is too complicated to understand so I annotated here:

So mixing too many things make it hard to recognize but everything is there and it does not fail or causing quality decerase, unlike ComfyUI/A1111/InvokeAI.

Fooocus Image Prompt (Advanced)

If you check “advanced”, you will be able to use two structure controls:

PyraCanny: A pyramid-based Canny edge control. The reason is that SDXL uses 1024px images and standard Canny tends to miss some image details from time to time, at such a high resolution. This method uses multiple resolutions to detect canny edges and then combine them softly, so that more structures are captured (than canny). The pyramid part is from “Edge Drawing: A combined real-time edge and segment detector”. You will download 350MB control models when using it.

CPDS: A structure extraction algorithm from “Contrast Preserving Decolorization (CPD)”. The “CPDS” means CPD Structure. The control model is modified by Fooocus team – it starts from SAI’s depth control-lora. The reason for using this method is for the fast speed and download-free preprocessor. Note that we only use the structure part of images, and it is not really “decolorization”. You will download 350MB control models when using it.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(this example uses default style and Fooocus V2 style)

For developers:

In Developer Debug Mode, you can mix the upscale/vary/inpaint with all above features if you know what you are doing and REALLY need it (the denoising strength can also be set in Developer mode). You can also get the preprocessor result by checking the “debug preprocessor”.

But keep in mind:

If you accidentally get satisfying results in Fooocus by tuning a lot of advanced parameters, you should try to copy your positive prompt, reopen Fooocus, do not change anything, and paste the prompt. You will find that results are even better, and all those tunings are unnecessary. (The only exception is probably changing base model in “Advanced”.)

eric23 · 2023-10-08T06:21:32Z

eric23
Oct 8, 2023

Awesome!

0 replies

YANGlattez · 2023-10-08T07:30:17Z

YANGlattez
Oct 8, 2023

cool！

0 replies

woshitoutouge · 2023-10-08T07:30:28Z

woshitoutouge
Oct 8, 2023

OMG!

0 replies

sherozbek1706 · 2023-10-08T07:44:10Z

sherozbek1706
Oct 8, 2023

Using more images leads to worse result quality

1 reply

lllyasviel Oct 8, 2023
Maintainer Author

Different from A1111 or ComfyUI, using more images does not lead to worse result quality in Fooocus.
If you find unexpected examples, may directly open an issue.

mirek190 · 2023-10-08T16:33:17Z

mirek190
Oct 8, 2023

wow .... fooocus is getting better and better

0 replies

3Diva · 2023-10-08T21:17:51Z

3Diva
Oct 8, 2023

This looks like an incredible update! Could someone help me use it please? When I try to put an image in the image prompt and render I just get an error message:

ValueError: Query/Key/Value should all have the same dtype
query.dtype: torch.float16
key.dtype: torch.float32
key.dtype: torch.float32

EDIT: It works fine if I select "PyraCanny" or "CPDS" but if I try to use "Image Prompt" is gives me that ValueError.

5 replies

lllyasviel Oct 8, 2023
Maintainer Author

fixed in 2.1.19

3Diva Oct 9, 2023

@lllyasviel That works great now! Thank you so much!

ayushyadav547 Jan 5, 2024

Mine still shows error in FaceSwap

felipesmaia Jan 25, 2024

Me too.. Error in any function in input image

puntomaupunto Jan 25, 2024

same for me. In the output of the script there is a ctl-C.

mirek190 · 2023-10-08T22:09:25Z

mirek190
Oct 8, 2023

hard to say ...for me works ok

0 replies

lllyasviel · 2023-10-08T22:56:03Z

lllyasviel
Oct 8, 2023
Maintainer Author

2.1.19:

PyraCanny improved a bit
Note that CPDS may be changed in the future.

0 replies

lanyusan · 2023-10-09T00:22:32Z

lanyusan
Oct 9, 2023

@lllyasviel the new release is awesome.

Is there anyways to do style blending like this?

Or like what is done here:

https://www.tensorflow.org/tutorials/generative/style_transfer

I have tried the latest image prompt but couldn't get similar results.

3 replies

lanyusan Oct 9, 2023

So I have tried a few more times and manged to get some thing like this. I have to uncheck all styles.

lllyasviel Oct 9, 2023
Maintainer Author

you can get this if you do not use bad parameters and just keep everything as default, but not sure if this is what is expected because the style itself is messy

(seed 710945904

(v2.1.22

lanyusan Oct 9, 2023

I have tried a few more times with different samples. It seems for text prompt free style transfer, PyraCanny works the best.

lllyasviel · 2023-10-09T01:24:15Z

lllyasviel
Oct 9, 2023
Maintainer Author

fixed some errors in CPDS in 2.1.24

0 replies

3Diva · 2023-10-09T01:38:52Z

3Diva
Oct 9, 2023

The new features are so cool. The "Image Prompt" one seams a bit like "Revision". I'm enjoying the pose control we get from "PyraCanny" too! Very nice!

Thank you for the sweet new features!

0 replies

lllyasviel · 2023-10-09T01:53:00Z

lllyasviel
Oct 9, 2023
Maintainer Author

2.1.24:

You can now drag here to resize the area:

for example

1 reply

lslsl3q Dec 31, 2023

you are my hero

barepixels · 2023-10-09T01:58:20Z

barepixels
Oct 9, 2023

my try, wish there is a way to strengthen his face likeness, no style selected

here are the source images. maybe someone can do better

I think this is a better version. Wish I can just load the image with metadata and resume where I left off.

5 replies

lanyusan Oct 9, 2023

This s great. Would you please share a screen shot of your configuration for generating it?

Thanks.

barepixels Oct 9, 2023

There is no metadata embedded in the png. Unlike Foocus-MRE one can not restore with all the settings to continue where left off. This is all I have in my log. Log did not record new Image Prompt settings.

lanyusan Oct 9, 2023

Thanks. Did you use PyraCanny or CPDS or just plain image input?

barepixels Oct 9, 2023

I did a whole bunch of testing I don't remember

Marcelx8 Oct 11, 2023

I'm very new to this, but I'm playing around with Fooocus daily since about 2 weeks ago. You mentioned metadata - would it be possible to first do an upscale version of the image, or subtle variant, and then only try and combine the 2 styles with the output images of the variants/upscales? Would that process generate the meta required? If you respond with "nope" 😅, then I'll first need to learn more about these things, hehe.

The reason I asked is because I think that is how I got an image to create a better variant from the original. Again, maybe that was just by coincidence on my end.

lllyasviel · 2023-10-09T02:26:48Z

lllyasviel
Oct 9, 2023
Maintainer Author

Not an update but another message

This is a bird image

(generated by prompt "bird" all default parameters)

turn it into a "dragon", all default parameters, seed 123

You can see that the mixture between text and image is extremely robust and works in 100% cases.

Some users may think that they can just change the weight of IP-Adapter in other software like A1111/ComfyUI/InvokeAI or even diffusers to get similar performance and similar robustness. Unfortunatly, this is not possible, to the best of my knowledge.

If you know how to achieve such robustness in other software, please let me know.

6 replies

lllyasviel Oct 9, 2023
Maintainer Author

interesting.

Although we only want to compare open-source software, comparing with commercial online services is also possible.

lanyusan Oct 9, 2023

Promeai follows the first image's face more faithfully. It might be more preferable for ordinary users to have fun with their photos.

Would be great if Fooocus can detect faces and keep faces more faithfully.

lllyasviel Oct 9, 2023
Maintainer Author

Promeai's results are a typical case of "fail to make variations".

I do not think their generated face is the uploaded face. It is just a random face. But their model fail to produce enough variations so that users have less choices.

Fooocus also generate in a random way. But fooocus succeeded in making variations at least.

lanyusan Oct 9, 2023

Yes I understand.

I am thinking about ordinary users' cases. One common case is to restyle a portrait but keep face faithfully.

It will make Foocus a lot more appealing for those people if the img2img or image prompt functions have an option to preserve human faces more faithfully.

Just an idea.

lllyasviel Oct 9, 2023
Maintainer Author

You can try Anvanced -> Advanced -> Developer Debug Mode -> Softness of ControlNet -> set to lower value or 0 if really need this.

lllyasviel · 2023-10-09T03:00:34Z

lllyasviel
Oct 9, 2023
Maintainer Author

Hint:

you do not need to turn off "Fooocus V2" in most cases.

"Fooocus V2" is handled in a different way than text prompts. You do not need to worry about unwanted texts are added to your prompts.

2 replies

Greshnic07 May 17, 2024

Помогите плиз, никто не может ответить на мой вопрос: как например готовое фото девушки (после удаления BG ) поставить на готовый фон (например поле кукурузы) чтобы она гармонично вписалась туда ?
Всегда сохраняется только поза(

valtezar Jul 22, 2024

Вероятно всего, придется, поместить ее в нужное место в фотошопе, а затем вернутся в фукус, и обработать края.

Kaneki010 · 2024-06-24T19:42:03Z

Kaneki010
Jun 24, 2024

can anyone please help me figure out how can i put my/anyone's face into this

i've been struggling with doing this for literal hours now , any help would be very appreciated

5 replies

mashb1t Jun 24, 2024
Collaborator

(optional, but recommended) downscale image to a supported SDXL resolution
drop image in inpaint, mask face
drop your face in image prompt, check advanced and choose face swap
activate mixing of inpaint and image prompt in advanced > developer debug settings > control
hopefully get a good result
(optional) upscale again

Kaneki010 Jun 24, 2024

thanks for replying ,
i followed every step but sadly the result was bad
i'm going to use kanye west's face as an example here

(rest of settings are unchanged)
as you can see , the new face looks like it's photoshoped right in with no connection with the original photo's art style
edit : i just noticed that when using Improve Detail mode of inpainting makes the resulted faces look much much better and follow the artstyle of the photo , but for some reason the FaceSwap image prompt doesn't seem to work as the new face don't look like the input

mashb1t Jun 24, 2024
Collaborator

maybe try adding another image (the same as for inpainting) as image prompt to apply the style

Kaneki010 Jun 24, 2024

thanks again for helping
i tried that but the results were still bad
i tried using the Improve Detail mode of inpainting and that improved things a lot by making the face actually relate to the original photo's artstyle ,

but the problem now is that the new face doesn't look like the one i put in FaceSwap , the face i put in FaceSwap image prompt doesn't even have a beard
tbh i'm losing hope that this is possible rn , do u have any last idea how to do it ? (thanks again btw)

mashb1t Jun 25, 2024
Collaborator

@Kaneki010 currently not, no. You can check if the enhance feature in my fork works for you.

theaccofai · 2024-06-25T14:29:16Z

theaccofai
Jun 25, 2024

Guys, how to make my AI influencer stop looking at the camera ?

1 reply

mashb1t Jun 25, 2024
Collaborator

@theaccofai You can define your prompts accordingly, using keywords like side view or head towards XYZ may work. Also consider using prompt weights for specific keywords.

puntomaupunto · 2024-06-25T15:15:59Z

puntomaupunto
Jun 25, 2024

How do prompt weights work? Ciao, .mau.

…

On Tue, Jun 25, 2024, 16:54 Manuel Schmid ***@***.***> wrote: @theaccofai <https://github.com/theaccofai> You can define your prompts accordingly, using keywords like side view or head towards XYZ may work. Also consider using prompt weights for specific keywords. — Reply to this email directly, view it on GitHub <#557 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMV6KQNO2KN665VOTXY5C3ZJGACVAVCNFSM6AAAAAA5XHHTN2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TQNZSG4YTG> . You are receiving this because you commented.Message ID: ***@***.***>

1 reply

ILikeToasters Jun 25, 2024

You can add more weight or less by using (word:1.3) for example or (word:0.7). Above 1 increases the weight and below lowers the weight. I find if you go over 1.5 you will often get odd results so keep that in mind. A shortcut for doing it is when the cursor is on the word you want to change the weight you can use CTRL+up/down arrow. If you do it with a phrase highlighted it will do the whole phrase.

pabsroq · 2024-06-27T00:02:19Z

pabsroq
Jun 27, 2024

Hi Guys,
Can something like this be generated? I keep on testing but so far nothing, been playing with "Input Image -> Image Prompt -> Advanced" but I must be missing the very obvious.
Thanks,

1 reply

Yogesh-DevHub Sep 29, 2024

if not found yet

https://youtu.be/wpChNuxcRtI?si=fZ9BHo1kq2uAEjtO

theaccofai · 2024-07-08T16:22:24Z

theaccofai
Jul 8, 2024

Hello guys launched Fooocus using Google Colab, used Pyracanny, Image Prompt, and other features. However, there is still a problem that I can not fix. I generated like thousands of pics and still can't figure out how to make my ai influencer not look at the camera + how to make face expressions like models from insta. For example, taking out the tongue with eyes closed, or looking left, and etc.

Please help me figure this out, if you can answer here or text me on telegram my nickname @theaccofg

1 reply

theaccofai Jul 8, 2024

@lllyasviel

Malmala · 2024-07-27T13:08:52Z

Malmala
Jul 27, 2024

Dear friends, can someone help me create real people photos in exactly this style using Fooocus, thank you very much.

1 reply

S3mpa1 Aug 16, 2024

If you want to create a picture like that you might want to Create your own style or a LoRA for that.
or maybe use the In paint/Out paint function.

user346674221 · 2024-08-02T22:06:18Z

user346674221
Aug 2, 2024

How can i download this on my Chromebook i've been having some problems with it because im new to this and i really wanted to download this

1 reply

mashb1t Aug 2, 2024
Collaborator

@user346674221 as you can see in the readme, a chromebook does not fulfill the minimum system requirements. Please consider using Colab (also readme).

martinenkoEduard · 2024-08-28T05:32:16Z

martinenkoEduard
Aug 28, 2024

What is the best model or method to make faces similar but not identical, like siblings (brother/sister)? Also, how can I maintain the same style and posture?
Similar to face-swapping, but not exactly like that. I want to create something similar, like a brother or sister.

1 reply

Yogesh-DevHub Aug 28, 2024

Describe the gender and age in prompt, use "image prompt / face swap" option by decreasing the "weight / stop-at", after few attempts you should get the idea for weight and stop-at values to get somewhat similar face image, like siblings.
For poses upload one more image select PyraCanny.

A video with quick explanation about what I said above.
https://youtu.be/fnhjSC4T69o?si=xA2kl45SIF1Cfh3v

arii-jain · 2024-09-29T11:34:48Z

arii-jain
Sep 29, 2024

As a beginner, I have a simple photoshop use case, I want to change the fabric of mannequin model. Tried several PnCs of inpaint and image prompt, but to no avail. Can someone please guide the correct steps?

0 replies

FrankEnglish · 2024-10-18T06:28:58Z

FrankEnglish
Oct 18, 2024

please can you explain me the different use of "stop at" and "weight"? whic is the result on the final immage created? it's not clear, tnx

1 reply

Yogesh-DevHub Oct 18, 2024

The process of image-to-image generation needs an image as a reference to generate a final result.
If this process takes 10 seconds to generate a result, then 'stop at' defines when fooocus should stop taking reference from the given image and start using his own creative skill to generate within that 10 second.

This image-to-image generation option takes multiple images at a time. While taking reference from multiple images, which image should be considered more for taking reference is managed by weight. More weight will include more elements from the respective image in the final result.

A YT video with same examples: https://youtu.be/x9QSBKOo9Zo

I am not a tech person aware how things work inside fooocus but this are my findings after using the application for months.

puntomaupunto · 2024-10-18T07:09:43Z

puntomaupunto
Oct 18, 2024

sorry for the dumb question, but is there a way to run Fooocus 2.5 from colab? I noticed the warning "You are using gradio version 3.41.2, however version 4.44.1 is available, please upgrade." and I think that maybe I am using an old notebook.

2 replies

Yogesh-DevHub Oct 18, 2024

I don't think there is an issue with the old notebook, please check #2911 (comment)

You can ignore that warning.

puntomaupunto Oct 18, 2024

thank you!

maochoa31415 · 2024-10-20T04:02:57Z

maochoa31415
Oct 20, 2024

How I can configure (via Docker) fooocus to use the CPU?

0 replies

plasma800 · 2024-10-29T20:32:58Z

plasma800
Oct 29, 2024

Broken image links in this guide. Guide still useful, but though you might want to know.

0 replies

YavuzErgul40 · 2024-11-29T23:24:37Z

YavuzErgul40
Nov 29, 2024

Hello. I would like to thank each and every one of them for their work. He does great work. I have a request from you, please provide the possibility of outputting as vector or SVG. (If you have, can you tell me how to do it? I searched but I could not find it). Thank you very much again

0 replies

Siddharth97D · 2024-12-02T06:49:41Z

Siddharth97D
Dec 2, 2024

Hello,

Great work, team! I would like to integrate this using an API—is this possible and permitted? I reviewed the Gradio API documentation included in the UI, but it seems a bit confusing. Could you please assist me? I also want to contribute in terms of development if possible.

Thank you in advance!

0 replies

Fooocus 2.1.0 Image Prompts (Midjourney Image Prompts) #557

lllyasviel Oct 7, 2023 Maintainer

Example: Single Image Prompt without Text Prompts

Example: Single Image Prompt with Text Prompts

Example: Multiple Images without Text Prompts

Example: Multiple Images with Text Prompts and Even Multiple Styles

Fooocus Image Prompt (Advanced)

For developers:

Replies: 94 comments · 134 replies

lllyasviel Oct 8, 2023 Maintainer Author

lllyasviel Oct 8, 2023 Maintainer Author

lllyasviel Oct 8, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

mashb1t Jun 24, 2024 Collaborator

mashb1t Jun 24, 2024 Collaborator

mashb1t Jun 25, 2024 Collaborator

mashb1t Jun 25, 2024 Collaborator

mashb1t Aug 2, 2024 Collaborator

lllyasviel
Oct 7, 2023
Maintainer

Replies: 94 comments 134 replies

lllyasviel Oct 8, 2023
Maintainer Author

lllyasviel Oct 8, 2023
Maintainer Author

lllyasviel
Oct 8, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

mashb1t Jun 24, 2024
Collaborator

mashb1t Jun 24, 2024
Collaborator

mashb1t Jun 25, 2024
Collaborator

mashb1t Jun 25, 2024
Collaborator

mashb1t Aug 2, 2024
Collaborator