-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature request] Please implement TencentARC/PhotoMaker #1980
Comments
Refers to #1959 |
we will review some newer methods in the next round of dev. Fooocus never "support" any features. if we add it, we will do some research and present something like Fooocus PhotoMaker to compete with all other avaliable software that only "support" those features. If we cannot make it better than all others, we will not add it. All features of Fooocus are unique and cannot be reproduced by other tools. Newer research may outperform ip-adapter, but they may not outperform Fooocus Image Prompt. thanks for your supppot for fooocus as always. |
hi implement https://github.com/csslc/ccsr?tab=readme-ov-file |
Focus Best Generate Model THX |
Fooocus Image Prompt is good. But there are some drawbacks that limit its use. It takes into account not only general facial features, but also the position of the head (rotation), lighting, and background. Often, even if the head is without a background in png format, Fooocus FaceSwap still takes into account some individual pixels left from the background. To avoid this, you have to reduce the weight in the settings, and you lose face recognition. |
It is my pleasure to bring up some ideas to Fooocus team. My biggest hope is for Fooocus to catch up with Dalle 3 in prompt understanding. I have come across this repo recently. Hope the team can take a look as well. LLaVA: Large Language and Vision Assistant Code: Demo: https://github.com/LLaVA-VL/LLaVA-Interactive-Demo
|
Please also take a look at this. Its demo results look very good in following complex prompt. https://github.com/YangLing0818/RPG-DiffusionMaster Abstract: RPG is a powerful training-free paradigm that can utilize proprietary MLLMs (e.g., GPT-4, Gemini-Pro) or open-source local MLLMs (e.g., miniGPT-4) as the prompt recaptioner and region planner with our complementary regional diffusion to achieve SOTA text-to-image generation and editing. Our framework is very flexible and can generalize to arbitrary MLLM architectures and diffusion backbones. |
I have made a simple integration in my fork of Fooocus using diffusers. Could use some improvement but it is working well. Not every scheduler has a mapping to a diffusers scheduler. |
I recently played with the new released model and code named PhotoMaker by TencentARC.
The repo is here:
https://github.com/TencentARC/PhotoMaker
The result is very impressive. It does an extraordinary job of retaining features of reference face, better than ip adapter, while supporting stylization.
Can you please take a look? @lllyasviel
Thanks.
The text was updated successfully, but these errors were encountered: