-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Pipeline] Add LEDITS++ pipelines (#6074)
* Setup LEdits++ file structure * Fix import * LEditsPP Stable Diffusion pipeline * Include variable image aspect ratios * Implement LEDITS++ for SDXL * clean up LEditsPPPipelineStableDiffusion * Adjust inversion output * Added docu, more cleanup for LEditsPPPipelineStableDiffusion * clean up LEditsPPPipelineStableDiffusionXL * Update documentation * Fix documentation import * Add skeleton IF implementation * Fix documentation typo * Add LEDTIS docu to toctree * Add missing title * Finalize SD documentation * Finalize SD-XL documentation * Fix code style and quality * Fix typo * Fix return types * added LEditsPPPipelineIF; minor changes for LEditsPPPipelineStableDiffusion and LEditsPPPipelineStableDiffusionXL * Fix copy reference * add documentation for IF * Add first tests * Fix batching for SD-XL * Fix text encoding and perfect reconstruction for SD-XL * Add tests for SD-XL, minor changes * move user_mask to correct device, use cross_attention_kwargs also for inversion * Example docstring * Fix attention resolution for non-square images * Refactoring for PR review * Safely remove ledits_utils.py * Style fixes * Replace assertions with ValueError * Remove LEditsPPPipelineIF * Remove unecessary input checks * Refactoring of CrossAttnProcessor * Revert unecessary changes to scheduler * Remove first progress-bar in inversion * Refactor scheduler usage and reset * Use imageprocessor instead of custom logic * Fix scheduler init warning * Fix error when running the pipeline in fp16 * Update documentation wrt perfect inversion * Update tests * Fix code quality and copy consistency * Update LEditsPP import * Remove enable/disable methods that are now in StableDiffusionMixin * Change import in docs * Revert import structure change * Fix ledits imports --------- Co-authored-by: Katharina Kornmeier <[email protected]>
- Loading branch information
1 parent
30132ab
commit 00eca4b
Showing
16 changed files
with
4,052 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
<!--Copyright 2023 The HuggingFace Team. All rights reserved. | ||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# LEDITS++ | ||
|
||
LEDITS++ was proposed in [LEDITS++: Limitless Image Editing using Text-to-Image Models](https://huggingface.co/papers/2311.16711) by Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinário Passos. | ||
|
||
The abstract from the paper is: | ||
|
||
*Text-to-image diffusion models have recently received increasing interest for their astonishing ability to produce high-fidelity images from solely text inputs. Subsequent research efforts aim to exploit and apply their capabilities to real image editing. However, existing image-to-image methods are often inefficient, imprecise, and of limited versatility. They either require time-consuming fine-tuning, deviate unnecessarily strongly from the input image, and/or lack support for multiple, simultaneous edits. To address these issues, we introduce LEDITS++, an efficient yet versatile and precise textual image manipulation technique. LEDITS++'s novel inversion approach requires no tuning nor optimization and produces high-fidelity results with a few diffusion steps. Second, our methodology supports multiple simultaneous edits and is architecture-agnostic. Third, we use a novel implicit masking technique that limits changes to relevant image regions. We propose the novel TEdBench++ benchmark as part of our exhaustive evaluation. Our results demonstrate the capabilities of LEDITS++ and its improvements over previous methods. The project page is available at https://leditsplusplus-project.static.hf.space .* | ||
|
||
<Tip> | ||
|
||
You can find additional information about LEDITS++ on the [project page](https://leditsplusplus-project.static.hf.space/index.html) and try it out in a [demo](https://huggingface.co/spaces/editing-images/leditsplusplus). | ||
|
||
</Tip> | ||
|
||
<Tip warning={true}> | ||
Due to some backward compatability issues with the current diffusers implementation of [`~schedulers.DPMSolverMultistepScheduler`] this implementation of LEdits++ can no longer guarantee perfect inversion. | ||
This issue is unlikely to have any noticeable effects on applied use-cases. However, we provide an alternative implementation that guarantees perfect inversion in a dedicated [GitHub repo](https://github.com/ml-research/ledits_pp). | ||
</Tip> | ||
|
||
We provide two distinct pipelines based on different pre-trained models. | ||
|
||
## LEditsPPPipelineStableDiffusion | ||
[[autodoc]] pipelines.ledits_pp.LEditsPPPipelineStableDiffusion | ||
- all | ||
- __call__ | ||
- invert | ||
|
||
## LEditsPPPipelineStableDiffusionXL | ||
[[autodoc]] pipelines.ledits_pp.LEditsPPPipelineStableDiffusionXL | ||
- all | ||
- __call__ | ||
- invert | ||
|
||
|
||
|
||
## LEditsPPDiffusionPipelineOutput | ||
[[autodoc]] pipelines.ledits_pp.pipeline_output.LEditsPPDiffusionPipelineOutput | ||
- all | ||
|
||
## LEditsPPInversionPipelineOutput | ||
[[autodoc]] pipelines.ledits_pp.pipeline_output.LEditsPPInversionPipelineOutput | ||
- all |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
from typing import TYPE_CHECKING | ||
|
||
from ...utils import ( | ||
DIFFUSERS_SLOW_IMPORT, | ||
OptionalDependencyNotAvailable, | ||
_LazyModule, | ||
get_objects_from_module, | ||
is_torch_available, | ||
is_transformers_available, | ||
) | ||
|
||
|
||
_dummy_objects = {} | ||
_import_structure = {} | ||
|
||
try: | ||
if not (is_transformers_available() and is_torch_available()): | ||
raise OptionalDependencyNotAvailable() | ||
except OptionalDependencyNotAvailable: | ||
from ...utils import dummy_torch_and_transformers_objects # noqa F403 | ||
|
||
_dummy_objects.update(get_objects_from_module(dummy_torch_and_transformers_objects)) | ||
else: | ||
_import_structure["pipeline_leditspp_stable_diffusion"] = ["LEditsPPPipelineStableDiffusion"] | ||
_import_structure["pipeline_leditspp_stable_diffusion_xl"] = ["LEditsPPPipelineStableDiffusionXL"] | ||
|
||
_import_structure["pipeline_output"] = ["LEditsPPDiffusionPipelineOutput", "LEditsPPDiffusionPipelineOutput"] | ||
|
||
if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: | ||
try: | ||
if not (is_transformers_available() and is_torch_available()): | ||
raise OptionalDependencyNotAvailable() | ||
|
||
except OptionalDependencyNotAvailable: | ||
from ...utils.dummy_torch_and_transformers_objects import * | ||
else: | ||
from .pipeline_leditspp_stable_diffusion import ( | ||
LEditsPPDiffusionPipelineOutput, | ||
LEditsPPInversionPipelineOutput, | ||
LEditsPPPipelineStableDiffusion, | ||
) | ||
from .pipeline_leditspp_stable_diffusion_xl import LEditsPPPipelineStableDiffusionXL | ||
|
||
else: | ||
import sys | ||
|
||
sys.modules[__name__] = _LazyModule( | ||
__name__, | ||
globals()["__file__"], | ||
_import_structure, | ||
module_spec=__spec__, | ||
) | ||
|
||
for name, value in _dummy_objects.items(): | ||
setattr(sys.modules[__name__], name, value) |
Oops, something went wrong.